المشاركات

عرض المشاركات من سبتمبر, 2024

Exploding, Impacting: looking at bioRxiv preprint view dynamics with R

صورة
One of the joys of posting a preprint is seeing that people are viewing, downloading and (hopefully) reading your paper. On bioRxiv you can check out the statistics for your paper in the metrics tab. We posted a preprint recently and it clocked up over 1,000 views in the first day ... Continue reading: Exploding, Impacting: looking at bioRxiv preprint view dynamics with R http://dlvr.it/TDdnpq

How to Remove Outliers from Multiple Columns in R: A Comprehensive Guide

صورة
Introduction Outliers can significantly skew your data analysis results, leading to inaccurate conclusions. For R programmers, effectively identifying and removing outliers is crucial for maintaining data integrity. This guide will walk you thr... Continue reading: How to Remove Outliers from Multiple Columns in R: A Comprehensive Guide http://dlvr.it/TDdngF

Cover and modify, some tips for R package development

I’ve recently been dealing with legacy code refactoring both in theory and in practice: while I’m continuing some work on the igraph R package, I’ve started reading Working Effectively with Legacy Code by Michael Feathers, that had be... Continue reading: Cover and modify, some tips for R package development http://dlvr.it/TDdnVf

New R Package: Data Science Looks at Discrimination (dsld)

I’m very pleased to announce a new package, dsld, available on CRAN. This is the work of eight talented undergrad students. I provided the concept and some general guidance, but this is their work. The package is aimed at dealing with discrimination — race, gender, age — in the workplace, education, ... Continue reading: New R Package: Data Science Looks at Discrimination (dsld) http://dlvr.it/TDd6tY

Unlocking Chemical Volatility: How the volcalc R Package is Streamlining Scientific Research

صورة
The R Consortium recently interviewed Kristina Riemer, director of the CCT Data Science Team at the University of Arizona, and Eric Scott, Scientific Programmer and Educator in the CCT Data... The post Unlocking Chemical Volatility: How the volcalc R Package is Streamlining Scientific Research appeared first on R Consortium. Continue reading: Unlocking Chemical Volatility: How the volcalc R Package is Streamlining Scientific Research http://dlvr.it/TDbx3f

R Solution for Excel Puzzles

صورة
Puzzles no. 544–548PuzzlesAuthor: ExcelBIAll files (xlsx with puzzle and R with solution) for each and every puzzle are available on my Github. Enjoy.Puzzle #544Wake up Cinderellas, we have some cleaning to do. To be exact we need to pick up numbers fr... Continue reading: R Solution for Excel Puzzles http://dlvr.it/TDbX1Y

Dual licensing R packages with code and data

Licenses are an important topic within open source. Without licenses, information or code can be publicly available but not legally available for reuse or redistribution. The open source software community’s most common licenses are the MIT lice... Continue reading: Dual licensing R packages with code and data http://dlvr.it/TDbWqD

How to Switch Two Columns in R: A Beginner’s Guide

صورة
Introduction Welcome to the world of R programming, where data manipulation is a crucial skill. One common task you may encounter is the need to switch two columns in a data frame. Understanding how to efficiently rearrange data can significant... Continue reading: How to Switch Two Columns in R: A Beginner’s Guide http://dlvr.it/TDbDcx

Modeling loss aversion with extended-support beta regression

صورة
The recently-proposed extended-support beta regression model in R package betareg is illustrated by simultaneously modeling the occurrence and extent of loss aversion in a behavioral economics experiment. Motivation To illustra... Continue reading: Modeling loss aversion with extended-support beta regression http://dlvr.it/TDZtHM

Introducing gt_summarytools: Analyze Your Data Faster With R

صورة
Hey guys, welcome back to my R-tips newsletter. In today’s fast-paced data science environment, speeding up exploratory data analysis (EDA) is more critical than ever. This is where gt_summarytools() comes in. A new function I’ve developed, gt_summaryt... Continue reading: Introducing gt_summarytools: Analyze Your Data Faster With R http://dlvr.it/TDYNVm

Introducing OptimalTransportNetworks.jl: Optimal Transport Networks in Spatial Equilibrium

صورة
I’m happy to announce the release of OptimalTransportNetworks.jl, a modern Julia translation of the MATLAB OptimalTransportNetworkToolbox implementing the quantitative spatial model and algorithms described in Fajgelbaum, P. D., & Schaal, E. (2... Continue reading: Introducing OptimalTransportNetworks.jl: Optimal Transport Networks in Spatial Equilibrium http://dlvr.it/TDYBGs

A Bayesian Plackett-Luce model in Stan applied to pinball championship data

صورة
Sometimes it feels a bit silly when a simple statistical model has a fancy-sounding name. But it also feels good to drop the following in casual conversation: “Ah, then I recommend a Plackett-Luce model, a straightforward generalization of the Bradley–Terry model, you know”, when a friend wonders how they ... Continue reading: A Bayesian Plackett-Luce model in Stan applied to pinball championship data http://dlvr.it/TDXxy9

5 Books added to Big Book of R

It’s been about 3 months since the last update and I’m looking forward to getting back to more regular additions to the collection which now stands at over 400 free, open-source R books! I have a small backlog to get through, but as always if you wish to submit a … ... Continue reading: 5 Books added to Big Book of R http://dlvr.it/TDWRD3

Prime numbers as sums of three squares. by @ellis2013nz

صورة
I was interested by a LinkedIn post about the number 397: “397 is conjectured to be the largest prime that can be represented uniquely as the sum of three positive squares” That is, 3^2 + 8^2 + 18^2 = 397 This led to some confusion in the commen... Continue reading: Prime numbers as sums of three squares. by @ellis2013nz http://dlvr.it/TDWR55

How to Import Data into R | Load Data file in R Programming

صورة
Key points R provides multiple methods to import data files in R, making it a versatile tool for data analysis. Efficient CSV Import Methods: Different functions like read.csv, read_csv, and fread cater to different dataset sizes and performance n... Continue reading: How to Import Data into R | Load Data file in R Programming http://dlvr.it/TDVCXT

Mastering Linux Commands: ls, file, and less for Beginners

صورة
Introduction Thank you for joining me today as we explore the fundamental Linux commands ls, file, and less. These commands are essential for navigating and managing files in a Linux environment. If you are new to Linux like me or looking to de... Continue reading: Mastering Linux Commands: ls, file, and less for Beginners http://dlvr.it/TDTdPX

Mastering Data Transformation in R with pivot_longer and pivot_wider

صورة
Artwork by: Shannon Pileggi and Allison Horst Introduction Data analysis requires a deep understanding of how to structure data effectively. Often, datasets are not in the format most suitable for analysis or visualization. That’s wher... Continue reading: Mastering Data Transformation in R with pivot_longer and pivot_wider http://dlvr.it/TDTd9D

Keep It Simple: Extracting Value from the Noise of Data Overload

صورة
Disclaimer:While my work in this series draws inspiration from the IBCS® standards, I am not a certified IBCS® analyst or consultant. The visualizations and interpretations presented here are my personal attempts to apply these principles and may not f... Continue reading: Keep It Simple: Extracting Value from the Noise of Data Overload http://dlvr.it/TDRjBL

How to Use cat() in R to Print Multiple Variables on the Same Line

صورة
Introduction Printing multiple variables on the same line is a fundamental skill for R programmers. This guide will introduce you to the cat() function, a powerful tool for efficient and flexible output in R. Introduction to cat() The cat() f... Continue reading: How to Use cat() in R to Print Multiple Variables on the Same Line http://dlvr.it/TDRhx6

Mastering printf() in C: A Beginner’s Guide

صورة
Introduction to printf() in C In the world of C programming, understanding how to effectively use printf() is crucial for any beginner. As one of the most widely used functions, it plays a pivotal role in outputting formatted text to the consol... Continue reading: Mastering printf() in C: A Beginner’s Guide http://dlvr.it/TDP3xj

PowerQuery Puzzle solved with R

صورة
#217–218PuzzlesAuthor: ExcelBIAll files (xlsx with puzzle and R with solution) for each and every puzzle are available on my Github. Enjoy.Puzzle #217Turn around, transpose, pivot it all means almost the same. Like table asked us to spin itself around.... Continue reading: PowerQuery Puzzle solved with R http://dlvr.it/TDLlM9

Rhino 1.10.0 Update: Automated Styling & Auto-complete for box Modules

صورة
In Rhino 1.7.0, we began introducing the Rhino style guide and linting for box::use() calls to promote best practices for code quality. These linter functions were eventually separated from Rhino 1.8.0 into the {box.linters} package. Among the checks performed by {box.linters} are rules covering box::use() calls. For existing ... Continue reading: Rhino 1.10.0 Update: Automated Styling & Auto-complete for box Modules http://dlvr.it/TDLMGj

Introducing Shiny Assistant – You Can Now Build Shiny Applications with GPT and GenerativeAI

صورة
It finally happened. It’s been almost two years since the first release of ChatGPT. It took the world by storm, to put it mildly. In the short period that followed, we now have multiple companies building Generative AI platforms that were unimaginable before late 2022. What a great time to ... Continue reading: Introducing Shiny Assistant – You Can Now Build Shiny Applications with GPT and GenerativeAI http://dlvr.it/TDKQXY

Unveiling ‘RandomWalker’: Your Gateway to Tidyverse-Compatible Random Walks

صورة
Introduction Welcome to the world of ‘RandomWalker’, an innovative R package designed to simplify the creation of various types of random walks. Developed by myself and my co-author, Antti Rask, this package is in its experimental phase but pro... Continue reading: Unveiling ‘RandomWalker’: Your Gateway to Tidyverse-Compatible Random Walks http://dlvr.it/TDKQPN

Parallel for loops (Map or Reduce) + New versions of nnetsauce and ahead

Parallel for loops (Map or Reduce) using R package misc + New versions of nnetsauce and ahead Continue reading: Parallel for loops (Map or Reduce) + New versions of nnetsauce and ahead http://dlvr.it/TDKQFY

R Solution for Excel Puzzles

صورة
Puzzles no. 539–543PuzzlesAuthor: ExcelBIAll files (xlsx with puzzle and R with solution) for each and every puzzle are available on my Github. Enjoy.Puzzle #539Many companies have to diversify their source of resources, to optimize prices etc. So they usually have their orders somehow mixed between vendors. In this case ... Continue reading: R Solution for Excel Puzzles http://dlvr.it/TDJ6H8

Extended-support beta regression for [0, 1] responses

New arXiv working paper introducing extended-support beta regression models which can capture probabilities for boundary observations at 0 and/or 1. It is available in the latest R package betareg, also accompanied by a new altdoc web page. ... Continue reading: Extended-support beta regression for [0, 1] responses http://dlvr.it/TDHRfk

How to Analyze Your Data Faster With R Using summarytools

صورة
Hey guys, welcome back to my R-tips newsletter. Getting quick insights into your data is absolutely critical to data understanding, predictive modeling, and production. But it can be challenging if you’re just getting started. Today, I’m going to show ... Continue reading: How to Analyze Your Data Faster With R Using summarytools http://dlvr.it/TDGjcH

forcats::fct_lump_n() with weights “overall”

Sometimes I want to summarize some categories which don’t have much impact on my analysis. So the best way to do this is using some of the forcats::fct_lump*() functions. But I often struggle to find the way using the weights to order the categor... Continue reading: forcats::fct_lump_n() with weights “overall” http://dlvr.it/TDDyH2

Stepwise selection of variables in regression is Evil. by @ellis2013nz

صورة
I’ve recently noticed that stepwise regression is still fairly popular, despite being well and truly frowned upon by well-informed statisticians. By stepwise regression, I mean any modelling strategy that involves adding or subtracting variables from a... Continue reading: Stepwise selection of variables in regression is Evil. by @ellis2013nz http://dlvr.it/TDDF1B

How to Use the duplicated Function in Base R with Examples

صورة
Introduction In data analysis, one of the common tasks is identifying and handling duplicate entries in datasets. Duplicates can arise from various stages of data collection and processing, and failing to address them can lead to skewed results... Continue reading: How to Use the duplicated Function in Base R with Examples http://dlvr.it/TDCDQg

Empowering Data Science: How R is Transforming Research in Cameroon

صورة
NyAvo RATOVO-ANDRIANARISOA, the co-founder of the R Community Cameroon, recently spoke with the R Consortium about the rapid growth of the R community in Cameroon and the impact of R... The post Empowering Data Science: How R is Transforming Research in Cameroon appeared first on R Consortium. Continue reading: Empowering Data Science: How R is Transforming Research in Cameroon http://dlvr.it/TDCD9D

Labels For Technical Writing Projects

Over the past thirty years I have written five technical books, co-written three others, and edited a further six. Since 2007 they have all lived in GitHub repositories, as did the first versions of the Software Carpentry lessons that I helped to writ... Continue reading: Labels For Technical Writing Projects http://dlvr.it/TDCCwy

Express to Impress: Leveraging IBCS Standards for Powerful Data Presentations

صورة
Attention: Article looks long at word count, but remember that contains pretty long chunks of code.Disclaimer:While my work in this series draws inspiration from the IBCS® standards, I am not a certified IBCS® analyst or consultant. The visualizations and interpretations presented here are my personal attempts to apply ... Continue reading: Express to Impress: Leveraging IBCS Standards for Powerful Data Presentations http://dlvr.it/TD8zsw

How to Print Tables in R with Examples Using table()

صورة
Introduction Tables are an essential part of data analysis, serving as a powerful tool to summarize and interpret data. In R, the table() function is a versatile tool for creating frequency and contingency tables. This guide will walk you throu... Continue reading: How to Print Tables in R with Examples Using table() http://dlvr.it/TD8f20

How to Use lapply() Function with Multiple Arguments in R

صورة
Introduction R is a powerful programming language primarily used for statistical computing and data analysis. Among its many features, the lapply() function stands out as a versatile tool for simplifying code and reducing redundancy. Whether yo... Continue reading: How to Use lapply() Function with Multiple Arguments in R http://dlvr.it/TD8dgz

Probabilistic Network Inference and Analysis in R and Python workshop

Join our workshop on Probabilistic Network Inference and Analysis in R and Python, which is a part of our workshops for Ukraine series!  Here’s some more info:  Title: Probabilistic Network Inference and Analysis in R and Python Date: Thursday, October 3rd, 18:00 – 20:00 CEST (Rome, Berlin, Paris timezone)  Speaker: Guillermo de ... Continue reading: Probabilistic Network Inference and Analysis in R and Python workshop http://dlvr.it/TD5zmB

Create a free Llama 3.1 405B-powered chatbot on any GitHub repo in 1 minute (cross-posted from Paired Ends)

صورة
This blog has moved. This is reposted from Paired Ends: https://blog.stephenturner.us/p/create-a-free-llama-405b-llm-chatbot-github-repo-huggingfaceLlama 3.1 405B is the first open-source LLM on par with frontier models GPT-4o and Claude 3.5 Sonnet... Continue reading: Create a free Llama 3.1 405B-powered chatbot on any GitHub repo in 1 minute (cross-posted from Paired Ends) http://dlvr.it/TD5zNb

Introduction to Interpretable Machine Learning in R

Join our workshop on Introduction to Interpretable Machine Learning in R, which is a part of our workshops for Ukraine series!  Here’s some more info:  Title: Introduction to Interpretable Machine Learning in R Date: Thursday, October 10th, 18:00 – 20:00 CEST (Rome, Berlin, Paris timezone)  Speaker: Andreas Hofheinz, Andreas is a Data ... Continue reading: Introduction to Interpretable Machine Learning in R http://dlvr.it/TD5JGM

Please Version Data

صورة
Introduction An important goal of our Win Vector LLC teaching offerings is to instill in engineers some familiarity with, and empathy for, how data is likely to be used for analytics and business. Having such engineers in your organization greatly increases the quality of the data later available to your […] Continue reading: Please Version Data http://dlvr.it/TD2Xh2

R Solution for Excel Puzzles

صورة
Puzzles no. 534–538PuzzlesAuthor: ExcelBIAll files (xlsx with puzzle and R with solution) for each and every puzzle are available on my Github. Enjoy.Puzzle #534Palindromes and other symetric numbers are pretty common them of our challenges, but this t... Continue reading: R Solution for Excel Puzzles http://dlvr.it/TD2XXf

How to Use grep() and Return Only Substring in R: A Comprehensive Guide

صورة
Introduction When working with text data in R, you often need to search for specific patterns or extract substrings from larger strings. The grep() function is a powerful tool for pattern matching, but it doesn’t directly return only the matche... Continue reading: How to Use grep() and Return Only Substring in R: A Comprehensive Guide http://dlvr.it/TD2XQ7

New versions of nnetsauce and ahead

New versions of nnetsauce and ahead Continue reading: New versions of nnetsauce and ahead http://dlvr.it/TD2XG4

Exploratory Data Analysis: Economic Performance of China

صورة
China’s GDP growth rate for the second quarter was both lower than expected and the previous quarter. In addition, the performance of the China Fund has been significantly worse over the past year. Is China heading towards a recession? Source code: Continue reading: Exploratory Data Analysis: Economic Performance of China http://dlvr.it/TD1MFT

Mind reader game, and Unicode symbols

صورة
Mind reader game, and Unicode symbols, by Jerry Tuttle Perhaps you've seen this Mind Reader game? Think of a two-digit positive whole number, such as 54. Subtract each of the two digits from your number, such as 54 - 5 - 4 = 45, and call 45 the RES... Continue reading: Mind reader game, and Unicode symbols http://dlvr.it/TD055b

Gender and sexuality in Australian surveys and census by @ellis2013nz

صورة
The 2021 ABS Standard for ‘Sex, Gender, Variations of Sex Characteristics and Sexual Orientation Variables’ Over the past two weeks there has been quite a controversy relating to questions about sexuality and gender in the next Australian Census of Po... Continue reading: Gender and sexuality in Australian surveys and census by @ellis2013nz http://dlvr.it/TCyzDR

JSON, NULL values and as_tibble

When working with data provided by common APIs you will almost always get in contact with JSON formatted data. Using R’s rjson::fromJSON will transform JSON to R’s lists. So far so good. Converting those lists to tibble using tibble::as_tib... Continue reading: JSON, NULL values and as_tibble http://dlvr.it/TCyVz7

📦 {alone} v0.5 is now available

صورة
Alone Season 11 has finished and is now available in the package and ready for analysis. As per usual install […] The post 📦 {alone} v0.5 is now available appeared first on Dan Oehm | Gradient Descending. Continue reading: 📦 {alone} v0.5 is now available http://dlvr.it/TCx6tC

Dr Drang and the Electoral College

The other week, the Internet’s most beloved creepy snowman wrote a blog post where he showed how to use a little Python to group states by their number of electoral college votes to make a table like this: Electors States PopPct ECPct 3 AK, DE, DC, ND, SD, VT, WY 1.61% 3.90% 4 ... Continue reading: Dr Drang and the Electoral College http://dlvr.it/TCx6mM

R-Ladies Bariloche in Argentina: Fostering a Different Approach to  Leadership

صورة
Lina Moreno, founder and organizer of the R-Ladies Bariloche chapter in Argentina, recently shared her journey with the R Consortium. A biologist focusing on evolutionary ecology, she discussed her experience... The post R-Ladies Bariloche in Argentina: Fostering a Different Approach to  Leadership appeared first on R Consortium. Continue reading: R-Ladies Bariloche in Argentina: Fostering a Different Approach to  Leadership http://dlvr.it/TCwXn6

Navigating Linux with ‘pwd’, ‘cd’, and ‘ls’: A Beginner’s Guide

صورة
Introduction I have mentioned in my previous linux post that I am on my own personal journey to learn it. I have been using it for sometime but not really understanding the commands. So I have started this blog post series on Linux for Friday’s... Continue reading: Navigating Linux with ‘pwd’, ‘cd’, and ‘ls’: A Beginner’s Guide http://dlvr.it/TCvsM1

Guarding Against Misleading Data

صورة
The IBCS ‘Check’ PrincipleIn the complex world of business intelligence (BI), the ability to present data accurately and transparently is critical. Whether crafting a dashboard for executive decision-making or generating a report for operational analys... Continue reading: Guarding Against Misleading Data http://dlvr.it/TCs0fD

Harness the Full Potential of Case-Insensitive Searches with grep() in R

صورة
Introduction to grep() in R The grep() function in R is a powerful tool for searching and matching patterns within text data. It is commonly used in data cleaning, manipulation, and text analysis to find specific patterns or values in strings o... Continue reading: Harness the Full Potential of Case-Insensitive Searches with grep() in R http://dlvr.it/TCr4C1

R-Change Number of Bins in Histogram

The post R-Change Number of Bins in Histogram appeared first on Data Science Tutorials Unravel the Future: Dive Deep into the World of Data Science Today! Data Science Tutorials. R-Change Number of Bins in Histogram, the default number of bins is determined by Sturges’ Rule. However, you can override this ... Continue reading: R-Change Number of Bins in Histogram http://dlvr.it/TCqfG0

How to Specify Histogram Breaks in R

The post How to Specify Histogram Breaks in R appeared first on Data Science Tutorials Unravel the Future: Dive Deep into the World of Data Science Today! Data Science Tutorials. How to Specify Histogram Breaks in R, you may want to specify the number of breaks or bins to use. ... Continue reading: How to Specify Histogram Breaks in R http://dlvr.it/TCqf4m

PowerQuery Puzzle solved with R

صورة
#213–214PuzzlesAuthor: ExcelBIAll files (xlsx with puzzle and R with solution) for each and every puzzle are available on my Github. Enjoy.Puzzle #213Two warehouses gave us some summary, but products there are sometimes in packages, sometimes in batche... Continue reading: PowerQuery Puzzle solved with R http://dlvr.it/TCqdty

Boost your shiny app with sparkling data visualizations: a deep dive into Chart.js JavaScript library

صورة
You can read the original post in its original format on Rtask website by ThinkR here: Boost your shiny app with sparkling data visualizations: a deep dive into Chart.js JavaScript library Let’s continue our exploration of integrating JavaScript code into a {shiny} application! We will show how to ... Continue reading: Boost your shiny app with sparkling data visualizations: a deep dive into Chart.js JavaScript library http://dlvr.it/TCmHW9

Our Experience at posit::conf 2024

صورة
posit::conf 2024 was nothing short of amazing! While Posit has already shared their top highlights, we wanted to offer our own take on the experience—what really stood out to us, what we’re excited about, and a few answers to questions that came up during Marcin Dubel’s session. ... Continue reading: Our Experience at posit::conf 2024 http://dlvr.it/TCmGvJ

Mastering the grep() Function in R: Using OR Logic

Introduction For R programmers, mastering the built-in functions is key to efficient data manipulation. One such powerful tool is the grep() function, which is commonly used for pattern matching within character vectors. While many are familiar... Continue reading: Mastering the grep() Function in R: Using OR Logic http://dlvr.it/TCmGCb

Five ways to improve your chart axes

صورة
When it comes to crafting visualisations, people often put a lot of thought into what type of plot they’re going to make and what colour scheme they’re going to use. One thing that sometimes get less attention than it should is the choice o... Continue reading: Five ways to improve your chart axes http://dlvr.it/TCmFMz

Stable Diffusion 3 in R? Why not? Thanks to {reticulate} 🙏❤️🙌

صورة
‘Fascinating’ describes my journey with Stable Diffusion 3. It’s deepened my appreciation for original art and masterpieces. Understanding how to generate quality art is just the beginning—it drives me to explore the underlying struc... Continue reading: Stable Diffusion 3 in R? Why not? Thanks to {reticulate} 🙏❤️🙌 http://dlvr.it/TCl0Z4

Sampling without replacement with unequal probabilities by @ellis2013nz

صورة
Not proportional to w A week ago I was surprised to read on Thomas Lumley’s Biased and Inefficient blog that when using R’s sample() function without replacement and with unequal probabilities of individual units being sampled: “What R currently h... Continue reading: Sampling without replacement with unequal probabilities by @ellis2013nz http://dlvr.it/TCjB9N

How to Use grep() for Exact Matching in Base R: A Comprehensive Guide

Understanding grep() in R The grep() function is a powerful tool in base R for pattern matching and searching within strings. It’s part of R’s base package, making it readily available without additional installations. grep() is versatile, but ... Continue reading: How to Use grep() for Exact Matching in Base R: A Comprehensive Guide http://dlvr.it/TCdKZb

Reproducible data science with Nix, part 12 — Nix as a polyglot build automation tool for data science

صورة
Nix is not only a package manager, but also a build automation tool, and you can use it to build polyglot data science pipelines in a completely reproducible way. For example, suppose that you need to mix Python, R and maybe some others tools for a project (by the way, ... Continue reading: Reproducible data science with Nix, part 12 — Nix as a polyglot build automation tool for data science http://dlvr.it/TCdBGf