المشاركات

عرض المشاركات من ديسمبر, 2023

The 6 steps to finally lose weight (or solve almost any other problem) (with R code)

صورة
OK, I’m sorry for the clickbait title. But I really do think a lot of the insights in this guide are trivial and very easy to understand yet surprisingly unknown to many people.1  In this guide we will discuss how to lose weight or solve (almost) any other kind ... Continue reading: The 6 steps to finally lose weight (or solve almost any other problem) (with R code) http://dlvr.it/T0pllM

Unlocking the Power of Time: Transforming Data Frames into Time Series in R

Introduction Hey there, fellow R enthusiasts! Today, we’re diving into the realm of time series, where data dances along the temporal dimension. To join this rhythmic analysis, we’ll first learn how to convert our trusty data frames into time se... Continue reading: Unlocking the Power of Time: Transforming Data Frames into Time Series in R http://dlvr.it/T0plfT

Python Rgonomics

Interoperability was a key theme in open-source data languages in 2023. Ongoing innovations in Arrow (a language-agnostic in-memory standard for data storage), growing adoption of Quarto (the language-agnostic heir apparent to R Markdown), and even p... Continue reading: Python Rgonomics http://dlvr.it/T0plYx

100 days of Python and R

Today, together with a friend who is looking to get into data analytics, I started doing the 100 days of Python challenge at Replit. I thought it would be a good idea to do the challenges in R, because why not. :) So, I am at day 2 (whoohooo I did two ... Continue reading: 100 days of Python and R http://dlvr.it/T0plTB

Reversion to the Mean: Unraveling a Pervasive Misconception in Business and Beyond

صورة
In the realm of business and leadership, one statistical phenomenon often goes unrecognized yet significantly influences our understanding of performance and success. This is the concept of reversion to the mean. This seemingly simple statistical occurrence can profoundly impact how we perceive management strategies, leadership effectiveness, and even the fate ... Continue reading: Reversion to the Mean: Unraveling a Pervasive Misconception in Business and Beyond http://dlvr.it/T0l6ZM

Usage shares of programming languages in economics research

My shiny app Finding Economics Articles with Data contains meanwhile over 8000 economic articles with replication packages. You can use it here: https://ejd.econ.mathematik.uni-ulm.de /> Some of the data on articles and file types in the reproduction packages can be downloaded as a zipped SQLite database from my ... Continue reading: Usage shares of programming languages in economics research http://dlvr.it/T0kn4h

Unveiling the Time Traveler: Plotting Time Series in R

صورة
Introduction Ready to journey through time with R? Buckle up, because we’re about to explore the art of visualizing time-dependent data, known as time series analysis. Whether you’re tracking monthly sales patterns or analyzing yearly climate tr... Continue reading: Unveiling the Time Traveler: Plotting Time Series in R http://dlvr.it/T0j36S

Salt Lake City R User Group’s Success Story: Blending In-Person and Online Events

صورة
Last year, Julia Silge, co-organizer of the Salt Lake City R User Group discussed the group’s plans to meld in-person and online activities with the R Consortium. This year, Andrew... The post Salt Lake City R User Group’s Success Story: Blending In-Person and Online Events appeared first on ... Continue reading: Salt Lake City R User Group’s Success Story: Blending In-Person and Online Events http://dlvr.it/T0gg4K

Old Art, New Code: The Typesetter’s Guide to Memoization

صورة
n the world of programming, where complexity often intertwines with the need for efficiency, there exists a practice as ancient as it is modern: memoization. This concept, akin to the meticulous art of typesetting in the days of yore, stands as a testa... Continue reading: Old Art, New Code: The Typesetter’s Guide to Memoization http://dlvr.it/T0gfxW

Creating Time Series in R with the ts() Function

صورة
Introduction Time series analysis is a powerful tool in the hands of a data scientist or analyst. It allows us to uncover patterns, trends, and insights hidden within temporal data. In this blog post, we’ll explore how to create a time series in... Continue reading: Creating Time Series in R with the ts() Function http://dlvr.it/T0gfnj

Characterization-based approach for construction of goodness-of-fit test for Lévy distribution

صورة
Introduction The Lévy distribution, together with the Normal and Cauchy distribution, belongs to the class of stable distributions, and it is among the only three distributions for which the density can be derived in a closed form. The density func... Continue reading: Characterization-based approach for construction of goodness-of-fit test for Lévy distribution http://dlvr.it/T0gfd9

Introduction to Time Series Analysis (with applications in R)

صورة
Hey guys, welcome back to my R-tips newsletter. Time series analysis has been critical in my career. But it took me 3 years to get comfortable. In today’s R-Tip, I’ll share 3 years of experience in time series in 3 minutes. Let’s go! Table of Contents... Continue reading: Introduction to Time Series Analysis (with applications in R) http://dlvr.it/T0dVkT

STC

Projects for STC: https://gitlab.com/stc_dev/replica7Analytics module for Speach Recognition system Continue reading: STC http://dlvr.it/T0dVc5

qeML Example: Nonparametric Quantile Regression

In this post, I will first introduce the concept of quantile regression (QR), a powerful technique that is rarely taught in stat courses. I’ll give an example from the quantreg package, and then will show how qeML can be used to do model-free QR estimation. Along the way, I ... Continue reading: qeML Example: Nonparametric Quantile Regression http://dlvr.it/T0dVTv

Assessing relationships with correlograms

صورة
We often find ourselves with a complex dataset containing numerous variables. One of the initial steps in the discovery phase - the initial analysis where you get familiar with the data - is using correlations to understand the relationships between the variables. A good tool for getting a quick glimpse ... Continue reading: Assessing relationships with correlograms http://dlvr.it/T0dVKN

non-equi joins in data.table

A quick note to understand (non-equi) joins in data.table Continue reading: non-equi joins in data.table http://dlvr.it/T0dVBJ

Advent of 2023, Day 11 – Starting data science with Microsoft Fabric

صورة
In this Microsoft Fabric series: We have looked into creating the lakehouse, checked the delta lake and delta tables, got some data into the lakehouse, and created a custom environment and Spark job definition. And now we need to see,…Read more › Continue reading: Advent of 2023, Day 11 – Starting data science with Microsoft Fabric http://dlvr.it/T01PFw

femR: Bridging Physics and Statistics in R with Support from the R Consortium

صورة
Laura M. Sangalli is a professor of Statistics at Politecnico di Milano, Italy. Her research interests include functional data analysis, high-dimensional and complex data, spatial data analysis, and biostatistics. With... The post femR: Bridging Physics and Statistics in R with Support from the R Consortium appeared first on R Consortium. Continue reading: femR: Bridging Physics and Statistics in R with Support from the R Consortium http://dlvr.it/T01Bxq

Reading notes on The Pragmatic Programmer by David Thomas and Andrew Hunt

In my quest to having reading notes on the tech books I read, and while waiting for code to run, I recently re-read The Pragmatic Programmer by David Thomas and Andrew Hunt. That book, whose second edition was published in 2019, offers an overview of m... Continue reading: Reading notes on The Pragmatic Programmer by David Thomas and Andrew Hunt http://dlvr.it/T00HBk

Edge of Tomorrow: Preparing R Functions for the Unexpected

صورة
Anticipating the UnanticipatedIn the dynamic world of data science and programming, one of the most valuable skills is the ability to anticipate and handle unexpected scenarios. When working with data, the unexpected comes in various forms: unusual dat... Continue reading: Edge of Tomorrow: Preparing R Functions for the Unexpected http://dlvr.it/T00GtM

Advent of 2023, Day 10 – Creating Job Spark definition

صورة
n this Microsoft Fabric series: An Apache Spark job definition is a single computational action, that is normally scheduled and triggered. In Microsoft Fabric (same as in Synapse), you could submit batch/streaming jobs to Spark clusters. By uploading a binary…Read more › Continue reading: Advent of 2023, Day 10 – Creating Job Spark definition http://dlvr.it/SzyddT

The 10 most popular R books of 2023

The Big Book of R has a collection of almost 400 free R books and as we round out 2023 it’s the perfect time to look back at which have been the most popular. I track the stats and they’re openly accessible. Some of these also have print versions. Get … ... Continue reading: The 10 most popular R books of 2023 http://dlvr.it/SzydRM

Customizing slides and documents using Quarto extensions workshop

صورة
Join our workshop on  Introduction to mixed frequency data models in R, which is a part of our workshops for Ukraine series!  Here’s some more info:  Title: Customizing slides and documents using Quarto extensions Date: Thursday, January 11th, 18:00 – 20:00 CET (Rome, Berlin, Paris timezone) Speaker:Nicola Rennie is a Lecturer ... Continue reading: Customizing slides and documents using Quarto extensions workshop http://dlvr.it/Szy9V1

Advent of Array Elegance (AoC2023 Day 7)

صورة
I’m solving Advent of Code this year using a relaxed criteria compared to last year in that I’m allowing myself to use packages where they’re helpful, rather than strictly base R. Last year I re-solved half of the exercises using Rust which helped me l... Continue reading: Advent of Array Elegance (AoC2023 Day 7) http://dlvr.it/SzxGCR

Adaptive Asset Allocation Replication

صورة
The paper, “Adaptive Asset Allocation: A Primer” by Adam Butler, Mike Philbrick, Rodrigo Gordillo, and David Varadi addresses flaws in the traditional application of Modern Portfolio Theory related to Strategic Asset Allocation. It shows ... Continue reading: Adaptive Asset Allocation Replication http://dlvr.it/SzvWw0

Exploring TidyAML: Simplifying Regression Analysis in R

صورة
Introduction If you’re a data enthusiast diving into the world of regression analysis in R, you’ve likely encountered the challenges of managing code complexity and juggling different modeling engines. The good news is that there’s a powerful to... Continue reading: Exploring TidyAML: Simplifying Regression Analysis in R http://dlvr.it/SzvWpr

R Validation Hub Community Meeting – December Recap

صورة
After a brief hiatus, the R Validation Hub recently reconvened for its community meeting, celebrating a year of remarkable achievements and setting the stage for future endeavors. In 2023, the... The post R Validation Hub Community Meeting – December Recap appeared first on R Consortium. Continue reading: R Validation Hub Community Meeting – December Recap http://dlvr.it/SztrXw

Analysing Shiny App start-up Times with Google Lighthouse

صورة
This is part one of a three part series on Lighthouse for Shiny Apps. Part 1: Using Google Lighthouse for Web Pages Part 2: Analysing Shiny App start-up Times with Google Lighthouse (This post) Part 3: Effect of Shiny Widgets with Google Lightho... Continue reading: Analysing Shiny App start-up Times with Google Lighthouse http://dlvr.it/Szry91

Unraveling Patterns: A Step-by-Step Guide to Piecewise Regression in R

صورة
Introduction Hey there, fellow R enthusiasts! Today, let’s embark on a fascinating journey into the realm of piecewise regression using R. If you’ve ever wondered how to uncover hidden trends and breakpoints in your data, you’re in for a treat. ... Continue reading: Unraveling Patterns: A Step-by-Step Guide to Piecewise Regression in R http://dlvr.it/Szry2X

The two phases of commits in a Git branch

I seem to have at last entered my Git era. 🎉 Reading and applying Git in practice was probably the best thing I did for my upskilling this year. One Git workflow aspect I’ve finally realized is that it’s fine to have two phases of work in a... Continue reading: The two phases of commits in a Git branch http://dlvr.it/SzrxwT

Posit x Databricks: A Game-Changing Synergy for Data Teams

صورة
On Thursday, December 5th, 2023, Posit and Databricks held a joint event, revealing several key developments in their ongoing collaboration. These updates, building on announcements from July 2023, showcase tangible progress in data analysis and cloud computing integration. James Blair, Product Manager of Cloud Integrations for Posit, alongside Rafi Kurlansik, Lead Product ... Continue reading: Posit x Databricks: A Game-Changing Synergy for Data Teams http://dlvr.it/Szrxmb

Dorling Cartograms

صورة
I was writing some examples for next semester’s dataviz class and shared one of them—a Dorling Cartogram—on the socials medias. Some people don’t like cartograms, some people do like cartograms; in conclusion, we live in a world... Continue reading: Dorling Cartograms http://dlvr.it/Szpcwy

Ph Profiles

why— layout: post title: That’s a (W)RAP! published: true date: 2023-12-06 image: path: /assets/img/blog/officer.png tags: rstats description: __ An ambition realised as a suite of R powered publications enter the public domain — Continue reading: Ph Profiles http://dlvr.it/SzpQh7

What is the probability that two persons have the same initials?

صورة
Introduction How likely is it? For our team For teams of different sizes Conclusion Introduction Last week, I joined a team to work on a collaborative project. The team was already established for a few months, with several scientists workin... Continue reading: What is the probability that two persons have the same initials? http://dlvr.it/SzpQYk

Shiny Express: Blending the Best of Shiny and Streamlit for Dashboard Development

صورة
Shiny has long stood as a pillar of functionality and flexibility for R-based web applications. Let’s explore Shiny Express — an emerging Python-centric framework in development that combines the ease of Streamlit with the robustness of Shiny. This article explores the nuances of Shiny Express and what differentiates it from ... Continue reading: Shiny Express: Blending the Best of Shiny and Streamlit for Dashboard Development http://dlvr.it/SznwXC

A Complete Guide to Stepwise Regression in R

صورة
Introduction Stepwise regression is a powerful technique used to build predictive models by iteratively adding or removing variables based on statistical criteria. In R, this can be achieved using functions like step() or manually with forward a... Continue reading: A Complete Guide to Stepwise Regression in R http://dlvr.it/SznwJx

PowerQuery Puzzle solved with R

صورة
# 135–136Puzzles:PQ_135: content filePQ_136: content filePQ_135Lets imagine that we have binning machines in our sport centre. We can set how many balls need to be grouped in one bin/bucket/chest/whatever. We have 10 balls and machine is placing it in ... Continue reading: PowerQuery Puzzle solved with R http://dlvr.it/SzlP21

Advent of 2023, Day 4 – Delta lake and delta tables in Microsoft Fabric

صورة
In this Microsoft Fabric series: Yesterday we looked into lakehouse and learned that Delta tables are the storing format. So, let’s explore what and how we can go around understanding and working with delta tables. But first we must understand…Read more › Continue reading: Advent of 2023, Day 4 – Delta lake and delta tables in Microsoft Fabric http://dlvr.it/SzlNsT

{checkhelper} is on CRAN: so you don’t have to be afraid to run a check

You can read the original post in its original format on Rtask website by ThinkR here: {checkhelper} is on CRAN: so you don’t have to be afraid to run a check You’ve put together a great package that you’re proud of and you’d like to share ... Continue reading: {checkhelper} is on CRAN: so you don’t have to be afraid to run a check http://dlvr.it/Szjsql

Chi-square distribution and test in R

صورة
Greetings, humanists, social and data scientists! Was there an association or relationship between gender and the verdicts in investigations in 18th-century London? If an inquest concerned a man, did this fact influence the final verdict of the in... Continue reading: Chi-square distribution and test in R http://dlvr.it/SzjsWh

Re-Release: `traktok`

صورة
I’m happy to announce that traktok, my package to get content from TikTok, has returned from the dead. That’s slightly exaggerated, because it actually always worked in some shape or form, but up until about September, the most recent state on Githu... Continue reading: Re-Release: `traktok` http://dlvr.it/SzjsB5

Are Birth Dates Still Destiny for Canadian NHL Players?

صورة
In the first chapter Malcolm Gladwell’s Outliers he discusses how in Canadian Junior Hockey there is a higher likelihood for players to be born in the first quarter of the year. In his words: Because these kids are older within their year they make all the important teams at ... Continue reading: Are Birth Dates Still Destiny for Canadian NHL Players? http://dlvr.it/Szht6P

A Comparison of Several qeML Predictive Methods

صورة
Is machine learning overrated, with traditional methods being underrated these days? Yes, ML has had some celebrated successes, but these have come after huge amounts of effort, and it’s possible that similar effort with traditional methods may have produced similar results. A related issue concerns the type of data. ... Continue reading: A Comparison of Several qeML Predictive Methods http://dlvr.it/Szhswh

Trying out timeplyr

The timeplyr R package, created by my colleague Nick, was accepted on CRAN in October 2023. A direct quote from the CRAN page is that it provides a set of fast tidy functions for wrangling, completing and summarising date and date-time data. It look... Continue reading: Trying out timeplyr http://dlvr.it/Szg4DG

Finding the most unique land cover spatial pattern

صورة
Spatial signatures represent spatial patterns of land cover in a given area. Thus, they can be used to search for areas with similar spatial patterns to a query region or to quantify changes in spatial patterns. The approaches above are implement... Continue reading: Finding the most unique land cover spatial pattern http://dlvr.it/SzdYmp

gssr Update

صورة
The General Social Survey, or GSS, is one of the cornerstones of US public opinion research and one of the most-analyzed datasets in Sociology. My colleague Steve Vaisey aptly describes it as the Hubble Space Telescope of American social science. It is... Continue reading: gssr Update http://dlvr.it/SzdYZS

A R graphic in a Yesod app

صورة
Yesod is a web framework for Haskell. In this post I show how to do a Yesod application allowing to upload some data from a CSV or a XLSX file and to display a R graphic representing two selected columns of the ... Continue reading: A R graphic in a Yesod app http://dlvr.it/SzdYLF

Get a Git repo where your team can stow their throwaway data science code!

صورة
When I started working as a Data Scientist nearly ten years ago, the data science team I joined did something I found really strange at first: They had a single GitHub repo where they put all their “throwaway” code. An R script to produce s... Continue reading: Get a Git repo where your team can stow their throwaway data science code! http://dlvr.it/SzdY3Y

tidyAML: Now supporting gee models

Introduction I am happy to announce that a new version of tidyAML is now available on CRAN. This version includes support for gee models. This is a big step forward for tidyAML as it now supports a wide variety of regression and classification m... Continue reading: tidyAML: Now supporting gee models http://dlvr.it/SzbXpt

Back-transformations with emmeans()

صورة
I am one of those old guys who still uses the stabilising transformations, when the data do not conform to the basic assumptions for ANOVA. Indeed, apart from counts and proportions, where GLMs can be very useful, I have not yet found a simple way t... Continue reading: Back-transformations with emmeans() http://dlvr.it/SzbXhQ

Geocomputation with R comptetition: book cover for the 2nd edition

صورة
Introduction The 2nd edition of Geocomputation with R is due to be published in 2024. Now, we’re looking for a new cover image and we’d like your help. The competition is open to all and we have some prizes (see below). We’re launching this map c... Continue reading: Geocomputation with R comptetition: book cover for the 2nd edition http://dlvr.it/SzZ7zf