المشاركات

عرض المشاركات من أبريل, 2024

tabulapdf: Extract Tables from PDF Documents

Motivation I had to extract multiple tables from PDF files and do some data analysis in R. I found that updating tabulizer (now retired from CRAN) to use a Java version newer than Java 8 (deprecated) was worth it to complete this task. tabulapdf ... Continue reading: tabulapdf: Extract Tables from PDF Documents http://dlvr.it/T6DgjH

Sketchy waffle charts in R

صورة
Waffle charts are a common way to visualise counts or percentages of categorical data. There are already several excellent ways of creating waffle charts in R - including approaches using {ggplot2} or {waffle}. This blog post uses neither of those. Instead, it describes a somewhat back-to-basics approach of simply drawing ... Continue reading: Sketchy waffle charts in R http://dlvr.it/T6B3WY

Guest Post: Introducing the polyglotr package

صورة
Announcing the polyglotr package polyglotr The package polyglotr is tool for language translation within the R programming environment. This package stands out for its ability to integrate with a variety of free translation services, making... Continue reading: Guest Post: Introducing the polyglotr package http://dlvr.it/T69gTk

Evenly Spaced Month Charts

صورة
I recently noticed that ggplot2 spaces date axes literally even when grouped by month. I’ve been using ggplot2 extensively for years and I don’t remember noticing before, so this is not really a big deal, but now that I know it bugs me a lot. Take a... Continue reading: Evenly Spaced Month Charts http://dlvr.it/T68JFS

Office365 AddIns for R (Part III)

صورة
A while back, I introduced the __ExcelRAddIn__ ([Office365 AddIns for R (Part I)]( https://adam-gladstone.github.io/r-project/Office365AddIns-for-R-part-I/)). This is an Office365 AddIn that allows you to evaluate an R-script from within Excel and use the results. This blog-post describes some of the recent updates to the ExcelRAddIn. Continue reading: Office365 AddIns for R (Part III) http://dlvr.it/T67k16

Backtesting

صورة
The key to successful backtesting is to ensure that you only use the data that were available at the time of the prediction. No “future” data can be included in the model training set, otherwise the model will suffer from look-ahead bias (having unrealistic access to future data). Continue reading: Backtesting http://dlvr.it/T67jmS

{emayili} Support for Mailtrap

صورة
The {emayili} package has adapters which make it simple to send email via a variety of services. For example, it caters specifically for ZeptoMail, MailerSend, Mailfence and Sendinblue. The latest version of {emayili}, 0.8.0 published on 23 April 2024, adds an an adapter for Mailtrap. Continue reading: {emayili} Support for Mailtrap http://dlvr.it/T64PcZ

Optimal policy learning based on causal machine learning in R workshop

Join our workshop on Optimal policy learning based on causal machine learning in R, which is a part of our workshops for Ukraine series!  Here’s some more info:  Title: Optimal policy learning based on causal machine learning in R Date: Thursday, May 16th, 18:00 – 20:00 CEST (Rome, Berlin, Paris timezone) Speaker: ... Continue reading: Optimal policy learning based on causal machine learning in R workshop http://dlvr.it/T5X8yr

How to Make Mobile Apps with R Shiny

صورة
Hey guys, welcome back to my R-tips newsletter. It’s no secret that modern businesses run on mobile apps. Today, I’m going to share how to turn your R Shiny web apps into Mobile-First business applications. Table of Contents Here’s what you’re learni... Continue reading: How to Make Mobile Apps with R Shiny http://dlvr.it/T5VLlY

Eclipse map

صورة
2017 Solar Eclipse with Totality - Composite – CC-BY-NC by Jeff Geerling I saw this post showing a map of the year of the most recent total eclipse, and people mentioning that we can find the data on the Five Millennium Canon of Solar Eclipses Database (the data also mentioned in the ... Continue reading: Eclipse map http://dlvr.it/T5RJ8q

Taking the data out of the glue with regex in R

Introduction Regular expressions, or regex, are incredibly powerful tools for pattern matching and extracting specific information from text data. Today, we’ll explore how to harness the might of regex in R with a practical example. Let’s dive i... Continue reading: Taking the data out of the glue with regex in R http://dlvr.it/T5RJ42

Announcing vvdoctor Alpha Release

صورة
Announcing the Launch of vvdoctor: A Shiny App for Effortless Data Analysis vvdoctor We are excited to announce the launch of vvdoctor, a new R Shiny app (and package) that provides a user-friendly interface for statistical testing from dat... Continue reading: Announcing vvdoctor Alpha Release http://dlvr.it/T5RHv1

R-Universe Documentation Gets a Boost from Google Season of Docs

We are excited to announce that R-Universe has been awarded a Google Season of Docs. R-Universe is rOpenSci’s platform for testing, building, distributing, and discovering R packages, led by Jeroen Ooms. It provides no-setup continuous integrati... Continue reading: R-Universe Documentation Gets a Boost from Google Season of Docs http://dlvr.it/T5QPfD

Leverage Effect

صورة
The models we have been looking at do not differentiate between positive and negative residuals: both errors are treated the same. However, this does not align with reality, where the volatility resulting from a large negative return is higher than that for the corresponding positive return. Continue reading: Leverage Effect http://dlvr.it/T5Q1Dq

Crafting Elegant Scientific Documents in RStudio: A LaTeX and R Markdown Tutorial

صورة
IntroductionIn the world of scientific research and academic writing, the clarity, precision, and aesthetics of your documents can significantly impact their reception and comprehension. LaTeX, a powerful typesetting system, has long been revered for its ability to create beautifully formatted documents, especially those requiring complex mathematical expressions and detailed layouts. ... Continue reading: Crafting Elegant Scientific Documents in RStudio: A LaTeX and R Markdown Tutorial http://dlvr.it/T5NkR8

simaerep release 0.5.0

صورة
Simulate adverse event reporting in clinical trials with the goal of detecting under-reporting sites. Monitoring of Adverse Event (AE) reporting in clinical trials is important for patient safety. We use bootstrap-based simulation to assign an AE ... Continue reading: simaerep release 0.5.0 http://dlvr.it/T5Nk7c

R-hub v2

After eight years, we are retiring the current version of R-hub, in favor of a better, faster, modern system. We call the new system R-hub v2. R-hub v2 runs R package checks on GitHub Actions. R-hub v2 works best if your R package is in a GitHub reposi... Continue reading: R-hub v2 http://dlvr.it/T5NPWt

The Impact of R on Academic Excellence in Manchester, UK

صورة
The R Consortium recently spoke with the organizing team of the R User Group at the University of Manchester (R.U.M.). R.U.M. aims to bring together R users of all levels... The post The Impact of R on Academic Excellence in Manchester, UK appeared first on R ... Continue reading: The Impact of R on Academic Excellence in Manchester, UK http://dlvr.it/T5LlcQ

The Truth About Tidy Wrappers

صورة
These are the packages we will need for this analysis. library(tidyverse) library(data.table) library(dtplyr) library(duckdb) library(duckplyr) library(polars) library(tidypolars) library(arrow) library(tictoc) library(microbenchmark) library(gt) The Tidyverse I love the Tidyverse from Posit.co. The biggest evolution of the R language ecosystem ... Continue reading: The Truth About Tidy Wrappers http://dlvr.it/T5LlV3

A Guide to Removing Multiple Rows in R Using Base R

Introduction As data analysts and scientists, we often find ourselves working with large datasets where data cleaning becomes a crucial step in our analysis pipeline. One common task is removing unwanted rows from our data. In this guide, we’ll ... Continue reading: A Guide to Removing Multiple Rows in R Using Base R http://dlvr.it/T5Kyq3

Simple and Fast Visualization of Biodiversity Occurrence Data using GBIF and R Shiny

As an ecologist, being able to easily visualize biodiversity occurrence data is an essential need as this kind of data visualization provides critical insights into species distribution patterns and ecological requirements, which is essential for understanding biodiversity dynamics in space and time. Moreover,  for pragmatic reasons, fast and simple biodiversity ... Continue reading: Simple and Fast Visualization of Biodiversity Occurrence Data using GBIF and R Shiny http://dlvr.it/T5Hsm7

How to Remove Rows with Some or All NAs in R

Introduction: Handling missing values is a crucial aspect of data preprocessing in R. Often, datasets contain missing values, which can adversely affect the analysis or modeling process. One common task is to remove rows containing missing value... Continue reading: How to Remove Rows with Some or All NAs in R http://dlvr.it/T5Hsb0

PowerQuery Puzzle solved with R

صورة
#171–172PuzzlesAuthor: ExcelBIAll files (xlsx with puzzle and R with solution) for each and every puzzle are available on my Github. Enjoy.Puzzle #171Power Query puzzles are very often focused on transformation of tables in very different ways. And we ... Continue reading: PowerQuery Puzzle solved with R http://dlvr.it/T5HCVY

EARL Early Bird Tickets Are Now Available!

صورة
Contributed by Abbie Brookes, Senior Data Analyst at Datacove Datacove is pleased to announce the availability of tickets for the upcoming EARL (Enterprise Applications of the R Language) conference.  The... The post EARL Early Bird Tickets Are Now Available! appeared first on R Consortium. Continue reading: EARL Early Bird Tickets Are Now Available! http://dlvr.it/T5GQGM

Data Frame Merging in R (With Examples)

Introduction Merging multiple data frames is a pivotal skill in data manipulation. Whether you’re handling small-scale datasets or large-scale ones, mastering the art of merging can significantly enhance your efficiency. In this tutorial, we’ll ... Continue reading: Data Frame Merging in R (With Examples) http://dlvr.it/T5FgVh

Loading Financial Time Series

صورة
I’m going to be writing a series of posts which will look at some applications of R (and perhaps Python) to financial modelling. We’ll start here by pulling some stock data into R, calculating the daily returns and then looking at correlations and simple volatility estimates. Continue reading: Loading Financial Time Series http://dlvr.it/T5DN43

Conformalized predictive simulations for univariate time series on more than 250 data sets

صورة
Conformalized predictive simulations for univariate time series on more than 250 data sets. Continue reading: Conformalized predictive simulations for univariate time series on more than 250 data sets http://dlvr.it/T5DMht

S.P.I.C.E of Causal Inference

صورة
The SUTVA, Positivity, Identifiability, Consistency, Exchangeability of Causal Inference, the essential ingredients that helps us bring out the true flavor of the causal model. Here is my understanding of each assumptions (main course) with examples (side dish) and accompanied by simulation (paired with beverages). Bon Appétit! Since the multiple ... Continue reading: S.P.I.C.E of Causal Inference http://dlvr.it/T5CFw0

Tetley caffeine meter replication with ggplot2

صورة
Tetley tea boxes feature the following caffeine meter: In R we can replicate this meter using ggplot2. Move the information to a tibble: library(dplyr) caffeine_meter Continue reading: Tetley caffeine meter replication with ggplot2 http://dlvr.it/T5CFk2

Extrapolation to unseen domains: from theory to applications

Extrapolation to unseen domains: from theory to applications Monday, April 22nd, 2024, 8:00 PT / 11:00 ET / 17:00 CET 3rd joint webinar of the IMS New Researchers Group, Young Data Science Researcher Seminar Zürich and the YoungStatS Project. When ... Continue reading: Extrapolation to unseen domains: from theory to applications http://dlvr.it/T5CFbM

A Practical Guide to Merging Data Frames Based on Multiple Columns in R

Introduction As a data scientist or analyst, you often encounter situations where you need to combine data from multiple sources. One common task is merging data frames based on multiple columns. In this guide, we’ll walk through several step-by... Continue reading: A Practical Guide to Merging Data Frames Based on Multiple Columns in R http://dlvr.it/T56r2R

Running MLwiN using mlnscript via the R2MLwiN R package on Apple Silicon Macs

صورة
Introduction MLwiN from the Centre for Multilevel Modelling (CMM) at the University of Bristol (disclaimer: where I also work) is a fantastic piece of software (Charlton et al. 2024). The name suggests it only works on Windows, but as we’ll find out this is very much not the case. However, ... Continue reading: Running MLwiN using mlnscript via the R2MLwiN R package on Apple Silicon Macs http://dlvr.it/T55fF4

Make Your Own NOAA Sea Temperature Graph

صورة
Sea-surface temperatures in the North Atlantic have been in the news recently as they continue to break records. While there are already a number of excellent summaries and graphs of the data, I thought I’d have a go at making some myself. The starting point is the detailed data ... Continue reading: Make Your Own NOAA Sea Temperature Graph http://dlvr.it/T54fRg

Unveiling Car Specs with Multidimensional Scaling in R

صورة
Introduction Visualizing similarities between data points can be tricky, especially when dealing with many features. This is where multidimensional scaling (MDS) comes in handy. It allows us to explore these relationships in a lower-dimensional ... Continue reading: Unveiling Car Specs with Multidimensional Scaling in R http://dlvr.it/T54fFv

Achieving Reporting Excellence: R Packages for Consistency and Diverse Outputs

صورة
In the era of data-driven decision making, the ability of businesses to communicate complex information effectively has never been more critical. Yet, as companies navigate through vast oceans of data, the challenge of not just analyzing but also presenting this data in a coherent, consistent, and compelling manner is a ... Continue reading: Achieving Reporting Excellence: R Packages for Consistency and Diverse Outputs http://dlvr.it/T54JLF

R Highcharts: How to Make Animated and Interactive Data Visualizations in R

صورة
If you’re looking to take your R data visualization skills to the next level, interactivity is the name of the game. There aren’t too many packages that offer it out of the box, but you don’t need quantity if you have quality. Highcharts is among the most ... Continue reading: R Highcharts: How to Make Animated and Interactive Data Visualizations in R http://dlvr.it/T53St7

Scaling Your Data to 0-1 in R: Understanding the Range

صورة
Introduction Today, we’re diving into a fundamental data pre-processing technique: scaling values between 0 and 1. This might sound simple, but it can significantly impact how your data behaves in analyses. Why Scale? Imagine you have data on ... Continue reading: Scaling Your Data to 0-1 in R: Understanding the Range http://dlvr.it/T51JP3

Discover great_tables: The Python Answer to R’s {gt} Package for Table Formatting in Quarto and PyShiny

صورة
Crafting compelling narratives often depends on presenting insights clearly and effectively. This skill is key for data science. For users of R’s {gt} package, it has provided a powerful and flexible way to create publication-quality tables in Quarto reports or Shiny apps such as clinical trial reports, research documents, ... Continue reading: Discover great_tables: The Python Answer to R’s {gt} Package for Table Formatting in Quarto and PyShiny http://dlvr.it/T51J0f

PowerQuery Puzzle solved with R

صورة
#169–170PuzzlesAuthor: ExcelBIAll files (xlsx with puzzle and R with solution) for each and every puzzle are available on my Github. Enjoy.Puzzle #169In todays challenge we have certain pattern to extract from given texts. As you may notice I really li... Continue reading: PowerQuery Puzzle solved with R http://dlvr.it/T4z4zf

A Practical Guide to Data Normalization in R

صورة
Introduction Data normalization is a crucial preprocessing step in data analysis and machine learning workflows. It helps in standardizing the scale of numeric features, ensuring fair treatment to all variables regardless of their magnitude. In ... Continue reading: A Practical Guide to Data Normalization in R http://dlvr.it/T4z4kT

Navigating ShinyConf 2024: A First-Timer’s Guide to Virtual Conferences

صورة
With ShinyConf 2024 just around the corner, first-time attendees may find themselves grappling with a mix of excitement and uncertainty. Fear not! This comprehensive guide aims to equip you with the tools and knowledge you need to prepare, thrive, and make the most out of your ShinyConf experience. Curious about what’... Continue reading: Navigating ShinyConf 2024: A First-Timer’s Guide to Virtual Conferences http://dlvr.it/T4ygCQ

adhan package: retreiving and aligning the prayer times in R

صورة
 The adhan package is available here ! The prayer times cannot always be estimated accurately in some places such as countries located in higher latitudes (e.g. the Nordic countries) . as for instance during midsummer time the Fajr may b... Continue reading: adhan package: retreiving and aligning the prayer times in R http://dlvr.it/T4xHXS

gssr Update

صورة
NORC released version 2a of the 1972-2022 General Social Survey cumulative file. I’ve updated {gssr}, an R package that makes it more convenient for R users to work with GSS Data. One handy feature of {gssr} is that it lets you see documentation for individual GSS variables as R ... Continue reading: gssr Update http://dlvr.it/T4xHJt

organize blocks of code in R with with() ?

صورة
In their “Object-Oriented Programming is Bad” video, Brian Will mentions a desired reserved word: use (timestamp) that could look something like this: The variables you specify come from the enclosing scope and would be available as copies within a separate … Continue reading → Continue reading: organize blocks of code in R with with() ? http://dlvr.it/T4vRNS