المشاركات

عرض المشاركات من أغسطس, 2021

How to confuse your shareholders by bad data visualization

صورة
Like many people during the COVID19 crisis, I turned to the stock market as a new hobby. Like the ignorant investor that I am, I thought it wise to hop on the cloud computing bandwagon. Hence, I bought, among others, a small position in Rackspace Technologies. A long way down ... Continue reading: How to confuse your shareholders by bad data visualization http://dlvr.it/S6hrxk

Announcing Calendar Based Versioning for All Commercial RStudio Products

صورة
Photo by Eric Rothermel on Unsplash RStudio is shifting to a calendar-based versioning scheme for future releases of all our commercial products. We are making this transition to deliver a more transparent experience for our customers: ... Continue reading: Announcing Calendar Based Versioning for All Commercial RStudio Products http://dlvr.it/S6h0Sp

Exploring Stock Market Listing Mortality since 1986

صورة
Click to see R set-up code # Libraries if(!require("pacman")) { install.packages("pacman") } pacman::p_load( data.table, re2, scales, ggplot2, plotly, DT, patchwork, survival, ggfortify, scales) # Set knitr params knitr::opts_chunk$set( comment = NA, fig.width = 12, fig.height = 8, out.width = '100%' ) NOTE: The read time for this post is overstated because of the formatting of the Plotly code. There are ~2,500 words, so read time should be ~10 minutes. Click to see R code generating plot # Load function to plot dual y-axis plot source("train_sec.R") # Get data series from FRED symbols http://dlvr.it/S6h0S4

RStudio Connect 2021.08.0 Python Updates

صورة
Python Updates At RStudio we know that many data science teams leverage both R and Python in their work, so it’s important that we build products to support the best tools available in both languages. For an overview of all the ways... Continue reading: RStudio Connect 2021.08.0 Python Updates http://dlvr.it/S6dHZM

A Latin American R community for HR

صورة
By Sergio Garcia Mora R4HR, formerly known as the Club de R para RRHH, is a Latin American based community whose mission is to spread the adoption of R in... The post A Latin American R community for HR appeared first on R Consortium. Continue reading: A Latin American R community for HR http://dlvr.it/S6cRzg

ShinyProxy 1-click App

صورة
ShinyProxy is a great way to deploy containerized Shiny apps to production and it is only 1 click away. Continue reading: ShinyProxy 1-click App http://dlvr.it/S6cRxY

R : a combined usage of split, lapply and do.call

صورة
This post explains a combined usage of split(), lapply() and do.call() R functions, so called Split-Apply-Combine approach. These are frequently used for group based calculations such as weighted average or aggregation. It will be useful when compl... Continue reading: R : a combined usage of split, lapply and do.call http://dlvr.it/S6YJSD

Detecting time series outliers

صورة
The tsoutliers() function in the forecast package for R is useful for identifying anomalies in a time series. However, it is not properly documented anywhere. This post is intended to fill that gap. The function began as an answer on CrossValidated and was later added to the forecast package because ... Continue reading: Detecting time series outliers http://dlvr.it/S6X1J9

Comparing SQLite, DuckDB and Arrow with UN Trade Data

Context This is not a competition, is just to show how to use the hardware with relative efficiency, being the idea is to show something collaborative rather than competitive. Assume that you work at customs and your boss asked you to obtain the a... Continue reading: Comparing SQLite, DuckDB and Arrow with UN Trade Data http://dlvr.it/S6VZkx

Practical Advice for R in Production – Answering Your Questions

صورة
This is a guest post by Colin Gillespie from Jumping Rivers, a Full Service RStudio Partner. Earlier this month, Jack Walton and I delivered a webinar with RStudio on the benefits of putting R into production environments, and how to do... Continue reading: Practical Advice for R in Production – Answering Your Questions http://dlvr.it/S6TFpD

How to Perform Dunnett’s Test in R

صورة
Dunnett’s test in R, After the ANOVA test has been completed the next step is to determine which group means are significantly different from one another. Different types of post hoc tests are available... The post How to Perform Dunnett’s Test in R appeared first on finnstats. Continue reading: How to Perform Dunnett’s Test in R http://dlvr.it/S6ScHS

Monte Carlo Cost Estimates: Engineers Throwing Dice

صورة
Estimating the cost of a complex project is not a trivial task. Traditional cost estimates are full of assumptions about the future state of the market and the final deliverable. Monte Carlo cost estimates are a tool to better understand the risks ... Continue reading: Monte Carlo Cost Estimates: Engineers Throwing Dice http://dlvr.it/S6R7fZ

Using the R-Universe

Using the R-Universe The R-Universe, created by Jeroen Ooms, provides a very simple way to create personal CRAN-like repos, which means a way to show your collection of tools in use to the community. In addition, you can use it to publish articles by using rmarkdown, an R package that ... Continue reading: Using the R-Universe http://dlvr.it/S6PqDk

A lightweight data validation ecosystem with R, GitHub, and Slack

صورة
Data quality monitoring is an essential part of any data analysis or business intelligence workflow. As such, an increasing number of promising tools1 have emerged as part of the Modern Data Stack to offer better orchestration, testing, and reporting.... Continue reading: A lightweight data validation ecosystem with R, GitHub, and Slack http://dlvr.it/S6P5nq

EARL Online 2021: Dr Branka Subotić, keynote speaker

The opening keynote at the Enterprise Applications of the R Language Conference presentation day will be Dr Branka Subotić. Branka... The post EARL Online 2021: Dr Branka Subotić, keynote speaker appeared first on Mango Solutions. Continue reading: EARL Online 2021: Dr Branka Subotić, keynote speaker http://dlvr.it/S6MWLd

When Yahoo Finance doesn’t have de-listed tickers needed

Click to see R set-up code # Libraries if(!require("pacman")) { install.packages("pacman") } pacman::p_load( data.table ) # Set knitr params knitr::opts_chunk$set( comment = NA, fig.width = 12, fig.height = 8, out.width = '100%' ) Introduction As we discussed in our last post Introducing the Redwall ‘Red Flag’ Explorer with New Constructs Data, we were able to test the response of 125,000 quarterly and annual financial statements to incidence of “red flag” ratios, but some of the most interesting ... Continue reading: When Yahoo Finance doesn’t have de-listed tickers needed http://dlvr.it/S6L7sV

New Bundesliga Forecasting Tool: Can Underdog Herta Berlin beat Bayern Munich?

صورة
The Bundesliga is Germany’s primary football league. It is one of the most important football leagues in the world, broadcast on television in over 200 countries. If you want to get your hands on a tool to forecast the result of any game (and perform some more statistical analyses), read ... Continue reading: New Bundesliga Forecasting Tool: Can Underdog Herta Berlin beat Bayern Munich? http://dlvr.it/S6K0j1

Appsilon Talks at the 2021 R/Medicine Conference: Focus on Automation

صورة
The 2021 R/Medicine Conference is already upon us! This week in August (24-27) will see a virtual gathering of the R community with a keen eye on medical data science. Damian Rodziewicz, President and Co-Founder of Appsilon, and Oriol Senan, R Shiny Developer and computational biology expert, will represent Appsilon ... Continue reading: Appsilon Talks at the 2021 R/Medicine Conference: Focus on Automation http://dlvr.it/S6Flcg

ggalt: Make a Lollipop Plot to Compare Categories in ggplot2

صورة
A Lollipop Plot shows the relationship between categories using a dot and a line that connects to a baseline (similar to a Bar Plot). In this short tutorial, we use ggalt to create a Lollipop Plot with the geom_lollipop() function. R-Tips Weekly This... Continue reading: ggalt: Make a Lollipop Plot to Compare Categories in ggplot2 http://dlvr.it/S6FlbD

Calculate Geometric Mean in R

Calculate Geometric Mean in R, Geometric mean is the nth root of the product of n values of a set of observations. Geometric can be expressed as GM=(x1, x2, x3, ……, xn)1/n The advantage... The post Calculate Geometric Mean in R appeared first on finnstats. Continue reading: Calculate Geometric Mean in R http://dlvr.it/S6FK4p

Securing ShinyProxy with Caddy Server

صورة
Use Caddy server to obtain TLS certificates for your custom domain and to serve Shiny apps securely with ShinyProxy. Continue reading: Securing ShinyProxy with Caddy Server http://dlvr.it/S6BF1f

Wildfires Comparison with ggplot2 dual Y-axis and Forecasting with KNN

صورة
In recent days, Turkey has been escalated wildfires. The government and the people were In recent days, Turkey has been escalated wildfires. The government and the people were on the alert and mobilized to put out the fires but, they couldn’t for days. A significant part of people thought ... Continue reading: Wildfires Comparison with ggplot2 dual Y-axis and Forecasting with KNN http://dlvr.it/S69q0Z

matrixStats: Consistent Support for Name Attributes via GSoC Project

صورة
Author: Angelina Panagopoulou, GSoC student developer, undergraduate in the Department of Informatics & Telecommunications (DIT), University of Athens, Greece We are glad to announce recent CRAN releases of matrixStats with support for hand... Continue reading: matrixStats: Consistent Support for Name Attributes via GSoC Project http://dlvr.it/S69KFY

Test For Randomness in R-How to check Dataset Randomness

Test For Randomness in R, How to check dataset randomness? Assume that a and b are symbols indicating the kind of items or numbers that make up a sequence and the test hypothesis is... The post Test For Randomness in R-How to check Dataset Randomness appeared first on finnstats. Continue reading: Test For Randomness in R-How to check Dataset Randomness http://dlvr.it/S66k8c

{emayili}: Rudimentary Email Address Validation

صورة
A recent issue on the {emayili} GitHub repository prompted me to think a bit more about email address validation. When I started looking into this I was somewhat surprised to learn that it’s such a complicated problem. Who would have thought that something as apparently simple as an email ... Continue reading: {emayili}: Rudimentary Email Address Validation http://dlvr.it/S63lw2

New Workshops Series Kick-off

صورة
We are happy to announce our third series of workshops, to take place in the fall of 2021. After the success of the past series that took place at the end of last year and this year in spring we decided to offer again a series of workshops based on ... Continue reading: New Workshops Series Kick-off http://dlvr.it/S62Z4V

The “Youth Bulge” of Afghanistan: The Hidden Force behind Political Instability

صورة
In view of the current dramatic events in Afghanistan many wonder why the extensive international efforts to bring some stability to the country have failed so miserably. In this post, we will present and analytically examine a fascinating theory that seems to be able to explain political (in-)stability almost ... Continue reading: The “Youth Bulge” of Afghanistan: The Hidden Force behind Political Instability http://dlvr.it/S61s6v

Feature Subsampling For Random Forest Regression

صورة
TLDR: The number of subsampled features is a main source of randomness and an important parameter in random forests. Mind the different default values across implementations. Randomness in Random Forests Random forests are very popular machine learning models. They are build from easily understandable and well visualizable decision trees and ... Continue reading: Feature Subsampling For Random Forest Regression http://dlvr.it/S5zYB4

Reinforcement Learning With TicTacJoe: A Simple Brain Coded Explicitly in R

صورة
Reinforcement Learning: Introduction Reinforcement Learning is a scheme of training machine learning models in which a certain agent’s actions in an environment (typically in the form of moves in a game played by the agent) are adjusted over time. Adjustments are made by reinforcing those which lead to a ... Continue reading: Reinforcement Learning With TicTacJoe: A Simple Brain Coded Explicitly in R http://dlvr.it/S5ywtR

Olympics, Reaction Times, Volleyball, and a New Version of SwimmeR

صورة
There’s a new version of SwimmeR available, 0.12.0, which includes capabilities for parsing swimming results from the 2020 Tokyo Olympics. Naturally I’m going to use it to investigate the theory I have about volleyball. To play along at home you’ll need a version of SwimmeR that’s at least 0.12.0, ... Continue reading: Olympics, Reaction Times, Volleyball, and a New Version of SwimmeR http://dlvr.it/S5yTqX

Announcing bookdown v0.23

صورة
Happy summer from the R Markdown family! We are proud to share that bookdown (https://pkgs.rstudio.com/bookdown/) version 0.23 is on CRAN. bookdown is a package that helps you write books and long-form articles/reports, knitting togethe... Continue reading: Announcing bookdown v0.23 http://dlvr.it/S5w6ST

RTutor: Exploring Economic Impacts of COVID-19

صورة
An earlier post described the impressive collection of US data that Chetty et al. made available to illuminate the economic impacts of Covid and accompanying policy measures. As part of her Master thesis at Ulm university Alexandra Aehle has created a... Continue reading: RTutor: Exploring Economic Impacts of COVID-19 http://dlvr.it/S5vT5r

Monitoring systemic risk with R

Hi everyone ! This is my very first post… many researchers / students / practitioners from all over the world write to me regularly about my R package SystemicR. I’m glad to contribute to the community but questions about data management and plotting often come up. I guess that the package notice ... Continue reading: Monitoring systemic risk with R http://dlvr.it/S5sSDs

Setting up a transparent reproducible R environment with Docker + renv

For my PhD I’m currently writing a paper using rmarkdown. Since I care about reproducibility, I’m using renv to register the versions of the R packages I use and to manage a local library that doesn’t affect the rest of my system. With that, anyone who wants ... Continue reading: Setting up a transparent reproducible R environment with Docker + renv http://dlvr.it/S5rfZS

rOpenSci Introduces Monthly Social Coworking and Office Hours

We’re excited to announce that we’ll be hosting monthly social coworking + office hours sessions via Zoom, starting September 7th! Coworking is a great way to be productive and reduce feelings of social isolation (especially important over... Continue reading: rOpenSci Introduces Monthly Social Coworking and Office Hours http://dlvr.it/S5r0H7

How to make your home Shiny or Rstudio Server accessible from the public internet

صورة
Prerequisites Open the Required Ports in the Server Configure Port Forwarding in your Router Optional Extra Steps Setup a Dynamic DNS Service (DDNS) Configure a Reverse Proxy ⚠️ Some assembly required! This project is going to require you to investigate the specifics of your network equipment on your own. You have ... Continue reading: How to make your home Shiny or Rstudio Server accessible from the public internet http://dlvr.it/S5nwX5

Programmer’s confidence

صورة
It’s the middle of August now and the book (“Advanced R Solutions”) my friend Malte and I have been writing in our spare time is currently being printed. If you’re interested in obtaining a hardcopy, you may already (pre-)order - the book will be sh... Continue reading: Programmer’s confidence http://dlvr.it/S5n70C

Keep your R scripts locally sourced

صورة
A few weeks ago, I had a bad debugging session. The code was just not doing what I expected, and I went down a lot of deadends trying to fix or simplify things. I could not get the problem to happen in a reproducible example (reprex) or interactively (in RStudio). ... Continue reading: Keep your R scripts locally sourced http://dlvr.it/S5mV0S

R dataframe merge while keeping orders of row and column

صورة
This post makes a useful wrapper R function merge() for left outer join, which preserves the orders of row and column of input x data. It is not a must but useful when we prefer these fixed orders in some case. Left Outer Join which we try to... Continue reading: R dataframe merge while keeping orders of row and column http://dlvr.it/S5kR88

Predict housing prices in Austin TX with tidymodels and xgboost

صورة
This is the latest in my series of screencasts demonstrating how to use the tidymodels packages, from just getting started to tuning more complex models. My screencasts lately have focused on xgboost as I have participated in SLICED, a competitive d... Continue reading: Predict housing prices in Austin TX with tidymodels and xgboost http://dlvr.it/S5jfpp

Visualise Org-Roam Networks With igraph and R

صورة
The Emacs package Org-Roam provides a powerful tool to take notes following the idea of the Zettelkasten method. You can write notes with all the power that Emacs provides while linking your thoughts to each other and with your bibliography. This a... Continue reading: Visualise Org-Roam Networks With igraph and R http://dlvr.it/S5hS6T

How to Overlay Plots in R-Quick Guide with Example

صورة
To make overlay Plots in R, we can make use of lines() and points() functions. Let’s create a scatter plot first and overly another... The post How to Overlay Plots in R-Quick Guide with Example appeared first on finnstats. Continue reading: How to Overlay Plots in R-Quick Guide with Example http://dlvr.it/S5gDSx

Could there be incentives to cycle through a red light?

صورة
This is of course a rhetorical question! Because cyclists must stop when the light is red! … But … there is always that moment, on a bicycle, when you stop, and  then you say to yourself the worst part is that the lights are badly regulated, and I know that the next ... Continue reading: Could there be incentives to cycle through a red light? http://dlvr.it/S5f2gg

Introducing the fastverse: An Extensible Suite of High-Performance and Low-Dependency Packages for Statistical Computing and Data Manipulation

صورة
The fastverse is a suite of complementary high-performance packages for statistical computing and data manipulation in R. Developed independently by various people, fastverse packages jointly contribute to the objectives of: Speeding up R throug... Continue reading: Introducing the fastverse: An Extensible Suite of High-Performance and Low-Dependency Packages for Statistical Computing and Data Manipulation http://dlvr.it/S5dJZp

RStudio Voices – Julia Silge

صورة
For the first piece in our new RStudio Voices series, we decided to interview one of our open source package developers as their work defines our organization’s focus on making data science tools available to everyone. We spoke with Julia... Continue reading: RStudio Voices – Julia Silge http://dlvr.it/S5cPhB

$GME To The Moon: How Much of an Outlier Was Gamestop’s January Rise?

صورة
Introduction Between January 13th and January 27th, 2021 the stock price for Gamestop (GME) rose 10x from $31 to $347 dollars. This rise was in part due to increased popularity on the Reddit forum r/wallstreetbets looking to create a short squeeze and because they “liked the stock”. This rapid rise also drew ... Continue reading: $GME To The Moon: How Much of an Outlier Was Gamestop’s January Rise? http://dlvr.it/S5bBcY

Reverting Git Commits

صورة
sha256 1 010e16069f8858640c2bb9af4a2293b720161b71d9d6e1f375c30237ea2b4123 Shikokuchuo Reverting Local Git Commits You have made a commit. You discover an error / something you left out /... Continue reading: Reverting Git Commits http://dlvr.it/S5Z8Wk

R / Medicine 2021

صورة
R/Medicine 2021, the premier conference for the use of R in clinical applications is less than two weeks away! This conference reflects the increasing importance of data science, computational statistics and machine learning to clinical applications, and emphasizes the effectiveness of the R language as a vehicle for making data ... Continue reading: R / Medicine 2021 http://dlvr.it/S5Y1R0

EARL online: Interview with Emily Riederer

We caught up with Emily Riederer ahead of her presentation at the upcoming Enterprise Applications of the R Language Conference... The post EARL online: Interview with Emily Riederer appeared first on Mango Solutions. Continue reading: EARL online: Interview with Emily Riederer http://dlvr.it/S5WCXh

ggplot: Easy as pie (charts)

صورة
This post by no means endorses the use of pie charts. But, if you must, here’s how… For some reason, the top Google results for “ggplot2 pie chart” show some very convoluted code to accomplish what should be easy: Make slices Add labels to the mid... Continue reading: ggplot: Easy as pie (charts) http://dlvr.it/S5VhG5

How to Calculate Cross-Correlation in R

صورة
How to Calculate Cross-Correlation in R, The degree of resemblance between a time series and a lagged version of another time series is measured... The post How to Calculate Cross-Correlation in R appeared first on finnstats. Continue reading: How to Calculate Cross-Correlation in R http://dlvr.it/S5TXsV

Old ‘Hood, New ‘Hood

صورة
I recently moved from suburban South Africa to rural England. I’m figuring out my new environment. Making some maps seemed to be a good way to get familiar with the surroundings. In the process I wanted to figure out two things: how to get maps wi... Continue reading: Old ‘Hood, New ‘Hood http://dlvr.it/S5Rl6w

How to Calculate Cosine Similarity in R

How to Calculate Cosine Similarity in R, The measure of similarity between two vectors in an inner product space is cosine similarity. The formula... The post How to Calculate Cosine Similarity in R appeared first on finnstats. Continue reading: How to Calculate Cosine Similarity in R http://dlvr.it/S5Q6C0

Democratizing Data with R, Python, and Slack

صورة
This is a guest post by Matthias Mueller, Director of Marketing Analytics at CM Group. Matthias oversees a team of data scientists, engineers, and analysts tasked with optimizing the marketing mix, building customer lifetime value models, ... Continue reading: Democratizing Data with R, Python, and Slack http://dlvr.it/S5PkwL

Introducing the Redwall ‘Red Flag’ Explorer with New Constructs Data

صورة
Click to see R set-up code # Libraries if(!require("pacman")) { install.packages("pacman") } pacman::p_load( data.table, scales, ggplot2, plotly, DT) # Set knitr params knitr::opts_chunk$set( comment = NA, fig.width = 12, fig.height = 8, out.width = '100%' ) # Load annual data only path http://dlvr.it/S5LvZT

The Quickest Way to Add New Apps to ShinyProxy

صورة
Once your ShinyProxy server is up and running, you need to maintain and update it. Last time we looked at how to update existing apps. This section explains how to add new ones hassle-free. Continue reading: The Quickest Way to Add New Apps to ShinyProxy http://dlvr.it/S5Lcgx

Fast SQL Server Imports with R

صورة
Writing large datasets to SQL Server can be very slow using the DBI package with an odbc connection. The issue with writing data is that individual INSERT statements are generated for each row of data. I’ve also had issues with remote connections... Continue reading: Fast SQL Server Imports with R http://dlvr.it/S5Jp61

Using R: plyr to purrr, part 1

صورة
This is the second post about my journey towards writing more modern Tidyverse-style R code; here is the previous one. We will look at the common case of taking subset of data out of a data frame, making some complex R object from them, and then extracting summaries from those ... Continue reading: Using R: plyr to purrr, part 1 http://dlvr.it/S5GgTr

Pricing of FX Forward in R and Excel

صورة
This post explains how to price a FX forward. We assume that 1) USD is the foreign currency and KRW the domestic one, 2) USD IRS zero curve and KRW FX implied zero curve are given. Before making a R code, we use Excel spreadsheet for the clear under... Continue reading: Pricing of FX Forward in R and Excel http://dlvr.it/S5GgQx

Tune xgboost models with early stopping to predict shelter animal status

صورة
This is the latest in my series of screencasts demonstrating how to use the tidymodels packages, from just getting started to tuning more complex models. I participated in this week’s episode of the SLICED playoffs, a competitive data science stream... Continue reading: Tune xgboost models with early stopping to predict shelter animal status http://dlvr.it/S5F6BD

How to actually make a quality scatterplot in R

صورة
Scatterplots are one of the most common types of data visualizations you will encounter as a biologist. They present the relationship between two continuous variables. We might take them for granted by their simplicity, but we shouldn’t assume the seeming intuition with which we can see and comprehend these ... Continue reading: How to actually make a quality scatterplot in R http://dlvr.it/S5BKkg