In today’s blogpost I will look at historical data from the Tour de France. This data was used in the Tidytuesday series back in April, but I thought I’d take a closer look at it now as I currently suffer from Tour de France withdrawal symptoms (a July without Tour de France is like December without Christmas).
First, let’s load the data and have a look at the structure.
Dagens blogginnlegg er skrevet i samarbeid med min dyktige kollega Eivind Kvitstein, en usedvanlig allsidig sørlending som er både aktuar, data scientist, revisor og nå også hobby-epidemiolog.
Covid-19-viruset har spredt seg over hele verden og snudd opp ned på vår hverdag. I nettavisene som f.eks. VG kan vi følge utviklingen i antall døde, antall smittede både i Norge og hele verden. Dette har ført til en rekke feiltolkninger av dataene.
Today I will continue looking at Strava data (see my previous post: https://www.andrewaage.com/post/analyzing-strava-data-using-r/).
This time, the goal is to create a Poisson-model to determine how to become popular on Strava, answering vital questions such as:
What time of the day should I post my training rides? What is more important, distance or speed? Should I include pictures on my activity? Which day of the week should I ride?
A few days ago I returned home after a lovely trip to Toulouse, where I attended the annual useR-conference, which is the largest or second largest R-conference in the world with more than 1000 participants (rstudio::conf seems to be about the same size).
In this blog post, I will recap what I have experienced and learned during the trip.
Day 0 - The curse strikes again First, we need to start with some back-story: I have been to France the last two years on cycling camps, and both years there were issues with the flights to France.