Sorry
This feed does not validate.
line 33019, column 65: (9 occurrences) [help]
<guid>https://www.rstudio.com/blog/rstudio-1-2-preview-cpp/</guid>
^
In addition, interoperability with the widest range of feed readers could be improved by implementing the following recommendations.
line 19, column 0: (1066 occurrences) [help]
<guid>https://www.rstudio.com/blog/publishing-your-own-binary-packages ...
line 70, column 0: (109 occurrences) [help]
<description><p>What a week! Thank you for a fantastic rstudio:: ...
line 138, column 0: (780 occurrences) [help]
</description>
line 541, column 0: (107 occurrences) [help]
<guid>https://www.rstudio.com/blog/how-i-use-stories-to-share-data-at- ...
line 769, column 0: (12 occurrences) [help]
<guid>https://www.rstudio.com/blog/rstudio-recap-from-the-appsilon-shi ...
line 1466, column 0: (37 occurrences) [help]
<guid>https://www.rstudio.com/blog/changes-for-the-better-in-gt-0-6-0/ ...
line 5941, column 0: (34 occurrences) [help]
<guid>https://www.rstudio.com/blog/designing-the-data-science-classroo ...
Source: https://blog.rstudio.com/index.xml
<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>RStudio | Open source & professional software for data science teams on RStudio</title><link>https://www.rstudio.com/blog/</link><description>Recent content in RStudio | Open source & professional software for data science teams on RStudio</description><generator>Hugo -- gohugo.io</generator><language>en-us</language><lastBuildDate>Mon, 08 Aug 2022 00:00:00 +0000</lastBuildDate><atom:link href="https://www.rstudio.com/blog/index.xml" rel="self" type="application/rss+xml" /><item><title>Bring Your Own Binary Packages with RSPM</title><link>https://www.rstudio.com/blog/publishing-your-own-binary-packages-with-rspm-2022-07/</link><pubDate>Mon, 08 Aug 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/publishing-your-own-binary-packages-with-rspm-2022-07/</guid><description><h2 id="save-time-with-binary-r-packages-on-linux">Save Time With Binary R Packages on Linux</h2><p>Installing R packages from source can be a slow process. This is compounded by the challenge of making sure you have all the right system libraries and compilers installed. CRAN eases the burden on most desktop R users by providing pre-built binary packages for both Windows and MacOS, but Linux users (or anyone using a Linux-based environment like Docker) are still expected to build from source.</p><p>RStudio comes to the rescue of Linux and Docker users with our free <a href="https://packagemanager.rstudio.com">Public Package Manager</a> (PPM) service. We provide binary versions of CRAN packages for the most popular Linux distributions, including Ubuntu, Red Hat, and SUSE. Installing these binary packages from PPM can save hours of frustration over building them yourself during installation.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic"># install the tidyverse from source</span><span style="color:#06287e">system.time</span>(<span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">tidyverse&#34;</span>, lib<span style="color:#666">=</span><span style="color:#06287e">tempdir</span>(), dependencies<span style="color:#666">=</span><span style="color:#007020;font-weight:bold">TRUE</span>))<span style="color:#60a0b0;font-style:italic">#&gt; user system elapsed</span><span style="color:#60a0b0;font-style:italic">#&gt; 709.348 41.983 757.044</span><span style="color:#60a0b0;font-style:italic"># set the repository to PPM (Ubuntu focal)</span><span style="color:#06287e">options</span>(repos<span style="color:#666">=</span><span style="color:#4070a0">&#34;</span><span style="color:#4070a0">https://packagemanager.rstudio.com/all/__linux__/focal/latest&#34;</span>);<span style="color:#60a0b0;font-style:italic"># install the tidyverse</span><span style="color:#06287e">system.time</span>(<span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">tidyverse&#34;</span>, lib<span style="color:#666">=</span><span style="color:#06287e">tempdir</span>(), dependencies<span style="color:#666">=</span><span style="color:#007020;font-weight:bold">TRUE</span>))<span style="color:#60a0b0;font-style:italic">#&gt; user system elapsed</span><span style="color:#60a0b0;font-style:italic">#&gt; 22.342 3.898 70.377</span></code></pre></div><p>On a Linux system, installing the entire tidyverse of packages can be much faster than installing from source!</p><p>You can start using binary packages from PPM by making a few changes to your R configuration:</p><ol><li>Open <a href="https://packagemanager.rstudio.com">packagemanager.rstudio.com</a> in your favorite web browser.</li><li>Click <strong>Get Started</strong>.</li><li>Click on <strong>Source</strong> in the upper right corner of the page and select your Linux distribution from the dropdown.</li><li>Click the <strong>Setup</strong> button from the top menu and follow the instructions to reconfigure R (or RStudio) to use PPM as your CRAN repository.</li></ol><p><strong>NOTE</strong>: If you are on Linux and not using RStudio, you may need to update your R configuration to support downloading binary packages from PPM. See <a href="https://packagemanager.rstudio.com/__docs__/admin/serving-binaries/#binaries-r-configuration-linux">R Configuration Steps (Linux)</a> for more details.</p><h2 id="bring-your-own-r-binaries">Bring Your Own R Binaries</h2><p>Using PPM can save you time and frustration for CRAN packages, but what if the package you need isn&rsquo;t available on CRAN? Maybe it&rsquo;s an internally developed package used widely within your own group. Maybe the package is only available on GitHub. Or maybe your organization builds packages only with approved libraries or tools.</p><p>For those users, our commercial <a href="https://www.rstudio.com/products/package-manager/">RStudio Package Manager</a> (RSPM) product may be a solution. With the 2022.07 release, you can now upload custom binary packages for internal, GitHub-only, or otherwise non-CRAN packages and make these binaries available to everyone on your team.</p><h2 id="publish-packages-remotely">Publish Packages Remotely</h2><p>We know many of our customers who maintain their own internal R packages already have systems in place for building and updating them. With the 2022.07 release, RSPM also introduces remote publishing with <a href="https://packagemanager.rstudio.com/__docs__/admin/admin-cli/#api-tokens">API tokens</a>, making it easier to integrate securely with your existing package build process or pipeline &ndash; wherever it lives. Administrators have full control over API token creation and lifetimes, and tokens can even be limited in scope to restrict publishing to only specific sources.</p><p>Check out our <a href="https://github.com/rstudio/package-manager-demo">package-manager-demo</a> project on GitHub for an example of enabling API tokens, building, and publishing a package with GitHub Actions.</p><h2 id="learn-more">Learn More</h2><p>In addition to hosting Linux and Windows binary packages, our <a href="https://packagemanager.rstudio.com">Public Package Manager</a> service has other free features such as historic CRAN snapshots and Bioconductor support. For more advanced needs, <a href="https://www.rstudio.com/products/package-manager/">RStudio Package Manager</a> adds additional features to help you easily manage and distribute packages within your organization such as:</p><ul><li>offline CRAN mirrors</li><li>curated repositories</li><li>internal package support</li><li>custom binary packages</li></ul><p>Try switching your CRAN repository to PPM today and see the benefits for yourself! And if you have questions or ideas on how we can make your package management easier, reach out on our <a href="https://community.rstudio.com/c/r-admin/package-manager/21">RStudio Community</a> page.</p></description></item><item><title>Four announcements from rstudio::conf(2022)</title><link>https://www.rstudio.com/blog/four-announcements-from-rstudio-conf-2022/</link><pubDate>Mon, 08 Aug 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/four-announcements-from-rstudio-conf-2022/</guid><description><p>What a week! Thank you for a fantastic rstudio::conf(2022). It was so exciting to learn and share with you during these eventful four days.</p><p>This post will share some of the big announcements from RStudio. We will highlight amazing packages, resources, and processes shared by others during conf in upcoming posts.</p><ul><li><a href="#rstudio-pbc-is-changing-its-name-to-posit">RStudio, PBC is changing its name to Posit</a></li><li><a href="#use-quarto-for-creating-content-with-python-r-julia-and-observable">Announcing Quarto, a new open-source scientific and technical publishing system</a></li><li><a href="#new-developments-in-shiny-shiny-for-python-shiny-without-a-server-a-visual-shiny-ui-editor-and-more">New developments in the Shiny ecosystem</a></li><li><a href="#updates-from-the-tidymodels-and-vetiver-teams">Updates from the tidymodels and vetiver teams</a></li></ul><h2 id="rstudio-pbc-is-changing-its-name-to-posit">RStudio, PBC is changing its name to Posit</h2><p>RStudio is <a href="https://www.rstudio.com/blog/rstudio-is-becoming-posit/" target = "_blank">changing its name to Posit</a>. Our mission is to create free and open source software for <em>data science, scientific research, and technical communication</em>. In the decade since RStudio became a company, we’ve learned much about creating a sustainable and organic model for open-source software.</p><ul><li>First is the importance of independence and community. As a Public Benefit Corporation (PBC), we’re legally bound to consider the benefits of our actions on our customers, employees, and the community at large in every decision we make.</li><li>Second is upholding the virtuous cycle. The free and open source tools we build are core productivity tools that anybody can access, regardless of economic means. Our commercial products provide features needed in enterprise settings while enabling us to invest back into our open source tools. We will always respect the difference between these lines of work; we aim not to grow at all costs but to build a company still fulfilling its mission in 100 years.</li></ul><p>We also want to impact the practice of science more broadly. For years, RStudio has made R more approachable and usable for millions of users. More recently, we’ve also worked on open source tools for other programming languages, such as reticulate, Quarto, vetiver, and Shiny for Python.</p><blockquote><p>We are not changing our name because we are changing what we are doing. We are changing our name to reflect what we are already doing. We have been working multilingual for many years. Now, we want to announce that to the world.&mdash; Hadley Wickham, Chief Scientist</p></blockquote><p>Our new name tells our multilingual story and reflects our ambitions to make scientific communication better for everyone. With Posit, we’re excited to share what we love about R and RStudio with the wider world.</p><p>Read more in <a href="https://www.rstudio.com/blog/rstudio-is-becoming-posit/" target = "_blank">J.J. Allaire and Hadley Wickham’s blog post</a>. RStudio will officially rebrand as Posit in October 2022. Until then, we will continue to do business as RStudio.</p><h2 id="use-quarto-for-creating-content-with-python-r-julia-and-observable">Use Quarto for creating content with Python, R, Julia, and Observable</h2><p><a href="https://quarto.org/" target = "_blank">Quarto</a> is a new open-source scientific and technical publishing system that works with R, Python, Julia, Javascript, and many other languages. While R Markdown is fundamentally tied to R, the goal of Quarto is to bring the power and flexibility of R Markdown to everyone. Crucially, Quarto enables Python users who prefer to write code in Jupyter Notebooks or VS Code to enjoy the benefits that R Markdown has brought to R users for years.</p><p>With Quarto, you can make websites, books, blogs, and more. The <a href="https://quarto.org/docs/guide/" target = "_blank">User Guide</a> is a resource with detailed walkthroughs of Quarto’s functionality. Check out the <a href="https://quarto.org/docs/gallery/" target = "_blank">Gallery</a> to see examples of what’s possible.</p><p>Read more in <a href="https://www.rstudio.com/blog/announcing-quarto-a-new-scientific-and-technical-publishing-system/" target = "_blank">J.J. Allaire’s blog post</a> and join us on August 9th for Tom Mock’s <a href="https://www.youtube.com/watch?v=yvi5uXQMvu4" target = "_blank">Welcome to Quarto workshop</a>.</p><h2 id="new-developments-in-shiny-shiny-for-python-shiny-without-a-server-a-visual-shiny-ui-editor-and-more">New developments in Shiny: Shiny for Python, Shiny without a server, a visual Shiny UI editor, and more</h2><p>🎂 Happy 10th birthday, Shiny!</p><p>Shiny is a framework for building interactive web applications without knowing CSS, HTML, and Javascript. Released ten years ago, it is a powerful tool used across many contexts and industries. R programmers use Shiny to <a href="https://calcat.covid19.ca.gov/cacovidmodels/" target = "_blank">track COVID cases in California</a> and <a href="https://shiny.rstudio.com/gallery/didacting-modeling.html" target = "_blank">teach linear regression</a>. There are conferences dedicated to Shiny, such as Appsilon’s <a href="https://appsilon.com/shiny-conference/" target = "_blank">Shiny Conference</a> earlier this year to Jumping River’s <a href="https://www.jumpingrivers.com/blog/shiny-in-production-conference/" target = "_blank">Shiny in Production</a> event in October. As stated in <a href="https://mastering-shiny.org/preface.html" target = "_blank">Mastering Shiny</a> by Hadley Wickham, “Shiny gives you the ability to pass on some of your R superpowers to anyone who can use the web.”</p><p>Presenters at this year’s rstudio::conf unveiled new, exciting developments for Shiny, expanding these superpowers to a larger audience.</p><h3 id="write-shiny-web-applications-with-python">Write Shiny web applications with Python</h3><p>In his keynote speech, Joe Cheng announced <a href="https://shiny.rstudio.com/py/" target = "_blank">Shiny for Python</a>. Python programmers can now try out Shiny’s approachable, reactive framework to create interactive web apps.</p><p>Currently in alpha, many resources exist for those interested in trying Shiny for Python. <a href="https://shiny.rstudio.com/py/" target = "_blank">The Shiny for Python website</a> provides API documents, examples, and articles. VS Code users can download the <a href="https://marketplace.visualstudio.com/items?itemName=rstudio.pyshiny" target = "_blank">Shiny for Python extension</a> to write and preview apps in the editor. Deployment options include <a href="https://www.rstudio.com/products/connect/" target = "_blank">RStudio Connect</a>, shinyapps.io, Shiny Server Open Source, and static web servers.</p><p>Check out the <a href="https://www.youtube.com/watch?v=R0vu3zSdvgM&list=PL9HYL-VRX0oTJtI1dWaT9T827fe7OqFhC" target = "_blank">Shiny for Python YouTube playlist</a> to see it in action.</p><h3 id="test-shiny-applications-with-shinytest2">Test Shiny applications with shinytest2</h3><p>Barret Schloerke presented <a href="https://rstudio.github.io/shinytest2/" target = "_blank">shinytest2</a>, a new package on CRAN that leverages the <a href="https://testthat.r-lib.org/" target = "_blank">testthat</a> library for Shiny. Shinytest2 provides regression testing for Shiny applications: users can check existing behavior for consistency over time. Written entirely in R, shinytest2 is a streamlined toolkit for unit testing Shiny applications.</p><p>Explore the <a href="https://www.youtube.com/watch?v=7KLv6HdIxvU&list=PL9HYL-VRX0oR_tSCCvpNKBdFtTXfohdTK" target = "_blank">shinytest2 YouTube playlist</a> to get started.</p><h3 id="use-a-visual-editor-for-designing-shiny-apps">Use a visual editor for designing Shiny apps</h3><p>Nick Strayer demonstrated two tools for easier, faster development of Shiny apps:</p><ul><li><a href="https://github.com/rstudio/gridlayout" target = "_blank">gridlayout</a>, a package that helps you build dashboard layouts using an intuitive table-like declaration format</li><li><a href="https://rstudio.github.io/shinyuieditor/index.html" target = "_blank">shinyuieditor</a>, a drag-and-drop visual tool for creating and editing the UI of your Shiny app. The editor produces code so that the app is reproducible.</li></ul><p>Now, it’s easier than ever for anyone to get started designing Shiny user interfaces, even without detailed knowledge of Shiny’s UI functions or HTML layout.</p><p>Watch a <a href="https://www.youtube.com/watch?v=Zac1qdaYNsY" target = "_blank">tour of Shiny UI Editor</a> and a <a href="https://www.youtube.com/watch?v=gYPnLiudtGU" target = "_blank">project walking through how to use the editor</a>.</p><script src="https://fast.wistia.com/embed/medias/lvc3v4p834.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:52.08% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_lvc3v4p834 videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><h3 id="run-shiny-without-a-server">Run Shiny without a server</h3><p>Winston Chang showed how to run “ShinyLive” — Shiny for Python without a server. The application runs on the client with no computational load on the server. This is possible because Python can be compiled to <a href="https://webassembly.org/" target = "_blank">WebAssembly</a> (Wasm), a binary format that can run in the browser. With ShinyLive, you can share Shiny apps with just a URL or deploy them to a static web hosting service.</p><p>Winston walks through a <a href="https://www.youtube.com/watch?v=sG2dWWothoM" target = "_blank">Beginner’s Guide to ShinyLive on YouTube</a>. See <a href="https://shinylive.io/py/examples/" target = "_blank">some ShinyLive examples</a> on the Shiny for Python website.</p><h2 id="updates-from-the-tidymodels-and-vetiver-teams">Updates from the tidymodels and vetiver teams</h2><p><a href="https://www.tidymodels.org/" target = "_blank">tidymodels</a> is a collection of R packages for modeling and machine learning using tidyverse principles. It provides users with a consistent, modular, and extensible framework for working with models in R. During their keynote, Julia Silge and Max Kuhn shared how tidymodels helps create ergonomic, effective, and safe code (and announced their new book, <a href="https://www.oreilly.com/library/view/tidy-modeling-with/9781492096474/" target = "_blank">Tidy Modeling with R</a>!).</p><p>The tidymodels team also demonstrated several new packages during conf, extending the framework to more areas and applications.</p><h3 id="deploy-and-maintain-machine-learning-models-with-vetiver">Deploy and maintain machine learning models with vetiver</h3><p>Machine learning operations, or MLOps, is a set of practices to deploy and maintain machine learning in production reliably and efficiently. Isabel Zimmerman presented how the new <a href="https://vetiver.rstudio.com/" target = "_blank">vetiver</a> framework provides fluent tooling for MLOps in R and Python.</p><p><img src="images/vetiver.png" alt="Isabel’s slide showing the vetiver workflow where you version, deploy, monitor a model, collect data, understand and clean model, and train and evaluate. Different parts are represented by a cookie. The tools associated with the steps are shown and the vetiver hex sticker is on the left hand side."></p><center><caption><a href="https://isabelizimm.github.io/rstudioconf2022-mlops/#/section" target = "_blank">Slides from Isabel's presentation</a></caption></center><h3 id="run-survival-analysis-with-the-censored-package">Run survival analysis with the censored package</h3><p>Survival analysis is a statistical procedure for data analysis where the outcome variable of interest is time until an event occurs. Hannah Frick showcased the <a href="https://censored.tidymodels.org/" target = "_blank">censored</a> package, a <a href="https://parsnip.tidymodels.org/" target = "_blank">parsnip</a> extension that provides support for survival analysis in tidymodels. The package offers several models, engines, and prediction types for users.</p><h3 id="build-unsupervised-models-with-the-tidyclust-package">Build unsupervised models with the tidyclust package</h3><p>Unsupervised learning learns patterns and provides insight from untagged data. Emil Hvitfeldt unveiled the <a href="https://emilhvitfeldt.github.io/tidyclust/" target = "_blank">tidyclust</a> package, a reimplementation of tidymodels for clustering models. Users can use the tidy, unified tidymodels framework for unsupervised learning algorithms like k-means clustering.</p><p>Follow the tidyverse blog to receive the <a href="https://www.tidyverse.org/blog/2022/07/tidymodels-2022-q2/" target = "_blank">quarterly tidymodels digest</a>.</p><h2 id="learn-more">Learn more</h2><p>We have so much more to come. Stay in touch:</p><ul><li>See a preview of our new brand at <a href="https://posit.co/" target = "_blank">posit.co</a></li><li><a href="https://www.rstudio.com/blog/subscribe/" target = "_blank">Subscribe to our blog updates</a></li><li><a href="https://rstd.io/glimpse-newsletter" target = "_blank">Subscribe to the new open source rstudio::glimpse() newsletter</a></li><li><a href="https://www.rstudio.com/blog/rstudio-community-monthly-events-roundup-august-2022/" target = "_blank">Attend an upcoming Data Science Hangout or Enterprise Meetup</a></li></ul></description></item><item><title>Workbench Session Information Improvements</title><link>https://www.rstudio.com/blog/homepage-session-information-improvements/</link><pubDate>Tue, 02 Aug 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/homepage-session-information-improvements/</guid><description><h2 id="where-it-all-began">Where it all began</h2><p>Sometimes it takes just a passing comment to inspire a mountain of change. For this effort, it began with a discussion focused around addressing &ldquo;confusing status differences on the homepage.&rdquo; A greater collection of issues was also identified as being caused by these &ldquo;status differences.&rdquo; As we began to dig deeper into making the necessary changes that would resolve this user experience problem, we noticed that other changes would be required in order to properly address it.</p><p>We set out to solve several critical goals with this project:</p><ul><li>Improve consistency between statuses wherever the homepage presents them to the user</li><li>Update the back end to provide a reliable and responsive source of truth for session information</li><li>Refine the session creation process to allow for more predictable session launches</li><li>Rename ambiguous or imprecise usages of the term &ldquo;Job(s)&rdquo; that could be confused with other features in the IDE or even the Homepage itself</li></ul><h2 id="the-back-end">The Back End</h2><p>The server received several major improvements to its communication with both the Homepage and sessions. Most notably, the server now tracks the session status explicitly, and updates it directly. Before this change, the Homepage would &ldquo;assemble&rdquo; a session&rsquo;s status based on several attributes, which would occasionally display unexpected results in edge cases. With the new approach, the Homepage displays exactly what the server reports.</p><p>Sessions now send an update to the server when they become active, which makes it possible to know with much greater accuracy when the session has become ready to use. We have leveraged this to vastly improve the <a href="#auto-join">auto-join</a> functionality for new sessions.</p><p>Additionally, changes made to <em>how</em> and <em>when</em> we store session metadata also made it possible to rename VS Code and Jupyter sessions (or any other non-R session).</p><p><img src="./images/vs-code-rename.png" alt="Rename a VS Code session"></p><h2 id="the-front-end">The Front End</h2><h3 id="renamed-elements">Renamed Elements</h3><p>Throughout the IDE and the Homepage, mentions of &ldquo;Jobs&rdquo; have been carefully renamed in order to create a clear distinction between their functionality and their associations.</p><p>The &ldquo;Jobs&rdquo; tab in the session information modal has been renamed to &ldquo;Launcher Diagnostics&rdquo; to better illustrate that these are processes running via the Launcher in support of the session itself. A &ldquo;Job&rdquo; better describes a unit of work launched from the IDE.</p><p><img src="./images/launcher-diagnostics-tab.png" alt="Launcher diagnostics tab"></p><p>In the IDE, all instances of &ldquo;Local Jobs&rdquo; are now called &ldquo;Background Jobs.&rdquo; The term &ldquo;background&rdquo; clarifies how and where the job is running in relation to the R session itself.</p><p><img src="./images/ide-renamed-tabs.png" alt="IDE renamed tabs"></p><p>In the IDE and on the Homepage, we have made references to &ldquo;Jobs&rdquo; consistent with each other. In the IDE, &ldquo;Launcher Jobs&rdquo; are now called &ldquo;Workbench Jobs&rdquo; to make their association with the Workbench system clearer. On the Homepage, the section that was previously titled &ldquo;Jobs&rdquo; is now titled &ldquo;Workbench Jobs.&rdquo;</p><p><img src="./images/workbench-jobs.png" alt="Workbench jobs"></p><p>Now, when you run a Workbench job, an entry appears in the &ldquo;Workbench Jobs&rdquo; section on the Homepage, as expected.</p><h3 id="status-indicator">Status Indicator</h3><p>The status indicators throughout the Homepage have been changed to share visual consistency and use more apparent colors. For the session entries on the Homepage, a tool tip has been added on hover that shows a brief description of that status.</p><p><img src="./images/session-active.png" alt="New session status indicator"></p><p>This tool tip will also display diagnostic messages, if applicable, when the session is pending. You no longer have to open the session information and check the Launcher Diagnostics tab in order to see why the session is stuck in a &ldquo;pending&rdquo; state.</p><p><img src="./images/pending-tooltip-with-message.png" alt="Pending tool tip with message"></p><h3 id="information-and-ui-feedback">Information and UI Feedback</h3><p>With our back end communication changes, the homepage is more responsive to launcher process status updates, and it will now try to obtain the session status as soon as it observes a launcher process status change. For operations like launching sessions, in which the session quickly (or slowly depending on server capacity) progresses through multiple statuses, you can now see these on the Homepage. For VS Code and Jupyter sessions, the entire launch sequence is now visible.</p><p>User-initiated actions like &ldquo;Suspend&rdquo; or &ldquo;Quit&rdquo; provide immediate visual feedback while the server performs the operation remotely. When you click on either, there is no delay between the action and the response. Once the Homepage receives the updated status from the server, it can then sync its own state with the state reported by the server.</p><p>A drop-down section has been added to session entries which quickly summarizes the launcher processes associated with that session. For sessions in the &ldquo;Active&rdquo; state, we show a more detailed list. Clicking any of these entries will automatically open the &ldquo;Info&rdquo; modal and scroll to the most recent corresponding process in the &ldquo;Launcher Diagnostics&rdquo; tab.</p><p><img src="./images/session-active-drop-down.png" alt="Active session with drop-down info"></p><h3 id="auto-join">Auto Join</h3><p>Another important change was the re-design of the auto-join functionality when launching sessions. Previously, this was enabled by default, but only when the session launched within a specific time window. For Local Launcher, this often worked, but for the Slurm or Kubernetes plugins, for which session launches can take many minutes depending on server and network conditions, this led to confusing behavior. Sometimes you would be taken into your new session, other times you would be left on the homepage.</p><p><img src="./images/auto-join-session.png" alt="Auto-join opt-in"></p><p>We have provided an opt-in option (enabled by default) in the launcher dialog that gives you the ability to either automatically join a session when it becomes active, or stay on the Homepage. When auto-join is enabled, an arrow indicator will appear that clearly marks the pending session.</p><p><img src="./images/auto-join-indicator.png" alt="Auto-join indicator and pop-up"></p><p>There is no time limit for this; a session will be joined once it reports that it is ready after any length of time, unless the user cancels the pending auto-join with the <code>Escape</code> key or by clicking &ldquo;Cancel Auto-join&rdquo; in the pop-up. The pop-up will transition from a blue &ldquo;Info&rdquo; to a yellow &ldquo;Warning&rdquo; after 10 seconds, and eventually to a red &ldquo;Critical&rdquo; after 30 seconds. For Kubernetes setups, this is likely to occur when pulling images.</p><p>These features are available in the newest version of Workbench included in version 2022.07.01 &ldquo;Spotted Wakerobin.&rdquo; Installing the upgrade on your servers will bring these improvements to your users whenever they next log in to Workbench. In the coming months, Workbench will be re-branded with our new company name, so keep an eye out for that as well!</p></description></item><item><title>Keep the party going after rstudio::conf</title><link>https://www.rstudio.com/blog/rstudio-community-monthly-events-roundup-august-2022/</link><pubDate>Mon, 01 Aug 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-community-monthly-events-roundup-august-2022/</guid><description><p>It was amazing to meet so many people at the RStudio Conference last week. If you’d like to keep the party going and are thinking of ways to keep in touch with people you met at talks, Birds of a Feather sessions, or just walking around the hallways (in-person or virtually) - check out the upcoming meetup events!</p><p>This is the RStudio Community Monthly Events Roundup, where we update you on upcoming virtual events happening at RStudio this month. Missed the great talks and presentations from June? Find them listed under <a href="#icymi-june-2022-events">ICYMI: June 2022 Events</a>.</p><p>You can <a href="https://www.addevent.com/calendar/wT379734" target = "_blank">subscribe</a> to the Community Events Calendar so that new events will automatically appear on your calendar. Please note that by subscribing, all of the events in the calendar will appear on your own calendar. If you wish to add individual events instead, please use the links below.</p><p>We can’t wait to see you there!</p><ul><li>All are welcome - no matter your industry/experience</li><li>No need to register for anything</li><li>It&rsquo;s always okay to join for part of a session</li><li>You can just listen in if you want</li><li>You can ask anonymous questions too!</li></ul><h2 id="save-the-date">Save the Date</h2><ul><li>August 4, 2022 at 12 ET: Data Science Hangout with Lindsey Dietz, Stress Testing Production Function Lead at the Federal Reserve Bank of Minneapolis (<a href="https://www.addevent.com/event/Qv9211919" target = "_blank">add to calendar</a>)</li><li>August 9, 2022 at 12 ET: Welcome to Quarto Workshop | Led by Tom Mock, RStudio (<a href="http://rstd.io/quarto-meetup" target = "_blank">add to calendar</a>)</li><li>August 11, 2022 at 12 ET: Data Science Hangout with Adam Bly, CEO and Founder at System (<a href="https://www.addevent.com/event/Qv9211919" target = "_blank">add to calendar</a>)</li><li>August 16, 2022 at 12 ET: Functional Data Analysis (Part 2) | Led by Matthew Malloure, Dow Chemical (<a href="https://www.addevent.com/event/uU13446233" target = "_blank">add to calendar</a>)</li><li>August 18, 2022 at 12 ET: Data Science Hangout with Ivonne Carrillo Domínguez, Data Engineering Manager at Bixal (<a href="https://www.addevent.com/event/Qv9211919" target = "_blank">add to calendar</a>)</li><li>August 23, 2022 at 12 ET: RStudio Pharma Meetup Series: Data-as-a-Product - A data science framework for data collaborations | Led by Afshin Mashadi-Hossein, Bristol Myers Squibb (<a href="http://rstd.io/pharma-meetup" target = "_blank">add to calendar</a>)</li><li>August 22, 2022 at 12 ET: Communicating the value of data science | Led by Merav Yuravlivker (<a href="http://rstd.io/champion-chats" target = "_blank">add to calendar</a>)</li><li>August 25, 2022 at 12 ET: Data Science Hangout with Jay Sewell, Director of Analytics at Harry Rosen (<a href="https://www.addevent.com/event/Qv9211919" target = "_blank">add to calendar</a>)</li><li>August 30, 2022 at 12 ET: Building a Blog with Quarto | Led by Isabella Velásquez, RStudio (<a href="http://rstd.io/quarto-blog" target = "_blank">add to calendar</a>)</li><li>September 1, 2022 at 12 ET: Data Science Hangout with Tiger Tang, Manager of Data Science at CARFAX (<a href="https://www.addevent.com/event/Qv9211919" target = "_blank">add to calendar</a>)</li><li>September 6, 2022 at 12 ET: RStudio Public Sector Meetup: Employee Engagement in VA with Shiny | Led by Ryan Derickson, Veterans Affairs (<a href="https://rstd.io/gov-meetup" target = "_blank">add to calendar</a>)</li><li>September 13, 2022 at 12 ET: RStudio Sports Analytics Meetup: NFL Big Data Bowl 2022 Winners discuss the Math behind the Path | Led by Robyn Ritchie, Brendan Kumagai, Ryker Moreau, Elijah Cavan (<a href="https://rstd.io/sports-meetup" target = "_blank">add to calendar</a>)</li></ul><h2 id="recurring-events">Recurring Events</h2><h3 id="data-science-hangout">Data Science Hangout</h3><p>The Data Science Hangout is a weekly, free-to-join open conversation for current and aspiring data science leaders to connect and chat about some of the more human-centric questions around data science leadership. These happen every Thursday at 12 ET with a different leader featured each week.</p><p>The conversation is all audience-based and you can jump on whenever it fits your schedule. Add the weekly hangouts <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">to your calendar</a> and check out the <a href="https://www.rstudio.com/data-science-hangout/" target = "_blank">website</a> with all the recordings.</p><h3 id="rstudio-enterprise-community-meetups">RStudio Enterprise Community Meetups</h3><p>We also host industry meetups for teams to share the work they are doing within their organizations, teach lessons learned, and network with others. Join the group on <a href="https://www.meetup.com/RStudio-Enterprise-Community-Meetup/" target = "_blank">Meetup</a>.</p><h2 id="icymi-june-2022-events">ICYMI: June 2022 Events</h2><ul><li>June 1, 2022 at 12 ET: <a href="https://youtu.be/o36425S1-VU?t=163" target = "_blank">Using Python with RStudio Team</a> | Led by David Aja, RStudio</li><li>June 2, 2022 at 12 ET: <a href="https://youtu.be/GrPB-O0gDwU" target = "_blank">Data Science Hangout with Travis Gerke</a>, Director of Data Science at PCCTC</li><li>June 7, 2022 at 12 ET: <a href="https://youtu.be/k3PuGGmA7Hg" target = "_blank">Making microservices a part of your data science team</a> | Led by Tom Schenk &amp; Bejan Sadeghian at KPMG</li><li>June 9, 2022 at 12 ET: <a href="https://youtu.be/qdAroyFRFCg" target = "_blank">Data Science Hangout with Tanya Cashorali</a>, CEO and Founder at TCB Analytics</li><li>June 14, 2022 at 12 ET: <a href="https://youtu.be/Ino-SzgNHR4" target = "_blank">RStudio Healthcare Meetup: Translating facts into insights at Children&rsquo;s Hospital of Philadelphia</a> | Led by Jake Riley</li><li>June 16, 2022 at 12 ET: <a href="https://youtu.be/mr3TmyXOG_g" target = "_blank">Data Science Hangout with David Meza</a>, AIML R&amp;D Lead, People Analytics at NASA</li><li>June 21, 2022 at 12 ET: <a href="https://youtu.be/lCrd3BMVVqQ" target = "_blank">Enabling Citizen Data Scientists with RStudio Academy</a> | Led by James Wade, Dow Chemical</li><li>June 23, 2022 at 12 ET: <a href="https://youtu.be/KKy5kFTpjC0" target = "_blank">Data Science Hangout with Alec Campanini</a>, Senior Manager II, Omni MerchOps Innovation: Assortment &amp; Space Analytics at Walmart</li><li>June 28, 2022 at 12 ET: <a href="https://youtu.be/-FuEXMVbh4o" target = "_blank">RStudio Sports Analytics Meetup: SportsDataverse Initiative</a> | Led by Saiem Gilani, Houston Rockets</li><li>June 30, 2022 at 12 ET: <a href="https://youtu.be/Bo78Vc5h_DQ" target = "_blank">Data Science Hangout with Rebecca Hadi</a>, Head of Data Science at Lyn Health</li></ul><h2 id="call-for-speakers">Call for Speakers</h2><p>If you’re interested in sharing your work at a Meetup (or just starting to consider it for a future date down the road!), <a href="https://forms.gle/EtXMpSoTfhpGopiS8" target = "_blank">please fill out the speaker submission form</a>. We’re always looking for a diverse set of speakers — if you are a member of a group that is underrepresented in data science, including people of color, those with disabilities, elders/older adults, LGBTQ folks, and women/minority genders, we highly encourage you to submit!</p></description></item><item><title>Announcing Quarto, a new scientific and technical publishing system</title><link>https://www.rstudio.com/blog/announcing-quarto-a-new-scientific-and-technical-publishing-system/</link><pubDate>Thu, 28 Jul 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-quarto-a-new-scientific-and-technical-publishing-system/</guid><description><p>Today we&rsquo;re excited to announce <a href="https://quarto.org" target = "_blank">Quarto</a>, a new open-source scientific and technical publishing system. Quarto is the next generation of <a href="https://rmarkdown.rstudio.com" target = "_blank">R Markdown</a>, and has been re-built from the ground up to support more languages and environments, as well as to take what we&rsquo;ve learned from 10 years of R Markdown and weave it into a more complete, cohesive whole.While Quarto is a &ldquo;new&rdquo; system, it&rsquo;s important to note that it&rsquo;s highly compatible with what&rsquo;s come before. Like R Markdown, Quarto is also based on <a href="https://yihui.name/knitr" target = "_blank">Knitr</a> and <a href="https://pandoc.org" target = "_blank">Pandoc</a>, and despite the fact that Quarto does some things differently, most existing R Markdown documents can be rendered unmodified with Quarto. Quarto also supports <a href="https://jupyter.org" target = "_blank">Jupyter</a> as an alternate computational engine to Knitr, and can also render existing Jupyter notebooks unmodified.</p><p>Some highlights and features of note:</p><ul><li>Choose from multiple computational engines (Knitr, Jupyter, and Observable) which makes it easy to use Quarto with <a href="https://quarto.org/docs/computations/r.html" target = "_blank">R</a>, <a href="https://quarto.org/docs/computations/python.html" target = "_blank">Python</a>, <a href="https://quarto.org/docs/computations/julia.html" target = "_blank">Julia</a>, <a href="https://quarto.org/docs/computations/ojs.html" target = "_blank">Javascript</a>, and many other languages.</li><li>Author documents as plain text markdown or Jupyter notebooks, using a variety of tools including RStudio, VS Code, Jupyter Lab, or any notebook or text editor you like.Publish high-quality reports, presentations, websites, blogs, books, and journal articles in HTML, PDF, MS Word, ePub, and more.Write with scientific markdown extensions, including equations, citations, crossrefs, diagrams, figure panels, callouts, advanced layout, and more.</li></ul><p>Now is a great time to start learning Quarto as we recently released version 1.0, our first stable release after nearly two years of development. Get started by heading to <a href="https://quarto.org" target = "_blank"><a href="https://quarto.org">https://quarto.org</a></a>.</p><p>If you are a dedicated R Markdown user, fear not, R Markdown is by no means going away! See our <a href="https://quarto.org/docs/faq/rmarkdown.html" target = "_blank">FAQ for R Markdown Users</a> or <a href="https://yihui.org/en/2022/04/quarto-r-markdown/" target = "_blank">Yihui Xie&rsquo;s blog post</a> on Quarto for additional details on the future of R Markdown.</p><p>Below we&rsquo;ll go into more depth on why we decided to create a new system as well as talk more about Quarto&rsquo;s support for the Jupyter ecosystem.</p><h2 id="why-a-new-system">Why a new system?</h2><p>The goal of Quarto is to make the process of creating and collaborating on scientific and technical documents dramatically better. Quarto combines the functionality of R Markdown, bookdown, distill, xaringan, etc. into a single consistent system with “batteries included” that reflects everything we’ve learned from R Markdown over the past 10 years.</p><p>The number of languages and runtimes used for scientific discourse is very large and the Jupyter ecosystem in particular is extraordinarily popular. Quarto is, at its core, multi-language and multi-engine, supporting Knitr, Jupyter, and Observable today and potentially other engines tomorrow.</p><p>While R Markdown is fundamentally tied to R, which severely limits the number of practitioners it can benefit, Quarto is RStudio’s attempt to bring R Markdown to everyone! Unlike R Markdown, Quarto doesn’t require or depend on R. Quarto was designed to be multilingual, beginning with R, Python, Javascript, and Julia, with the idea that it will work even for languages that don’t yet exist.</p><p>While creating a new system has given us the opportunity for a fresh look at things, we have also tried to be as compatible as possible with existing investments in learning, content, and code. If you know R Markdown well, you already know Quarto well, and many of your documents are already compatible with Quarto.</p><h2 id="quarto-and-jupyter">Quarto and Jupyter</h2><p>While the R community has mostly focused on plain text R Markdown for literate programming, the Python community has a very strong tradition of using Jupyter notebooks for interactive computing and the interweaving of narrative, code, and output. With Quarto we are hoping to bring what we&rsquo;ve learned about publishing dynamic documents with R to the Jupyter ecosystem.</p><p>One compelling benefit of Quarto supporting both Knitr and Jupyter is that you can create websites and books that include content from both systems in a single project. Whether users prefer to author in plain markdown, computational markdown, or Jupyter notebooks, they can all contribute to the same project. Similarly, code written in R, Python, Julia, and other languages can co-exist in the same project. We believe that providing a common set of tools will facilitate collaboration and make it much easier to weave together contributions from diverse participants into a cohesive whole.</p><p>We also want to enable the many tools built around Jupyter to have access to state of the art scientific publishing capabilities. A great example of this is some recent work we&rsquo;ve done with <a href="https://fast.ai" target = "_blank"><a href="https://fast.ai">https://fast.ai</a></a> to help integrate Quarto with the <a href="https://nbdev.fast.ai/" target = "_blank">nbdev</a> literate programming system. nbdev enables the development of Python libraries within Jupyter Notebooks, putting all code, tests and documentation in one place. In nbdev 2, library documentation written in notebooks can be used to automatically create a Quarto website for the library with a single function call.</p><p>Getting more involved with Jupyter as part of working on Quarto has been a great experience. We&rsquo;re excited to do more with the Jupyter community and to continue supporting the ecosystem as a sponsor of <a href="https://numfocus.org/" target = "_blank">NumFOCUS</a>.</p><h2 id="learning-more">Learning more</h2><p>Here are some resources that will help you learn more about Quarto:</p><ul><li><a href="https://quarto.org/docs/get-started/">Get started</a> with Quarto by downloading it and following the tutorial for your tool of choice (including<a href="https://quarto.org/docs/get-started/hello/rstudio.html"> RStudio</a>, <a href="https://quarto.org/docs/get-started/hello/vscode.html">VS Code</a>, and<a href="https://quarto.org/docs/get-started/hello/jupyter.html"> Jupyter Lab</a>).</li><li>See the <a href="https://quarto.org/docs/guide/">User Guide</a> for articles on everything you can do with Quarto, including adding <a href="https://quarto.org/docs/interactive/">interactivity</a>, using <a href="https://quarto.org/docs/extensions/">extensions</a> and <a href="https://quarto.org/docs/extensions/formats.html">custom formats</a>, and<a href="https://quarto.org/docs/publishing/"> publishing</a> to a wide variety of destinations.</li><li>Check out the <a href="https://quarto.org/docs/gallery/">Gallery</a> for examples of the things you can do with Quarto.</li><li>Watch all of the Quarto talks from this year&rsquo;s <a href="https://quarto.org/docs/blog/posts/2022-06-21-rstudio-conf-2022-quarto/">rstudio::conf</a></li><li>Report any <a href="https://github.com/quarto-dev/quarto-cli/issues">issues</a> you encounter or <a href="https://github.com/quarto-dev/quarto-cli/discussions">start a discussion</a> about using Quarto.</li><li>Follow us on Twitter at <a href="https://twitter.com/quarto_pub">@quarto_pub</a> and<a href="https://quarto.org/docs/blog/"> subscribe</a> to our blog.</li></ul><p>We&rsquo;re excited to begin the journey of making Quarto the very best scientific publishing system we can, and look forward to sharing many more developments in the months and years ahead.</p></description></item><item><title>RStudio is becoming Posit</title><link>https://www.rstudio.com/blog/rstudio-is-becoming-posit/</link><pubDate>Wed, 27 Jul 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-is-becoming-posit/</guid><description><p>Today we are very excited to announce that RStudio has a new name: Posit. This is a big change, and below we&rsquo;ll talk about exactly why we are doing this and what it means. But first&mdash;why Posit? Posit is a <a href="https://www.vocabulary.com/dictionary/posit" target = "_blank">real word</a> that means to put forth an idea for discussion. Data scientists spend much of their day positing claims that they then evaluate with data. When considering a new name for the company we wanted something that reflects both the work our community engages in (testing hypotheses!) as well as the scientific aspiration to build ever- greater levels of knowledge and understanding.</p><p>The R and RStudio communities have become something very special. We&rsquo;ve helped people pose and answer difficult and consequential questions with data. We&rsquo;ve built open source tools to make &ldquo;code-first&rdquo; data science accessible and approachable to millions of people, and established reproducibility as a baseline expectation for analysis and communication. And around all of this we&rsquo;ve seen the development of an inclusive, supportive, diverse community, sincerely interested in empowering each other to do more.</p><p>One of the central ideas that this community has rallied behind is the belief that it’s imperative to use open source software for scientific work. Scientific work needs to be reproducible, resilient (not captive to a software vendor), and must encourage broad participation in the creation of the tools themselves. At the same time, it is challenging to secure long-term, sustainable funding for the open source software needed to make this happen.</p><p>As the community has grown and we&rsquo;ve seen the impact of our collective efforts, we have realized that one of the most important problems that RStudio has solved is melding its core mission of creating open source software with the imperatives of sustaining a commercial enterprise. This is a tricky business, and especially so today, as corporations are frequently forced into doing whatever it takes to sustain growth and provide returns to shareholders, even against the interests of their own customers! To avoid this problem and codify our mission into our company charter, we re-incorporated as a <a href="https://en.wikipedia.org/wiki/Benefit_corporation" target = "_blank">Public Benefit Corporation</a> in 2019.</p><p>Our charter defines our mission as the creation of free and open source software for <em>data science, scientific research, and technical communication</em>. This mission intentionally goes beyond &ldquo;R for Data Science&rdquo;—we hope to take the approach that’s succeeded with R and apply it more broadly. We want to build a company that is around in 100 years time that continues to have a positive impact on science and technical communication. We&rsquo;ve only just started along this road: we&rsquo;re experimenting with tools for Python and our new <a href="https://quarto.org/" target = "_blank">Quarto</a> project aims to impact scientific communication far beyond data science.</p><p>In many ways we are at the outset of a new phase of RStudio&rsquo;s development. For the first phase, we made the potentially confusing decision of naming our company after our IDE that was initially focused on R users. We kept that name even as our offerings grew to much more than just an IDE, and served many languages apart from R. While that made sense at the time, it’s become increasingly challenging to keep that name as our charter has grown broader.</p><p>While we of course feel sad moving away from the RStudio name that’s served us so well, we also feel excited about the future of Posit. We’re thrilled that we found a name that we think so accurately captures what people do with our tools and we’re excited to make our broader mission more clear to the outside world. We’re also happy that the RStudio name will live on, retaining its original purpose: identifying the best IDE for data science with R.</p><p>What does the new name mean for our commercial software? In many ways, nothing: our commercial products have supported Python for over 2 years. But we will rename them to Posit Connect, Posit Workbench, and Posit Package Manager so it’s easier for folks to understand that we support more than just R. What about our open source software? Similarly, not much is changing: our open source software is and will continue to be predominantly for R. That said, over the past few years we&rsquo;ve already been investing in other languages like reticulate (calling Python from R), Python features for the IDE, and support for Python and Julia within Quarto. You can expect to see more multilanguage experiments in the future.</p><p>So while you will see our name change in a bunch of places (including our main corporate website), we are still continuing on the same path. That path has widened as we have succeeded in the original mission, and we are excited at the chance to bring what we all love so much about the R community to everyone.</p></description></item><item><title>rstudio::conf(2022) Starts Today!</title><link>https://www.rstudio.com/blog/rstudio-conf-2022-starts-today/</link><pubDate>Mon, 25 Jul 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-conf-2022-starts-today/</guid><description><p>The day you’ve been waiting for is here: rstudio::conf(2022) starts today! We hope you’re ready for four days, 16 workshops, four keynotes, and countless ways to learn and collaborate!</p><p>With so much happening, we wanted to give an overview of what to expect and how to get ready.</p><ul><li><strong>Check out the conf schedule app.</strong> Create your itinerary on your <a href="https://itunes.apple.com/us/app/sched/id1629987732" target = "_blank">Apple</a> or <a href="https://play.google.com/store/apps/details?id=com.sched.rstudioconf2022" target = "_blank">Android</a> device.<ul><li>Find out more on the <a href="https://www.rstudio.com/conference/2022/schedule/" target = "_blank">schedule page</a> by clicking the Mobile App + iCal button.</li></ul></li><li><strong>Participate in a Birds of a Feather event.</strong> BoFs are short, scheduled sessions where people from similar backgrounds meet during rstudio::conf. BoFs include industry-specific groups such as academic research, finance, and pharma and groups with similar interests such as natural language processing or machine learning. Many thanks to the organizations hosting the sponsored BoFs.<ul><li>Add them to your calendar by filtering <a href="https://www.rstudio.com/conference/2022/schedule/" target = "_blank">the schedule</a> to “Social” events.</li></ul></li><li><strong>Join us in The Lounge!</strong> Come hang out and chat with RStudio employees about how you do data science in your day-to-day, what challenges you’re facing, how to learn or teach R on a broad scale, or check out the latest in our open source packages. You may even bump into your favorite package developer or software engineer.</li><li><strong>Take part in other social events to connect with the community.</strong> We have a book signing reception for workshop attendees on Monday evening, the welcome reception on Tuesday evening, dinner and activities on Wednesday evening, and an R-Ladies reception on Thursday evening.</li></ul><p>We have more ways to enjoy conf, whether you’re joining us in person or online!</p><ul><li><strong>Watch the keynotes and talk via live stream.</strong> The live streams will be available on the rstudio::conf(2022) website. No registration is required; they are open and free to all! Tune in and ask questions alongside other attendees.</li><li><strong>Join our Discord server to chat and network with RStudio folks and other attendees.</strong> Participate in fun community events, AMAs, and more! Sign up on the <a href="https://www.rstudio.com/conference/" target = "_blank">conference website</a>.</li><li><strong>Follow us on social media.</strong> We’ll be active on <a href="https://twitter.com/rstudio" target = "_blank">RStudio Twitter</a>, <a href="https://twitter.com/rstudio_glimpse" target = "_blank">RStudio Glimpse</a>, <a href="https://www.linkedin.com/company/rstudio-pbc" target = "_blank">LinkedIn</a>, <a href="https://www.instagram.com/rstudio_pbc/?hl=en" target = "_blank">Instagram</a>, and <a href="https://www.tiktok.com/t/ZTRBRBpvg/" target = "_blank">TikTok</a>!</li><li><strong>Use the #RStudioConf and #RStudioConf2022 hashtags to engage and share with others.</strong></li></ul><p>If you are wondering how to fit everything you want to do in these four days, Tracy Teal, RStudio&rsquo;s Open Source Program Director, recommends coming to conf with one or two goals in mind. Is it someone you want to meet, something you want to learn, a talk you want to see? Then organize your activities accordingly so that you can make them happen! And of course, please reach out to RStudio staff if we can help you achieve your goals.</p><p>We can’t wait to see you at rstudio::conf(2022)!</p></description></item><item><title>Join rstudio::conf(2022) Virtually</title><link>https://www.rstudio.com/blog/rstudio-conf-2022-virtual/</link><pubDate>Thu, 21 Jul 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-conf-2022-virtual/</guid><description><p>While we hope to see you in person at rstudio::conf(2022), we want to include as many of you as possible, so we invite you to join us virtually!</p><ul><li><strong>Live streaming:</strong> Keynotes and talks will be livestreamed on the <a href="https://www.rstudio.com/conference/" target = "_blank">rstudio::conf website</a>, free and open to all. No registration is required.</li><li><strong>Virtual networking on Discord:</strong> Sign up to access the conference Discord server so you can chat with other attendees, participate in fun community events, and keep up with announcements. This is open to both in-person and virtual attendees.<ul><li>A lot of RStudio folks are attending conf virtually. Come hang out!</li><li>Need help? Ask questions on the #🧐-discord-help-and-how-to channel.</li><li><a href="https://www.rstudio.com/conference/" target = "_blank">Sign up for Discord server access!</a> The sign-up form is under the &ldquo;Participate virtually&rdquo; heading.</li></ul></li></ul><p><img src="form.png" alt="rstudio conf webpage's participate virtually section in a red rectangle and the form sign up link highlighted in yellow"><center><i><caption>Discord Sign Up Form Location</caption></i></center></p><ul><li><strong>Social media:</strong> Follow our social media accounts and use the hashtags #RStudioConf and #RStudioConf2022 to share and engage with others!<ul><li><a href="https://twitter.com/rstudio" target = "_blank">RStudio Twitter</a></li><li><a href="https://twitter.com/rstudio_glimpse" target = "_blank">RStudio Glimpse Twitter</a></li><li><a href="https://www.linkedin.com/company/rstudio-pbc/" target = "_blank">RStudio LinkedIn</a></li><li><a href="https://www.instagram.com/rstudio_pbc/?hl=en" target = "_blank">RStudio Instagram</a></li><li><a href="https://www.tiktok.com/t/ZTRBRBpvg/" target = "_blank">RStudio TikTok</a></li></ul></li></ul><h2 id="plan-your-virtual-experience">Plan your virtual experience</h2><ul><li>Go to the <a href="https://www.rstudio.com/conference/2022/schedule/" target = "_blank">Schedule page</a> and add talks to your calendar.</li><li>On July 27-28th, head to the <a href="https://www.rstudio.com/conference/" target = "_blank">conference website</a> to watch the livestreams and ask questions alongside other attendees.</li><li><a href="https://www.rstudio.com/conference/" target = "_blank">Join the Discord server</a> to chat, network, and share with others attendees at conf.</li><li>Follow our activity on social media.</li><li>Use the hashtags #RStudioConf and #RStudioConf2022 to be part of the party online. Post photos, quotations, thoughts, or questions!</li></ul><p>Recordings of the keynotes and talks will be available on the RStudio website in a few weeks.</p><p>We can’t wait to see you there!</p></description></item><item><title>RStudio 2022.07.0: What's New</title><link>https://www.rstudio.com/blog/rstudio-2022-07-0-what-s-new/</link><pubDate>Wed, 20 Jul 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-2022-07-0-what-s-new/</guid><description><p>This post highlights some of the improvements in the latest RStudio IDE release 2022.07.0, code-named &ldquo;Spotted Wakerobin&rdquo;. To read about all of the new features and updates available in this release, check out the latest <a href="https://www.rstudio.com/products/rstudio/release-notes/" target="_blank">Release Notes</a>.</p><ul><li><a href="#find-in-files-improvements">Find in Files improvements</a></li><li><a href="#hyperlinks">Hyperlinks</a></li><li><a href="#support-for-r-4-2-0">Support for R (&gt;= 4.2.0)</a></li><li><a href="#more-info">More info</a></li></ul><h2 id="find-in-files-improvements">Find in Files improvements</h2><p>We&rsquo;ve made some significant improvements to the Find in Files pane, across all platforms but with particularly significant improvements in Windows.</p><ul><li><p>The Find in Files pane has gained a Refresh button, so that users can manually refresh possible matches/replacements to capture any changes to the files since the search was last run.<img src="images/screenshot_refresh_button.jpg" alt="Refresh Find in Files"></p></li><li><p>We&rsquo;ve upgraded the version of <code>grep</code> we use with RStudio on Windows. This more modern version of <code>grep</code> enables improved searching through directories and subdirectories with non-ASCII characters in the path name, such as <code>C:\Users\me\Éñçĥìłăḏà</code> or <code>C:\你好\你好</code>.</p></li><li><p>We&rsquo;ve also changed the flavor of regular expressions supported by the Find in Files search. Previously, Find in Files supported only POSIX Basic Regular Expressions. As of this release, Find in Files is now powered by <a href="https://en.wikibooks.org/wiki/Regular_Expressions/POSIX_Basic_Regular_Expressions" target="_blank">Extended Regular Expressions</a>, when the <code>Regular expression</code> checkbox is checked. What does this mean for your searches? Previously, if you used the special characters <code>?</code>, <code>+</code>, <code>|</code>, <code>{}</code>, or <code>()</code>, they were treated as character literals; now they will be interpreted according to their special regex meaning when unescaped, and as character literals only when escaped with a single backslash. This change also adds additional support for Find and Replace using regular expressions with <code>\b</code>, <code>\w</code>, <code>\d</code>, <code>\B</code>, <code>\W</code>, and <code>\D</code>, which now return the expected results in both Find and Replace mode. These changes bring Find in Files search more closely in line with the flavor of regular expressions supported by R&rsquo;s base <code>grep</code> function (using <code>PERL=FALSE</code>), but note that where the <code>grep</code> function within R requires double backslashes, Find in Files requires only a single backslash as the escape character.</p></li></ul><p><img src="images/screenshot_regex_find.jpg" alt="Find in Files Regular expression Find"><img src="images/screenshot_regex_replace.jpg" alt="Find in Files Regular expression Replace"></p><ul><li>When using Find in Files with the search directory set to a Git repository, users will by default have the option to ignore searching through any files or subdirectories listed within the <code>.gitignore</code> for that repo. Users can uncheck this option if they wish to include these files in their search.</li></ul><p><img src="images/screenshot_git_grep.jpg" alt="Find in Files Exclude .gitignore"></p><p>A number of other small bug fixes have been included in this release to improve the reliability and usability of Find in Files search. We hope this makes the feature more powerful and straightforward for users.</p><h2 id="hyperlinks">Hyperlinks</h2><p>Support for hyperlinks, as generated by <a href="https://cli.r-lib.org/reference/style_hyperlink.html" target="_blank">cli::style_hyperlink()</a>, has been added to the console output, build pane and various other places. Depending on their url, clicking a hyperlink will:</p><ul><li><p>go to a website <code>cli::style_hyperlink(&quot;tidyverse&quot;, &quot;https://www.tidyverse.org&quot;)</code>, a local file <code>cli::style_hyperlink(&quot;file&quot;, &quot;file:///path/to/file&quot;)</code>, or a specific line/column of a file <code>cli::style_hyperlink(&quot;file&quot;, &quot;file:///path/to/file&quot;, params = c(line = 10, col = 4))</code></p></li><li><p>open a help page <code>cli::style_hyperlink(&quot;summarise()&quot;, &quot;ide:help:dplyr::summarise&quot;)</code> or a vignette <code>cli::style_hyperlink(&quot;intro to dplyr&quot;, &quot;ide:vignette:dplyr::dplyr&quot;)</code>,with some preview information in the popup when the link is hovered over.<img src="images/screenshot_hyperlink_ide_help.jpg" alt="Help Page Hyperlink popup"></p></li><li><p>run code in the console <code>cli::style_hyperlink(&quot;Show last error&quot;, &quot;ide:run::rlang::last_error()&quot;)</code>. This also shows information about the function that willrun when the link is clicked.<img src="images/screenshot_hyperlink_ide_run.jpg" alt="Run Hyperlink popup"></p></li></ul><p>Some packages (e.g. <code>testthat</code> and <code>roxygen2</code>) have started to take advantage of this feature to improve their user experience, and we hope this willinspire other packages.<img src="images/screenshot_hyperlink_testthat.jpg" alt="testthat failure link"></p><h2 id="support-for-r-4-2-0">Support for R (&gt;= 4.2.0)</h2><p>R 4.2+, officially released in April 2022, received extensive IDE support in the previous release of the RStudio IDE. In this release, we add support for some additional features as well as some critical bug fixes.</p><ul><li>We resolved an issue where files would appear to be blank when opened in projects not using UTF-8 encoding on Windows with R 4.2.0, which could result in users inadvertently overwriting their files with an empty file.</li><li>We added further support for the R native pipe, first introduced in R 4.1. Code diagnostics now recognize and support the use of unnamed arguments in conjunction with the native pipe (e.g. <code>LETTERS |&gt; length()</code>) as well as the use of the new placeholder character (e.g. <code>mtcars |&gt; lm(mpg ~ cyl, data = _)</code>) added in R 4.2.0.</li><li>We&rsquo;ve also made it easier for users to configure whether they want to use the native R pipe <code>|&gt;</code> or the <code>magrittr</code> pipe <code>%&gt;%</code> when using the <em>Insert pipe</em> command (Cmd/Ctrl + Shift + M). Previously, this was only configurable at the global level, from the Global Options pane. As of this release, you can now inherit or override the global option in Project Options as well, to help maintain code style consistency within a RStudio project.<img src="images/screenshot_pipe_project_option.jpg" alt="Setting pipe operator in Project Options"></li><li>R 4.2 also introduced extensive changes to the Help system; we&rsquo;ve updated support for this new enhanced Help system to ensure it displays crisply and legibly in the IDE especially when using a dark theme.<img src="images/screenshot_enhanced_help_pane.jpg" alt="Enhanced Help in R 4.2"></li></ul><h2 id="more-info">More info</h2><p>There&rsquo;s lots more in this release, and it&rsquo;s <a href="https://www.rstudio.com/products/rstudio/download/" target="_blank">available for download today</a>. You can read about all the features and bugfixes in the RStudio 2022.07.0 &ldquo;Spotted Wakerobin&rdquo; release in the <a href="https://www.rstudio.com/products/rstudio/release-notes/" target="_blank">RStudio Release Notes</a>. We&rsquo;d love to hear your feedback about the new release on our <a href="https://community.rstudio.com/c/rstudio-ide/9" target="_blank">community forum</a>.</p></description></item><item><title>rstudio::glimpse() Newsletter</title><link>https://www.rstudio.com/blog/rstudio-glimpse-newsletter-01/</link><pubDate>Thu, 14 Jul 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-glimpse-newsletter-01/</guid><description><div class="lt-gray-box">Tracy Teal is the Open Source Program Director at RStudio.<br></div><p>It&rsquo;s the lead up to rstudio::conf(2022), and I&rsquo;m quickly learning what &lsquo;conf-driven development&rsquo; means. There&rsquo;s a lot going on, and more to come at conf!</p><p>I&rsquo;ve been here about a year now as the Open Source Program Director, and am excited about what we do everyday. To share that excitement and provide a <code>glimpse()</code> into our tools and how to use them, I&rsquo;m kicking off this newsletter. I&rsquo;ll share highlights, new releases, learning resources and some quick tips and tricks. I hope this helps keep you up to date, learn some things, and have some fun too.</p><p>This is my first newsletter, so I&rsquo;m trying out the format and topic areas. I&rsquo;d love to hear from you what you think and what you might like to see. Comment in RStudio Community and follow along at our <a href="https://twitter.com/rstudio_glimpse" target = "_blank">@rstudio_glimpse Twitter account</a> too.</p><h2 id="roundup">Roundup</h2><ul><li><strong><a href="https://www.tidyverse.org/blog/2022/06/announce-vetiver/">MLOps in R and Python</a>.</strong> We’re thrilled to <a href="https://www.rstudio.com/blog/announce-vetiver/">announce vetiver</a> to provide fluent tooling to version, share, deploy, and monitor a trained model. Using <a href="http://vetiver.rstudio.com/">vetiver</a> for MLOps lets you use the tools you are comfortable with, in R or Python, for exploratory data analysis and model training/tuning, and provides a flexible framework for the parts of a model lifecycle not served as well by current approaches.</li><li><strong><a href="https://www.tidyverse.org/blog/2022/06/actions-2-0-0/">GitHub actions for R developers v2</a>.</strong> The <a href="https://github.com/r-lib/actions#readme">r-lib/actions repo</a> has a number of reusable actions that perform common R-related tasks that makes continuous integration for R package development easier. <a href="https://www.tidyverse.org/blog/2022/06/actions-2-0-0/#what-is-new">Updates</a> include simpler workflows, snapshots as artifacts, and more!</li><li><strong>Try out <a href="https://www.tidyverse.org/blog/2022/05/case-weights/">case weights in tidymodels</a>.</strong> Case weights are non-negative numbers used to specify how much each observation influences the estimation of a model. This has been in the works for awhile, and the tidymodels team would love feedback.</li><li><strong>Easier installation for TinyTex and tinytex.</strong> TinyTex and tinytex have <a href="https://yihui.org/en/2022/05/tinytex-changes/#migrating-to-the-rstudio-org-on-github">moved GitHub organizations</a> to make it safer to build, install, and manage future contributions. TinyTeX users, you can<a href="https://yihui.org/en/2022/05/tinytex-full/"> install the full TeX Live now</a> if you prefer! It&rsquo;s a few gigabytes and contains all possible LaTeX packages.</li><li><strong>Need to test a Shiny app?</strong> <a href="https://github.com/rstudio/shinytest2">shinytest2</a> provides a streamlined toolkit for <a href="http://schloerke.com/presentation-2022-04-27-appsilon-shinytest2/#0">unit testing Shiny applications</a> and seamlessly integrates with the popular <a href="https://github.com/r-lib/testthat">testthat</a> framework for unit testing R code.</li><li><strong><a href="https://www.rstudio.com/conference/">rstudio::conf(2022)</a> is coming July 25th!</strong> See the <a href="https://www.rstudio.com/conference/2022/schedule/">full schedule</a>. There’s still time to <a href="https://na.eventscloud.com/ereg/newreg.php?eventid=665183&amp;">sign up</a> for in-person workshops or the conference. Head to the <a href="https://www.rstudio.com/conference/">conference page</a> to participate virtually and watch the live streams. You’ll also be able to catch recordings after the conference. There will be so much for all of us to share with each other!</li></ul><h2 id="learn-teach-share">Learn. Teach. Share.</h2><ul><li><strong>Playing games with Shiny.</strong> Dragons, numbers and Shiny, what could be better! Jesse Mostipak and Barret Schloerke <a href="https://www.youtube.com/watch?v=sD39WAZo99A&amp;t=216s">live code some Shiny games</a>. Come for the Shiny games, stay for the awesome example of paired coding.</li><li><strong>Wordle with Shiny.</strong> Winston Chang created a Shiny Wordle app, and shows how in this <a href="https://www.youtube.com/playlist?list=PL9HYL-VRX0oQnWIeY_ydYBdU76iQ-tchU">three-part video</a>. I loved it so much, I <a href="https://www.rstudio.com/blog/shiny-wordle-journey/">created my own</a> too!</li><li><strong>Tables. Tables. Tables.</strong> Looking to up your table game, or get started with tables? See the great examples from the <a href="https://community.rstudio.com/c/table-gallery/64">table gallery</a> and Rich Iannone and Jesse Mostipak’s <a href="https://www.youtube.com/playlist?list=PL9HYL-VRX0oR-ISYQHemal6VgxaXHS_GT">amazing tables videos</a> including tables battles, making beautiful tables, and new features in <a href="https://gt.rstudio.com/">gt</a>.</li><li><strong>Torch outside the box.</strong> Sometimes, a software’s best feature is the one you’ve added yourself. See an example of why you <a href="https://blogs.rstudio.com/ai/posts/2022-04-27-torch-outside-the-box/">may want to extend torch</a>, how to proceed, and an <a href="https://blogs.rstudio.com/ai/posts/2022-05-18-torchopt/">example of torchopt</a>.</li><li><strong>Getting started with deep learning in R.</strong> Announcing the <a href="https://blogs.rstudio.com/ai/posts/2022-05-31-deep-learning-with-r-2e/">release of “Deep Learning with R, 2nd Edition,”</a> from François Chollet, Tomasz Kalinowski and J. J. Allaire. The book is over a third longer, with more than 75% new content, so basically a new book! It shows you how to get started with deep learning in R, even if you have no background in mathematics or data science.</li><li><strong>Teaching the tidyverse in 2021.</strong> Mine Çetinkaya-Rundel continues her great series on <a href="https://www.tidyverse.org/blog/2021/08/teach-tidyverse-2021/">teaching the tidyverse</a> and has tips for folks who might want to update their teaching materials to include the last year’s changes in the tidyverse.</li></ul><h2 id="highlighted-how-tos">Highlighted How-To’s</h2><p><a href="https://www.rstudio.com/blog/how-i-use-stories-to-share-data-at-meetings/">How I Use Stories to Share Data at Meetings</a> by Ryan Estrellado. Learn how Ryan more effectively communicates information for data-driven decision making, using the Palmer penguins dataset as an example!</p><p><a href="https://www.rstudio.com/blog/r-markdown-tips-and-tricks-3-time-savers/">R Markdown Tips and Tricks #3: Time-savers and Troubleshooters</a> by Brendan Cullen, Alison Hill and Isabella Velásquez. There’s even more in <a href="https://www.rstudio.com/blog/r-markdown-tips-tricks-1-rstudio-ide/">Part 1: Working in the RStudio IDE</a> and <a href="https://www.rstudio.com/blog/r-markdown-tips-tricks-2-cleaning-up-your-code/">Part 2: Cleaning up your code</a>.</p><h2 id="selected-new-releases">Selected new releases</h2><p><b><a href="https://www.rstudio.com/blog/pins-for-python/">pins for Python</a></b><br>We’re excited to announce the release of pins for Python! pins removes the hassle of managing data across projects, colleagues, and teams by providing a central place for people to store, version and retrieve data. If you’ve ever chased a CSV through a series of email exchanges, or had to decide between data-final.csv and data-final-final.csv, then pins is for you.</p><p><b><a href="https://rstudio.github.io/reticulate/news/index.html">reticulate 1.25</a></b><br>The <a href="https://rstudio.github.io/reticulate/">reticulate</a> package provides a comprehensive set of tools for interoperability between Python and R. New release fixes some issues and has some exception handling changes.</p><p><b><a href="https://www.rstudio.com/blog/changes-for-the-better-in-gt-0-6-0/">gt 0.6.0</a></b><br><a href="https://gt.rstudio.com/">gt</a> 0.6.0 is here, along with several new functions! Join <a href="https://twitter.com/riannone">Rich Iannone </a> on a tour of these functions and how they work in <a href="https://t.co/vD7YUUi5rN">the release video</a>.</p><p><b><a href="https://github.com/rstudio/distill/releases/tag/v1.4">distill 1.4</a></b><br>New release of <a href="https://rstudio.github.io/distill/">distill</a> fixes some outstanding issues and improves highlighting and code folding.</p><p><b><a href="https://blogs.rstudio.com/ai/posts/2022-06-09-tf-2-9/">TensorFlow and Keras 2.9</a></b><br>These releases bring many refinements that allow for more idiomatic and concise R code.</p><h3 id="tidyverse">tidyverse</h3><p><b><a href="https://www.tidyverse.org/blog/2022/06/dbplyr-2-2-0/">dbplyr 2.2.0</a></b><br><a href="https://dbplyr.tidyverse.org/">dbplyr</a> is a database backend for dplyr that allows you to use a remote database as if it was a collection of local data frames: you write ordinary dplyr code and dbplyr translates it to SQL for you. This release has improvements to SQL translations and support for dplyr’s <code>row_</code> function.</p><p><b><a href="https://www.tidyverse.org/blog/2022/04/haven-2-5-0/">haven 2.5.0</a></b><br><a href="https://haven.tidyverse.org/">haven</a> allows you to read and write SAS, SPSS, and Stata data formats from R, thanks to the wonderful ReadStat C library written by Evan Miller.</p><p><b><a href="https://github.com/r-lib/pkgload/releases/tag/v1.3.0">pkgload 1.3.0</a></b><br>The goal of <a href="https://pkgload.r-lib.org/">pkgload</a> is to simulate the process of installing and loading a package, without actually doing the complete process, and hence making package iteration much faster. The main feature in this release is that devtools::load_all() now prompts you to install missing dev deps with pak.</p><p><b><a href="https://github.com/r-lib/rig/releases">rig 0.5.0</a></b><br><a href="https://github.com/r-lib/rig">rig</a> lets you install and manage multiple #rstats versions on macOS, Windows or Linux. Many exciting new features in this and previous releases, including a macOS menu bar app.</p><p><b><a href="https://rlang.r-lib.org/news/index.html">rlang 1.0.4</a></b><br><a href="https://rlang.r-lib.org/">rlang</a> is a collection of frameworks and APIs for programming with R. New releases feature an opt-in new display for backtraces that we are considering enabling by default in the future. This new display shows more frames by default and uses colours to deemphasise the parts that were previously hidden.</p><p><b><a href="https://www.tidyverse.org/blog/2022/05/roxygen2-7-2-0/">roxygen2 7.2.0</a></b><br><a href="https://roxygen2.r-lib.org/">roxygen2</a> allows you to write specially formatted R comments that generate R documentation files (<code>man/*.Rd</code>) and the NAMESPACE file.</p><p><b><a href="https://www.tidyverse.org/blog/2022/04/scales-1-2-0/">scales 1.2.0</a></b><br>The <a href="https://scales.r-lib.org/">scales</a> package provides much of the infrastructure that underlies <a href="https://ggplot2.tidyverse.org/">ggplot2’s</a> scales, and using it allows you to customize the transformations, breaks, and labels used by ggplot2.</p><h3 id="tidymodels">tidymodels</h3><p><b><a href="https://www.tidyverse.org/blog/2022/06/bonsai-0-1-0/">bonsai 0.1.0</a></b><br>This is the first release of the bonsai package on CRAN. bonsai is a parsnip extension package for tree-based models.</p><p><b><a href="http://censored.tidymodels.org/">censored 0.1.0 </a></b><br>This is a new package! <a href="http://censored.tidymodels.org/">censored</a> is a parsnip extension package for survival models.</p><p><b><a href="https://www.tidyverse.org/blog/2022/05/recipes-update-05-20222/">recipes extension</a></b><br><a href="https://recipes.tidymodels.org/">recipes</a> is a package for preprocessing data before using it in models or visualizations. You can think of it as a mash-up of model.matrix() and dplyr.</p><p><b><a href="https://www.tidyverse.org/blog/2022/06/spatialsample-0-2-0/">spatialsample 0.2.0</a></b><br>spatialsample is a package for spatial resampling, extending the rsample framework to help create spatial extrapolation between your analysis and assessment data sets.</p><p><b><a href="https://github.com/tidymodels/stacks">stacks 1.0.0</a></b><br>stacks is on CRAN, enabling stacked ensemble modeling with tidy data principles. See more in this <a href="https://joss.theoj.org/papers/10.21105/joss.04471">short paper on the package published in JOSS</a>.</p><h3 id="shiny">Shiny</h3><p><b><a href="https://rstudio.github.io/chromote/">chromote</a></b><br>chromote is an R implementation of the full Chrome dev tools protocol, providing a headless Chrome Remote Interface. The website has a TON of usage examples!</p><p><b><a href="https://rstudio.github.io/shinytest2/">shinytest2</a></b><br>shinytest2 provides a streamlined toolkit for unit testing Shiny applications and seamlessly integrates with the popular testthat framework for unit testing R code.</p><p><b><a href="http://rstudio.github.io/webshot2">webshot2</a></b><br>webshot2 enables you to take screenshots of web pages from R, replacing webshot by using chromote instead of PhantomJS.</p><h2 id="wrapping-up">Wrapping Up</h2><p>Thank you for reading our first rstudio::glimpse() newsletter. Let me know if you have any comments, ideas or suggestions for what you’d like to see in the newsletter by talking about this post on RStudio Community!</p><p>And finally:</p><details><summary>How do computers like their snacks?</summary>Byte size.</details></description></item><item><title>How I Use Stories to Share Data at Meetings</title><link>https://www.rstudio.com/blog/how-i-use-stories-to-share-data-at-meetings/</link><pubDate>Wed, 06 Jul 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/how-i-use-stories-to-share-data-at-meetings/</guid><description><div class="lt-gray-box"><p>Ryan Estrellado is a writer and educator. He is the author of the book <a href="https://ryanestrellado.com/the-k12-educators-data-guidebook" target = "_blank">The K–12 Educator’s Data Guidebook: Reimagining Practical Data Use in Schools</a> and a co-author of <a href="https://ryanestrellado.com/data-science-in-education-using-r" target = "_blank">Data Science in Education Using R</a>.</p>Ryan tells inspiring stories about the reality of education work, from overcoming a fear of data to finding a creative practice in the workplace. He has over twenty years of experience in public education. Ryan writes about data-driven decision making in schools and how to build creative education careers at <a href="https://ryanestrellado.com" target = "_blank">ryanestrellado.com</a>.</div><p><strong>Want to keep people awake at data presentations? Try sharing your findings backwards. You’ve worked it. Now put your chart down, flip it and reverse it.</strong></p><p>The first time I put someone to sleep with a data presentation was in 2003.</p><p>I was a school psychologist in a public school in the United States. That meant, among other things, that I tested elementary school students for learning disabilities. Then I’d share the results at meetings with parents and school staff so we could all help the student better.</p><p>At one of these meetings, as I soldiered through every single data point of my tests, I noticed a teammate had gone silent. At first, I thought she was using active listening skills. Then it struck me that one can’t do that with their eyes closed.</p><p>Yes, she was sleeping. In providing a safe place to nap, I had done a service for the public education community. But it wasn’t the service I was paid to do.</p><p>Over the next few years, people fell asleep at my meetings more times. And recently, I realized what I was doing wrong. In this article, I’ll be sharing those lessons with you.</p><h2>Sequence Well or Risk Slumber</h2><p>If this article were called, “How to Put People to Sleep With Data Presentations,” the main point would be this: Present findings to your audience in the same sequence that you discovered them. That’s exactly what I did all those years as a school psychologist. And we already know how that turned out. So what to do?</p><p>In Season 5, Episode 21 of <em>Seinfeld</em>, George Costanza laments his bad fortune and resolves to do the opposite of every choice he’s ever made:</p><blockquote><p>No, wait a minute. I always have tuna on toast. Nothing’s ever worked out for me with tuna on toast. I want the complete opposite of tuna on toast. Chicken salad. On rye. Untoasted.</p></blockquote><p>I had a similar experience reflecting on my data presentations. People fell asleep when I shared my findings in the same order that I discovered them. Like George, I wanted the complete opposite: Presenting the findings in the <em>reverse</em> order that I discovered them. I mean, if one way puts the audience to sleep, then shouldn’t the opposite way keep them awake?</p><p>Or, as Jerry puts it in that episode of <em>Seinfeld</em>, “If every instinct you have is wrong, then the opposite would have to be right.”</p><center><blockquote class="twitter-tweet"><p lang="en" dir="ltr">In short, I presented information in the same order I discovered it during the analysis. The whole game changed when I started presenting information in the REVERSE order I discovered it in the analysis. Let me explain: (2/7) <a href="https://t.co/Yzaub9YQFX">pic.twitter.com/Yzaub9YQFX</a></p>— Ryan Estrellado (<span class="citation">@ry_estrellado</span>) <a href="https://twitter.com/ry_estrellado/status/1409900670317064196?ref_src=twsrc%5Etfw">June 29, 2021</a></blockquote><script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script></center><p>In the rest of this post, I’ll share the sequence of steps I take to discover my findings. Then I’ll reverse those steps to craft a story for a presentation. I’ll explore the Palmer Penguins R package to show how this idea works.</p><h2>The Analysis: Start With the Data Points</h2><p>When I do any kind of data analysis, I follow some version of this sequence:</p><ol style="list-style-type: decimal"><li>Look at all the data points</li><li>Note interesting details</li><li>Categorize details into interesting themes</li></ol><p>I go through these steps to explore the Palmer Penguins dataset in the next few sections. As I do, notice how I start with broad questions and arrive at more specific themes.</p><h3>Look at All the Data Points</h3><p>First, I’ll use a basic question to start my exploration. During this stage, I like to let my curiosity lead. When I look through the Palmer Penguins dataset, I notice there are variables that describe measurements:</p><pre class="r"><code>library(tidyverse)library(palmerpenguins)</code></pre><pre class="r"><code>glimpse(penguins)</code></pre><pre><code>#&gt; Rows: 344#&gt; Columns: 8#&gt; $ species &lt;fct&gt; Adelie, Adelie, Adelie, Adelie, Adelie, Adelie, Adel…#&gt; $ island &lt;fct&gt; Torgersen, Torgersen, Torgersen, Torgersen, Torgerse…#&gt; $ bill_length_mm &lt;dbl&gt; 39.1, 39.5, 40.3, NA, 36.7, 39.3, 38.9, 39.2, 34.1, …#&gt; $ bill_depth_mm &lt;dbl&gt; 18.7, 17.4, 18.0, NA, 19.3, 20.6, 17.8, 19.6, 18.1, …#&gt; $ flipper_length_mm &lt;int&gt; 181, 186, 195, NA, 193, 190, 181, 195, 193, 190, 186…#&gt; $ body_mass_g &lt;int&gt; 3750, 3800, 3250, NA, 3450, 3650, 3625, 4675, 3475, …#&gt; $ sex &lt;fct&gt; male, female, female, NA, female, male, female, male…#&gt; $ year &lt;int&gt; 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007…</code></pre><p>So for this example, I use this question: Do different species of penguins have different measurements?</p><p>When I’m exploring a dataset, I visualize the data in various ways. Again, at this stage, I’m not overthinking it. It’s more like sketching ideas than it is painting a masterpiece.</p><p>I’m curious about differences in measurements, so I plot the bill length, bill depth, and flipper length. I use a scatter plot because I want to see each data point. I also color the points by species, based on the hunch that different species have different measurements:</p><pre class="r"><code>ggplot(penguins,aes(x = body_mass_g,y = bill_length_mm,color = species)) +geom_point() +labs(title = &quot;Bill Length By Species&quot;)</code></pre><p><img src="figsBill%20Length%20Plot-1.png" title="Scatterplot with body mass on the x-axis, bill length on the y-axis, colored by species, showing a slightly positive relationship with Chinstrap penguins having higher bill length but lower body mass, Adelie with low bill length and low body mass, and Gentoo with high body mass and high bill length" alt="Scatterplot with body mass on the x-axis, bill length on the y-axis, colored by species, showing a slightly positive relationship with Chinstrap penguins having higher bill length but lower body mass, Adelie with low bill length and low body mass, and Gentoo with high body mass and high bill length" width="2100" style="display: block; margin: auto;" /></p><pre class="r"><code>ggplot(penguins,aes(x = body_mass_g,y = bill_depth_mm,color = species)) +geom_point() +labs(title = &quot;Bill Depth By Species&quot;)</code></pre><p><img src="figsBill%20Depth%20Plot-1.png" title="Scatterplot with body mass on the x-axis, bill depth on the y-axis, colored by species, showing two distinctive groups with Adelie and Chinstrap in one group with low body mass and high bill depth, and another group of Gentoo with high body mass and low bill depth." alt="Scatterplot with body mass on the x-axis, bill depth on the y-axis, colored by species, showing two distinctive groups with Adelie and Chinstrap in one group with low body mass and high bill depth, and another group of Gentoo with high body mass and low bill depth." width="2100" style="display: block; margin: auto;" /></p><pre class="r"><code>ggplot(penguins,aes(x = body_mass_g,y = flipper_length_mm,color = species)) +geom_point() +labs(title = &quot;Flipper Length By Species&quot;)</code></pre><p><img src="figsFlipper%20Length%20Plot-1.png" title="Scatterplot with body mass on the x-axis, flipper length on the y-axis, colored by species, showing a positive relationship with Chinstrap and Adelie penguins with low body mass and lower flipper length, and Gentoo with high body mass and high flipper length" alt="Scatterplot with body mass on the x-axis, flipper length on the y-axis, colored by species, showing a positive relationship with Chinstrap and Adelie penguins with low body mass and lower flipper length, and Gentoo with high body mass and high flipper length" width="2100" style="display: block; margin: auto;" /></p><p>I also find the average length of these measurements by penguin species:</p><pre class="r"><code>penguins |&gt;group_by(species) |&gt;summarize(mean_bill_length =mean(bill_length_mm, na.rm = TRUE),mean_bill_depth =mean(bill_depth_mm, na.rm = TRUE),mean_flipper_length =mean(flipper_length_mm, na.rm = TRUE),mean_body_mass =mean(body_mass_g, na.rm = TRUE))</code></pre><pre><code>#&gt; # A tibble: 3 × 5#&gt; species mean_bill_length mean_bill_depth mean_flipper_length mean_body_mass#&gt; &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;#&gt; 1 Adelie 38.8 18.3 190. 3701.#&gt; 2 Chinstrap 48.8 18.4 196. 3733.#&gt; 3 Gentoo 47.5 15.0 217. 5076.</code></pre><p>I’ve plotted and summarized the data to help me spot interesting details. Now I’ll play with some language to describe what I notice.</p><h3>Note Interesting Details</h3><p>Here’s where my exploration pays off. In the last section, I led with my curiosity. In this session, I start focusing on details. Or, as the design thinkers would say, the last section was for divergent thinking. This section is for convergent thinking:</p><ul><li>The Chinstrap penguins have the highest mean bill length, though it’s similar to the Gentoo’s bill length</li><li>The Chinstrap penguins have the highest mean bill depth, though the Adelie’s are very close</li><li>The Gentoo penguins have the highest mean flipper length, followed by the Chinstrap, and then the Adelie</li></ul><h3>Categorize Details Into Interesting Themes</h3><p>I think of themes as a collection of interesting details. Interesting details on their own are just that—interesting details. But when I describe what they have in common, I get something else—the beginnings of a story.</p><p>For example, I can play with different ways to describe what I pointed out in the last section:</p><ul><li>Patterns in species measurements</li><li>Differences, but also similarities</li><li>The Chinstrap penguins trade bill size for flipper size</li></ul><p>By this point in the process, I’ve explored the data and found interesting details. Then I played with ways to talk about those details. Now, I’ll craft a way to share this with an audience.</p><p>The good news is most of the work is already done. But there’s one important move I need to make if I want to avoid another snooze cruise.</p><h2>The Presentation: Start With the Story</h2>Starting with the data points and ending with the story doesn’t set the tone for a compelling discussion. It’s like inviting your friends over for dinner, then showing them your timeshare presentation before bringing out the food. They were there for the shrimp cocktail and gossip. That needs to come first if you want to keep them happy.<br><br><center><img src="upside-down.gif" alt="Man saying 'everything is upside down'"></center><p>Here’s how I would present the information:</p><ol style="list-style-type: decimal"><li>Point out interesting themes</li><li>Describe selected details</li><li>Show selected data points</li></ol><p>I go through these steps to present the findings in the next few sections. As I do, notice how I start with a story before supporting it with selected data points.</p><h3>Point Out Interesting Themes</h3><p>The first part is the hook. It’s how you set the tone. So open with the story and let your audience know you’ve got something to share. And remember, you can’t <em>make</em> them be interested, but you can show them that <em>you’re</em> interested.</p><p>You might open the presentation for the Palmer Penguins exploration like this:</p><blockquote><p>I was expecting to see some differences in measurements. These are different species of penguins, after all. But it turns out it’s not as simple as one species being bigger than the others.</p></blockquote><h3>Describe Selected Details</h3><p>The second part is about unpacking interesting details. It’s how you signal credibility. Whatever it is you found so interesting isn’t made up. You discovered it by doing an actual analysis:</p><blockquote><p>There’s one example of this that stuck to me. The Chinstrap penguins had bigger bills on average. But surprisingly, they didn’t have the biggest average flipper length.</p></blockquote><h3>Show Selected Data Points</h3><p>And now comes the data. This part is about digging a level deeper and showing your audience exactly what you saw. What really works here is the context. You’ve already set the stage with your theme and details. Now the data helps your audience see how you arrived at the story:</p><blockquote><p>Let me show you what I mean. Here’s a plot that compares the bill length of all three species:</p></blockquote><pre class="r"><code>ggplot(penguins, aes(x = body_mass_g, y = bill_length_mm, color = species)) +geom_point() +labs(title = &quot;Bill Length By Species&quot;,x = &quot;Body mass in mm&quot;,y = &quot;Bill length in mm&quot;,caption = &quot;Data: Palmer Penguins&quot;,color = &quot;Species&quot;) </code></pre><p><img src="figsBill%20Length%20Presentation-1.png" title="Scatterplot with body mass on the x-axis, bill length on the y-axis, colored by species, showing a slightly positive relationship with Chinstrap penguins having higher bill length but lower body mass, Adelie with low bill length and low body mass, and Gentoo with high body mass and high bill length" alt="Scatterplot with body mass on the x-axis, bill length on the y-axis, colored by species, showing a slightly positive relationship with Chinstrap penguins having higher bill length but lower body mass, Adelie with low bill length and low body mass, and Gentoo with high body mass and high bill length" width="2100" style="display: block; margin: auto;" /></p><blockquote><p>And here’s one that compares the flipper length of all three species:</p></blockquote><pre class="r"><code>ggplot(penguins,aes(x = body_mass_g,y = flipper_length_mm,color = species)) +geom_point() +labs(title = &quot;Flipper Length By Species&quot;,x = &quot;Body mass in mm&quot;,y = &quot;Flipper length in mm&quot;,caption = &quot;Data: Palmer Penguins&quot;,color = &quot;Species&quot;) </code></pre><p><img src="figsFlipper%20Length%20Presentation-1.png" title="Scatterplot with body mass on the x-axis, flipper length on the y-axis, colored by species, showing a positive relationship with Chinstrap and Adelie penguins with low body mass and lower flipper length, and Gentoo with high body mass and high flipper length" alt="Scatterplot with body mass on the x-axis, flipper length on the y-axis, colored by species, showing a positive relationship with Chinstrap and Adelie penguins with low body mass and lower flipper length, and Gentoo with high body mass and high flipper length" width="2100" style="display: block; margin: auto;" /></p><blockquote><p>See what I mean? The Chinstrap penguins tend to have longer bill lengths. But they didn’t tend to have longer flipper length. That crown goes to the Gentoo. You can see that more in this table of mean measurements:</p></blockquote><pre class="r"><code>penguins |&gt;group_by(species) |&gt;summarize(&quot;Mean Bill Length&quot; =mean(bill_length_mm, na.rm = TRUE),&quot;Mean Flipper Length&quot; =mean(flipper_length_mm, na.rm = TRUE)) |&gt;rename(&quot;Species&quot; = species) |&gt;arrange(desc(&quot;Mean Bill Length&quot;))</code></pre><pre><code>#&gt; # A tibble: 3 × 3#&gt; Species `Mean Bill Length` `Mean Flipper Length`#&gt; &lt;fct&gt; &lt;dbl&gt; &lt;dbl&gt;#&gt; 1 Adelie 38.8 190.#&gt; 2 Chinstrap 48.8 196.#&gt; 3 Gentoo 47.5 217.</code></pre><h2>Conclusion</h2><p>So that’s it. Start your data analysis by looking at many data points. Then describe it through an interesting story. Afterward, start your presentation by leading with a story. Then signal the rigor of your analysis with supporting data points.</p><p>When you do this, you’ll organize one fewer meeting where people nod off. And more to the point of a data presentation, you’ll create a fun environment where actual learning happens. Because who says truth can’t also be entertaining?</p><h2>Notes</h2><ol style="list-style-type: decimal"><li><strong>flip it and reverse it</strong>: Elliot, M (2002). “Work It,” <em>Under Construction</em>, The Goldmind, Inc.<br /></li><li><strong>I’ll be sharing those lessons with you</strong>: I’ve shared this technique in a video about <a href="https://ryanestrellado.com/how-to-present-data-without-putting-people-to-sleep">presenting school data</a> and in my book about <a href="https://ryanestrellado.com/the-k12-educators-data-guidebook">data-driven decision making in schools</a></li><li><strong>George Costanza laments his bad fortune</strong>: <a href="https://youtu.be/CizwH_T7pjg">George’s clip on YouTube</a></li><li><strong>we’ll see the sequence of steps</strong>: This post has an example of numerical data, but I use this sequence when reflecting on interview responses or <a href="https://podcasts.apple.com/us/podcast/donuts-in-the-lounge-a-podcast-for-educators/id1610942852">other stories from educators in the field</a>.</li><li><strong>Palmer Penguins Package</strong>: Horst AM, Hill AP, Gorman KB (2020). <em>palmerpenguins: Palmer Archipelago (Antarctica) penguin data</em>. R package version 0.1.0. <a href="https://allisonhorst.github.io/palmerpenguins/">https://allisonhorst.github.io/palmerpenguins/</a>. doi: 10.5281/zenodo.3960218.</li><li><strong>For this example, we’ll use this question</strong>: You’ll likely be exploring your own data more thoroughly, but I’m keeping it simple for this post to illustrate the technique.</li><li><strong>so I plot the bill length, bill depth, and flipper length</strong>: There were some warnings about missing data. I didn’t include them here for aesthetic reasons. But you can find all the code for this piece in its <a href="https://github.com/restrellado/presenting-data-rstats-edition">GitHub repository</a>.</li><li><strong>the last section was for divergent thinking</strong>: Read more about divergent and convergent thinking in Brown, T (2009). <em>Change by design</em>. HarperBusiness.</li></ol></description></item><item><title>RStudio Recap From the Appsilon Shiny Conference</title><link>https://www.rstudio.com/blog/rstudio-recap-from-the-appsilon-shiny-conference/</link><pubDate>Tue, 05 Jul 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-recap-from-the-appsilon-shiny-conference/</guid><description><p>In April 2022, our Full Service partner <a href="https://appsilon.com/" target = "_blank">Appsilon</a> hosted the first-ever <a href="https://appsilon.com/shiny-conference/" target = "_blank">Shiny Conference</a>. The conference comprised three days of free, online Shiny content ranging from tips and tricks from the experts, to fascinating community case studies, to examples of enterprise scaling solutions. It provided a shared space for Shiny developers worldwide to learn, network, and share their work.</p><p>RStudio was proud to have a presence at the Shiny conference. Barret Schloerke, Software Engineer at RStudio, demonstrated shinytest2, a new R package that facilitates the testing of Shiny applications. To close out the conference, Joe Cheng, CTO at RStudio, and Winston Chang, Software Engineer at RStudio, joined a panel with Filip Stachura, CEO and co-founder of Appsilon. Eric Nantz moderated the session, where the panelists shared reflections on Shiny’s past and excitement for Shiny’s future.</p><p>Watch the recordings from the Shiny Conference on <a href="https://www.youtube.com/channel/UC6LqpR5qBfNlQp5mVIVsthA" target = "_blank">Appsilon&rsquo;s YouTube channel</a> and read our RStudio recap below.</p><h2 id="how-to-test-shiny-applications">How to Test Shiny Applications</h2><p>We are all too familiar with this Shiny workflow: add some reactivity, click &ldquo;Run App,&rdquo; experiment in the Viewer, and repeat to make changes. A better alternative is to create unit tests. Unit testing breaks up and checks code to ensure it does what you expect. This means fewer bugs, better code structure, easier restarts, and robustness. For R package developers, the <a href="https://testthat.r-lib.org/" target = "_blank">testthat package</a> provides a framework for unit testing that is easy to learn, use, and integrate with existing workflows.</p><p>Barret revealed <a href="https://rstudio.github.io/shinytest2/" target = "_blank">shinytest2</a>, a new package on CRAN that leverages the testthat library for Shiny. Shinytest2 provides regression testing for Shiny applications: testing existing behavior for consistency over time. Written entirely in R, shinytest2 is a streamlined toolkit for unit testing Shiny applications.</p><p>Shiny developers run the <code>record_test()</code> function to capture all the events happening in the app and replay them later. Developers can use this saved state to check values. Shinytest2 creates these snapshots with Chromote, a headless Chrome browser, and the <code>AppDriver</code> object. Thanks to variant support, users can test regardless of operating systems or R versions.</p><p>Barret also highlighted Shiny’s <code>exportTestValues()</code> function. <code>exportTestValues()</code> exports key-value pairs for an app’s names and reactives when it is in test mode to expose intermediate reactive values without slowing down the app in production.</p><p>Learn more about shinytest2 on the <a href="https://rstudio.github.io/shinytest2/" target = "_blank">package website</a>.</p><center><iframe width="560" height="315" src="https://www.youtube.com/embed/EOVPBN5o8F8" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></center><h2 id="reflecting-on-shinys-past-present-and-future">Reflecting on Shiny&rsquo;s Past, Present, and Future</h2><p>During the Q&amp;A keynote, Joe, Winston, and Filip answered questions from community members. The session explored Shiny’s past and inquired about its present and future.</p><p>The panelists shared how the community adds to the Shiny ecosystem. Feedback on existing tools guides new features and functionality. External packages give developers flexibility and make it easier to follow good practices. App creators share their experiences and expand the community’s knowledge base. Over time, this means more debugging tools and better performance for Shiny apps.</p><p>The questions delved into other topics. Community members asked about alternative frameworks such as Streamlit and the use of Shiny in enterprise settings. They wondered what the developers would implement differently if they could go back (camel case, anybody?). The panelists also shared insights on the core Shiny development process and the extensive internal testing for backward compatibility.</p><p>There were a few shoutouts to <a href="https://rstudio.github.io/bslib/" target = "_blank">bslib</a>, a package that helps developers make beautiful Shiny apps with bootstrap.</p><p>What is on the roadmap? As Joe stated during the panel, it is an &ldquo;incredibly exciting time for our team.&rdquo; Barret, Joe, and Winston will present more on Shiny developments at the upcoming rstudio::conf(). <a href="https://www.rstudio.com/conference/" target = "_blank">Register today!</a></p><center><iframe width="560" height="315" src="https://www.youtube.com/embed/sTBDxB46LCs" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></center><h2 id="learn-more">Learn More</h2><p>Many thanks to Appsilon for hosting an inspiring conference for Shiny developers worldwide.</p><ul><li>Check out the <a href="https://youtube.com/playlist?list=PLexAKolMzPcrYjGA1PULfm7-P12qjKmPb" target = "_blank">2022 Appsilon Shiny Conference playlist</a>.</li><li><a href="https://www.rstudio.com/conference/" target = "_blank">Register for rstudio::conf(2022)</a> to hear more from Barret, Joe, Winston, and others from the Shiny community.</li><li>Interested in learning Shiny or advancing your skills? Check out these upcoming workshops at rstudio::conf(2022):<ul><li><a href="https://www.rstudio.com/conference/2022/workshops/get-started-shiny/" target = "_blank">Getting Started with Shiny</a>, presented by Colin Rundel</li><li><a href="https://www.rstudio.com/conference/2022/workshops/shiny-prod-apps/" target = "_blank">Building Production-Quality Shiny Applications</a>, presented by Eric Nantz</li></ul></li></ul></description></item><item><title>10 ways you can provide value with RStudio Connect</title><link>https://www.rstudio.com/blog/10-ways-rstudio-connect/</link><pubDate>Tue, 28 Jun 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/10-ways-rstudio-connect/</guid><description><p>In speaking with the community, I hear about so many different ways that teams are using <a href="https://www.rstudio.com/products/connect/" target="_blank">RStudio Connect</a> and providing value across their companies.</p><p>On Twitter I posted <a href="https://twitter.com/_RachaelDempsey/status/1532830320516546562?s=20&amp;t=NfaSd2S66q_Zeo_a6EDvRA"><strong>10 ways RStudio Connect can give your data science team superpowers</strong></a> and thought it&rsquo;d be helpful to share here as well.</p><p>I&rsquo;d love to hear about all the other ways your team is using RStudio Connect and about the cool things you are building too! If you’re interested in sharing your work at a meetup as well, please consider filling out the <a href="https://docs.google.com/forms/d/e/1FAIpQLSexuUNx-_qOkPl8JXogf8TtCo6DyLyzB2YCtm5CjBgIGTFmhg/viewform?usp=send_form" target="_blank">speaker submission form</a>.</p><h3 id="so-here-we-go">So, here we go!</h3><p>How many have you done?</p><ol><li><a href="#Shiny">Start sharing Shiny applications</a></li><li><a href="#Pins">Deploy pins to share data, models, and other R objects across projects and with your colleagues</a></li><li><a href="#Blog">Build a blog to share work &amp; best practices among others at your company and/or help facilitate a meetup group</a></li><li><a href="#Parameterized">Schedule automated PDF reports with parameterized R Markdown for business stakeholders</a></li><li><a href="#Slack">Use RStudio with communication channels (like Slack) that are already in use by business stakeholders</a></li><li><a href="#Landing">Build a custom landing page for business stakeholders to make content discovery easier</a></li><li><a href="#Package">Use an internally created package to standardize dashboards across the team</a></li><li><a href="#Tableau">Use APIs to enable Tableau users at your company to leverage work done by your data science team in R and Python</a></li><li><a href="#ConnectAPI">Check out the Connect Server API to utilize usage data that helps answer questions like: Is my CEO using this app?</a></li><li><a href="#Microservices">Host microservices via Plumber APIs that connect data science work with your data engineering &amp; development teams</a></li></ol><p>You can also <a href="#Python">replace any of the examples above with Python</a> because RStudio Connect supports Jupyter, Flask, FastAPI, Dash, Streamlit, Bokeh, mixed R/Python content with reticulate, Python pins, etc.</p><h3 id="a-nameshinyaof-course-start-sharing-shiny-applications"><a name="Shiny"></a>Of course, start sharing Shiny applications</h3><p>Shiny makes it easy to build interactive web applications straight from R.</p><p>Here&rsquo;s an <a href="https://youtu.be/07j22d4B_hA">example of how Microsoft&rsquo;s data scientists use Shiny</a> and own the model development life cycle from data ingestion, model development, model deployment to visualization of insights with Paul Chang.</p><p><img src="images/shiny.jpeg" alt=""></p><ul><li><a href="https://youtu.be/07j22d4B_hA" target="_blank">Meetup Recording: Capacity Planning for Microsoft Azure Data Centers Using R &amp; RStudio Connect</a></li><li><a href="https://lnkd.in/gh-hGScE" target="_blank">Meetup slides</a></li><li><a href="https://www.shiny.rstudio.com" target="_blank">Shiny website</a></li></ul><h3 id="a-namepinsadeploy-pins-to-share-data-models-and-other-r-objects-across-projects-and-with-your-colleagues"><a name="Pins"></a>Deploy pins to share data, models, and other R objects across projects and with your colleagues.</h3><p>The <a href="www.pins.rstudio.com" target="_blank">pins package</a> publishes data, models, and other R objects, making it easy to share them across projects and with your colleagues. Pins can be automatically versioned, making it straightforward to track changes, re-run analyses on historical data, and undo mistakes.</p><p>Here&rsquo;s an <a href="https://youtu.be/e2h-BVgY4VA">example of how the City of Reykjavík pins models that are used in their Shiny app</a> to show the capacity of pools with Hlynur Hallgrímsson.</p><p><img src="images/pins2.jpg" alt=""></p><ul><li><a href="https://youtu.be/e2h-BVgY4VA" target ="_blank">Meetup recording: The data you were promised&hellip;and the data that you got</a></li><li><a href="https://hlynurhallgrims.github.io/the_data_you_were_promised/#1" target = "_blank">Meetup slides</a></li><li><a href="https://rstudio.com/iceland" target="_blank">Short-film featuring the City of Reykjavík&rsquo;s data team</a></li></ul><h3 id="a-nameblogabuild-a-blog-to-share-work--best-practices-among-others-at-your-company-andor-to-help-facilitate-a-meetup-group"><a name="Blog"></a>Build a blog to share work &amp; best practices among others at your company and/or to help facilitate a meetup group</h3><p>Your internal data science community helps each other learn, collaborate, and share. However, you may not have a central place for your assets, making it difficult for members to find information from the past or know where to save their work.</p><p>A blog can be a lightweight way to consolidate resources, events, and other information for your internal data science community. Here&rsquo;s an <a href="https://youtu.be/MrW5XFf7aps">example of deploying a blog to RStudio Connect</a> with Isabella Velasquez.</p><p><img src="images/blog3.jpg" alt=""></p><ul><li><a href="https://youtu.be/MrW5XFf7aps" target="_blank">Meetup recording: Building a blog with R</a></li><li><a href="https://colorado.rstudio.com/rsc/building-a-blog-with-r/Building%20a%20Blog%20With%20R.html#/section" target="_blank">Meetup slides</a></li><li><a href="https://github.com/ivelasq/internal-blog-example" target="_blank">GitHub Link for Internal Blog Example</a></li></ul><h3 id="a-nameparameterizedaschedule-automated-pdf-reports-with-parameterized-r-markdown-for-business-stakeholders"><a name="Parameterized"></a>Schedule automated PDF reports with parameterized R Markdown for business stakeholders</h3><p>One of the many benefits of working with R Markdown is that you can reproduce analysis at the click of a button. This makes it very easy to update any work and alter any input parameters within the report.</p><p>Parameterized reports extend this one step further, and allow users to specify one or more parameters to customize the analysis. This is useful if you want to create a report template that can be reused across multiple similar scenarios.</p><p>Here&rsquo;s an <a href="https://youtu.be/JsaGSrM8aZ0">example of creating an income statement for a group of theoretical office branches with parameterized R Markdown</a> with Brad Lindblad.</p><p><img src="images/finance.jpeg" alt=""></p><ul><li><a href="https://youtu.be/JsaGSrM8aZ0" target="_blank">Meetup recording: Professional financial reports with R Markdown</a></li><li><a href="https://github.com/bradlindblad/pro_reports_talk" target="_blank">Github link</a></li></ul><h3 id="a-nameslackause-rstudio-with-communication-channels-like-slack-that-are-already-in-use-by-business-stakeholders"><a name="Slack"></a>Use RStudio with communication channels (like Slack) that are already in use by business stakeholders.</h3><p>Serving stakeholders data in the moment they need it in an environment where they are already talking about business performance can help in democratizing information, especially when end-users are experiencing dashboard overload.</p><p>Here&rsquo;s an <a href="https://lnkd.in/gk6GKHSR">example of how Campaign Monitor serves individualized insights directly to stakeholders with R, Python, and RStudio Connect</a> with Matthias Mueller.</p><p><img src="images/slack2.jpg" alt=""></p><ul><li><a href="https://youtu.be/Y2zoRCXgPwk" target="_blank">Meetup recording: Serving bespoke insights through automation</a></li><li><a href="https://www.rstudio.com/blog/r-in-marketing-meetup/" target="_blank">Blog post: Democratizing Data with R, Python, and Slack</a></li></ul><h3 id="a-namelandingabuild-a-custom-landing-page-for-business-stakeholders-to-make-content-discovery-easier"><a name="Landing"></a>Build a custom landing page for business stakeholders to make content discovery easier</h3><p>How do you make sure your audience finds what they need on RStudio Connect without paging through the dashboard, remembering the right search terms, or bookmarking every content item you share?</p><p>After deploying many pieces of related content, how do you share them as a cohesive project?</p><p>Here&rsquo;s a <a href="https://youtu.be/GBNzhIkObyE">walkthrough of using connectwidgets to build your ideal showcase of data products</a> with Kelly O&rsquo;Briant.</p><script src="https://fast.wistia.com/embed/medias/zqgzo2fbe7.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:53.96% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_zqgzo2fbe7 videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><center><caption>Example of gallery created by connectwidgets</caption></center><br><ul><li><a href="https://youtu.be/GBNzhIkObyE" target="_blank">Meetup recording: Building your ideal showcase of data products</a></li><li><a href="https://www.rstudio.com/blog/rstudio-connect-data-showcase/" target="_blank">Blog post: Curating Your Data Science Content on RStudio Connect</a></li></ul><h3 id="a-namepackageause-an-internally-created-package-to-standardize-dashboards-across-the-team"><a name="Package"></a>Use an internally created package to standardize dashboards across the team</h3><p>Here&rsquo;s an <a href="https://youtu.be/ssmwUBSpF-8">example of how Snap Finance ensures a reproducible development workflow with a Shiny dashboard framework</a> with Alan Carlson.</p><p>Using their internal package, the team is able to build dashboards the same way, reducing tech debt and simplifying code review. This allows their team to bring on new developers and have them become almost instant contributors. With standardization, their developers are spending more time actually developing instead of spending time spinning up a Shiny framework.</p><script src="https://fast.wistia.com/embed/medias/pl819znagu.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:60.83% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_pl819znagu videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><ul><li><a href="https://youtu.be/ssmwUBSpF-8" target="_blank">Meetup recording: Robust, modular dashboards that minimize tech debt</a></li><li><a href="https://www.rstudio.com/about/customer-stories/snap-finance/" target="_blank">Snap Finance Customer spotlight</a></li></ul><h3 id="a-nametableauause-apis-to-enable-tableau-users-at-your-company-to-leverage-work-done-by-your-data-science-team-in-r-and-python"><a name="Tableau"></a>Use APIs to enable Tableau users at your company to leverage work done by your data science team in R and Python</h3><p>RStudio Connect includes support for Tableau Analytics Extensions, which provide a way to create calculated fields in workbooks that can execute scripts outside of the Tableau environment. This RStudio Connect integration enables you to create R or Python HTTP API extensions for use across all your Tableau workbooks.</p><p>Compared to existing methods for integrating R and/or Python in Tableau, integration via APIs hosted on RStudio Connect provides better security and dependency management. James Blair led us through a demo of this in <a href="https://youtu.be/t25Lbi5D6kg" target = "_blank">the meetup recording here.</a></p><p><img src="images/fastapi.png" alt=""></p><ul><li><a href="https://youtu.be/t25Lbi5D6kg" target = "_blank">Meetup recording: Leveraging R &amp; Python in Tableau with RStudio Connect</a></li><li><a href="https://docs.rstudio.com/rsc/integration/tableau/" target = "_blank">Tableau Integration Documentation</a></li><li><a href="https://blog.rstudio.com/2021/10/21/embedding-shiny-apps-in-tableau-dashboards-using-shinytableau/" target = "_blank">Blog post: Embedding Shiny Apps in Tableau using shinytableau</a></li><li><a href="https://www.rstudio.com/blog/rstudio-connect-2021-09-0-tableau-analytics-extensions/" target="_blank">Blog post: Tableau Analytics Extensions</a></li></ul><h3 id="a-nameconnectapiacheck-out-the-connect-server-api-to-utilize-usage-data-that-helps-answer-questions-like-is-my-ceo-using-this-app"><a name="ConnectAPI"></a>Check out the Connect Server API to utilize usage data that helps answer questions like: Is my CEO using this app?</h3><p>The <a href="https://docs.rstudio.com/connect/api/#overview--versioning-of-the-api" target="_blank">RStudio Connect Server API</a> provides easy access to your server’s instrumentation data from when users visit your server. As a publisher or administrator, you have access to these data: who logged in, when they logged in, what they looked at, and how long they spent on that piece of content.</p><p>With this, you can extend Connect and visualize advanced usage metrics that help answer important questions and focus your data science work. Here&rsquo;s a <a href="https://youtu.be/0iljqY9j64U" target ="_blank">walk-through &amp; resources to get started</a> with Cole Arendt.</p><p><img src="images/connectapiusage.png" alt=""></p><ul><li><a href="https://youtu.be/0iljqY9j64Uf" target="_blank">Meetup recording: Shiny usage tracking in RStudio Connect</a></li><li><a href="https://github.com/RStudioEnterpriseMeetup/Presentations/blob/2b5db2fa109458c38ba7ac2d79a66c869e0a241a/shiny-app-usage.pdf" target="_blank">Meetup slides</a></li><li><a href="https://www.rstudio.com/blog/track-shiny-app-use-server-api/" target="_blank">Blog post with examples</a></li></ul><h3 id="a-namemicroservicesahost-microservices-via-plumber-apis-that-connect-data-science-work-with-your-data-engineering--development-teams"><a name="Microservices"></a>Host microservices via Plumber APIs that connect data science work with your data engineering &amp; development teams</h3><p>As data science teams—and their applications—grow larger, teams can experience growing pains that make applications complex, difficult to customize, or challenging to collaborate across large teams.</p><p>You can use the Plumber package to deploy APIs to RStudio Connect as part of a microservices architecture that allows your team to work with front-end development teams using their preferred framework (e.g., React, Angular, Vue).</p><p>Here&rsquo;s an <a href="https://youtu.be/k3PuGGmA7Hg">example of how KPMG uses microservices to scale R-based applications across the enterprise</a>.</p><p><img src="images/kpmg2.jpg" alt=""></p><ul><li><a href="https://youtu.be/k3PuGGmA7Hg" target="_blank">Meetup recording: Making microservices part of your data team</a></li><li><a href="https://github.com/RStudioEnterpriseMeetup/Presentations/blob/main/KPMG%20Making%20Microservices%20Part%20of%20Your%20Team%20-%20RStudio%20Meetup.pptx" target="_blank">Meetup slides</a></li></ul><h3 id="a-namepythonawhat-about-python"><a name="Python"></a>What about Python?</h3><p>You can also replace any of the examples above with Python because RStudio Connect supports:</p><ul><li><a href="https://docs.rstudio.com/connect/user/jupyter-notebook/">Jupyter Notebooks</a></li><li><a href="https://docs.rstudio.com/connect/1.8.2/user/flask/">Flask applications</a></li><li><a href="https://docs.rstudio.com/connect/user/fastapi/">FastAPI applications</a></li><li><a href="https://docs.rstudio.com/connect/user/dash/">Plotly Dash applications</a></li><li><a href="https://docs.rstudio.com/connect/user/streamlit/">Streamlit applications</a></li><li><a href="https://docs.rstudio.com/connect/user/bokeh/">Bokeh applications</a></li><li><a href="https://solutions.rstudio.com/r/reticulate/">Mixed R/Python content with the reticulate package</a></li><li><a href="https://rstudio.github.io/pins-python/intro.html">Python pins</a></li></ul><p>We&rsquo;d love to highlight your Python use cases too! If you’re interested in sharing your work at a meetup, please consider filling out the <a href="https://docs.google.com/forms/d/e/1FAIpQLSexuUNx-_qOkPl8JXogf8TtCo6DyLyzB2YCtm5CjBgIGTFmhg/viewform?usp=send_form" target="_blank">speaker submission form</a>.</p><p><img src="images/python.jpeg" alt=""></p><ul><li><a href="https://youtu.be/o36425S1-VU" target="_blank">Meetup recording: Using Python with RStudio Team</a></li><li><a href="https://lnkd.in/gyq5jZ4i" target="_blank">Integrating RStudio Connect with Python</a></li></ul><h2 id="learn-more">Learn More</h2><p>If you haven&rsquo;t had a chance to try RStudio Connect before, you can <a href="https://www.rstudio.com/products/connect/evaluation/" target="_blank">request an evaluation here.</a></p><p>If you need help convincing your team that ️<strong>you should have these superpowers too</strong>, check out the <a href="https://www.rstudio.com/champion" target="_blank">RStudio Champion Site</a> for resources to help build a business case, grow your internal data science community, work with IT, and more.</p></description></item><item><title>Shiny and Arrow</title><link>https://www.rstudio.com/blog/shiny-and-arrow/</link><pubDate>Mon, 27 Jun 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-and-arrow/</guid><description><div class="lt-gray-box">This is a guest post from Michael Thomas, Chief Data Scientist at Ketchbrook Analytics. At Ketchbrook, Michael and his team help businesses gain significant competitive advantages by leveraging data effectively. Keep in touch with Michael on <a href="https://www.linkedin.com/in/michaeljthomas2/" target = "_blank">LinkedIn</a>.</div><p>Shiny apps are incredible tools for bringing data science to life. They can help communicate your analysis to non-technical stakeholders, or enable self-service data exploration for any audience. At <a href="https://www.ketchbrookanalytics.com" target = "_blank">Ketchbrook Analytics</a>, we care a lot about building <em>production-grade</em> quality Shiny apps; in other words, we strive to ensure that the apps we develop for you will run successfully inside your organization with minimal maintenance.</p><p>Building a working Shiny app that runs on your own laptop can be a tricky process itself! However, there are some additional things you need to consider when taking the next step of <em>deploying</em> your app to <strong>production</strong> so that others can reap the benefits of your hard work. One of these considerations is where to <em>store</em> the data.</p><blockquote><p>&ldquo;Where should the data live? Should we use a database or flat file(s)? Is our data small enough to fit there?&rdquo;</p></blockquote><center><img src="data_soup.png" alt="Databases and File Storage Formats Floating Together in Data Soup" width="600"></center><h2 id="storing-your-data">Storing Your Data</h2><p>There are <em>so</em> many options to choose from when it comes to how you want to store the data behind your Shiny app. Sometimes a traditional database doesn&rsquo;t make sense for your project. Databases can take a while to configure, and if your data isn&rsquo;t relational then a <em>data lake</em> approach might be a better option. A <strong>data lake</strong> is just a fancy term for a collection of flat files that are organized in a thoughtful way.</p><p>When you think about storing data in flat files, formats like <strong>.csv</strong> or <strong>.txt</strong> probably come to mind. However, as your data becomes <em>&ldquo;big&rdquo;</em>, transitioning your data to more modern, column-oriented file types (e.g., <strong>.parquet</strong>) can drastically reduce the size of the file containing your data and increase the speed at which other applications can read that data.</p><h3 id="the-benefits-of-parquet">The Benefits of .parquet</h3><p>First, let&rsquo;s dig a little deeper into the advantages of <strong>.parquet</strong> over <strong>.csv</strong>. The main benefits are:</p><ul><li>smaller file sizes</li><li>improved read speed</li></ul><p>The compression and columnar storage format lead to file sizes that are significantly smaller than if that same data were stored in a typical delimited file. From our experience &ndash; and also backed by <a href="https://tomaztsql.wordpress.com/2022/05/08/comparing-performances-of-csv-to-rds-parquet-and-feather-data-types/" target = "_blank">this great blog post by Tomaž Kaštrun</a> &ndash; <strong>.parquet</strong> typically comes in at a little less than half the size of a data-equivalent <strong>.csv</strong>; however, this margin widens even further as the data volume increases. Included in Tomaž&rsquo;s article is this fantastic chart (below) illustrating the read, write, and file size metrics he gathered while experimenting across many different file types and sizes.</p><center><img src="read_write_comparison_chart.png" alt="Chart from Tomaž Kaštrun, comparing file type sizes and respective read/write performance" width="600"></center><p>Interestingly, <strong>.parquet</strong> files not only store your data, but they also store <em>data about your data</em> (i.e., metadata). Information such as minimum &amp; maximum values are stored for each column, which helps make aggregation blazing fast.</p><p>Still not sold? Maybe you are wondering,</p><blockquote><p>Is &ldquo;.parquet&rdquo; a sustainable file format for storing my data, or is it just a fad?</p></blockquote><p>That&rsquo;s a fair question! The last thing we want to do as data scientists is to create more technical debt for our organization. Rest assured, <strong>.parquet</strong> format is not going anywhere &ndash; many production workflows at major organizations are driven by <strong>.parquet</strong> files in a data lake.</p><p><a href="https://voltrondata.com/" target = "_blank">Voltron Data</a> is the company behind <strong>.parquet</strong> format and the greater <a href="https://arrow.apache.org/" target = "_blank">Apache Arrow</a> project. They recently finished their Series A round by <a href="https://voltrondata.com/news/fundinglaunch/" target = "_blank">raising $110 million in funding</a> to continue to develop this technology. Needless to say, we won&rsquo;t be seeing <strong>.parquet</strong> format going away any time soon.</p><p>Lastly, unlike <strong>.RDS</strong> files, <strong>.parquet</strong> is a cross-platform file storage format, which means you can work with <strong>.parquet</strong> files from <a href="https://github.com/apache/arrow#powering-in-memory-analytics" target = "_blank">just about any programming language</a> including R. This is where the <a href="https://github.com/apache/arrow/tree/master/r#arrow" target = "_blank">{arrow}</a> package can help.</p><h3 id="the-benefits-of-arrow">The Benefits of {arrow}</h3><p>The <strong>{arrow}</strong> package provides major benefits:</p><ol><li>It has the ability to read &amp; write <strong>.parquet</strong> files (among other file types)</li><li>You can query the data in that file <em>before</em> bringing it into an R data frame, using <strong>{dplyr}</strong> verbs, which provides for dramatic speed improvements</li></ol><p>The combination of <strong>{arrow}</strong> and <strong>{dplyr}</strong> also results in <em>lazy evaluation</em> of your data manipulation statements. This means that your {dplyr} functions build a &ldquo;recipe&rdquo; of transformation steps that will only evaluate once you are finally ready to bring the transformed data into memory (through the use of <code>dplyr::collect()</code>). Don&rsquo;t take our word for it, though; hear it <a href="https://arrow.apache.org/docs/r/articles/dataset.html" target = "_blank">straight from the Apache Arrow team</a>:</p><blockquote><p>&ldquo;&hellip;[A]ll work is pushed down to the individual data files, and depending on the file format, chunks of data within the files. As a result, you can select a subset of data from a much larger dataset by collecting the smaller slices from each file &ndash; you don’t have to load the whole dataset in memory to slice from it.&rdquo;</p></blockquote><p>The concept of lazy evaluation with <strong>{dplyr}</strong> is also paramount when performing data manipulations and summaries on data stored in relational databases. The fact that a data science team can leverage those same principles to analyze data stored in <strong>.parquet</strong> files, without having to learn a completely new approach, is another massive benefit!</p><h2 id="how-it-all-fits-together-in-shiny-a-use-case-at-ketchbrook-analytics">How It All Fits Together in Shiny: A Use Case at Ketchbrook Analytics</h2><p>We have learned that the combination of <strong>{arrow}</strong> + <strong>{dplyr}</strong> + <strong>.parquet</strong> gives us all of the memory-saving benefits we would get from querying a database, but with the simplicity of flat files.</p><p>Ketchbrook was developing a Shiny app for a client, for which the relevant data was stored in a large, single <strong>.csv</strong> that was causing two problems:</p><ol><li>There wasn&rsquo;t enough room for the file on their <strong>shinyapps.io</strong> server</li><li>Even when run locally, applying filters and aggregations to the data from within the app was slow</li></ol><p>After converting the large <strong>.csv</strong> file into <strong>.parquet</strong> format, the data became one-sixth of the size of the original <strong>.csv</strong> &ndash; plenty of room available on the server for the <strong>.parquet</strong> data.</p><p>Further, executing <code>dplyr::filter()</code> on the already-in-memory <strong>.csv</strong> data was taking quite a few seconds for the app to respond. The conversion of the data to <strong>.parquet</strong> format, coupled with executing the <strong>{dplyr}</strong> functions against an <strong>{arrow}</strong> table (instead of an R data frame), drastically reduced the processing time to less than one second.</p><p>To demonstrate this powerful combination of <strong>{shiny}</strong> + <strong>{arrow}</strong>, Ketchbrook Analytics developed an <a href="https://ketchbrookanalytics.shinyapps.io/shiny_arrow/" target = "_blank">example Shiny app</a> and accompanying <a href="https://github.com/ketchbrookanalytics/shiny_arrow" target = "_blank">GitHub repository</a>.</p><p>Play around with the app, dive into the code, and try incorporating <strong>{arrow}</strong> into your next Shiny project!</p><h3 id="the-proof-is-in-the-pudding-and-the-file-size">The Proof is in the Pudding (and the File Size)</h3><p>For our <a href="https://ketchbrookanalytics.shinyapps.io/shiny_arrow/" target = "_blank">example Shiny app</a>, we created a mock dataset, and stored it in both <strong>.txt</strong> and <strong>.parquet</strong> format. You can create this data yourself by running <a href="https://github.com/ketchbrookanalytics/shiny_arrow/tree/main/data-raw" target = "_blank">these two scripts</a>.</p><p>For comparison, let&rsquo;s view the size of the data that&rsquo;s stored in tab-delimited <strong>.txt</strong> file format:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">files <span style="color:#666">&lt;-</span> fs<span style="color:#666">::</span><span style="color:#06287e">file_info</span>(path <span style="color:#666">=</span> <span style="color:#06287e">list.files</span>(dir,full.names <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>)) <span style="color:#666">|</span><span style="color:#666">&gt;</span>dplyr<span style="color:#666">::</span><span style="color:#06287e">select</span>(path, size) <span style="color:#666">|</span><span style="color:#666">&gt;</span>dplyr<span style="color:#666">::</span><span style="color:#06287e">mutate</span>(path <span style="color:#666">=</span> fs<span style="color:#666">::</span><span style="color:#06287e">path_file</span>(path),file_type <span style="color:#666">=</span> stringr<span style="color:#666">::</span><span style="color:#06287e">str_extract</span>(string <span style="color:#666">=</span> path,pattern <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">[^.]+$&#34;</span> <span style="color:#60a0b0;font-style:italic"># extract text after period</span>))files <span style="color:#666">|</span><span style="color:#666">&gt;</span>dplyr<span style="color:#666">::</span><span style="color:#06287e">filter</span>(file_type <span style="color:#666">==</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">txt&#34;</span>) <span style="color:#666">|</span><span style="color:#666">&gt;</span>knitr<span style="color:#666">::</span><span style="color:#06287e">kable</span>()</code></pre></div><table><thead><tr class="header"><th style="text-align: left;">path</th><th style="text-align: right;">size</th><th style="text-align: left;">file_type</th></tr></thead><tbody><tr class="odd"><td style="text-align: left;">half_of_the_data.txt</td><td style="text-align: right;">170M</td><td style="text-align: left;">txt</td></tr><tr class="even"><td style="text-align: left;">the_other_half_of_the_data.txt</td><td style="text-align: right;">170M</td><td style="text-align: left;">txt</td></tr></tbody></table><p>We can see that the <strong>.txt</strong> files total 339M in size.</p><p>Now let&rsquo;s look at the data when stored as <strong>.parquet</strong> file format:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">files <span style="color:#666">|</span><span style="color:#666">&gt;</span>dplyr<span style="color:#666">::</span><span style="color:#06287e">filter</span>(file_type <span style="color:#666">==</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">parquet&#34;</span>) <span style="color:#666">|</span><span style="color:#666">&gt;</span>knitr<span style="color:#666">::</span><span style="color:#06287e">kable</span>()</code></pre></div><table><thead><tr class="header"><th style="text-align: left;">path</th><th style="text-align: right;">size</th><th style="text-align: left;">file_type</th></tr></thead><tbody><tr class="odd"><td style="text-align: left;">all_of_the_data.parquet</td><td style="text-align: right;">158M</td><td style="text-align: left;">parquet</td></tr></tbody></table><p>Wow! The same dataset is less than half the size when stored as <strong>.parquet</strong> as compared to <strong>.txt</strong>.</p><h3 id="the-need-for-speed">The Need for Speed</h3><p>We saw the storage savings in action &ndash; now let&rsquo;s take a look at the speed improvements.</p><p>As a practical example, let&rsquo;s run a sequence of <code>dplyr::filter()</code>, <code>dplyr::group_by()</code>, and <code>dplyr::summarise()</code> statements against the <strong>.txt</strong> file:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">tic <span style="color:#666">&lt;-</span> <span style="color:#06287e">Sys.time</span>()vroom<span style="color:#666">::</span><span style="color:#06287e">vroom</span>(<span style="color:#06287e">list.files</span>(path <span style="color:#666">=</span> dir,full.names <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>,pattern <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">.txt$&#34;</span>),delim <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">\t&#34;</span>) <span style="color:#666">|</span><span style="color:#666">&gt;</span>dplyr<span style="color:#666">::</span><span style="color:#06287e">filter</span>(Variable_H <span style="color:#666">&gt;</span> <span style="color:#40a070">50</span>) <span style="color:#666">|</span><span style="color:#666">&gt;</span>dplyr<span style="color:#666">::</span><span style="color:#06287e">group_by</span>(Item_Code) <span style="color:#666">|</span><span style="color:#666">&gt;</span>dplyr<span style="color:#666">::</span><span style="color:#06287e">summarise</span>(Variable_A_Total <span style="color:#666">=</span> <span style="color:#06287e">sum</span>(Variable_A))</code></pre></div><pre><code># A tibble: 1,000 × 2Item_Code Variable_A_Total&lt;chr&gt; &lt;dbl&gt;1 A1G740 49453.2 A1J731 49481.3 A1N838 51610.4 A1O339 52633.5 A1R990 47588.6 A2E381 50823.7 A2J681 51575.8 A2N118 49840.9 A2U328 51106.10 A2W136 48013.# … with 990 more rows</code></pre><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">toc <span style="color:#666">&lt;-</span> <span style="color:#06287e">Sys.time</span>()time_txt <span style="color:#666">&lt;-</span> <span style="color:#06287e">difftime</span>(toc, tic)time_txt</code></pre></div><pre><code>Time difference of 6.38838 secs</code></pre><p>When run against the <strong>.txt</strong> file, the process takes 6.39 seconds to run.</p><p>Now let&rsquo;s try the same <strong>{dplyr}</strong> query against the <strong>.parquet</strong> file:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">tic <span style="color:#666">&lt;-</span> <span style="color:#06287e">Sys.time</span>()arrow<span style="color:#666">::</span><span style="color:#06287e">open_dataset</span>(sources <span style="color:#666">=</span> <span style="color:#06287e">list.files</span>(path <span style="color:#666">=</span> dir,full.names <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>,pattern <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">.parquet$&#34;</span>),format <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">parquet&#34;</span>) <span style="color:#666">|</span><span style="color:#666">&gt;</span>dplyr<span style="color:#666">::</span><span style="color:#06287e">filter</span>(Variable_H <span style="color:#666">&gt;</span> <span style="color:#40a070">50</span>) <span style="color:#666">|</span><span style="color:#666">&gt;</span>dplyr<span style="color:#666">::</span><span style="color:#06287e">group_by</span>(Item_Code) <span style="color:#666">|</span><span style="color:#666">&gt;</span>dplyr<span style="color:#666">::</span><span style="color:#06287e">summarise</span>(Variable_A_Total <span style="color:#666">=</span> <span style="color:#06287e">sum</span>(Variable_A)) <span style="color:#666">|</span><span style="color:#666">&gt;</span>dplyr<span style="color:#666">::</span><span style="color:#06287e">collect</span>()</code></pre></div><pre><code># A tibble: 1,000 × 2Item_Code Variable_A_Total&lt;chr&gt; &lt;dbl&gt;1 Z8B631 49545.2 J8O195 52941.3 I5Y383 46572.4 O8N416 51525.5 I2E912 49862.6 D4M22 50317.7 L1G322 46862.8 C3C179 51791.9 N4Q977 49013.10 L6T273 48561.# … with 990 more rows</code></pre><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">toc <span style="color:#666">&lt;-</span> <span style="color:#06287e">Sys.time</span>()time_parquet <span style="color:#666">&lt;-</span> <span style="color:#06287e">difftime</span>(toc, tic)time_parquet</code></pre></div><pre><code>Time difference of 1.975953 secs</code></pre><p>Wow! It might not seem like much, but the difference between a user having to wait 6.39 seconds for your Shiny app to execute a process versus having to wait 1.98 seconds is incredibly significant from a <em>user experience</em> standpoint.</p><p>But don&rsquo;t just take our word for it. Make your next Shiny app an <strong>{arrow}</strong>-driven, high-performance experience for your own users!</p></description></item><item><title>rstudio::conf(2022) Conference Schedule</title><link>https://www.rstudio.com/blog/rstudio-2022-conf-schedule/</link><pubDate>Thu, 16 Jun 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-2022-conf-schedule/</guid><description><p>We are delighted to announce the rstudio::conf schedule! Expect four days of amazing speakers, workshops, and events on a range of topics: data science processes, industry use cases, working with Shiny, and more!</p><p>Visit the <a href="https://rstd.io/conf-sched-2022" target = "_blank">Conference Schedule</a> to see the impressive list of events and start planning your conference experience.</p><center><a class="btn btn-primary" href="https://rstd.io/conf-sched-2022" target="_blank">Visit the Conference Schedule Page</a></center><h2 id="exploring-rstudioconf-events-and-topics">Exploring rstudio::conf events and topics</h2><p>The scheduler provides event descriptions and details. All talks will be live streamed.</p><p>Toggle between different simple or expanded views and view the speakers and attendees. On the right-hand side, you can filter by track. Hover over the links to see categories of talks.</p><img src="images/img1.png" alt="Screenshot of conference schedule with list of talks on the left-hand side and filters by type on the left-hand side. Hovering over Track A shows the different events under that event."><br><br>Check out the impressive list!<h2 id="save-your-schedule-of-talks">Save your schedule of talks</h2><p>Sign up to save your schedule and add events to your calendar. Once you create an account, you can complete your profile, receive email announcements, and save your custom schedule.</p><p>We will have an app available for easy access to the schedule during the conference &mdash; we&rsquo;ll update the website once it is ready.</p><h2 id="register-now">Register now!</h2><p>We cannot wait to see you at rstudio::conf in July!</p><ul><li><a href="https://www.rstudio.com/conference/" target = "_blank">Register</a> to attend rstudio::conf in-person.</li><li>To begin the conference, we have <a href="https://www.rstudio.com/conference/2022/2022-conf-workshops-pricing/" target = "_blank">two days of hands-on workshops</a>. These are not offered online — be sure to sign up!</li></ul></description></item><item><title>Automated Survey Reporting With googlesheets4, pins, and R Markdown</title><link>https://www.rstudio.com/blog/automated-survey-reporting/</link><pubDate>Wed, 15 Jun 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/automated-survey-reporting/</guid><description><p>In October 2021, we released the <a href="https://www.rstudio.com/blog/announcing-the-2021-rstudio-communications-survey/" target = "_blank">RStudio Communications Survey</a>. Many thanks to all of you who replied! We received great suggestions from the community that we incorporate into our work.</p><p>We wanted to share how we shared our internal survey results. This approach is similar to one that Julia Silge used last year for <a href="https://www.rstudio.com/blog/model-monitoring-with-r-markdown/" target = "_blank">monitoring a deployed model</a>. Curtis Kephart adapted this workflow to publish the Communications Survey results on a daily basis:</p><ul><li>We collected results using a Google Form.</li><li>We started our extract, load, and transform (ELT) process in an <a href="https://rmarkdown.rstudio.com/" target = "_blank">R Markdown</a> document.<ul><li>With the <a href="https://googlesheets4.tidyverse.org/" target = "_blank">googlesheet4</a> package, we imported the results from a Google Sheet into the RStudio IDE.</li><li>We then cleaned the data using the <a href="https://www.tidyverse.org/packages/" target = "_blank">tidyverse</a> and other packages.</li><li>Every day, we&rsquo;d save the latest results to a dataframe using the <a href="https://pins.rstudio.com/" target = "_blank">pins</a> package.</li></ul></li><li>We created an R Markdown report using the pinned data.<ul><li>We created visualizations using <a href="https://ggplot2.tidyverse.org/" target = "_blank">ggplot2</a> and other packages.</li><li>We styled our reports using the <a href="https://rstudio.github.io/thematic/" target = "_blank">thematic</a> and <a href="https://rstudio.github.io/bslib/" target = "_blank">bslib</a> packages.</li></ul></li><li>We published the R Markdown report to <a href="https://www.rstudio.com/products/connect/" target = "_blank">RStudio Connect</a>.</li><li>We scheduled RStudio Connect to run this workflow and save new data daily.</li><li>We would check out the latest results in our browser.</li></ul><img src="cycle.png" alt="Cycle of ETL, create report, publish report, and refresh data with related packages underneath"><p>Thanks to this workflow, our colleagues could easily access the survey results and see an updated report every day. We would review the responses from the community to plan out future communication strategies.</p><p>Interested in automated survey reporting? Let&rsquo;s walk through the steps using data from the <a href="https://www.ntia.gov/data/explorer#sel=internetUser&disp=map" target = "_blank">National Telecommunications and Information Administration Data Explorer</a>. We&rsquo;ve saved the project in this <a href="https://github.com/rstudio-marketing/automated-survey-reporting" target = "_blank">GitHub repository</a> if you would like access to the files.</p><p>Below, we&rsquo;ll highlight key packages and functionality, but do be aware we relied on many great tools to make this happen.</p><img src="hex1.png" alt="Wall of hexes with the tidyverse, R Markdown, and googlesheets4 hex stickers"><h2 id="extract-transform-and-load-etl-in-an-r-markdown-document">Extract, Transform, and Load (ETL) in an R Markdown Document</h2><p>First, we need to extract data from a Google Form, transform it into the proper format for analysis, and then load the data to write a report.</p><p>R Markdown is a file format for making dynamic documents with R. It&rsquo;s an easy way to integrate code, output, and text. For the Communications suryve, we decided to write our ETL process in an R Markdown file so that we could add headings and commentary to our code. With R Markdown, we could also schedule automatic refreshes on RStudio Connect (we will describe this a little bit later).</p><h3 id="importing-data-from-google-sheets">Importing data from Google Sheets</h3><p>Once someone fills out a Google Form, the results are saved in a Google Sheet. The googlesheets4 package allows you to access Google Sheets from R. It helps with authentication for reading private sheets or writing new data, and provides various functions to help you work with Google.</p><p>Load our survey data by running the below:</p><pre><code>```{r}library(googlesheets4)survey_dat &lt;-read_sheet(&quot;https://docs.google.com/spreadsheets/d/1wnv6PM0YiYoSZ8HHDAGLAGw8ro-SAXblpJ-Aw0VtEGM/edit?usp=sharing&quot;)```</code></pre><p>Each time we run <code>read_sheet()</code>, we pull in the most recent spreadsheet from Google.</p><h3 id="tidying-and-transforming-data-using-the-tidyverse-packages-and-friends">Tidying and transforming data using the tidyverse packages (and friends)</h3><p>The Google Sheet is stored as a data frame in RStudio. Great! Now, we have to make sure it&rsquo;s in an analyzable format. We can do this using packages like <a href="https://dplyr.tidyverse.org/" target = "_blank">dplyr</a> and <a href="https://tidyr.tidyverse.org/" target = "_blank">tidyr</a>.</p><p>Let&rsquo;s explore the dataset.</p><pre><code>```{r}library(tidyverse)library(lubridate)glimpse(survey_dat)```</code></pre><pre><code>Rows: 753Columns: 16$ dataset &lt;chr&gt; &quot;Nov 1994&quot;, &quot;Nov 1994&quot;, &quot;Nov 1994&quot;, &quot;Nov 1994&quot;, &quot;O…$ variable &lt;chr&gt; &quot;isHouseholder&quot;, &quot;isPerson&quot;, &quot;computerAtHome&quot;, &quot;is…$ description &lt;chr&gt; &quot;Household Reference Person in Universe: Non-Insti…$ universe &lt;chr&gt; NA, NA, &quot;isHouseholder&quot;, &quot;isPerson&quot;, NA, NA, &quot;isHo…$ usProp &lt;dbl&gt; 1.000000, 1.000000, 0.242794, 0.810185, 1.000000, …$ usCount &lt;dbl&gt; 99708018, 248509799, 24208471, 201338994, 10274155…$ age314Prop &lt;dbl&gt; NA, 1, NA, 0, NA, 1, NA, 0, NA, 1, NA, NA, NA, NA,…$ age314Count &lt;dbl&gt; NA, 47170805, NA, 0, NA, 47960746, NA, 0, NA, 4828…$ age1524Prop &lt;dbl&gt; 1.000000, 1.000000, 0.183342, 1.000000, 1.000000, …$ age1524Count &lt;dbl&gt; 5573616, 36254422, 1021876, 36254422, 5670471, 369…$ age2544Prop &lt;dbl&gt; 1.000000, 1.000000, 0.302369, 1.000000, 1.000000, …$ age2544Count &lt;dbl&gt; 43123860, 83005904, 13039306, 83005904, 43481519, …$ age4564Prop &lt;dbl&gt; 1.000000, 1.000000, 0.285785, 1.000000, 1.000000, …$ age4564Count &lt;dbl&gt; 29830455, 50934354, 8525086, 50934354, 32280721, 5…$ age65pProp &lt;dbl&gt; 1.000000, 1.000000, 0.076591, 1.000000, 1.000000, …$ age65pCount &lt;dbl&gt; 21180087, 31144313, 1622202, 31144313, 21308845, 3…</code></pre><p>The column <code>dataset</code> contains the survey administration date, but it is a character variable. We can use <code>dplyr::mutate()</code> and to create a new column called <code>date</code> that stores the variable in a date format.</p><pre><code>```{r}survey_dat_mutate &lt;-survey_dat %&gt;%mutate(date =my(dataset),.before = 1)```</code></pre><p>Notice that the data is in &ldquo;wide&rdquo; format: each survey administration is stored in a single row, and each category is in a separate column.</p><p>We&rsquo;d like our data to be in &ldquo;long&rdquo; format. In the long format, each row is one data point. Every survey administration will have data in multiple rows. This makes it easier to work with tidyverse packages. We can transform our data using <code>tidyr::pivot_longer()</code>.</p><pre><code>```{r}survey_data_transform &lt;-survey_dat_mutate %&gt;%pivot_longer(cols = usProp:age65pCount,names_to = &quot;variables&quot;,values_to = &quot;values&quot;)```</code></pre><p>Now, our data is in a format that&rsquo;s ready to use in our report.</p><p>We used many functions when working with the Google Form from the Communication Survey. Thankfully, packages like dplyr and tidyr make it easy to reproducibly clean the data and transform it to the format we need.</p><h3 id="appending-and-saving-data-to-pins">Appending and saving data to pins</h3><p>We tidied our data, but how can we ensure our report uses the latest clean dataset and not one from the previous day?</p><p>The pins package is an excellent solution to this. With pins, small R objects are published to a ‘board&rsquo; so we can share them across projects or people. In this case, we pin our survey data to an <a href="https://www.rstudio.com/products/connect/" target = "_blank">RStudio Connect</a> server. At the end of our data cleaning script, we rewrite the pin. Every time we run the data cleaning script, the pin contains the latest, complete dataset.</p><details><summary><b>Click here for an example of an ETL process in an R Markdown document.</b></summary><pre><code>---title: &quot;ETL Process&quot;output: html_document---```{r setup}library(dplyr)library(tidyr)library(lubridate)library(pins)# Connect to boardboard &lt;-board_rsconnect(server = Sys.getenv(&quot;CONNECT_SERVER&quot;),key = Sys.getenv(&quot;CONNECT_API_KEY&quot;))```# Extract data```{r extract}survey_dat &lt;-googlesheets4::read_sheet(&quot;https://docs.google.com/spreadsheets/d/1iIf8vsGSlKmyYSy-FbevOBmi-c0YwI46lLzNbD_RZSQ/edit?usp=sharing&quot;)``````{r}#| include = FALSEglimpse(survey_dat)```# Transform data```{r mutate}survey_dat_mutate &lt;-survey_dat %&gt;%mutate(date =my(dataset),.before = 1)``````{r transform}survey_data_transform &lt;-survey_dat_mutate %&gt;%pivot_longer(cols = usProp:age65pCount,names_to = &quot;variables&quot;,values_to = &quot;values&quot;)```# Load transformed data in a pin```{r load}board %&gt;%pin_write(survey_data_transform, &quot;survey_data_results&quot;, type = &quot;rds&quot;)```</code></pre></details><p>We also used pins to help protect our respondents&rsquo; privacy. As part of our tidying, we filtered out names or email addresses. We would pin the filtered dataset to use in the report. This way, we used as much raw data as possible without sharing any identifiable information.</p><img src="hex2.png" alt="Wall of hexes with the R Markdown, ggplot2, and thematic hex stickers"><h3 id="writing-an-r-markdown-report-with-pinned-data">Writing an R Markdown Report with pinned data</h3><p>We loaded our survey results, cleaned them up, and stored them in a pin. Now it&rsquo;s time to write a report!</p><p>In addition to running ETL processes, R Markdown (which contains the report-creating package knitr) is a powerful tool for creating reports. It can handle Markdown text, add custom styling, and much more.</p><p>Our survey had many different sections, each with many questions. We first created separate R Markdown documents for each section to build our report. This helped organize our code so we wouldn&rsquo;t get lost in what we were doing.</p><p>Let&rsquo;s walk through how this would look with our Census data. Our first file could look something like this:</p><pre><code>---title: &quot;Plot 1&quot;---```{r}#| include = FALSElibrary(tidyverse)library(hrbrthemes)```## Percentage by age-------### Line Chart```{r}#| echo = FALSEp &lt;- pinned_dat %&gt;%filter(variable == &quot;noInternetAtHome&quot;,str_detect(variables, &quot;Prop&quot;)) %&gt;%ggplot(aes(x = date, y = values, color = variables)) +geom_line(size = 1) +labs(title = &quot;Percentage of respondents with\nno internet at home by age group&quot;,xlab = &quot;Date&quot;,ylab = &quot;Percentage&quot;) +scale_color_ipsum() +theme_ipsum_ps(grid = &quot;XY&quot;, axis = &quot;xy&quot;)p```</code></pre><p>Our second one can create a document for a separate section:</p><pre><code>---title: &quot;Plot 2&quot;---### Percentage by main reason-------### Bar Chart```{r}#| echo = FALSEp &lt;- pinned_dat %&gt;%filter(str_detect(variable, &quot;MainReason&quot;),str_detect(variables, &quot;Prop&quot;),universe == &quot;isHouseholder&quot;,dataset == &quot;Nov 2021&quot;) %&gt;%ggplot(aes(x = variables, y = values, fill = variables)) +geom_bar(stat = &quot;identity&quot;) +labs(title = &quot;Percentage With No Internet by Main Reason&quot;,subtitle = &quot;November 2021&quot;,xlab = &quot;Date&quot;,ylab = &quot;Percentage&quot;) +facet_wrap( ~ variable) +scale_fill_ipsum() +theme_ipsum_ps(grid = &quot;XY&quot;, axis = &quot;xy&quot;) +theme(legend.position = &quot;bottom&quot;,axis.text.x = element_blank())p```</code></pre><p>We have separate files for these sections. We can use a third R Markdown document as the main file for our report.</p><p>The <code>knitr::knit_child()</code> function is a powerful tool to organize your R Markdown documents. You can use it to create parameterized section templates. With our survey example, we found we were creating multiple sections that broke down results by respondent demographics, and summarizing similar questions. These sections applied similar code to different subsets of the data. Rather than copy-and-pasting code, we use <code>knit_child()</code> to pass parameters into section templates. See the <a href="https://bookdown.org/yihui/rmarkdown-cookbook/child-document.html#child-document" target = "_blank">R Markdown Cookbook&rsquo;s discussion of child documents</a> for more information.</p><p>With R Markdown, we have many options to customize our report to match our organization&rsquo;s style guide. Two options are the thematic and bslib packages. Below, we&rsquo;re using the bslib package.</p><pre><code>---title: &quot;Internet Use Survey Report&quot;output:html_document:theme:bootswatch: simplex---```{r setup, include = FALSE}knitr::opts_chunk$set(warning = FALSE)library(pins)library(ggplot2)library(stringr)library(dplyr)library(hrbrthemes)# Connect to boardboard &lt;-board_rsconnect(auth = &quot;envvar&quot;)# Read pinned datapinned_dat &lt;-pin_read(&quot;survey_data_results&quot;,board = board)```Data from the National Telecommunications and Information Administration.```{r}#| child=c(&quot;02-plot.Rmd&quot;)``````{r}#| child=c(&quot;03-plot2.Rmd&quot;)```</code></pre><p>When we knit this file, we have the results in a stylized report that displays output from the three <code>.Rmd</code> documents using the latest, clean survey dataset. Hooray, we created a report!</p><h2 id="publishing-r-markdown-html-output-to-rstudio-connect">Publishing R Markdown HTML Output to RStudio Connect</h2><p>After we finished our report, we wanted to make it accessible to our team! For this, we used RStudio Connect. RStudio Connect is an enterprise-level platform from RStudio that publishes many different data science products, including the HTML output from R Markdown.</p><p>We could publish our R Markdown document directly from the RStudio IDE. Here&rsquo;s what it looks like with the survey report from above:</p><script src="https://fast.wistia.com/embed/medias/i4dtiibepy.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_i4dtiibepy videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><center><caption><i>Publishing a report to RStudio Connect</i></caption></center><p>We mentioned that our report was <em>daily</em>. With RStudio Connect, we scheduled our ETL, reporting, and publishing cycle to occur every day. When our team members opened the dashboard, they would see the results up until the last scheduled refresh.</p><img src="schedule.png" alt="Options for scheduling an R Markdown on a daily schedule on RStudio Connect"><center><caption><i>Scheduling options on RStudio Connect</i></caption></center><p>We wanted to make sure that results were available to everybody within RStudio but not shareable outside of the company. Thanks to the access settings of RStudio Connect, we could make sure that only those with permission could view our dashboard.</p><img src="access.png" alt="Options for security on an app that changes who gets to access the report on RStudio Connect"><center><caption><i>Security options on RStudio Connect</i></caption></center><h2 id="learn-more">Learn More</h2><p>Thank you again for your responses to the Communications survey. We&rsquo;re excited to have shown you our process for reporting the results using RStudio tools: R Markdown, the tidyverse packages, googlesheets4, RStudio Connect, and many others.</p><ul><li>Check out the complete workflow in this <a href="https://github.com/rstudio-marketing/automated-survey-reporting" target = "_blank">GitHub repo</a>.</li><li>Learn more about <a href="https://www.rstudio.com/products/connect/" target = "_blank">RStudio Connect</a>, the enterprise-level publishing platform from RStudio.</li><li>Read about more R Markdown features in the <a href="https://bookdown.org/yihui/rmarkdown-cookbook/child-document.html" target = "_blank">R Markdown Cookbook</a>.</li></ul></description></item><item><title>Changes (for the better) in {gt} 0.6.0</title><link>https://www.rstudio.com/blog/changes-for-the-better-in-gt-0-6-0/</link><pubDate>Fri, 10 Jun 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/changes-for-the-better-in-gt-0-6-0/</guid><description><p>A new version of the R package {gt} has been released! We are now at version <code>0.6.0</code> and there are now even more features that’ll make your display/summary tables look and work much, much better. In this post, let’s run through some of the bigger changes and see the benefits they can bring!</p><h2>Better RTF output tables for Pharma</h2><p>If you’re in the pharmaceutical industry and have a hand in generating tabular reporting for regulatory submissions, the new RTF rendering features will output tables in a more suitable format. Some of the key changes include:</p><ol style="list-style-type: decimal"><li>default table styling with far fewer table borders (a more common look and feel)</li><li>page-layout options (<code>page.*</code>) added to the <code>tab_options()</code> function (like <code>page.orientation</code>, <code>page.numbering</code>, etc.)</li><li>the addition of pre-header text fields in <code>tab_header()</code> (the <code>preheader</code> argument)</li></ol><p>I won’t show all the code required for generating a Pharma-specific table here. For detailed examples, it’s better to look at the examples in Phil Bowsher’s <a href="https://github.com/philbowsher/Clinical-Tables-in-R-with-gt">Clinical-Tables-in-R-with-gt repository on GitHub</a>, particularly the .Rmd files beginning with <code>gt-</code>. What <em>I will</em> show you here is a screen capture of how one of those tables looks when opened in Word.</p><p><img src="pharma_table_RTF_Word.png" title="Word document containing a pharmaceutical table, the title is Summary of Demographic and Baseline Characteristics and it has data on category, labels, whether or not the subjects were placebo, number, and p-values." alt="Word document containing a pharmaceutical table, the title is Summary of Demographic and Baseline Characteristics and it has data on category, labels, whether or not the subjects were placebo, number, and p-values." width="1589" /></p><h2>New functions for substituting cell data</h2><p>We now have four new functions that allow you to make precise substitutions of cell values with perhaps something more meaningful. They all begin with <code>sub_</code> and that’s short for substitution!</p><h3><code>sub_zero()</code></h3><p>Let’s begin with the <code>sub_zero()</code> function. It allows for substituting zero values in the table body. Here are all the options:</p><pre class="r"><code>sub_zero(data,columns = everything(),rows = everything(),zero_text = &quot;nil&quot;)</code></pre><p>Let’s generate a simple, single-column tibble that contains an assortment of values that could potentially undergo some substitution.</p><pre class="r"><code>tbl &lt;- dplyr::tibble(num = c(10^(-1:2), 0, 0, 10^(4:6)))tbl</code></pre><pre><code>## # A tibble: 9 × 1## num## &lt;dbl&gt;## 1 0.1## 2 1## 3 10## 4 100## 5 0## 6 0## 7 10000## 8 100000## 9 1000000</code></pre><p>With this table, we can format all of the numbers in the single <code>num</code> column <em>and</em> replace the zero values with <code>"nil"</code> text with separate call of <code>sub_zero()</code>.</p><pre class="r"><code>tbl %&gt;%gt() %&gt;%fmt_number(columns = num) %&gt;%sub_zero()</code></pre><div id="hgytnkzicz" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;"><style>html {font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Helvetica Neue', 'Fira Sans', 'Droid Sans', Arial, sans-serif;}#hgytnkzicz .gt_table {display: table;border-collapse: collapse;margin-left: auto;margin-right: auto;color: #333333;font-size: 16px;font-weight: normal;font-style: normal;background-color: #FFFFFF;width: auto;border-top-style: solid;border-top-width: 2px;border-top-color: #A8A8A8;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #A8A8A8;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;}#hgytnkzicz .gt_heading {background-color: #FFFFFF;text-align: center;border-bottom-color: #FFFFFF;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#hgytnkzicz .gt_title {color: #333333;font-size: 125%;font-weight: initial;padding-top: 4px;padding-bottom: 4px;padding-left: 5px;padding-right: 5px;border-bottom-color: #FFFFFF;border-bottom-width: 0;}#hgytnkzicz .gt_subtitle {color: #333333;font-size: 85%;font-weight: initial;padding-top: 0;padding-bottom: 6px;padding-left: 5px;padding-right: 5px;border-top-color: #FFFFFF;border-top-width: 0;}#hgytnkzicz .gt_bottom_border {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#hgytnkzicz .gt_col_headings {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#hgytnkzicz .gt_col_heading {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: normal;text-transform: inherit;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 6px;padding-left: 5px;padding-right: 5px;overflow-x: hidden;}#hgytnkzicz .gt_column_spanner_outer {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: normal;text-transform: inherit;padding-top: 0;padding-bottom: 0;padding-left: 4px;padding-right: 4px;}#hgytnkzicz .gt_column_spanner_outer:first-child {padding-left: 0;}#hgytnkzicz .gt_column_spanner_outer:last-child {padding-right: 0;}#hgytnkzicz .gt_column_spanner {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 5px;overflow-x: hidden;display: inline-block;width: 100%;}#hgytnkzicz .gt_group_heading {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;text-transform: inherit;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;}#hgytnkzicz .gt_empty_group_heading {padding: 0.5px;color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: middle;}#hgytnkzicz .gt_from_md > :first-child {margin-top: 0;}#hgytnkzicz .gt_from_md > :last-child {margin-bottom: 0;}#hgytnkzicz .gt_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;margin: 10px;border-top-style: solid;border-top-width: 1px;border-top-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;overflow-x: hidden;}#hgytnkzicz .gt_stub {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;text-transform: inherit;border-right-style: solid;border-right-width: 2px;border-right-color: #D3D3D3;padding-left: 5px;padding-right: 5px;}#hgytnkzicz .gt_stub_row_group {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;text-transform: inherit;border-right-style: solid;border-right-width: 2px;border-right-color: #D3D3D3;padding-left: 5px;padding-right: 5px;vertical-align: top;}#hgytnkzicz .gt_row_group_first td {border-top-width: 2px;}#hgytnkzicz .gt_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#hgytnkzicz .gt_first_summary_row {border-top-style: solid;border-top-color: #D3D3D3;}#hgytnkzicz .gt_first_summary_row.thick {border-top-width: 2px;}#hgytnkzicz .gt_last_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#hgytnkzicz .gt_grand_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#hgytnkzicz .gt_first_grand_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-top-style: double;border-top-width: 6px;border-top-color: #D3D3D3;}#hgytnkzicz .gt_striped {background-color: rgba(128, 128, 128, 0.05);}#hgytnkzicz .gt_table_body {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#hgytnkzicz .gt_footnotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#hgytnkzicz .gt_footnote {margin: 0px;font-size: 90%;padding-left: 4px;padding-right: 4px;padding-left: 5px;padding-right: 5px;}#hgytnkzicz .gt_sourcenotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#hgytnkzicz .gt_sourcenote {font-size: 90%;padding-top: 4px;padding-bottom: 4px;padding-left: 5px;padding-right: 5px;}#hgytnkzicz .gt_left {text-align: left;}#hgytnkzicz .gt_center {text-align: center;}#hgytnkzicz .gt_right {text-align: right;font-variant-numeric: tabular-nums;}#hgytnkzicz .gt_font_normal {font-weight: normal;}#hgytnkzicz .gt_font_bold {font-weight: bold;}#hgytnkzicz .gt_font_italic {font-style: italic;}#hgytnkzicz .gt_super {font-size: 65%;}#hgytnkzicz .gt_two_val_uncert {display: inline-block;line-height: 1em;text-align: right;font-size: 60%;vertical-align: -0.25em;margin-left: 0.1em;}#hgytnkzicz .gt_footnote_marks {font-style: italic;font-weight: normal;font-size: 75%;vertical-align: 0.4em;}#hgytnkzicz .gt_asterisk {font-size: 100%;vertical-align: 0;}#hgytnkzicz .gt_slash_mark {font-size: 0.7em;line-height: 0.7em;vertical-align: 0.15em;}#hgytnkzicz .gt_fraction_numerator {font-size: 0.6em;line-height: 0.6em;vertical-align: 0.45em;}#hgytnkzicz .gt_fraction_denominator {font-size: 0.6em;line-height: 0.6em;vertical-align: -0.05em;}</style><table class="gt_table"><thead class="gt_col_headings"><tr><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">num</th></tr></thead><tbody class="gt_table_body"><tr><td class="gt_row gt_right">0.10</td></tr><tr><td class="gt_row gt_right">1.00</td></tr><tr><td class="gt_row gt_right">10.00</td></tr><tr><td class="gt_row gt_right">100.00</td></tr><tr><td class="gt_row gt_right">nil</td></tr><tr><td class="gt_row gt_right">nil</td></tr><tr><td class="gt_row gt_right">10,000.00</td></tr><tr><td class="gt_row gt_right">100,000.00</td></tr><tr><td class="gt_row gt_right">1,000,000.00</td></tr></tbody></table></div><p><br /></p><h3><code>sub_missing()</code> (formerly known as <code>fmt_missing()</code>)</h3><p>Here’s something that’s both old and new. The <code>sub_missing()</code> function (for replacing <code>NA</code>s with… something) is <strong>new</strong>, but it’s essentially replacing a function that is <strong>old</strong> (<code>fmt_missing()</code>). Let’s have a look at this function anyway!</p><pre class="r"><code>sub_missing(data,columns = everything(),rows = everything(),missing_text = &quot;---&quot;)</code></pre><p>The <code>missing_text</code> replacement of <code>"---"</code> is actually an em dash (the longest of the dash family). This can be downgraded to an en dash with <code>"--"</code> or we can go further with <code>"-"</code>, giving us a hyphen replacement. Or, you can use another piece of text. Let’s get to an example of that. The <code>exibble</code> dataset (included in {gt}) has quite a few <code>NA</code>s and we’ll either replace with the text <code>"missing"</code> (in columns 1 and 2) or <code>"nothing"</code> (in the remaining columns).</p><pre class="r"><code>exibble %&gt;%dplyr::select(-row, -group) %&gt;%gt() %&gt;%sub_missing(columns = 1:2,missing_text = &quot;missing&quot;) %&gt;%sub_missing(columns = 4:7,missing_text = &quot;nothing&quot;)</code></pre><div id="fzpharlmfa" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;"><style>html {font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Helvetica Neue', 'Fira Sans', 'Droid Sans', Arial, sans-serif;}#fzpharlmfa .gt_table {display: table;border-collapse: collapse;margin-left: auto;margin-right: auto;color: #333333;font-size: 16px;font-weight: normal;font-style: normal;background-color: #FFFFFF;width: auto;border-top-style: solid;border-top-width: 2px;border-top-color: #A8A8A8;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #A8A8A8;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;}#fzpharlmfa .gt_heading {background-color: #FFFFFF;text-align: center;border-bottom-color: #FFFFFF;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#fzpharlmfa .gt_title {color: #333333;font-size: 125%;font-weight: initial;padding-top: 4px;padding-bottom: 4px;padding-left: 5px;padding-right: 5px;border-bottom-color: #FFFFFF;border-bottom-width: 0;}#fzpharlmfa .gt_subtitle {color: #333333;font-size: 85%;font-weight: initial;padding-top: 0;padding-bottom: 6px;padding-left: 5px;padding-right: 5px;border-top-color: #FFFFFF;border-top-width: 0;}#fzpharlmfa .gt_bottom_border {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#fzpharlmfa .gt_col_headings {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#fzpharlmfa .gt_col_heading {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: normal;text-transform: inherit;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 6px;padding-left: 5px;padding-right: 5px;overflow-x: hidden;}#fzpharlmfa .gt_column_spanner_outer {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: normal;text-transform: inherit;padding-top: 0;padding-bottom: 0;padding-left: 4px;padding-right: 4px;}#fzpharlmfa .gt_column_spanner_outer:first-child {padding-left: 0;}#fzpharlmfa .gt_column_spanner_outer:last-child {padding-right: 0;}#fzpharlmfa .gt_column_spanner {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 5px;overflow-x: hidden;display: inline-block;width: 100%;}#fzpharlmfa .gt_group_heading {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;text-transform: inherit;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;}#fzpharlmfa .gt_empty_group_heading {padding: 0.5px;color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: middle;}#fzpharlmfa .gt_from_md > :first-child {margin-top: 0;}#fzpharlmfa .gt_from_md > :last-child {margin-bottom: 0;}#fzpharlmfa .gt_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;margin: 10px;border-top-style: solid;border-top-width: 1px;border-top-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;overflow-x: hidden;}#fzpharlmfa .gt_stub {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;text-transform: inherit;border-right-style: solid;border-right-width: 2px;border-right-color: #D3D3D3;padding-left: 5px;padding-right: 5px;}#fzpharlmfa .gt_stub_row_group {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;text-transform: inherit;border-right-style: solid;border-right-width: 2px;border-right-color: #D3D3D3;padding-left: 5px;padding-right: 5px;vertical-align: top;}#fzpharlmfa .gt_row_group_first td {border-top-width: 2px;}#fzpharlmfa .gt_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#fzpharlmfa .gt_first_summary_row {border-top-style: solid;border-top-color: #D3D3D3;}#fzpharlmfa .gt_first_summary_row.thick {border-top-width: 2px;}#fzpharlmfa .gt_last_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#fzpharlmfa .gt_grand_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#fzpharlmfa .gt_first_grand_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-top-style: double;border-top-width: 6px;border-top-color: #D3D3D3;}#fzpharlmfa .gt_striped {background-color: rgba(128, 128, 128, 0.05);}#fzpharlmfa .gt_table_body {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#fzpharlmfa .gt_footnotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#fzpharlmfa .gt_footnote {margin: 0px;font-size: 90%;padding-left: 4px;padding-right: 4px;padding-left: 5px;padding-right: 5px;}#fzpharlmfa .gt_sourcenotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#fzpharlmfa .gt_sourcenote {font-size: 90%;padding-top: 4px;padding-bottom: 4px;padding-left: 5px;padding-right: 5px;}#fzpharlmfa .gt_left {text-align: left;}#fzpharlmfa .gt_center {text-align: center;}#fzpharlmfa .gt_right {text-align: right;font-variant-numeric: tabular-nums;}#fzpharlmfa .gt_font_normal {font-weight: normal;}#fzpharlmfa .gt_font_bold {font-weight: bold;}#fzpharlmfa .gt_font_italic {font-style: italic;}#fzpharlmfa .gt_super {font-size: 65%;}#fzpharlmfa .gt_two_val_uncert {display: inline-block;line-height: 1em;text-align: right;font-size: 60%;vertical-align: -0.25em;margin-left: 0.1em;}#fzpharlmfa .gt_footnote_marks {font-style: italic;font-weight: normal;font-size: 75%;vertical-align: 0.4em;}#fzpharlmfa .gt_asterisk {font-size: 100%;vertical-align: 0;}#fzpharlmfa .gt_slash_mark {font-size: 0.7em;line-height: 0.7em;vertical-align: 0.15em;}#fzpharlmfa .gt_fraction_numerator {font-size: 0.6em;line-height: 0.6em;vertical-align: 0.45em;}#fzpharlmfa .gt_fraction_denominator {font-size: 0.6em;line-height: 0.6em;vertical-align: -0.05em;}</style><table class="gt_table"><thead class="gt_col_headings"><tr><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">num</th><th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1">char</th><th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1">fctr</th><th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1">date</th><th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1">time</th><th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1">datetime</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">currency</th></tr></thead><tbody class="gt_table_body"><tr><td class="gt_row gt_right">1.111e-01</td><td class="gt_row gt_left">apricot</td><td class="gt_row gt_center">one</td><td class="gt_row gt_left">2015-01-15</td><td class="gt_row gt_left">13:35</td><td class="gt_row gt_left">2018-01-01 02:22</td><td class="gt_row gt_right">49.950</td></tr><tr><td class="gt_row gt_right">2.222e+00</td><td class="gt_row gt_left">banana</td><td class="gt_row gt_center">two</td><td class="gt_row gt_left">2015-02-15</td><td class="gt_row gt_left">14:40</td><td class="gt_row gt_left">2018-02-02 14:33</td><td class="gt_row gt_right">17.950</td></tr><tr><td class="gt_row gt_right">3.333e+01</td><td class="gt_row gt_left">coconut</td><td class="gt_row gt_center">three</td><td class="gt_row gt_left">2015-03-15</td><td class="gt_row gt_left">15:45</td><td class="gt_row gt_left">2018-03-03 03:44</td><td class="gt_row gt_right">1.390</td></tr><tr><td class="gt_row gt_right">4.444e+02</td><td class="gt_row gt_left">durian</td><td class="gt_row gt_center">four</td><td class="gt_row gt_left">2015-04-15</td><td class="gt_row gt_left">16:50</td><td class="gt_row gt_left">2018-04-04 15:55</td><td class="gt_row gt_right">65100.000</td></tr><tr><td class="gt_row gt_right">5.550e+03</td><td class="gt_row gt_left">missing</td><td class="gt_row gt_center">five</td><td class="gt_row gt_left">2015-05-15</td><td class="gt_row gt_left">17:55</td><td class="gt_row gt_left">2018-05-05 04:00</td><td class="gt_row gt_right">1325.810</td></tr><tr><td class="gt_row gt_right">missing</td><td class="gt_row gt_left">fig</td><td class="gt_row gt_center">six</td><td class="gt_row gt_left">2015-06-15</td><td class="gt_row gt_left">nothing</td><td class="gt_row gt_left">2018-06-06 16:11</td><td class="gt_row gt_right">13.255</td></tr><tr><td class="gt_row gt_right">7.770e+05</td><td class="gt_row gt_left">grapefruit</td><td class="gt_row gt_center">seven</td><td class="gt_row gt_left">nothing</td><td class="gt_row gt_left">19:10</td><td class="gt_row gt_left">2018-07-07 05:22</td><td class="gt_row gt_right">nothing</td></tr><tr><td class="gt_row gt_right">8.880e+06</td><td class="gt_row gt_left">honeydew</td><td class="gt_row gt_center">eight</td><td class="gt_row gt_left">2015-08-15</td><td class="gt_row gt_left">20:20</td><td class="gt_row gt_left">nothing</td><td class="gt_row gt_right">0.440</td></tr></tbody></table></div><p><br />If you’re using and loving <code>fmt_missing()</code>, it’s okay! You’ll probably receive a warning about it when you upgrade to {gt} <code>0.6.0</code> though. Best to just substitute <code>fmt_missing()</code> with <code>sub_missing()</code> anyway!</p><h3><code>sub_small_vals()</code></h3><p>Next up is the <code>sub_small_vals()</code> function. Ever have really, really small values and really just want to say they are small? You can do this in multiple ways with this new function. Here are all the options:</p><pre class="r"><code>sub_small_vals(data,columns = everything(),rows = everything(),threshold = 0.01,small_pattern = if (sign == &quot;+&quot;) &quot;&lt;{x}&quot; else md(&quot;&lt;*abs*(-{x})&quot;),sign = &quot;+&quot;)</code></pre><p>Whoa! That’s a lot of options. We can unpack all this though, and we’ll do it with a few examples. First, we need a table so let’s generate a simple, single-column tibble that contains an assortment of values that could potentially undergo some substitution.</p><pre class="r"><code>tbl &lt;- dplyr::tibble(num = c(10^(-4:2), 0, NA))tbl</code></pre><pre><code>## # A tibble: 9 × 1## num## &lt;dbl&gt;## 1 0.0001## 2 0.001## 3 0.01## 4 0.1## 5 1## 6 10## 7 100## 8 0## 9 NA</code></pre><p>The <code>tbl</code> contains a variety of smaller numbers and some might be small enough to reformat with athreshold value. With <code>sub_small_vals()</code> we can do just that with the default <code>threshold</code> of <code>0.01</code>, and you’ll see that the targeted cells read <code>&lt;0.01</code>.</p><pre class="r"><code>tbl %&gt;%gt() %&gt;%fmt_number(columns = num) %&gt;%sub_small_vals()</code></pre><div id="yfmfynicpc" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;"><style>html {font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Helvetica Neue', 'Fira Sans', 'Droid Sans', Arial, sans-serif;}#yfmfynicpc .gt_table {display: table;border-collapse: collapse;margin-left: auto;margin-right: auto;color: #333333;font-size: 16px;font-weight: normal;font-style: normal;background-color: #FFFFFF;width: auto;border-top-style: solid;border-top-width: 2px;border-top-color: #A8A8A8;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #A8A8A8;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;}#yfmfynicpc .gt_heading {background-color: #FFFFFF;text-align: center;border-bottom-color: #FFFFFF;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#yfmfynicpc .gt_title {color: #333333;font-size: 125%;font-weight: initial;padding-top: 4px;padding-bottom: 4px;padding-left: 5px;padding-right: 5px;border-bottom-color: #FFFFFF;border-bottom-width: 0;}#yfmfynicpc .gt_subtitle {color: #333333;font-size: 85%;font-weight: initial;padding-top: 0;padding-bottom: 6px;padding-left: 5px;padding-right: 5px;border-top-color: #FFFFFF;border-top-width: 0;}#yfmfynicpc .gt_bottom_border {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#yfmfynicpc .gt_col_headings {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#yfmfynicpc .gt_col_heading {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: normal;text-transform: inherit;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 6px;padding-left: 5px;padding-right: 5px;overflow-x: hidden;}#yfmfynicpc .gt_column_spanner_outer {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: normal;text-transform: inherit;padding-top: 0;padding-bottom: 0;padding-left: 4px;padding-right: 4px;}#yfmfynicpc .gt_column_spanner_outer:first-child {padding-left: 0;}#yfmfynicpc .gt_column_spanner_outer:last-child {padding-right: 0;}#yfmfynicpc .gt_column_spanner {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 5px;overflow-x: hidden;display: inline-block;width: 100%;}#yfmfynicpc .gt_group_heading {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;text-transform: inherit;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;}#yfmfynicpc .gt_empty_group_heading {padding: 0.5px;color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: middle;}#yfmfynicpc .gt_from_md > :first-child {margin-top: 0;}#yfmfynicpc .gt_from_md > :last-child {margin-bottom: 0;}#yfmfynicpc .gt_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;margin: 10px;border-top-style: solid;border-top-width: 1px;border-top-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;overflow-x: hidden;}#yfmfynicpc .gt_stub {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;text-transform: inherit;border-right-style: solid;border-right-width: 2px;border-right-color: #D3D3D3;padding-left: 5px;padding-right: 5px;}#yfmfynicpc .gt_stub_row_group {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;text-transform: inherit;border-right-style: solid;border-right-width: 2px;border-right-color: #D3D3D3;padding-left: 5px;padding-right: 5px;vertical-align: top;}#yfmfynicpc .gt_row_group_first td {border-top-width: 2px;}#yfmfynicpc .gt_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#yfmfynicpc .gt_first_summary_row {border-top-style: solid;border-top-color: #D3D3D3;}#yfmfynicpc .gt_first_summary_row.thick {border-top-width: 2px;}#yfmfynicpc .gt_last_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#yfmfynicpc .gt_grand_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#yfmfynicpc .gt_first_grand_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-top-style: double;border-top-width: 6px;border-top-color: #D3D3D3;}#yfmfynicpc .gt_striped {background-color: rgba(128, 128, 128, 0.05);}#yfmfynicpc .gt_table_body {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#yfmfynicpc .gt_footnotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#yfmfynicpc .gt_footnote {margin: 0px;font-size: 90%;padding-left: 4px;padding-right: 4px;padding-left: 5px;padding-right: 5px;}#yfmfynicpc .gt_sourcenotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#yfmfynicpc .gt_sourcenote {font-size: 90%;padding-top: 4px;padding-bottom: 4px;padding-left: 5px;padding-right: 5px;}#yfmfynicpc .gt_left {text-align: left;}#yfmfynicpc .gt_center {text-align: center;}#yfmfynicpc .gt_right {text-align: right;font-variant-numeric: tabular-nums;}#yfmfynicpc .gt_font_normal {font-weight: normal;}#yfmfynicpc .gt_font_bold {font-weight: bold;}#yfmfynicpc .gt_font_italic {font-style: italic;}#yfmfynicpc .gt_super {font-size: 65%;}#yfmfynicpc .gt_two_val_uncert {display: inline-block;line-height: 1em;text-align: right;font-size: 60%;vertical-align: -0.25em;margin-left: 0.1em;}#yfmfynicpc .gt_footnote_marks {font-style: italic;font-weight: normal;font-size: 75%;vertical-align: 0.4em;}#yfmfynicpc .gt_asterisk {font-size: 100%;vertical-align: 0;}#yfmfynicpc .gt_slash_mark {font-size: 0.7em;line-height: 0.7em;vertical-align: 0.15em;}#yfmfynicpc .gt_fraction_numerator {font-size: 0.6em;line-height: 0.6em;vertical-align: 0.45em;}#yfmfynicpc .gt_fraction_denominator {font-size: 0.6em;line-height: 0.6em;vertical-align: -0.05em;}</style><table class="gt_table"><thead class="gt_col_headings"><tr><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">num</th></tr></thead><tbody class="gt_table_body"><tr><td class="gt_row gt_right">&lt;0.01</td></tr><tr><td class="gt_row gt_right">&lt;0.01</td></tr><tr><td class="gt_row gt_right">0.01</td></tr><tr><td class="gt_row gt_right">0.10</td></tr><tr><td class="gt_row gt_right">1.00</td></tr><tr><td class="gt_row gt_right">10.00</td></tr><tr><td class="gt_row gt_right">100.00</td></tr><tr><td class="gt_row gt_right">0.00</td></tr><tr><td class="gt_row gt_right">NA</td></tr></tbody></table></div><p><br />The <code>small_pattern</code> combines the threshold value and other literal text to generate an informative and accurate label. Here’s a more concrete example that shows how the <code>threshold</code> and <code>small_pattern</code> work together (it’s also Markdownified with <code>md()</code>, for <em>extra fun</em>).</p><pre class="r"><code>tbl %&gt;%gt() %&gt;%fmt_number(columns = num) %&gt;%sub_small_vals(threshold = 0.1,small_pattern = md(&quot;**Smaller** than {x}&quot;))</code></pre><div id="jjmpermqwz" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;"><style>html {font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Helvetica Neue', 'Fira Sans', 'Droid Sans', Arial, sans-serif;}#jjmpermqwz .gt_table {display: table;border-collapse: collapse;margin-left: auto;margin-right: auto;color: #333333;font-size: 16px;font-weight: normal;font-style: normal;background-color: #FFFFFF;width: auto;border-top-style: solid;border-top-width: 2px;border-top-color: #A8A8A8;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #A8A8A8;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;}#jjmpermqwz .gt_heading {background-color: #FFFFFF;text-align: center;border-bottom-color: #FFFFFF;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#jjmpermqwz .gt_title {color: #333333;font-size: 125%;font-weight: initial;padding-top: 4px;padding-bottom: 4px;padding-left: 5px;padding-right: 5px;border-bottom-color: #FFFFFF;border-bottom-width: 0;}#jjmpermqwz .gt_subtitle {color: #333333;font-size: 85%;font-weight: initial;padding-top: 0;padding-bottom: 6px;padding-left: 5px;padding-right: 5px;border-top-color: #FFFFFF;border-top-width: 0;}#jjmpermqwz .gt_bottom_border {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#jjmpermqwz .gt_col_headings {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#jjmpermqwz .gt_col_heading {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: normal;text-transform: inherit;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 6px;padding-left: 5px;padding-right: 5px;overflow-x: hidden;}#jjmpermqwz .gt_column_spanner_outer {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: normal;text-transform: inherit;padding-top: 0;padding-bottom: 0;padding-left: 4px;padding-right: 4px;}#jjmpermqwz .gt_column_spanner_outer:first-child {padding-left: 0;}#jjmpermqwz .gt_column_spanner_outer:last-child {padding-right: 0;}#jjmpermqwz .gt_column_spanner {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 5px;overflow-x: hidden;display: inline-block;width: 100%;}#jjmpermqwz .gt_group_heading {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;text-transform: inherit;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;}#jjmpermqwz .gt_empty_group_heading {padding: 0.5px;color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: middle;}#jjmpermqwz .gt_from_md > :first-child {margin-top: 0;}#jjmpermqwz .gt_from_md > :last-child {margin-bottom: 0;}#jjmpermqwz .gt_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;margin: 10px;border-top-style: solid;border-top-width: 1px;border-top-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;overflow-x: hidden;}#jjmpermqwz .gt_stub {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;text-transform: inherit;border-right-style: solid;border-right-width: 2px;border-right-color: #D3D3D3;padding-left: 5px;padding-right: 5px;}#jjmpermqwz .gt_stub_row_group {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;text-transform: inherit;border-right-style: solid;border-right-width: 2px;border-right-color: #D3D3D3;padding-left: 5px;padding-right: 5px;vertical-align: top;}#jjmpermqwz .gt_row_group_first td {border-top-width: 2px;}#jjmpermqwz .gt_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#jjmpermqwz .gt_first_summary_row {border-top-style: solid;border-top-color: #D3D3D3;}#jjmpermqwz .gt_first_summary_row.thick {border-top-width: 2px;}#jjmpermqwz .gt_last_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#jjmpermqwz .gt_grand_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#jjmpermqwz .gt_first_grand_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-top-style: double;border-top-width: 6px;border-top-color: #D3D3D3;}#jjmpermqwz .gt_striped {background-color: rgba(128, 128, 128, 0.05);}#jjmpermqwz .gt_table_body {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#jjmpermqwz .gt_footnotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#jjmpermqwz .gt_footnote {margin: 0px;font-size: 90%;padding-left: 4px;padding-right: 4px;padding-left: 5px;padding-right: 5px;}#jjmpermqwz .gt_sourcenotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#jjmpermqwz .gt_sourcenote {font-size: 90%;padding-top: 4px;padding-bottom: 4px;padding-left: 5px;padding-right: 5px;}#jjmpermqwz .gt_left {text-align: left;}#jjmpermqwz .gt_center {text-align: center;}#jjmpermqwz .gt_right {text-align: right;font-variant-numeric: tabular-nums;}#jjmpermqwz .gt_font_normal {font-weight: normal;}#jjmpermqwz .gt_font_bold {font-weight: bold;}#jjmpermqwz .gt_font_italic {font-style: italic;}#jjmpermqwz .gt_super {font-size: 65%;}#jjmpermqwz .gt_two_val_uncert {display: inline-block;line-height: 1em;text-align: right;font-size: 60%;vertical-align: -0.25em;margin-left: 0.1em;}#jjmpermqwz .gt_footnote_marks {font-style: italic;font-weight: normal;font-size: 75%;vertical-align: 0.4em;}#jjmpermqwz .gt_asterisk {font-size: 100%;vertical-align: 0;}#jjmpermqwz .gt_slash_mark {font-size: 0.7em;line-height: 0.7em;vertical-align: 0.15em;}#jjmpermqwz .gt_fraction_numerator {font-size: 0.6em;line-height: 0.6em;vertical-align: 0.45em;}#jjmpermqwz .gt_fraction_denominator {font-size: 0.6em;line-height: 0.6em;vertical-align: -0.05em;}</style><table class="gt_table"><thead class="gt_col_headings"><tr><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">num</th></tr></thead><tbody class="gt_table_body"><tr><td class="gt_row gt_right"><strong>Smaller</strong> than 0.1</td></tr><tr><td class="gt_row gt_right"><strong>Smaller</strong> than 0.1</td></tr><tr><td class="gt_row gt_right"><strong>Smaller</strong> than 0.1</td></tr><tr><td class="gt_row gt_right">0.10</td></tr><tr><td class="gt_row gt_right">1.00</td></tr><tr><td class="gt_row gt_right">10.00</td></tr><tr><td class="gt_row gt_right">100.00</td></tr><tr><td class="gt_row gt_right">0.00</td></tr><tr><td class="gt_row gt_right">NA</td></tr></tbody></table></div><p><br />Small and negative values can also be handled but they are handled specially by the <code>sign</code>parameter. Setting that to <code>"-"</code> will format only the small, negative values.</p><pre class="r"><code>tbl %&gt;%dplyr::mutate(num = -num) %&gt;%gt() %&gt;%fmt_number(columns = num) %&gt;%sub_small_vals(sign = &quot;-&quot;)</code></pre><div id="zejzmornes" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;"><style>html {font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Helvetica Neue', 'Fira Sans', 'Droid Sans', Arial, sans-serif;}#zejzmornes .gt_table {display: table;border-collapse: collapse;margin-left: auto;margin-right: auto;color: #333333;font-size: 16px;font-weight: normal;font-style: normal;background-color: #FFFFFF;width: auto;border-top-style: solid;border-top-width: 2px;border-top-color: #A8A8A8;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #A8A8A8;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;}#zejzmornes .gt_heading {background-color: #FFFFFF;text-align: center;border-bottom-color: #FFFFFF;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#zejzmornes .gt_title {color: #333333;font-size: 125%;font-weight: initial;padding-top: 4px;padding-bottom: 4px;padding-left: 5px;padding-right: 5px;border-bottom-color: #FFFFFF;border-bottom-width: 0;}#zejzmornes .gt_subtitle {color: #333333;font-size: 85%;font-weight: initial;padding-top: 0;padding-bottom: 6px;padding-left: 5px;padding-right: 5px;border-top-color: #FFFFFF;border-top-width: 0;}#zejzmornes .gt_bottom_border {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#zejzmornes .gt_col_headings {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#zejzmornes .gt_col_heading {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: normal;text-transform: inherit;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 6px;padding-left: 5px;padding-right: 5px;overflow-x: hidden;}#zejzmornes .gt_column_spanner_outer {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: normal;text-transform: inherit;padding-top: 0;padding-bottom: 0;padding-left: 4px;padding-right: 4px;}#zejzmornes .gt_column_spanner_outer:first-child {padding-left: 0;}#zejzmornes .gt_column_spanner_outer:last-child {padding-right: 0;}#zejzmornes .gt_column_spanner {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 5px;overflow-x: hidden;display: inline-block;width: 100%;}#zejzmornes .gt_group_heading {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;text-transform: inherit;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;}#zejzmornes .gt_empty_group_heading {padding: 0.5px;color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: middle;}#zejzmornes .gt_from_md > :first-child {margin-top: 0;}#zejzmornes .gt_from_md > :last-child {margin-bottom: 0;}#zejzmornes .gt_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;margin: 10px;border-top-style: solid;border-top-width: 1px;border-top-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;overflow-x: hidden;}#zejzmornes .gt_stub {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;text-transform: inherit;border-right-style: solid;border-right-width: 2px;border-right-color: #D3D3D3;padding-left: 5px;padding-right: 5px;}#zejzmornes .gt_stub_row_group {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;text-transform: inherit;border-right-style: solid;border-right-width: 2px;border-right-color: #D3D3D3;padding-left: 5px;padding-right: 5px;vertical-align: top;}#zejzmornes .gt_row_group_first td {border-top-width: 2px;}#zejzmornes .gt_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#zejzmornes .gt_first_summary_row {border-top-style: solid;border-top-color: #D3D3D3;}#zejzmornes .gt_first_summary_row.thick {border-top-width: 2px;}#zejzmornes .gt_last_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#zejzmornes .gt_grand_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#zejzmornes .gt_first_grand_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-top-style: double;border-top-width: 6px;border-top-color: #D3D3D3;}#zejzmornes .gt_striped {background-color: rgba(128, 128, 128, 0.05);}#zejzmornes .gt_table_body {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#zejzmornes .gt_footnotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#zejzmornes .gt_footnote {margin: 0px;font-size: 90%;padding-left: 4px;padding-right: 4px;padding-left: 5px;padding-right: 5px;}#zejzmornes .gt_sourcenotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#zejzmornes .gt_sourcenote {font-size: 90%;padding-top: 4px;padding-bottom: 4px;padding-left: 5px;padding-right: 5px;}#zejzmornes .gt_left {text-align: left;}#zejzmornes .gt_center {text-align: center;}#zejzmornes .gt_right {text-align: right;font-variant-numeric: tabular-nums;}#zejzmornes .gt_font_normal {font-weight: normal;}#zejzmornes .gt_font_bold {font-weight: bold;}#zejzmornes .gt_font_italic {font-style: italic;}#zejzmornes .gt_super {font-size: 65%;}#zejzmornes .gt_two_val_uncert {display: inline-block;line-height: 1em;text-align: right;font-size: 60%;vertical-align: -0.25em;margin-left: 0.1em;}#zejzmornes .gt_footnote_marks {font-style: italic;font-weight: normal;font-size: 75%;vertical-align: 0.4em;}#zejzmornes .gt_asterisk {font-size: 100%;vertical-align: 0;}#zejzmornes .gt_slash_mark {font-size: 0.7em;line-height: 0.7em;vertical-align: 0.15em;}#zejzmornes .gt_fraction_numerator {font-size: 0.6em;line-height: 0.6em;vertical-align: 0.45em;}#zejzmornes .gt_fraction_denominator {font-size: 0.6em;line-height: 0.6em;vertical-align: -0.05em;}</style><table class="gt_table"><thead class="gt_col_headings"><tr><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">num</th></tr></thead><tbody class="gt_table_body"><tr><td class="gt_row gt_right">&lt;<em>abs</em>(-0.01)</td></tr><tr><td class="gt_row gt_right">&lt;<em>abs</em>(-0.01)</td></tr><tr><td class="gt_row gt_right">&minus;0.01</td></tr><tr><td class="gt_row gt_right">&minus;0.10</td></tr><tr><td class="gt_row gt_right">&minus;1.00</td></tr><tr><td class="gt_row gt_right">&minus;10.00</td></tr><tr><td class="gt_row gt_right">&minus;100.00</td></tr><tr><td class="gt_row gt_right">0.00</td></tr><tr><td class="gt_row gt_right">NA</td></tr></tbody></table></div><p><br />You don’t have to settle with the default <code>threshold</code> value or the default replacement pattern(in <code>small_pattern</code>). This can be changed and the <code>"{x}"</code> in <code>small_pattern</code> (which uses the<code>threshold</code> value) can even be omitted.</p><pre class="r"><code>tbl %&gt;%gt() %&gt;%fmt_number(columns = num) %&gt;%sub_small_vals(threshold = 0.0005,small_pattern = &quot;smol&quot;)</code></pre><div id="libfdovpmi" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;"><style>html {font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Helvetica Neue', 'Fira Sans', 'Droid Sans', Arial, sans-serif;}#libfdovpmi .gt_table {display: table;border-collapse: collapse;margin-left: auto;margin-right: auto;color: #333333;font-size: 16px;font-weight: normal;font-style: normal;background-color: #FFFFFF;width: auto;border-top-style: solid;border-top-width: 2px;border-top-color: #A8A8A8;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #A8A8A8;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;}#libfdovpmi .gt_heading {background-color: #FFFFFF;text-align: center;border-bottom-color: #FFFFFF;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#libfdovpmi .gt_title {color: #333333;font-size: 125%;font-weight: initial;padding-top: 4px;padding-bottom: 4px;padding-left: 5px;padding-right: 5px;border-bottom-color: #FFFFFF;border-bottom-width: 0;}#libfdovpmi .gt_subtitle {color: #333333;font-size: 85%;font-weight: initial;padding-top: 0;padding-bottom: 6px;padding-left: 5px;padding-right: 5px;border-top-color: #FFFFFF;border-top-width: 0;}#libfdovpmi .gt_bottom_border {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#libfdovpmi .gt_col_headings {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#libfdovpmi .gt_col_heading {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: normal;text-transform: inherit;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 6px;padding-left: 5px;padding-right: 5px;overflow-x: hidden;}#libfdovpmi .gt_column_spanner_outer {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: normal;text-transform: inherit;padding-top: 0;padding-bottom: 0;padding-left: 4px;padding-right: 4px;}#libfdovpmi .gt_column_spanner_outer:first-child {padding-left: 0;}#libfdovpmi .gt_column_spanner_outer:last-child {padding-right: 0;}#libfdovpmi .gt_column_spanner {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 5px;overflow-x: hidden;display: inline-block;width: 100%;}#libfdovpmi .gt_group_heading {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;text-transform: inherit;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;}#libfdovpmi .gt_empty_group_heading {padding: 0.5px;color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: middle;}#libfdovpmi .gt_from_md > :first-child {margin-top: 0;}#libfdovpmi .gt_from_md > :last-child {margin-bottom: 0;}#libfdovpmi .gt_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;margin: 10px;border-top-style: solid;border-top-width: 1px;border-top-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;overflow-x: hidden;}#libfdovpmi .gt_stub {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;text-transform: inherit;border-right-style: solid;border-right-width: 2px;border-right-color: #D3D3D3;padding-left: 5px;padding-right: 5px;}#libfdovpmi .gt_stub_row_group {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;text-transform: inherit;border-right-style: solid;border-right-width: 2px;border-right-color: #D3D3D3;padding-left: 5px;padding-right: 5px;vertical-align: top;}#libfdovpmi .gt_row_group_first td {border-top-width: 2px;}#libfdovpmi .gt_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#libfdovpmi .gt_first_summary_row {border-top-style: solid;border-top-color: #D3D3D3;}#libfdovpmi .gt_first_summary_row.thick {border-top-width: 2px;}#libfdovpmi .gt_last_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#libfdovpmi .gt_grand_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#libfdovpmi .gt_first_grand_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-top-style: double;border-top-width: 6px;border-top-color: #D3D3D3;}#libfdovpmi .gt_striped {background-color: rgba(128, 128, 128, 0.05);}#libfdovpmi .gt_table_body {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#libfdovpmi .gt_footnotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#libfdovpmi .gt_footnote {margin: 0px;font-size: 90%;padding-left: 4px;padding-right: 4px;padding-left: 5px;padding-right: 5px;}#libfdovpmi .gt_sourcenotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#libfdovpmi .gt_sourcenote {font-size: 90%;padding-top: 4px;padding-bottom: 4px;padding-left: 5px;padding-right: 5px;}#libfdovpmi .gt_left {text-align: left;}#libfdovpmi .gt_center {text-align: center;}#libfdovpmi .gt_right {text-align: right;font-variant-numeric: tabular-nums;}#libfdovpmi .gt_font_normal {font-weight: normal;}#libfdovpmi .gt_font_bold {font-weight: bold;}#libfdovpmi .gt_font_italic {font-style: italic;}#libfdovpmi .gt_super {font-size: 65%;}#libfdovpmi .gt_two_val_uncert {display: inline-block;line-height: 1em;text-align: right;font-size: 60%;vertical-align: -0.25em;margin-left: 0.1em;}#libfdovpmi .gt_footnote_marks {font-style: italic;font-weight: normal;font-size: 75%;vertical-align: 0.4em;}#libfdovpmi .gt_asterisk {font-size: 100%;vertical-align: 0;}#libfdovpmi .gt_slash_mark {font-size: 0.7em;line-height: 0.7em;vertical-align: 0.15em;}#libfdovpmi .gt_fraction_numerator {font-size: 0.6em;line-height: 0.6em;vertical-align: 0.45em;}#libfdovpmi .gt_fraction_denominator {font-size: 0.6em;line-height: 0.6em;vertical-align: -0.05em;}</style><table class="gt_table"><thead class="gt_col_headings"><tr><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">num</th></tr></thead><tbody class="gt_table_body"><tr><td class="gt_row gt_right">smol</td></tr><tr><td class="gt_row gt_right">0.00</td></tr><tr><td class="gt_row gt_right">0.01</td></tr><tr><td class="gt_row gt_right">0.10</td></tr><tr><td class="gt_row gt_right">1.00</td></tr><tr><td class="gt_row gt_right">10.00</td></tr><tr><td class="gt_row gt_right">100.00</td></tr><tr><td class="gt_row gt_right">0.00</td></tr><tr><td class="gt_row gt_right">NA</td></tr></tbody></table></div><p><br /></p><h3><code>sub_large_vals()</code></h3><p>Okay, there’s one more substitution function to cover, and this one’s for all the large values in your table: <code>sub_large_vals()</code>. With this you can substitute what you might consider as <em>too large</em> values in the table body.</p><pre class="r"><code>sub_large_vals(data,columns = everything(),rows = everything(),threshold = 1E12,large_pattern = &quot;&gt;={x}&quot;,sign = &quot;+&quot;)</code></pre><p>Let’s generate a simple, single-column tibble that contains an assortment of values that couldpotentially undergo some substitution.</p><pre class="r"><code>tbl &lt;- dplyr::tibble(num = c(0, NA, 10^(8:14)))tbl</code></pre><pre><code>## # A tibble: 9 × 1## num## &lt;dbl&gt;## 1 0## 2 NA## 3 1e 8## 4 1e 9## 5 1e10## 6 1e11## 7 1e12## 8 1e13## 9 1e14</code></pre><p>The <code>tbl</code> contains some really large numbers and some might be big enough to reformat with a threshold value (the default <code>threshold</code> is <code>1E12</code>). Here’s how it’s done with <code>sub_large_vals()</code>.</p><pre class="r"><code>tbl %&gt;%gt() %&gt;%fmt_number(columns = num) %&gt;%sub_large_vals()</code></pre><div id="ivmqrctmyo" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;"><style>html {font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Helvetica Neue', 'Fira Sans', 'Droid Sans', Arial, sans-serif;}#ivmqrctmyo .gt_table {display: table;border-collapse: collapse;margin-left: auto;margin-right: auto;color: #333333;font-size: 16px;font-weight: normal;font-style: normal;background-color: #FFFFFF;width: auto;border-top-style: solid;border-top-width: 2px;border-top-color: #A8A8A8;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #A8A8A8;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;}#ivmqrctmyo .gt_heading {background-color: #FFFFFF;text-align: center;border-bottom-color: #FFFFFF;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#ivmqrctmyo .gt_title {color: #333333;font-size: 125%;font-weight: initial;padding-top: 4px;padding-bottom: 4px;padding-left: 5px;padding-right: 5px;border-bottom-color: #FFFFFF;border-bottom-width: 0;}#ivmqrctmyo .gt_subtitle {color: #333333;font-size: 85%;font-weight: initial;padding-top: 0;padding-bottom: 6px;padding-left: 5px;padding-right: 5px;border-top-color: #FFFFFF;border-top-width: 0;}#ivmqrctmyo .gt_bottom_border {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#ivmqrctmyo .gt_col_headings {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#ivmqrctmyo .gt_col_heading {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: normal;text-transform: inherit;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 6px;padding-left: 5px;padding-right: 5px;overflow-x: hidden;}#ivmqrctmyo .gt_column_spanner_outer {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: normal;text-transform: inherit;padding-top: 0;padding-bottom: 0;padding-left: 4px;padding-right: 4px;}#ivmqrctmyo .gt_column_spanner_outer:first-child {padding-left: 0;}#ivmqrctmyo .gt_column_spanner_outer:last-child {padding-right: 0;}#ivmqrctmyo .gt_column_spanner {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 5px;overflow-x: hidden;display: inline-block;width: 100%;}#ivmqrctmyo .gt_group_heading {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;text-transform: inherit;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;}#ivmqrctmyo .gt_empty_group_heading {padding: 0.5px;color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: middle;}#ivmqrctmyo .gt_from_md > :first-child {margin-top: 0;}#ivmqrctmyo .gt_from_md > :last-child {margin-bottom: 0;}#ivmqrctmyo .gt_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;margin: 10px;border-top-style: solid;border-top-width: 1px;border-top-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;overflow-x: hidden;}#ivmqrctmyo .gt_stub {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;text-transform: inherit;border-right-style: solid;border-right-width: 2px;border-right-color: #D3D3D3;padding-left: 5px;padding-right: 5px;}#ivmqrctmyo .gt_stub_row_group {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;text-transform: inherit;border-right-style: solid;border-right-width: 2px;border-right-color: #D3D3D3;padding-left: 5px;padding-right: 5px;vertical-align: top;}#ivmqrctmyo .gt_row_group_first td {border-top-width: 2px;}#ivmqrctmyo .gt_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#ivmqrctmyo .gt_first_summary_row {border-top-style: solid;border-top-color: #D3D3D3;}#ivmqrctmyo .gt_first_summary_row.thick {border-top-width: 2px;}#ivmqrctmyo .gt_last_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#ivmqrctmyo .gt_grand_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#ivmqrctmyo .gt_first_grand_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-top-style: double;border-top-width: 6px;border-top-color: #D3D3D3;}#ivmqrctmyo .gt_striped {background-color: rgba(128, 128, 128, 0.05);}#ivmqrctmyo .gt_table_body {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#ivmqrctmyo .gt_footnotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#ivmqrctmyo .gt_footnote {margin: 0px;font-size: 90%;padding-left: 4px;padding-right: 4px;padding-left: 5px;padding-right: 5px;}#ivmqrctmyo .gt_sourcenotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#ivmqrctmyo .gt_sourcenote {font-size: 90%;padding-top: 4px;padding-bottom: 4px;padding-left: 5px;padding-right: 5px;}#ivmqrctmyo .gt_left {text-align: left;}#ivmqrctmyo .gt_center {text-align: center;}#ivmqrctmyo .gt_right {text-align: right;font-variant-numeric: tabular-nums;}#ivmqrctmyo .gt_font_normal {font-weight: normal;}#ivmqrctmyo .gt_font_bold {font-weight: bold;}#ivmqrctmyo .gt_font_italic {font-style: italic;}#ivmqrctmyo .gt_super {font-size: 65%;}#ivmqrctmyo .gt_two_val_uncert {display: inline-block;line-height: 1em;text-align: right;font-size: 60%;vertical-align: -0.25em;margin-left: 0.1em;}#ivmqrctmyo .gt_footnote_marks {font-style: italic;font-weight: normal;font-size: 75%;vertical-align: 0.4em;}#ivmqrctmyo .gt_asterisk {font-size: 100%;vertical-align: 0;}#ivmqrctmyo .gt_slash_mark {font-size: 0.7em;line-height: 0.7em;vertical-align: 0.15em;}#ivmqrctmyo .gt_fraction_numerator {font-size: 0.6em;line-height: 0.6em;vertical-align: 0.45em;}#ivmqrctmyo .gt_fraction_denominator {font-size: 0.6em;line-height: 0.6em;vertical-align: -0.05em;}</style><table class="gt_table"><thead class="gt_col_headings"><tr><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">num</th></tr></thead><tbody class="gt_table_body"><tr><td class="gt_row gt_right">0.00</td></tr><tr><td class="gt_row gt_right">NA</td></tr><tr><td class="gt_row gt_right">100,000,000.00</td></tr><tr><td class="gt_row gt_right">1,000,000,000.00</td></tr><tr><td class="gt_row gt_right">10,000,000,000.00</td></tr><tr><td class="gt_row gt_right">100,000,000,000.00</td></tr><tr><td class="gt_row gt_right">≥1e+12</td></tr><tr><td class="gt_row gt_right">≥1e+12</td></tr><tr><td class="gt_row gt_right">≥1e+12</td></tr></tbody></table></div><p><br />Large negative values can also be handled but they are handled specially by the <code>sign</code> parameter. Setting that to <code>"-"</code> will format only the large values that are negative. Notice that with the default <code>large_pattern</code> value of <code>"&gt;={x}"</code> the <code>"&gt;="</code> is automatically changed to <code>"&lt;="</code>.</p><pre class="r"><code>tbl %&gt;%dplyr::mutate(num = -num) %&gt;%gt() %&gt;%fmt_number(columns = num) %&gt;%sub_large_vals(sign = &quot;-&quot;)</code></pre><div id="wvcxxyputp" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;"><style>html {font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Helvetica Neue', 'Fira Sans', 'Droid Sans', Arial, sans-serif;}#wvcxxyputp .gt_table {display: table;border-collapse: collapse;margin-left: auto;margin-right: auto;color: #333333;font-size: 16px;font-weight: normal;font-style: normal;background-color: #FFFFFF;width: auto;border-top-style: solid;border-top-width: 2px;border-top-color: #A8A8A8;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #A8A8A8;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;}#wvcxxyputp .gt_heading {background-color: #FFFFFF;text-align: center;border-bottom-color: #FFFFFF;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#wvcxxyputp .gt_title {color: #333333;font-size: 125%;font-weight: initial;padding-top: 4px;padding-bottom: 4px;padding-left: 5px;padding-right: 5px;border-bottom-color: #FFFFFF;border-bottom-width: 0;}#wvcxxyputp .gt_subtitle {color: #333333;font-size: 85%;font-weight: initial;padding-top: 0;padding-bottom: 6px;padding-left: 5px;padding-right: 5px;border-top-color: #FFFFFF;border-top-width: 0;}#wvcxxyputp .gt_bottom_border {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#wvcxxyputp .gt_col_headings {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#wvcxxyputp .gt_col_heading {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: normal;text-transform: inherit;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 6px;padding-left: 5px;padding-right: 5px;overflow-x: hidden;}#wvcxxyputp .gt_column_spanner_outer {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: normal;text-transform: inherit;padding-top: 0;padding-bottom: 0;padding-left: 4px;padding-right: 4px;}#wvcxxyputp .gt_column_spanner_outer:first-child {padding-left: 0;}#wvcxxyputp .gt_column_spanner_outer:last-child {padding-right: 0;}#wvcxxyputp .gt_column_spanner {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 5px;overflow-x: hidden;display: inline-block;width: 100%;}#wvcxxyputp .gt_group_heading {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;text-transform: inherit;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;}#wvcxxyputp .gt_empty_group_heading {padding: 0.5px;color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: middle;}#wvcxxyputp .gt_from_md > :first-child {margin-top: 0;}#wvcxxyputp .gt_from_md > :last-child {margin-bottom: 0;}#wvcxxyputp .gt_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;margin: 10px;border-top-style: solid;border-top-width: 1px;border-top-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;overflow-x: hidden;}#wvcxxyputp .gt_stub {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;text-transform: inherit;border-right-style: solid;border-right-width: 2px;border-right-color: #D3D3D3;padding-left: 5px;padding-right: 5px;}#wvcxxyputp .gt_stub_row_group {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;text-transform: inherit;border-right-style: solid;border-right-width: 2px;border-right-color: #D3D3D3;padding-left: 5px;padding-right: 5px;vertical-align: top;}#wvcxxyputp .gt_row_group_first td {border-top-width: 2px;}#wvcxxyputp .gt_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#wvcxxyputp .gt_first_summary_row {border-top-style: solid;border-top-color: #D3D3D3;}#wvcxxyputp .gt_first_summary_row.thick {border-top-width: 2px;}#wvcxxyputp .gt_last_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#wvcxxyputp .gt_grand_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#wvcxxyputp .gt_first_grand_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-top-style: double;border-top-width: 6px;border-top-color: #D3D3D3;}#wvcxxyputp .gt_striped {background-color: rgba(128, 128, 128, 0.05);}#wvcxxyputp .gt_table_body {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#wvcxxyputp .gt_footnotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#wvcxxyputp .gt_footnote {margin: 0px;font-size: 90%;padding-left: 4px;padding-right: 4px;padding-left: 5px;padding-right: 5px;}#wvcxxyputp .gt_sourcenotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#wvcxxyputp .gt_sourcenote {font-size: 90%;padding-top: 4px;padding-bottom: 4px;padding-left: 5px;padding-right: 5px;}#wvcxxyputp .gt_left {text-align: left;}#wvcxxyputp .gt_center {text-align: center;}#wvcxxyputp .gt_right {text-align: right;font-variant-numeric: tabular-nums;}#wvcxxyputp .gt_font_normal {font-weight: normal;}#wvcxxyputp .gt_font_bold {font-weight: bold;}#wvcxxyputp .gt_font_italic {font-style: italic;}#wvcxxyputp .gt_super {font-size: 65%;}#wvcxxyputp .gt_two_val_uncert {display: inline-block;line-height: 1em;text-align: right;font-size: 60%;vertical-align: -0.25em;margin-left: 0.1em;}#wvcxxyputp .gt_footnote_marks {font-style: italic;font-weight: normal;font-size: 75%;vertical-align: 0.4em;}#wvcxxyputp .gt_asterisk {font-size: 100%;vertical-align: 0;}#wvcxxyputp .gt_slash_mark {font-size: 0.7em;line-height: 0.7em;vertical-align: 0.15em;}#wvcxxyputp .gt_fraction_numerator {font-size: 0.6em;line-height: 0.6em;vertical-align: 0.45em;}#wvcxxyputp .gt_fraction_denominator {font-size: 0.6em;line-height: 0.6em;vertical-align: -0.05em;}</style><table class="gt_table"><thead class="gt_col_headings"><tr><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">num</th></tr></thead><tbody class="gt_table_body"><tr><td class="gt_row gt_right">0.00</td></tr><tr><td class="gt_row gt_right">NA</td></tr><tr><td class="gt_row gt_right">&minus;100,000,000.00</td></tr><tr><td class="gt_row gt_right">&minus;1,000,000,000.00</td></tr><tr><td class="gt_row gt_right">&minus;10,000,000,000.00</td></tr><tr><td class="gt_row gt_right">&minus;100,000,000,000.00</td></tr><tr><td class="gt_row gt_right">≤-1e+12</td></tr><tr><td class="gt_row gt_right">≤-1e+12</td></tr><tr><td class="gt_row gt_right">≤-1e+12</td></tr></tbody></table></div><p><br />You don’t have to settle with the default <code>threshold</code> value or the default replacement pattern (in <code>large_pattern</code>). This can be changed and the <code>"{x}"</code> in <code>large_pattern</code> (which uses the <code>threshold</code> value) can even be omitted.</p><pre class="r"><code>tbl %&gt;%gt() %&gt;%fmt_number(columns = num) %&gt;%sub_large_vals(threshold = 5E10,large_pattern = &quot;hugemongous&quot;)</code></pre><div id="zksdokqaop" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;"><style>html {font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Helvetica Neue', 'Fira Sans', 'Droid Sans', Arial, sans-serif;}#zksdokqaop .gt_table {display: table;border-collapse: collapse;margin-left: auto;margin-right: auto;color: #333333;font-size: 16px;font-weight: normal;font-style: normal;background-color: #FFFFFF;width: auto;border-top-style: solid;border-top-width: 2px;border-top-color: #A8A8A8;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #A8A8A8;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;}#zksdokqaop .gt_heading {background-color: #FFFFFF;text-align: center;border-bottom-color: #FFFFFF;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#zksdokqaop .gt_title {color: #333333;font-size: 125%;font-weight: initial;padding-top: 4px;padding-bottom: 4px;padding-left: 5px;padding-right: 5px;border-bottom-color: #FFFFFF;border-bottom-width: 0;}#zksdokqaop .gt_subtitle {color: #333333;font-size: 85%;font-weight: initial;padding-top: 0;padding-bottom: 6px;padding-left: 5px;padding-right: 5px;border-top-color: #FFFFFF;border-top-width: 0;}#zksdokqaop .gt_bottom_border {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#zksdokqaop .gt_col_headings {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#zksdokqaop .gt_col_heading {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: normal;text-transform: inherit;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 6px;padding-left: 5px;padding-right: 5px;overflow-x: hidden;}#zksdokqaop .gt_column_spanner_outer {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: normal;text-transform: inherit;padding-top: 0;padding-bottom: 0;padding-left: 4px;padding-right: 4px;}#zksdokqaop .gt_column_spanner_outer:first-child {padding-left: 0;}#zksdokqaop .gt_column_spanner_outer:last-child {padding-right: 0;}#zksdokqaop .gt_column_spanner {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 5px;overflow-x: hidden;display: inline-block;width: 100%;}#zksdokqaop .gt_group_heading {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;text-transform: inherit;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;}#zksdokqaop .gt_empty_group_heading {padding: 0.5px;color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: middle;}#zksdokqaop .gt_from_md > :first-child {margin-top: 0;}#zksdokqaop .gt_from_md > :last-child {margin-bottom: 0;}#zksdokqaop .gt_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;margin: 10px;border-top-style: solid;border-top-width: 1px;border-top-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;overflow-x: hidden;}#zksdokqaop .gt_stub {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;text-transform: inherit;border-right-style: solid;border-right-width: 2px;border-right-color: #D3D3D3;padding-left: 5px;padding-right: 5px;}#zksdokqaop .gt_stub_row_group {color: #333333;background-color: #FFFFFF;font-size: 100%;font-weight: initial;text-transform: inherit;border-right-style: solid;border-right-width: 2px;border-right-color: #D3D3D3;padding-left: 5px;padding-right: 5px;vertical-align: top;}#zksdokqaop .gt_row_group_first td {border-top-width: 2px;}#zksdokqaop .gt_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#zksdokqaop .gt_first_summary_row {border-top-style: solid;border-top-color: #D3D3D3;}#zksdokqaop .gt_first_summary_row.thick {border-top-width: 2px;}#zksdokqaop .gt_last_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#zksdokqaop .gt_grand_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#zksdokqaop .gt_first_grand_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-top-style: double;border-top-width: 6px;border-top-color: #D3D3D3;}#zksdokqaop .gt_striped {background-color: rgba(128, 128, 128, 0.05);}#zksdokqaop .gt_table_body {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#zksdokqaop .gt_footnotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#zksdokqaop .gt_footnote {margin: 0px;font-size: 90%;padding-left: 4px;padding-right: 4px;padding-left: 5px;padding-right: 5px;}#zksdokqaop .gt_sourcenotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#zksdokqaop .gt_sourcenote {font-size: 90%;padding-top: 4px;padding-bottom: 4px;padding-left: 5px;padding-right: 5px;}#zksdokqaop .gt_left {text-align: left;}#zksdokqaop .gt_center {text-align: center;}#zksdokqaop .gt_right {text-align: right;font-variant-numeric: tabular-nums;}#zksdokqaop .gt_font_normal {font-weight: normal;}#zksdokqaop .gt_font_bold {font-weight: bold;}#zksdokqaop .gt_font_italic {font-style: italic;}#zksdokqaop .gt_super {font-size: 65%;}#zksdokqaop .gt_two_val_uncert {display: inline-block;line-height: 1em;text-align: right;font-size: 60%;vertical-align: -0.25em;margin-left: 0.1em;}#zksdokqaop .gt_footnote_marks {font-style: italic;font-weight: normal;font-size: 75%;vertical-align: 0.4em;}#zksdokqaop .gt_asterisk {font-size: 100%;vertical-align: 0;}#zksdokqaop .gt_slash_mark {font-size: 0.7em;line-height: 0.7em;vertical-align: 0.15em;}#zksdokqaop .gt_fraction_numerator {font-size: 0.6em;line-height: 0.6em;vertical-align: 0.45em;}#zksdokqaop .gt_fraction_denominator {font-size: 0.6em;line-height: 0.6em;vertical-align: -0.05em;}</style><table class="gt_table"><thead class="gt_col_headings"><tr><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">num</th></tr></thead><tbody class="gt_table_body"><tr><td class="gt_row gt_right">0.00</td></tr><tr><td class="gt_row gt_right">NA</td></tr><tr><td class="gt_row gt_right">100,000,000.00</td></tr><tr><td class="gt_row gt_right">1,000,000,000.00</td></tr><tr><td class="gt_row gt_right">10,000,000,000.00</td></tr><tr><td class="gt_row gt_right">hugemongous</td></tr><tr><td class="gt_row gt_right">hugemongous</td></tr><tr><td class="gt_row gt_right">hugemongous</td></tr><tr><td class="gt_row gt_right">hugemongous</td></tr></tbody></table></div><p><br /></p><h2>Wrapping Up</h2><p>We are always trying to improve the {gt} package with a mix of big features (some examples: improving rendering, adding new families of functions) and numerous tiny features (like improving existing functions, clarifying documentation, etc.). It’s hoped that the things delivered in {gt} <code>0.6.0</code> lead to improvements in how you create and present summary tables in R. If there are features you <em>really</em> want, always feel free to <a href="https://github.com/rstudio/gt/issues">file an issue</a> or talk about your ideas in the <a href="https://github.com/rstudio/gt/discussions"><em>Discussions</em> page</a>!</p></description></item><item><title>Announcing vetiver for MLOps in R and Python</title><link>https://www.rstudio.com/blog/announce-vetiver/</link><pubDate>Thu, 09 Jun 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announce-vetiver/</guid><description><p>We are thrilled to announce the release of <a href="https://vetiver.rstudio.com/">vetiver</a>, a framework for MLOps tasks in R and Python. The goal of vetiver is to provide fluent tooling to <strong>version</strong>, <strong>share</strong>, <strong>deploy</strong>, and <strong>monitor</strong> a trained model.</p><p>Data scientists have open source tools that they love using to prepare data for modeling and train models, but there is a lack of fluent open source tooling for MLOps tasks like putting a model in production or monitoring model performance. Using vetiver for MLOps lets you use the tools you are comfortable with for exploratory data analysis and model training/tuning, and provides a flexible framework for the parts of a model lifecycle not served as well by current approaches.</p><p>As of today, the vetiver framework supports models trained via <a href="https://scikit-learn.org/">scikit-learn</a>, <a href="https://pytorch.org/">PyTorch</a>, <a href="https://www.tidymodels.org/">tidymodels</a>, <a href="https://topepo.github.io/caret/">caret</a>, <a href="https://mlr3.mlr-org.com/">mlr3</a>, <a href="https://xgboost.readthedocs.io/en/latest/R-package/">XGBoost</a>, <a href="https://cran.r-project.org/package=ranger">ranger</a>, <a href="https://stat.ethz.ch/R-manual/R-patched/library/stats/html/lm.html"><code>lm()</code></a>, and <a href="https://stat.ethz.ch/R-manual/R-patched/library/stats/html/glm.html"><code>glm()</code></a>. We are interested in what other modeling frameworks to support, so please let us know what you would like to use vetiver with!</p><h2>Getting started</h2><p>You can install the released version of vetiver for R from <a href="https://cran.r-project.org/package=vetiver">CRAN</a>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">vetiver&#34;</span>)</code></pre></div><p>You can install the released version of vetiver for Python from <a href="https://pypi.org/project/vetiver/">PyPI</a>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-python" data-lang="python">pip install vetiver</code></pre></div><p>See our documentation for more on how to:</p><ul><li><a href="https://vetiver.rstudio.com/get-started/">create a deployable vetiver model</a></li><li><a href="https://vetiver.rstudio.com/get-started/version.html">publish and version your model</a></li><li><a href="https://vetiver.rstudio.com/get-started/deploy.html">deploy your model as a REST API</a></li></ul><h2>Why use vetiver?</h2><p>The vetiver framework for MLOps tasks is built for data science teams that use R and/or Python, with a native, fluent experience for both. We especially had &ldquo;bilingual&rdquo; data science teams in mind as we designed vetiver&rsquo;s approach, enabling teams that use both languages (or an individual who uses both) to deploy models with consistent and unified practices.</p><p>The vetiver framework provides data scientists with a first deployment experience that is as painless as possible, while being flexible and extensible for more advanced users. At RStudio, we have experienced how tools that are built to help beginners succeed and do the &ldquo;right thing&rdquo; are also typically good tools for data practitioners as they mature and advance. In vetiver specifically, functions handle both recording and checking the model’s input data prototype, to avoid common failure modes when deploying models. Other functions support predicting from a remote API endpoint so that you can treat a deployed model much the same as a local R or Python model in memory.</p><h2>Get in touch</h2><p>We are so happy about releasing vetiver for R and Python, and we want to know how to make it better. Join our discussion onRStudio Community to chat with us about deploying your models, and let us know what you would like to see from vetiver!</p></description></item><item><title>The Critical Shift to Data in the Finance Industry</title><link>https://www.rstudio.com/blog/the-critical-shift-to-data-in-the-finance-industry/</link><pubDate>Wed, 08 Jun 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/the-critical-shift-to-data-in-the-finance-industry/</guid><description><p><sup>Photo by <a href="https://unsplash.com/@seanpollock?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Sean Pollock</a> on <a href="https://unsplash.com/?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup></p><div class="lt-gray-box">This is a guest post from Jo Ann Stadtmueller, SVP, Commercial & Product Strategy at Data Society. <a href="https://datasociety.com/?utm_source=RStudio&utm_medium=blog&utm_campaign=Data+Science+in+Finance&utm_id=Data+Shift+Finance+RStudio" target = "_blank">Data Society</a> provides customized, industry-tailored data science training programs and AI/ML solutions for enterprise and government agencies to educate, equip, and empower their workforce.</div><p><a href="https://datasociety.com/?utm_source=RStudio&utm_medium=blog&utm_campaign=Data+Science+in+Finance&utm_id=Data+Shift+Finance+RStudio" target = "_blank">Data Society</a> was delighted by RStudio’s invitation to participate in the <a href="https://www.youtube.com/watch?v=zikxpOoEcLk&feature=emb_imp_woyt" target = "_blank">February meetup series</a>. Our company’s co-founders, Chief Executive Officer Merav Yuravlivker and Chief Solutions Officer Dmitri Adler, enjoyed sharing their insights into industry trends for the RStudio Finance Meetup.</p><p>This subject is dear to Data Society’s team, which has worked extensively on data science solutions and training of particular significance to the financial services industry. Our founders’ presentation highlighted some of the growing challenges that we have helped our clients in financial services navigate with data science and that we believe will continue to demand attention.</p><h2 id="fraud-detection">Fraud Detection</h2><p>Financial crime remains an escalating threat. Failure to effectively monitor transactions, detect potentially fraudulent activity, and report accordingly can be detrimental both commercially and as a compliance concern. Therefore, as new forms of fraud evolve, so must technologies capable of capturing activities that warrant alerts. R and Python are essential tools to facilitate the <a href="https://datasociety.com/the-critical-role-of-data-science-in-detecting-potential-financial-crime/" target = "_blank">optimization of transaction monitoring systems</a> that generate tailored alerts based on specified parameters. In addition, they enable data scientists to perform independent analysis of transaction data for anomalies that may indicate fraudulent activity.</p><h2 id="risk-assessment">Risk Assessment</h2><p>Risk assessment is an additional challenge in the financial services industry—which, like fraud, also has compliance implications. Technology offers powerful solutions, like RStudio’s toolchain, to ensure your work is scalable, secure, and auditable. Therefore, data science-based tools that can automate the complex risk evaluation process also demand a thorough understanding of the data and mechanisms that drive the output. For example, we developed a machine learning application in R for <a href="https://datasociety.com/case-study/idb-infrastructure-risk-mitigation/" target = "_blank">Inter-American Development Bank</a> to assess risk associated with infrastructure projects.</p><p>However, financial services professionals can be <a href="https://datasociety.com/rethinking-risk-management-in-financial-services-with-data-science/" target = "_blank">vulnerable to misleading reporting and inaccurate conclusions</a> without proper training. We encourage clients to develop a thorough understanding of the analytical processes and input that produce the output of automated risk assessments, as well as proficiency in the key data science tools used to produce them.</p><p>We also recently introduced <a href="https://dsmarketing.datasociety.com/camelsback-download" target = "_blank">Camelsback</a>, a financial risk assessment tool based on the <a href="https://www.fdic.gov/news/press-releases/2021/pr21091.html" target = "_blank">award-winning risk evaluation framework </a> we developed in partnership with Google for the FDIC’s Resilience Tech Sprint. Named after the CAMELS (Capital adequacy, Asset quality, Management, Earnings, Liquidity, and Sensitivity) system, which banking regulators use for rating financial institutions, this Python-based AI engine incorporates data from internal and external sources. It enables managers to calibrate risk weights to generate risk scores from a broad base of information and context.</p><h2 id="non-traditional-data">Non-Traditional Data</h2><p>The introduction of external data in tools such as Camelsback points to an additional industry trend, the increasing reliance on data from non-traditional sources to improve outcomes including risk assessment and fraud detection. Applications that leverage documentation such as market reports, news articles, and social media posts introduce new analytical dimensions. R and Python provide extensive libraries for natural language processing, enabling financial institutions to access a wealth of insights from text data through techniques such as sentiment analysis, topic modeling, and text classification. The capacity to gather and process less structured, more quantitative data seems likely only to gain importance in many areas of financial services analytics.</p><h2 id="environmental-social-and-governance-reporting">Environmental, Social, and Governance Reporting</h2><p>In this climate, data science has become a valuable tool for financial institutions striving to exploit previously untapped reserves of informative data. An additional trend related to the mining of non-traditional data is the <a href="https://www.thomsonreuters.com/en-us/posts/news-and-media/esg-regulations-financial-firms/" target = "_blank">movement toward Environmental, Social, and Governance (ESG) disclosure</a>. Although ESG regulations are still nascent in the US, there is momentum behind financial institutions accounting for variables such as ecological impact, workforce practices, and equity when assessing risk and making lending decisions. Like emerging risk analysis and fraud detection processes, measuring ESG requires understanding of the types and sources of information relevant to such outcomes and how to quantify the data to be mined from them.</p><h2 id="conclusion">Conclusion</h2><p>In a world in constant motion, the financial services industry must be ever-responsive to fresh demands related to monitoring, analytics, assessment, compliance, and reporting. Data science tools and methodologies play critical roles in helping financial institutions access and leverage the data most relevant to these evolving challenges. However, increased reliance on data science solutions should be accompanied by increased investment in data science training to ensure their output’s integrity, accuracy, and interpretability.</p><p>Watch Dmitri Adler and Merav Yuravlivker&rsquo;s webinar, The Shift to Data: Industry Trends in Finance, below:</p><center><iframe width="560" height="315" src="https://www.youtube.com/embed/zikxpOoEcLk" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></center></description></item><item><title>Grow Your Data Science Skills With Academy</title><link>https://www.rstudio.com/blog/grow-your-data-science-skills-with-academy/</link><pubDate>Mon, 06 Jun 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/grow-your-data-science-skills-with-academy/</guid><description><p><sup>Photo by <a href="https://unsplash.com/@corina?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">corina ardeleanu</a> on <a href="https://unsplash.com/?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup></p><p>We recently announced Academy, a new educational experience that my colleagues and I have been building for teams to learn data science in the most efficient way possible. If you&rsquo;ve been meaning to learn R, you can try Academy for yourself as an individual enrollee through this summer&rsquo;s <a href="https://www.rstudio.com/conference/2022/workshops/intro-to-tidyverse/" target = "_blank">rstudio::conf&rsquo;s workshop, Introduction to the Tidyverse</a>.</p><p>Academy gets all learners up and running with data science quickly – whether you&rsquo;re new to R or new to programming. I&rsquo;m a data science educator on the team, and for the past couple years we&rsquo;ve been working on designing custom curricula, mentoring data scientists, and building a premier learning platform. I&rsquo;d like to share what I think makes this training experience the way learning <em>should be</em>.</p><p>Consider learning a language – whether spoken or coded – to master it, you need:</p><ul><li><p><strong>Real-world practice.</strong> This has to happen from the start. You need to <em>apply</em> facts to develop a lasting skill. This means working on tasks that encourage you to experiment, make mistakes, and then course-correct.</p></li><li><p>For a practice regimen to be successful, you need a system in place that allows you to get <strong>frequent, meaningful feedback</strong> so that the challenges you tackle are fruitful and not overwhelming.</p></li><li><p>And lastly, if you want any of the data science skills that you learn to stick, you have to ensure that the cycle of real-world <strong>practice and feedback happens repeatedly and consistently</strong> over a long period of time. If you don&rsquo;t use it, you lose it!</p></li></ul><p>Academy brings the best of these elements together into a single training experience that more closely resembles a data science <strong>apprenticeship.</strong> Here are the key pillars:</p><ol><li><strong>Projects</strong></li></ol><center><img src="images/image01.png" alt="Tidy analysis workflow with an envelope drawing for import, a table drawing for wrangle, a graph drawing for visualize, and a report drawing for publish" width="50%"></center><p>Everything in Academy is centered on applying new skills to a data science <strong>project</strong>. As a learner, your project is built to be relevant to your specific industry, and it&rsquo;s designed to teach you the very skills that you&rsquo;ll need in your day-to-day work. Each project walks you through a full cycle of data analysis – from importing the data to publishing a final report of your findings.</p><p>How do you learn what you need to complete the project? This brings us to…</p><ol start="2"><li><strong>Lessons</strong></li></ol><p>We have written a library of interactive lessons that cover data science fundamentals. You&rsquo;ll work through a set of lessons each week asynchronously, completing dynamic exercises along the way. Each exercise has customized feedback, specific to your answers. Going through Academy lessons feels more like a conversation – you explore, experiment with code, and then get explanations and next steps to try.</p><ol start="3"><li><strong>Project milestones</strong></li></ol><p>After completing lessons, you&rsquo;re ready to take on the project milestone. Milestones are bite-sized pieces of a larger data science project. They contain an artifact, like a plot or table that&rsquo;s been made using the project data.</p><p>You&rsquo;ll get one week to apply the skills you&rsquo;ve learned in lessons and recreate the milestone. But you have a second milestone task, too – you&rsquo;ll extend the project milestone by modifying it or adding to it, often using something new about R that you&rsquo;ve explored on your own – i.e., using something <em>not</em> in the lessons. This part is especially important because it&rsquo;s how you practice teaching yourself new R skills – a skill in and of itself that must be learned to become a competent coder.</p><p>Each week you&rsquo;ll receive new milestones that build on each other and add to your data science repertoire.</p><p>Milestones can be open-ended assignments, and you might think that this could run the risk of overwhelming the beginner data scientist. To remedy this, we provide the next piece of Academy.</p><ol start="4"><li><strong>Mentor access</strong></li></ol><center><img src="images/image02.png" alt="A person with their arms in front of their body holding a heart with the RStudio logo on it" width="50%"></center><p>In order to ensure that you feel well-supported and guided throughout the open-ended aspects of the project (and throughout the apprenticeship, in general), you&rsquo;ll have the opportunity to meet weekly with an Academy mentor over a Zoom call. The mentor provides tailored advice about what you&rsquo;re doing well or what you could improve &mdash; much like a personal trainer would.</p><p>But this is only one of the sources of support you receive…</p><ol start="5"><li><strong>Group sessions</strong></li></ol><img src="images/image03.png" alt="A person with a dog next to a person sitting cross legged" style="width:100%"><p>You&rsquo;ll also have accountability and social support from peer groups at group sessions. As an Academy student, you go through the apprenticeship in small groups, usually 5-7 peers from your company.</p><p>Each week, you meet all together to take turns presenting their project work and customizations from the week. This is an opportunity to learn new things and get feedback from one another. This small-group setting makes the learning experience more personal, social, and unique.</p><ol start="6"><li><strong>Daily Practice</strong></li></ol><center><img src="images/image04.png" alt="A drawing of a laptop with the RStudio logo on the screen" style="width:50%"></center><p>The final pillar of Academy is daily practice. To keep up the regular cycle of practice and feedback to build long-lasting skills, we are developing an extensive library of practice drills. And we make it easy to squeeze in some practice and regularly drill the concepts you most need to review.</p><p>All of these pieces are what make Academy a tailored and effective training experience for learning data science.</p><p>How can you try out Academy? If you&rsquo;re part of a corporate team, you can get in touch by visiting our <a href="https://www.rstudio.com/academy/" target = "_blank">homepage</a>. But if you&rsquo;re an individual who wants to skill up, there&rsquo;s currently only one way to participate: this year, we&rsquo;re offering a special opportunity to enroll through the <a href="https://www.rstudio.com/conference/2022/workshops/intro-to-tidyverse/" target = "_blank">rstudio::conf(2022) workshops</a>. The Introduction to the Tidyverse workshop will be taught as a 6-week Academy course, with two in-person days during the conference.</p><p>It&rsquo;s an exciting time to be learning data science and we&rsquo;re excited that Academy can now be a part of your coding journey! Hope to work with you soon!</p></description></item><item><title>Announcing pins for Python</title><link>https://www.rstudio.com/blog/pins-for-python/</link><pubDate>Thu, 02 Jun 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/pins-for-python/</guid><description><p>We’re excited to announce the release of <a href="https://rstudio.github.io/pins-python/">pins</a> for Python!</p><p>pins removes the hassle of managing data across projects, colleagues, and teams byproviding a central place for people to store, version and retrieve data.If you’ve ever chased a CSV through a series of email exchanges, or had to decide between<code>data-final.csv</code> and <code>data-final-final.csv</code>, then pins is for you.</p><p>pins stores data on a <strong>board</strong>, which can be a local folder, or on RStudio Connect or acloud provider like Amazon S3.Each individual object (such as a dataframe, model, or another pickle-able Python object), together with some metadata, is called a pin.</p><p>The Python pins library works with its <a href="https://pins.rstudio.com/">R counterpart</a>,so that teams working across R and Python have a unified strategy for sharing data.This work emerged as part of RStudio’s investment in Python open source, in order tosupport bilingual data science teams.</p><h2>Getting Started</h2><p>The first step to using pins is installing it from PyPI.</p><pre class="shell"><code>python -m pip install pins</code></pre><p>In the examples below, I’ll walk through the basics of pins using a temporary directoryfor a board, with <code>board_temp()</code>. This gets deleted after you close Python, so it isnot ideal for collaboration! You can use other boards, like <code>board_rsconnect()</code>, <code>board_folder()</code>, and <code>board_s3()</code>, in more realistic settings.</p><pre class="python"><code>import pinsfrom pins.data import mtcarsboard = pins.board_temp()</code></pre><p>You can “pin” (save) data to a board with the <code>.pin_write()</code> method. It requires threearguments: an object, a name, and a pin type:</p><pre class="python"><code>board.pin_write(mtcars.head(), &quot;mtcars&quot;, type=&quot;csv&quot;)#&gt; Meta(title=&#39;mtcars: a pinned 5 x 11 DataFrame&#39;, description=None, created=&#39;20220601T175057Z&#39;, pin_hash=&#39;120a54f7e0818041&#39;, file=&#39;mtcars.csv&#39;, file_size=249, type=&#39;csv&#39;, api_version=1, version=Version(created=datetime.datetime(2022, 6, 1, 17, 50, 57, 80318), hash=&#39;120a54f7e0818041&#39;), name=&#39;mtcars&#39;, user={})#&gt;#&gt; Writing to pin &#39;mtcars&#39;</code></pre><p>Above, we saved the data as a CSV, but depending onwhat you’re saving and who else you want to read it, you might use the<code>type</code> argument to instead save it as a <code>feather</code>, <code>parquet</code>, or <code>joblib</code> file.</p><p>You can later retrieve the pinned data with <code>.pin_read()</code>:</p><pre class="python"><code>board.pin_read(&quot;mtcars&quot;)#&gt; mpg cyl disp hp drat wt qsec vs am gear carb#&gt; 0 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4#&gt; 1 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4#&gt; 2 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1#&gt; 3 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1#&gt; 4 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2</code></pre><p>You can search for data using <code>.pin_search()</code> and <code>.pin_list()</code>.</p><pre class="python"><code># prints out a list of all pins# board.pin_list()# searches for pins containing &quot;cars&quot;board.pin_search(&quot;cars&quot;)#&gt; name type ... file_size meta#&gt; 0 mtcars csv ... 249 Meta(title=&#39;mtcars: a pinned 5 x 11 DataFrame&#39;...#&gt;#&gt; [1 rows x 6 columns]</code></pre><p>Two more pieces of important functionality exist:</p><ul><li><code>.pin_write()</code> won’t delete existing data, but versions your data.</li><li><code>.pin_read()</code> caches your data, so subsequent reads are much faster.</li></ul><p>See <a href="https://rstudio.github.io/pins-python/getting_started.html">getting started</a> in thepins documentation for more information.</p><h2>Interoperability with R pins</h2><p>Pins stored with Python can be read with R, and vice-versa.</p><p>For example, here is R code that reads the <code>mtcars</code> pin we wrote to the board above.Note that <code>TEMP_PATH</code> refers to the temporary directory we created in this blog post for our Python board.</p><pre class="r"><code>library(pins)board &lt;- board_folder(TEMP_PATH)board %&gt;% pin_read(&quot;mtcars&quot;)#&gt; mpg cyl disp hp drat wt qsec vs am gear carb#&gt; 1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4#&gt; 2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4#&gt; 3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1#&gt; 4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1#&gt; 5 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2</code></pre><p>This is especially useful when colleagues prefer one language over the other. For real collaborative work like this, you would use a board like <code>board_rsconnect()</code> or <code>board_s3()</code>.</p><h2>Going further</h2><p>The real power of pins comes when you share a board with multiple people.To get started, you can use <code>board_folder()</code> with a directory on a shareddrive or in DropBox, or if you use<a href="https://www.rstudio.com/products/connect/">RStudio Connect</a> you can use<code>board_rsconnect()</code>:</p><pre class="python"><code>board = pins.board_rsconnect()board.pin_write(tidy_sales_data, &quot;michael/sales-summary&quot;, type=&quot;csv&quot;)</code></pre><p>Then, someone else (or an automated report) can read and use yourpin:</p><pre class="python"><code>board = pins.board_rsconnect()board.pin_read(&quot;michael/sales-summary&quot;)</code></pre><p>The pins package also includes boards that allow you to share data onservices like Amazon’s S3 (<code>board_s3()</code>), with plans to support other backends such as Google Cloud Storage and Azure’s blob storage.</p><h2>Get in touch</h2><p>We are so happy about releasing pins for Python, and we wantto make sure it supports your workflow. Join our discussion onRStudio Community to let us know what you’re working on,and how pins could help!</p></description></item><item><title>RStudio Community Monthly Events Roundup - June 2022</title><link>https://www.rstudio.com/blog/rstudio-community-monthly-events-roundup-june-2022/</link><pubDate>Wed, 01 Jun 2022 03:00:00 -0600</pubDate><guid>https://www.rstudio.com/blog/rstudio-community-monthly-events-roundup-june-2022/</guid><description><p>Welcome to RStudio Community Monthly Events Roundup, where we update you on upcoming virtual events happening at RStudio this month. Missed the great talks and presentations from last month? Find them listed under <a href="#icymi-may-2022-events">ICYMI: May 2022 Events</a>.</p><p>You can <a href="https://www.addevent.com/calendar/wT379734" target = "_blank">subscribe</a> to the Community Events Calendar so that new events will automatically appear on your calendar. Please note that by subscribing, all of the events in the calendar will appear on your own calendar. If you wish to add individual events instead, please use the links below.</p><p>We can’t wait to see you there!</p><h2 id="save-the-date">Save the Date</h2><ul><li>June 2, 2022 at 12 ET: Data Science Hangout with Travis Gerke, Director of Data Science at PCCTC (<a href="https://www.addevent.com/event/Qv9211919" target = "_blank">add to calendar</a>)</li><li>June 7, 2022 at 12 ET: Making microservices a part of your data science team | Led by Tom Schenk &amp; Bejan Sadeghian at KPMG (<a href="https://evt.to/aeshgidow" target = "_blank">add to calendar</a>)</li><li>June 9, 2022 at 12 ET: Data Science Hangout with Tanya Cashorali, CEO and Founder at TCB Analytics (<a href="https://www.addevent.com/event/Qv9211919" target = "_blank">add to calendar</a>)</li><li>June 14, 2022 at 12 ET: RStudio Healthcare Meetup: Translating facts into insights at Children&rsquo;s Hospital of Philadelphia | Led by Jake Riley (<a href="https://www.addevent.com/event/Du13258557" target = "_blank">add to calendar</a>)</li><li>June 16, 2022 at 12 ET: Data Science Hangout with David Meza, AIML R&amp;D Lead, People Analytics at NASA (<a href="https://www.addevent.com/event/Qv9211919" target = "_blank">add to calendar</a>)</li><li>June 21, 2022 at 12 ET: Enabling Citizen Data Scientists with RStudio Academy | Led by James Wade, Dow Chemical (<a href="https://www.addevent.com/event/Yc13364359" target = "_blank">add to calendar</a>)</li><li>June 23, 2022 at 12 ET: Data Science Hangout with Alec Campanini, Senior Manager II, Omni MerchOps Innovation: Assortment &amp; Space Analytics at Walmart (<a href="https://www.addevent.com/event/Qv9211919" target = "_blank">add to calendar</a>)</li><li>June 28, 2022 at 12 ET: RStudio Sports Analytics Meetup: SportsDataverse Initiative | Led by Saiem Gilani, Houston Rockets (<a href="http://rstd.io/sports-meetup" target = "_blank">add to calendar</a>)</li><li>June 30, 2022 at 12 ET: Data Science Hangout with Rebecca Hadi, Head of Data Science at Lyn Health (<a href="https://www.addevent.com/event/Qv9211919" target = "_blank">add to calendar</a>)</li></ul><h2 id="recurring-events">Recurring Events</h2><h3 id="data-science-hangout">Data Science Hangout</h3><p>The Data Science Hangout is a weekly, free-to-join open conversation for current and aspiring data science leaders to connect and chat about some of the more human-centric questions around data science leadership. These happen every Thursday at 12 ET with a different leader featured each week.</p><p>The conversation is all audience-based and you can jump on whenever it fits your schedule. Add the weekly hangouts <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">to your calendar</a> and check out the <a href="https://www.rstudio.com/data-science-hangout/" target = "_blank">website</a> with all the recordings.</p><p>A few other things:</p><ul><li>All are welcome - no matter your industry/experience</li><li>No need to register for anything</li><li>It&rsquo;s always okay to join for part of a session</li><li>You can just listen-in if you want</li><li>You can ask anonymous questions too!</li></ul><h3 id="rstudio-enterprise-community-meetups">RStudio Enterprise Community Meetups</h3><p>We also host industry meetups for teams to share the work they are doing within their organizations, teach lessons learned, and network with others. Join the group on <a href="https://www.meetup.com/RStudio-Enterprise-Community-Meetup/" target = "_blank">Meetup</a>.</p><h2 id="icymi-may-2022-events">ICYMI: May 2022 Events</h2><ul><li>May 3, 2022 at 4 ET: <a href="https://youtu.be/eKWXvXf0kwo" target = "_blank">Shiny modularization, Leaflet for R and Leaflet JS extensions with Epi-interactive</a> | Led by Dr Uli Muellner and Nick Snellgrove at Epi-interactive (coming soon)</li><li>May 5, 2022 at 12 ET: <a href="https://youtu.be/KubOBhiRfIY" target = "_blank">Data Science Hangout with Michael Chow</a>, Data Scientist &amp; Software Engineer at RStudio</li><li>May 12, 2022 at 12 ET: <a href="https://youtu.be/iKj6SK9Wvos" target = "_blank">Data Science Hangout with Wayne Jones</a>, Principal Data Scientist at Shell</li><li>May 17, 2022 at 12 ET: <a href="https://youtu.be/RBVqKi3FV30" target = "_blank">R for Clinical Study Reports &amp; Submission</a> | Led by Yilong Zhang, PhD at Meta</li><li>May 19, 2022 at 12 ET: <a href="https://www.youtube.com/playlist?list=PL9HYL-VRX0oTu3bUoyYknD-vpR7Uq6bsR" target = "_blank">Data Science Hangout with Lindsey Clark</a>, Director of Data Science at Healthcare Bluebook</li><li>May 25, 2022 at 12 ET: <a href="https://youtu.be/mgCQZmJdQaI" target = "_blank">Optimizing Shiny for enterprise-grade apps</a> | Led by Veerle Van Leemput at Analytic Health</li><li>May 26, 2022 at 12 ET: <a href="https://www.youtube.com/playlist?list=PL9HYL-VRX0oTu3bUoyYknD-vpR7Uq6bsR" target = "_blank">Data Science Hangout with Alice Walsh</a>, VP, Translational Research at Pathos</li><li>June 1, 2022 at 12 ET: <a href="https://youtu.be/o36425S1-VU" target = "_blank">Using Python with RStudio Team</a> | Led by David Aja, RStudio</li></ul><h2 id="call-for-speakers">Call for Speakers</h2><p>If you’re interested in sharing your work at a Meetup (or just starting to consider it for a future date down the road!), <a href="https://forms.gle/EtXMpSoTfhpGopiS8" target = "_blank">please fill out the speaker submission form</a>. We’re always looking for a diverse set of speakers — if you are a member of a group that is underrepresented in data science, including people of color, those with disabilities, elders/older adults, LGBTQ folks, and women/minority genders, we highly encourage you to submit!</p></description></item><item><title>Announcing RStudio for Microsoft Azure ML</title><link>https://www.rstudio.com/blog/announcing-rstudio-for-azure-ml/</link><pubDate>Wed, 01 Jun 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-rstudio-for-azure-ml/</guid><description><h2 id="machine-learning-development-in-the-cloud">Machine Learning Development in the Cloud</h2><p>Cloud platforms enable the machine learning lifecycle with variable scaling, lower start-up costs, and centralized data lakes. Data scientists who use these resources in conjunction with their favorite tools can more efficiently build high-quality models at scale.</p><p>RStudio supports cloud strategies in <a href="https://www.rstudio.com/solutions/rstudio-in-the-cloud/" target = "_blank">various ways</a>. Our partnerships allow data scientists to operationalize machine learning on their preferred cloud platform using our professional products, built to be the best tools for open source data science.</p><h2 id="announcing-rstudio-for-azure-ml">Announcing RStudio for Azure ML</h2><p>We’re excited to announce a new partnership with <a href="https://azure.microsoft.com/en-us/services/machine-learning/#product-overview" target = "_blank">Azure ML</a> to deliver <a href="https://www.rstudio.com/products/workbench/" target = "_blank">RStudio Workbench</a> on the Azure platform. Data scientists can use the RStudio Workbench they know and love in conjunction with their Azure data sources and other Azure ML capabilities.</p><blockquote><p>RStudio is very pleased to work with the Azure Machine Learning team on this release, as we collaborate to make it easier for organizations to move their open-source data science workloads to the cloud. We are committed to helping our joint customers use our commercial offerings to bring their production workloads to their preferred cloud platforms.</p><p>— Tareef Kawaf, President, RStudio PBC</p></blockquote><h2 id="access-rstudio-workbench-within-your-azure-ml-cloud-environment">Access RStudio Workbench Within Your Azure ML Cloud Environment</h2><p>RStudio Workbench is the ideal platform for code-first data science development. With this offering, data scientists can start a single-user instance of RStudio Workbench from within their Azure ML environment as part of the machine learning lifecycle.</p><p>With RStudio Workbench, data scientists can:</p><ul><li>Program in R or Python in your preferred IDE (RStudio, VSCode, JupyterLab, Jupyter Notebook)</li><li>Build models with your favorite open-source tools such as the tidyverse and Shiny</li><li>Access pre-installed packages to support your data science work</li><li>Improve data connectivity via the RStudio Pro Drivers</li><li>Run scripts in the background as local launcher jobs</li><li>Select multiple versions of R and Python</li><li>Open multiple R and Python sessions</li><li>Receive end-user support</li></ul><p><img src="azure-image.png" alt="Laptop with a Workbench IDE open in Azure"></p><h2 id="get-started-with-rstudio-for-azure-ml">Get Started With RStudio for Azure ML</h2><p>We look forward to further supporting open-source data science in the cloud. Check out the <a href="https://www.rstudio.com/azure-ml" target = "_blank">RStudio for Azure ML product page</a> for more information and to purchase a license.</p></description></item><item><title>Designing the Data Science Classroom Workshop at rstudio::conf(2022)</title><link>https://www.rstudio.com/blog/designing-the-data-science-classroom/</link><pubDate>Tue, 31 May 2022 01:00:00 -0600</pubDate><guid>https://www.rstudio.com/blog/designing-the-data-science-classroom/</guid><description><p>The data science ecosystem is constantly evolving. Packages are updated, new software is released, and fresh strategies are developed. For educators, the speed of the changes can be dizzying. They must stay proficient in their skills and informed on the advancements in teaching statistics and data science.</p><p>We’ve been busy here at RStudio and wanted to share some exciting new tools and features for those teaching data science. If these tools pique your interest, we are hosting a workshop on <a href="https://www.rstudio.com/conference/2022/workshops/teach-ds/" target = "_blank">Designing the Data Science Classroom at rstudio::conf()</a>. You will acquire concrete guidance on content, workflows, and infrastructure to employ these tools in your teaching. Watch a video on three reasons why you might be interested, or read more below:</p><script src="https://fast.wistia.com/embed/medias/o8orjrabxc.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_o8orjrabxc videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/o8orjrabxc/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><h2 id="learn-whats-new-in-data-science">Learn what’s new in data science</h2><p>The tidyverse and tidymodels packages provide users with consistent design philosophy, grammar, and data structures for data science. There have been <a href="https://www.tidyverse.org/blog/" target = "_blank">many updates in the past year</a>, from <a href="https://www.tidyverse.org/blog/2022/02/tidyr-1-2-0/" target = "_blank">new ways to wrangle data</a> to the <a href="https://www.tidyverse.org/blog/2022/05/case-weights/" target = "_blank">ability to use case weights</a>.</p><p>In addition to package releases, it’s helpful to stay current with the tools. <a href="https://quarto.org/" target = "_blank">Quarto</a> is a new publishing system for scientific and technical writing. Educators can create documents, web pages, blog posts, and books to share and publish their work. Quarto offers seamless support for Python, R, Julia, and Observable JavaScript. There’s a lot to learn. In this workshop, you will learn both how to teach Quarto to your students and how to use Quarto to create your teaching materials such as slides, course websites, etc.</p><p>With so many updates to relevant packages and tools, educators have to determine how to prioritize tweaking their course content. In our workshop, we’ll walk through what is important to incorporate into your curriculum.</p><h2 id="design-a-computational-infrastructure">Design a computational infrastructure</h2><p>In addition to content, educators need to think about the computational infrastructure they will use in their teaching. Will students locally install software or use cloud resources? How will they interact with version control tools? And how will they get feedback on their work?</p><p>In this workshop, we will highlight <a href="https://www.rstudio.com/products/cloud/" target = "_blank">RStudio Cloud</a> as the tool of choice for the computational infrastructure of data science courses. It integrates with Git/GitHub, runs an RStudio IDE environment in the cloud (so that students do not have to run installations on their local machine), and allows educators to access, view, and edit projects. Recently, we introduced <a href="https://rstudio.cloud/learn/guide#project-collaborative-editing" target = "_blank">collaborative editing</a> so that you can see students’ edits in real-time.</p><p>We will also introduce learnr, an R package that allows you to turn your R Markdown documents into interactive tutorials with automated feedback. These tutorials can be used for summative or formative feedback and can be highly enjoyable learning experiences for students.</p><p>Finally, in this workshop, you’ll also get a mini-module on R package development, specifically, making data packages for teaching purposes.</p><p>As you consider what concepts to teach in your classroom, a thorough knowledge of available tools and how to use them is key for a great classroom experience. Over two days of this workshop, you’ll get both a deep and broad coverage of content and tooling for teaching data science with R.</p><h2 id="engage-with-others">Engage with others</h2><p>The data science community comprises members who aid and support each other. As you can imagine, many educators ask questions like, “Should I teach the base R pipe now?” or “How do I teach my students to find help online when they get stuck?”. By meeting and learning from like-minded people, you can find answers and inspiration to bring back to your classroom.</p><h2 id="learn-more">Learn more</h2><p>We are excited to share more about the new tools and ideas in the data science space. We’d love to see you at the <a href="https://www.rstudio.com/conference/2022/workshops/teach-ds/" target = "_blank">Designing the Data Science Classroom Workshop</a> in July. Sign up today!</p></description></item><item><title>Deep Learning with R, Second Edition Book Launch</title><link>https://www.rstudio.com/blog/deep-learning-with-r-second-edition-book-launch/</link><pubDate>Tue, 31 May 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/deep-learning-with-r-second-edition-book-launch/</guid><description><p>We are excited to announce the <a href="https://rstd.io/dlwr-2e" target = "_blank">MEAP release of Deep Learning with R, Second Edition</a>! As part of the Manning Early Access Program (MEAP), you have early access to the book while it is being written. We plan to release the complete version of the book next month.</p><h2 id="get-started-with-deep-learning-with-r">Get started with deep learning with R</h2><p>This book is a hands-on guide to deep learning using Keras and R. Tomasz Kalinowski, the maintainer of the Keras and Tensorflow R packages at RStudio, shows you how to get started. No background in mathematics or data science is required.</p><h2 id="discover-the-latest-innovations-in-the-deep-learning-space">Discover the latest innovations in the deep learning space</h2><p>With deep learning, data scientists can create more accurate and efficient models, sometimes even outperforming human cognition. Recent innovations have unlocked exciting new capabilities in this space.</p><p>The latest edition of Deep Learning with R contains over 75% new content and significant updates on topics such as:</p><ul><li>Deep learning from first principles</li><li>Image classification and image segmentation</li><li>Time series forecasting</li><li>Text classification and machine translation</li><li>Text generation, neural style transfer, and image generation</li></ul><p>You will learn the latest in deep learning through intuitive explanations, crisp illustrations, and clear examples.</p><h2 id="learn-more">Learn more</h2><p>Deep learning with R allows you to write in your preferred programming language while taking full advantage of the deep learning methods.</p><ul><li>Find out more about the second edition of Deep Learning with R on the <a href="https://blogs.rstudio.com/ai/posts/2022-05-31-deep-learning-with-r-2e/" target = "_blank">RStudio AI Blog</a>.</li><li>Purchase the MEAP version of Deep Learning with R, Second Edition on the <a href="https://rstd.io/dlwr-2e" target = "_blank">Manning website</a>. Use the code <strong>mlallaire2</strong> for 40% off.</li></ul></description></item><item><title>Real-Time Collaborative Editing on RStudio Cloud</title><link>https://www.rstudio.com/blog/real-time-collaborative-editing-on-rstudio-cloud/</link><pubDate>Mon, 23 May 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/real-time-collaborative-editing-on-rstudio-cloud/</guid><description><p>We are excited to announce real-time collaborative editing on RStudio Cloud! Administrators or moderators of a space can open a project that another user is working on to join that user in the project. Users can edit code and immediately see changes made by others.</p><script src="https://fast.wistia.com/embed/medias/irh36tnh77.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:50.0% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_irh36tnh77 videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><center><i><caption>Pair Programming <a href="https://bookdown.org/roy_schumacher/r4ds/pipes.html#use-the-pipe" target = "_blank">magrittr pipes</a> Example in RStudio Cloud</caption></i></center><p>Collaborative editing is currently a beta feature. If you are a Premium, Instructor, or Organization account holder, you can enable collaborative editing for your account.</p><h2 id="access-and-edit-an-rstudio-cloud-project-at-the-same-time">Access and edit an RStudio Cloud project at the same time</h2><p>Up to five users can now access and edit an RStudio Cloud project in real-time. Entering a project allows users to see who is in the same space. Once a file is saved in the project, users can concurrently edit the script (similar to Google Docs).</p><p>As part of the roll out of collaborative editing, the RStudio Cloud team changed the way user files are stored within projects. Specifically, each user that opens a project will be provided with their own private home directory, as well as their own R session state. This was necessary to support multi-user collaborative editing, as well as to enhance the security of user-specific files.</p><p>If collaborative editing is enabled, all collaborators on a project will continue to have access to files in the project directory. However, each user will now have their own R session data, which includes console and R command histories and R environment state. As a consequence of this change, Admins or Moderators opening projects belonging to another user in a space will not see the session data from that user.</p><p>If collaborative editing is not enabled, users accessing the project will see the same behavior as before. Only a single user can access a project at a time, and all users share the same R session, home directory, history, and environment. Learn more in this <a href="https://community.rstudio.com/t/important-changes-related-to-r-sessions-in-rstudio-projects/136423" target = "_blank">community thread</a>.</p><h2 id="work-together-review-code-and-resolve-issues">Work together, review code, and resolve issues</h2><p>Collaborative editing creates a supportive environment for active learning and social interaction. Working on the same project encourages users to work with each other. They have the opportunity to demonstrate skills, brainstorm new techniques, and share best practices.</p><p>During a session, users have the opportunity to continuously review code and resolve issues. Programmers can work together to inspect the problem, walk through the debugging process, and try out solutions.</p><script src="https://fast.wistia.com/embed/medias/8e87w3z7hb.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:50.0% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_8e87w3z7hb videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><center><i><caption>Instructor-Student Project Interaction in RStudio Cloud</caption></i></center><h2 id="learn-more">Learn more</h2><p>We hope that RStudio Cloud’s collaborative editing feature is helpful to your work.</p><ul><li>Learn more about <a href="https://www.rstudio.com/products/cloud/" target = "_blank">RStudio Cloud</a>.</li><li>Read <a href="https://rstudio.cloud/learn/whats-new" target = "_blank">what’s new in RStudio Cloud</a> and read more on <a href="https://rstudio.cloud/learn/guide#project-collaborative-editing" target = "_blank">Collaborative Editing</a>.</li><li>Interested in using RStudio Cloud in the classroom? Join Mine Çetinkaya-Rundel &amp; Maria Tackett on July 25-26 at rstudio::conf for <strong>Designing the Data Science Classroom</strong>. This workshop equips educators with concrete information on content, workflows, and infrastructure for introducing modern computation with R and RStudio. Learn more on the <a href="https://www.rstudio.com/conference/2022/workshops/teach-ds/" target = "_blank">workshop page</a>.</li></ul></description></item><item><title>R Markdown Tips and Tricks #3: Time-savers & Trouble-shooters</title><link>https://www.rstudio.com/blog/r-markdown-tips-and-tricks-3-time-savers/</link><pubDate>Wed, 18 May 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/r-markdown-tips-and-tricks-3-time-savers/</guid><description><p><sup>Photo by <a href="https://unsplash.com/@jeremybezanger?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Jeremy Bezanger</a> on <a href="https://unsplash.com/s/photos/knit?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup></p><p>The R Markdown file format combines R programming and the markdown language to create dynamic, reproducible documents. Authors use R Markdown for reports, slide shows, blogs, books — even <a href="https://bookdown.org/yihui/rmarkdown/shiny-start.html" target = "_blank">Shiny apps</a>! Since users can do so much with R Markdown, it&rsquo;s important to be efficient with time and resources.</p><p>We asked our Twitter friends <a href="https://twitter.com/_bcullen/status/1333878752741191680" target = "_blank">the tips and tricks that they have picked up</a> along their R Markdown journey. There was a flurry of insightful responses ranging from organizing files to working with YAML, and we wanted to highlight some of the responses so that you can apply them to your work, as well.</p><p>This is the third of a four-part series to help you on your path to R Markdown success, where we discuss <strong>features and functions that save you time and help you troubleshoot</strong>. You can find many of these tips and tricks in the <a href="https://bookdown.org/yihui/rmarkdown-cookbook/" target = "_blank">R Markdown Cookbook</a>. We&rsquo;ve included the link to the relevant chapter (and other resources) in each section.</p><p><strong>1. Convert an R script into an R Markdown document with <code>knitr::spin()</code></strong></p><p>Have you ever wished you could transform an R script into an R Markdown document without having to copy and paste your code? The function <code>knitr::spin()</code> lets you do just that. Pass your R script to <code>spin()</code> and watch the transformation happen.</p><center><img src="img/img1.png" alt="File icons representing the conversion of an R script into an R Markdown document" width="70%"></center><p>You can quickly move from coding your analysis to writing your reports. In addition, you can keep your workflow reproducible by including <code>knitr::spin()</code> at the end of your R script. Rerun it any time you update your analysis so that your source code and R Markdown report are synced.</p><ul><li>R Markdown Cookbook chapter: <a href="https://bookdown.org/yihui/rmarkdown-cookbook/spin.html" target = "_blank">Render an R script to a report</a>.</li><li>You can also transform an R script into a report using <code>#'</code> comments. The <a href="https://happygitwithr.com/r-test-drive.html#write-a-render-ready-r-script" target = "_blank">Render an R script</a> chapter of Happy Git and GitHub for the useR chapter walks through how to create a render-ready R script.</li></ul><p><strong>2. Convert an R Markdown document into an R script with <code>knitr::purl()</code></strong></p><p>Now let&rsquo;s flip it around! What if you want to extract only the R code from your R Markdown report? For this, use the function <code>knitr::purl()</code>.</p><center><img src="img/img2.png" alt="File icons representing the conversion of an R Markdown document script into an R script" width="70%"></center><p>The output from <code>purl()</code> can show no text, all text, or just the chunk options from your <code>.Rmd</code> file depending on the <code>documentation</code> argument.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic"># Extracts only pure R code</span>knitr<span style="color:#666">::</span><span style="color:#06287e">purl</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">script.R&#34;</span>, documentation <span style="color:#666">=</span> <span style="color:#40a070">0L</span>)<span style="color:#60a0b0;font-style:italic"># Extracts R code and chunk options</span>knitr<span style="color:#666">::</span><span style="color:#06287e">purl</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">script.R&#34;</span>, documentation <span style="color:#666">=</span> <span style="color:#40a070">1L</span>)<span style="color:#60a0b0;font-style:italic"># Extracts all text</span>knitr<span style="color:#666">::</span><span style="color:#06287e">purl</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">script.R&#34;</span>, documentation <span style="color:#666">=</span> <span style="color:#40a070">2L</span>)</code></pre></div><p>If you do not want certain code chunks to be extracted,you can set the chunk option <code>purl = FALSE</code>.</p><pre><code>```{r ignored}#| purl = FALSEx = rnorm(1000)```</code></pre><ul><li>R Markdown Cookbook chapter: <a href="https://bookdown.org/yihui/rmarkdown-cookbook/purl.html" target = "_blank">Convert R Markdown to R script</a>.</li></ul><p><strong>3. Reuse code chunks throughout your document</strong></p><p>The knitr package provides several options to avoid copying and pasting your code. One way is to use reference labels with the chunk option <code>ref.label</code>.Let&rsquo;s use the example from the R Markdown Cookbook. Say you have these two chunks:</p><pre><code>```{r chunk-b}# this is the chunk b1 + 1```</code></pre><pre><code>```{r chunk-c}# this is the chunk c2 + 2```</code></pre><p>You can write a chunk that combines <code>chunk-c</code> and <code>chunk-b</code>:</p><pre><code>```{r chunk-a}#| ref.label = c(&quot;chunk-c&quot;, &quot;chunk-b&quot;)```</code></pre><p>Your <code>chunk-a</code> will render like this:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-R" data-lang="R"><span style="color:#60a0b0;font-style:italic"># this is the chunk c</span><span style="color:#40a070">2</span> <span style="color:#666">+</span> <span style="color:#40a070">2</span></code></pre></div><pre><code>## [1] 4</code></pre><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-R" data-lang="R"><span style="color:#60a0b0;font-style:italic"># this is the chunk b</span><span style="color:#40a070">1</span> <span style="color:#666">+</span> <span style="color:#40a070">1</span></code></pre></div><pre><code>## [1] 2</code></pre><p>Please note that any code inside of <code>chunk-a</code> will <em>not</em> be evaluated.</p><p>One application of <code>ref.label</code> puts all of your code in an appendix. The code output will show up in the document&rsquo;s main body, and the code chunks will appear only at the end.</p><pre><code># Appendix```{r}#| ref.label=knitr::all_labels(),#| echo = TRUE,#| eval = FALSE```</code></pre><ul><li>R Markdown Cookbook chapters: <a href="https://bookdown.org/yihui/rmarkdown-cookbook/reuse-chunks.html" target = "_blank">Reuse code chunks</a> and <a href="https://bookdown.org/yihui/rmarkdown-cookbook/code-appendix.html" target = "_blank">Put together all code in the appendix</a>.</li><li>Another application of <code>ref.label</code> allows you to show code in one place and then display the output somewhere else. Garrick Aden-Buie walks through an example of how to do this in xaringan slides in his blog post, <a href="https://www.garrickadenbuie.com/blog/decouple-code-and-output-in-xaringan-slides/" target = "_blank">Decouple Code and Output in xaringan slides</a>.</li></ul><p><strong>4. Cache your chunks (with dependencies)</strong></p><p>If there is a chunk in your R Markdown file that takes a while to run, you can set <code>cache = TRUE</code> to pre-save the results for the future. The next time you knit the document, the code will call the cached object rather than rerun (provided that nothing in the cached chunk has changed). This can save a lot of time when loading big files or running intensive processes.</p><pre><code>```{r load-data}#| cache = TRUEdat &lt;- read.csv(&quot;HUGE_FILE_THAT_TAKES_FOREVER_TO_LOAD.csv&quot;)```</code></pre><p>If one of your later chunks depends on the output from a cached chunk, include the <code>dependson</code> option. The chunk will rerun if something has changed in the cached chunk.</p><p>For example, say you run a function in one chunk and use the result in another chunk:</p><pre><code>```{r cached-chunk}#| cache = TRUEx &lt;- 500x```</code></pre><pre><code>```{r dependent-chunk}#| cache = TRUE,#| dependson = &quot;cached-chunk&quot;x + 5```</code></pre><p>This will result in <code>505</code>.</p><p>Now, you edit the cached chunk:</p><pre><code>```{r cached-chunk}#| cache = TRUEx &lt;- 600x```</code></pre><p>With the <code>dependson</code> option, your dependent chunk will update when you edit your cached chunk. In this case, your dependent chunk will now output <code>605</code>.</p><p>Without the <code>dependson</code> option, your dependent chunk will use the previously cached result and output <code>505</code>, even though the cached chunk now says <code>x &lt;- 600</code>.</p><p>Too much caching going on? You can reset all your caches by using a global chunk option in the first code chunk of your document, e.g., <code>knitr::opts_chunk$set(cache.extra = 1)</code>. This chunk option name can be arbitrary but we recommend that you do not use an existing option name in <code>knitr::opts_chunk$get()</code> (e.g., <code>cache.extra</code> is not a built-in option). If you want to reset the caches again, you can set the option to a different value.</p><ul><li>R Markdown Cookbook chapter: <a href="https://bookdown.org/yihui/rmarkdown-cookbook/cache.html" target = "_blank">Cache time-consuming code chunks</a>.</li><li>R for Data Science chapter: <a href="https://r4ds.had.co.nz/r-markdown.html#caching" target = "_blank">R Markdown - Caching</a></li></ul><p><strong>5. Save the content of a chunk elsewhere with the <code>cat</code> engine</strong></p><p>You may want to write the content of a code chunk to an external file to use later in the document. The <code>cat</code> engine makes this possible. The files do not just have to be <code>.R</code> files either &mdash; they can be <code>.txt.</code>, <code>.sql</code>, etc.</p><pre><code>```{cat}#| engine.opts = list(file = &quot;example.txt&quot;)This text will be saved as &quot;example.txt&quot; in your project directory.```</code></pre><p>One application is to use the <code>cat</code> engine to save a <code>.sql</code> file, then execute that file in a code chunk later in the document.</p><p>Create this SQL script in a chunk and save it in a file:</p><pre><code>```{cat}#| engine.opts = list(file = &quot;tbl.sql&quot;, lang = &quot;sql&quot;)SELECT episode, titleFROM &quot;database/elements&quot;.&quot;elements&quot;LIMIT 3```</code></pre><p>Read it in your <code>.Rmd</code> file:</p><pre><code>```{sql}#| connection = con,#| code = readLines(&quot;tbl.sql&quot;)```</code></pre><p>The <code>cat</code> engine is helpful when you want to create a <a href="https://reprex.tidyverse.org/index.html" target = "_blank">reprex</a> for your R Markdown documents. You contain everything within your document rather than having to attach external files so others can reproduce your work.</p><ul><li>R Markdown Cookbook chapter: <a href="https://bookdown.org/yihui/rmarkdown-cookbook/eng-cat.html" target = "_blank">Write the chunk content to a file via the <code>cat</code> engine</a>.</li></ul><p><strong>6. Include parameters to easily change values</strong></p><p>Include parameters to set values throughout your report. When you need to rerun the report with new values, you will not have to manually change them throughout your document. For example, if you want to display data for a particular class of cars, set the parameter <code>my_class</code>:</p><pre><code>---title: &quot;Daily Report&quot;output: &quot;html_document&quot;params:my_class: &quot;fuel economy&quot;---</code></pre><p>In the document, reference the parameter with <code>params$</code>:</p><pre><code>```{r setup}#| include = FALSElibrary(dplyr)library(ggplot2)mtcars_df &lt;-mtcars %&gt;%mutate(class = case_when(mpg &gt; 15 ~ &quot;fuel economy&quot;,TRUE ~ &quot;not fuel economy&quot;))class &lt;-mtcars_df %&gt;%filter(class == params$my_class)```# Wt vs mpg for `r params$my_class` cars```{r}ggplot(class, aes(wt, mpg)) +geom_point() +geom_smooth(se = FALSE)```</code></pre><p>To create a report that uses the new set of parameter values, you can use the <code>rmarkdown::render()</code> function. Add the <code>params</code> argument to <code>render()</code> with your updated value:</p><pre><code>rmarkdown::render(&quot;paramDoc.Rmd&quot;, params = list(my_class = &quot;not fuel economy&quot;))</code></pre><p>The report will now output the values for &lsquo;not fuel economy&rsquo; cars.</p><ul><li>R Markdown Cookbook chapter: <a href="https://bookdown.org/yihui/rmarkdown-cookbook/parameterized-reports.html#parameterized-reports" target = "_blank">Parameterized reports</a>.</li><li>R for Data Science chapter: <a href="https://r4ds.had.co.nz/r-markdown.html#parameters" target = "_blank">R Markdown - Parameters</a></li></ul><p><strong>7. Create templates with <code>knit_expand()</code></strong></p><p>With <code>knitr::knit_expand()</code>, you can replace expressions in <code>{{}}</code> with their values.</p><p>For example,</p><pre><code>```{r}knit_expand(text = 'The value of pi is {{pi}}.')```</code></pre><pre><code>```[1] &quot;The value of pi is 3.14159265358979.&quot;```</code></pre><p>You can also create templates for your R Markdown files with <code>knit_expand()</code>. Create a file with the tabs or headings that you would like. Create a second file that loops through the data you would like to output.</p><p>Take this example from the R Markdown Cookbook. Create a <code>template.Rmd</code> file:</p><pre><code># Regression on {{i}}```{r lm-{{i}}}lm(mpg ~ {{i}}, data = mtcars)```</code></pre><p>Then create another file that loops through each of the variables in <code>mtcars</code> except <code>mpg</code>:</p><pre><code>```{r}#| echo = FALSE,#| results = &quot;asis&quot;src = lapply(setdiff(names(mtcars), 'mpg'), function(i) {knitr::knit_expand('template.Rmd')})res = knitr::knit_child(text = unlist(src), quiet = TRUE)cat(res, sep = '\n')```</code></pre><p>Knit this file to create a report that applies the template the non-<code>mpg</code> variables:</p><script src="https://fast.wistia.com/embed/medias/wmhllooxbv.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:60.42% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_wmhllooxbv videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><p>In our original Twitter thread, Felipe Mattioni Maturana shows us an example from his work:</p><center><blockquote class="twitter-tweet"><p lang="en" dir="ltr">sure! that is basically what I described here: <a href="https://t.co/y4EMVTfb9y">https://t.co/y4EMVTfb9y</a> if you go to <a href="https://t.co/7CZPBEJwoD">https://t.co/7CZPBEJwoD</a> you can see the two templates used, and this is the result: <a href="https://t.co/aRo0EdJJUT">pic.twitter.com/aRo0EdJJUT</a></p>&mdash; Felipe Mattioni Maturana (@felipe_mattioni) <a href="https://twitter.com/felipe_mattioni/status/1334204239695011841?ref_src=twsrc%5Etfw">December 2, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script></center><ul><li>R Markdown Cookbook chapter: <a href="https://bookdown.org/yihui/rmarkdown-cookbook/knit-expand.html" target = "_blank">Use <code>knitr::knit_expand()</code> to generate Rmd source</a>.</li></ul><p><strong>8. Exit knitting early with <code>knit_exit()</code></strong></p><p>Exit the knitting process before the end of the document with <code>knit_exit()</code>. Knitr will write out the results up to that point and ignore the remainder of the document.</p><p>You can use <code>knit_exit()</code> either inline or in a code chunk.</p><pre><code>```{r chunk-one}x &lt;- 100````r knitr::knit_exit()````{r chunk-two}y```</code></pre><p>The rendered document will only show <code>chunk-one</code>. This is helpful if you run into errors and want to find where they are by splitting up your document.</p><ul><li>R Markdown Cookbook chapter: <a href="https://bookdown.org/yihui/rmarkdown-cookbook/knit-exit.html" target = "_blank">Exit knitting early</a>.</li><li>Other <code>.Rmd</code> troubleshooting tips can be found in the <a href="https://happygitwithr.com/rmd-test-drive.html?q=knit_exit#rmd-troubleshooting" target = "_blank">Test Drive R Markdown</a> chapter of Happy Git and GitHub for the useR.</li></ul><h2 id="continue-the-journey">Continue the Journey</h2><p>We hope that these tips &amp; tricks help you save time and troubleshoot in R Markdown. Thank you to everybody who shared advice, workflows, and features!</p><p>Stay tuned for the last post in this four-part series: <strong>Looks better, works better.</strong></p><h2 id="resources">Resources</h2><ul><li>Peruse the <a href="https://bookdown.org/yihui/rmarkdown-cookbook/" target = "_blank">R Markdown Cookbook</a> for more tips and tricks.</li><li><a href="https://www.rstudio.com/products/connect/" target = "_blank">RStudio Connect</a> is an enterprise-level product from RStudio to publish and schedule reports, enable self-service customization, and distribute beautiful emails.</li></ul></description></item><item><title>Software Development Resources for Data Scientists</title><link>https://www.rstudio.com/blog/software-development-resources-for-data-scientists/</link><pubDate>Mon, 16 May 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/software-development-resources-for-data-scientists/</guid><description><p>Data scientists concentrate on making sense of data through exploratory analysis, statistics, and models. Software developers apply a separate set of knowledge with different tools. Although their focus may seem unrelated, data science teams can benefit from adopting software development best practices. Version control, automated testing, and other dev skills help create reproducible, production-ready code and tools.</p><p>Rachael Dempsey recently <a href="https://twitter.com/_RachaelDempsey/status/1518578015978266625?s=20&t=niXllzayd4AoZ8GWGV8Pqw" target = "_blank">asked the Twitter community</a> for suggestions on resources that data scientists can use to improve their software development skill set.</p><p>We received so many great recommendations that we wanted to summarize and share them here. This blog post walks through software development best practices that your team may want to adopt and where to find out more.</p><p>The areas discussed below are:</p><ul><li><a href="#structure-your-data-science-projects-so-that-everything-just-runs-the-first-time">Project structure</a></li><li><a href="#test-functions-so-that-they-do-what-you-expect-them-to-do">Automatic testing</a></li><li><a href="#create-reproducible-environments-so-that-results-are-consistent">Reproducible environments</a></li><li><a href="#use-version-control-to-track-and-control-changes-across-a-team">Version control</a></li></ul><h2 id="structure-your-data-science-projects-so-that-everything-just-runs-the-first-time">Structure your data science projects so that everything &ldquo;just runs the first time&rdquo;</h2><p>It&rsquo;d be great if we could open an R script, click Run, and we have the output we need. However, code doesn&rsquo;t always &ldquo;just work.&rdquo; As our projects get bigger and more complex, we need structure so that they are easy to manage and understand.</p><p>As Daniel Chen says in his <a href="https://youtu.be/UQHz38s3DyA" target = "_blank">Structuring Your Data Science Projects</a> talk, a proper project structure gets us to the happy place where our code runs. Existing principles, templates, and tools can guide you:</p><ul><li>Make a project for your work so that file directories are easy to work with</li><li>Organize work into subfolders so that anybody can open up your project and understand what is going on</li><li>Create functions to organize what your code is doing</li><li>Write reports that show your stakeholders what they care about (and hide what they don&rsquo;t)</li><li>Use workflow orchestrators to see which tasks depend on each other and to rebuild your analysis (<a href="https://books.ropensci.org/targets/" target = "_blank">the targets package</a> is an option for R users)</li></ul><p><img src="images/workflow.png" alt="An example workflow diagram created from the targets package"></p><center><i><caption>A workflow diagram showing a project's activities and dependencies<br>Source: <a href="https://wlandau.github.io/targets-tutorial/#1" target = "_blank">Reproducible computation at scale in R, Will Landau</a></center></i></caption><br>Organized projects allow data scientists to remain productive as their projects grow.<h3 id="project-structure-resources">Project Structure Resources</h3><ul><li><a href="https://rstats.wtf/save-source.html" target = "_blank">What They Forgot to Teach You About R - A Holistic Workflow</a></li><li><a href="https://slides.djnavarro.net/project-structure/#1" target = "_blank">Project Structure Slides by Danielle Navarro</a></li><li><a href="https://goodresearch.dev/" target = "_blank">Good Research Code Handbook</a></li><li><a href="https://workflowr.github.io/workflowr/" target = "_blank">workflowr package</a></li><li><a href="https://swcarpentry.github.io/make-novice/" target = "_blank">Data Carpentries Automation and Make Course</a></li></ul><h2 id="test-functions-so-that-they-do-what-you-expect-them-to-do">Test functions so that they do what you expect them to do</h2><p>A data scientist can test a function by inputting some values, seeing if the output is what you expect, modifying the code if there are issues, and rerunning to check the values again. However, this process leaves a lot of room for error. It&rsquo;s easy to forget what changed and cause something else to break.</p><p>Automated testing offers a better option, such as with the <a href="https://docs.pytest.org/en/latest/" target = "_blank">pytest package</a> or <a href="https://testthat.r-lib.org/" target = "_blank">testthat package</a>. Automated testing focuses on small, well-defined pieces of code. Data scientists write out explicit expectations. Tests are saved in a single location, making them easy to rerun. When a test fails, it&rsquo;s clear where to look for the problem.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-python" data-lang="python"><span style="color:#007020;font-weight:bold">from</span> <span style="color:#0e84b5;font-weight:bold">foobar</span> <span style="color:#007020;font-weight:bold">import</span> foo, bar<span style="color:#007020;font-weight:bold">def</span> <span style="color:#06287e">test_foo_values</span>():<span style="color:#007020;font-weight:bold">assert</span> foo(<span style="color:#40a070">4</span>) <span style="color:#666">==</span> <span style="color:#40a070">8</span><span style="color:#007020;font-weight:bold">assert</span> foo(<span style="color:#40a070">2.2</span>) <span style="color:#666">==</span> <span style="color:#40a070">1.9</span><span style="color:#007020;font-weight:bold">def</span> <span style="color:#06287e">test_bar_limits</span>():<span style="color:#007020;font-weight:bold">assert</span> bar(<span style="color:#40a070">4</span>, [<span style="color:#40a070">1</span>, <span style="color:#40a070">90</span>], option<span style="color:#666">=</span>True) <span style="color:#666">&lt;</span> <span style="color:#40a070">8</span><span style="color:#60a0b0;font-style:italic"># If you want to test that bar() raises an exception when called with certain</span><span style="color:#60a0b0;font-style:italic"># arguments, e.g. if bar() should raise an error when its argument is negative:</span><span style="color:#007020;font-weight:bold">def</span> <span style="color:#06287e">test_bar_errors</span>():<span style="color:#007020;font-weight:bold">with</span> pytest<span style="color:#666">.</span>raises(<span style="color:#007020">ValueError</span>):bar(<span style="color:#666">-</span><span style="color:#40a070">4</span>, [<span style="color:#40a070">2</span>, <span style="color:#40a070">10</span>]) <span style="color:#60a0b0;font-style:italic"># test passes if bar raises a ValueError</span></code></pre></div><center><i><caption>Example Python Test<br>Source: <a href="https://36-750.github.io/practices/unit-testing/" target = "_blank">Statistical Computing Course, Alex Reinhart and Christopher R. Genovese</a></center></i></caption><p>Incorporating automated testing in a workflow exposes problems early and makes it easier to alter code.</p><h3 id="testing-resources">Testing Resources</h3><ul><li><a href="https://goodresearch.dev/" target = "_blank">Good Research Code Handbook</a></li><li><a href="https://36-750.github.io/practices/unit-testing/" target = "_blank">Statistical Computing - Unit Testing</a></li><li><a href="https://r-pkgs.org/tests.html" target = "_blank">R Packages - Testing</a></li><li><a href="https://testthat.r-lib.org/" target = "_blank">testthat package</a></li></ul><h2 id="create-reproducible-environments-so-that-results-are-consistent">Create reproducible environments so that results are consistent</h2><p>Have you ever had a script that worked great, but now you can&rsquo;t reproduce the results on a new laptop? This may happen because of changes in the operating system, package versions, or other factors. You have to spend time figuring out why the output is suddenly different.</p><p>A reproducible environment ensures workflows run as they did in the past. Your team controls everything needed to run a project in a reproducible environment. Each person running the code can expect the same behavior since everything is standardized.</p><p>Virtual environments for Python or the <a href="https://rstudio.github.io/renv/articles/renv.html" target = "_blank">renv package</a> for R are examples of the tools that can help a data science team reproduce their work. They record the version of loaded packages in a project and can re-install the declared versions of those packages.</p><p>For example, say one of your projects uses <code>dplyr::group_map()</code>, introduced in dplyr v0.8.0. If a team member runs the code with an older version of dplyr, they will run into an error. The renv package captures the state of the environment in which the code was written and shares that environment with others so that they can run the code without issue. (Try out <a href="https://github.com/JosiahParry/renv-example" target = "_blank">Josiah Parry&rsquo;s example</a> on GitHub.)</p><p>With reproducible environments, data scientists can collaborate on work and validate results regardless of when and where they are running code.</p><h3 id="reproducible-environments-resources">Reproducible Environments Resources</h3><ul><li><a href="https://rstats.wtf/get-to-know-your-r-installation.html" target = "_blank">What They Forgot to Teach You About R - Personal R Admin</a></li><li><a href="https://environments.rstudio.com/" target = "_blank">RStudio - Reproducible Environments</a></li><li><a href="https://solutions.rstudio.com/python/minimum-viable-python/" target = "_blank">Minimal Viable Python</a></li><li><a href="https://education.molssi.org/python-package-best-practices/" target = "_blank">Python Package Best Practices</a></li><li><a href="https://docs.python.org/3/tutorial/venv.html" target = "_blank">Python virtual environments</a>, <a href="https://rstudio.github.io/renv/articles/renv.html" target = "_blank">renv package</a>, and <a href="https://youtu.be/yjlEbIDevOs" target = "_blank">renv webinar</a></li></ul><h2 id="use-version-control-to-track-and-control-changes-across-a-team">Use version control to track and control changes across a team</h2><p>Data scientists produce many files. As these files evolve throughout a project, keeping track of the latest version becomes more challenging. If the team collaborates on the same file, someone may use an outdated version and has to spend time reconciling mismatched lines of code.</p><p>Version control with tools like Git and GitHub can alleviate these pains. Teams can manage asynchronous work and avoid conflict or confusion. It&rsquo;s easy to track the evolution of files, find (and revert) changes, and resolve differences between versions.</p><p><img src="images/diff.png" alt="Differences between two versions of a file from the diff window available in RStudio&rsquo;s Git GUI"></p><center><i><caption>Quick check of differences between two versions of a file<br>(current version uses <code>mtcars$hp</code> instead of <code>mtcars$am</code>)</center></i></caption><p>Using version control in data science projects makes collaboration and maintenance more manageable.</p><h3 id="version-control-resources">Version Control Resources</h3><ul><li><a href="https://happygitwithr.com/" target = "_blank">Happy Git and GitHub for the useR</a></li><li><a href="https://peerj.com/preprints/3159v2/" target = "_blank">Excuse me, do you have a moment to talk about version control?</a></li><li><a href="https://lab.github.com/" target = "_blank">GitHub Learning Lab</a></li><li><a href="https://support.rstudio.com/hc/en-us/articles/200532077-Version-Control-with-Git-and-SVN" target = "_blank">Git GUI in RStudio IDE</a></li><li>Other Git GUIs: <a href="https://www.gitkraken.com/" target = "_blank">GitKraken</a>, <a href="https://www.sourcetreeapp.com/" target = "_blank">Sourcetree</a></li></ul><h2 id="learn-more">Learn More</h2><p>These are just a few areas that data science teams should concentrate on to improve their software development skill set. We hope that you find them helpful and are excited to learn more!</p><ul><li>Read the <a href="https://twitter.com/_RachaelDempsey/status/1518578015978266625?s=20&t=SfEXfXBBkuzeKzUOgh70KA" target = "_blank">original Twitter thread</a> for more links.</li><li>Watch helpful <a href="https://youtube.com/playlist?list=PLXKlQEvIRus_oupGJ3rxMC0wtXunhz5N0" target = "_blank">YouTube videos</a> on this topic.</li></ul><p><strong>Interested in developing holistic workflows, improving debugging processes, and writing non-repetitive code? Register for What They Forgot to Teach You About R at rstudio::conf(2022), a two-day session led by Shannon McClintock Pileggi, Jenny Bryan, and David Aja. Learn more on the rstudio::conf <a href="https://www.rstudio.com/conference/2022/workshops/wtf-rstats/" target = "_blank">workshop page</a>.</strong></p></description></item><item><title>Exploring RStudio's Visual Markdown Editor</title><link>https://www.rstudio.com/blog/exploring-rstudio-visual-markdown-editor/</link><pubDate>Wed, 11 May 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/exploring-rstudio-visual-markdown-editor/</guid><description><p><a href="https://www.rstudio.com/blog/announcing-rstudio-1-4/" target = "_blank">RStudio 1.4</a> was released in January 2021 and switched to calendar-based versioning as of 2021.09. Starting with 1.4, the IDE includes a visual markdown editor that works on any markdown-based document, such as <code>.md</code> or <code>.Rmd</code> files. Visual editing mode provides a better experience when writing reports and analyses. You can see changes in real-time and preview what your document looks like without re-knitting.</p><p>In addition, the visual markdown editor provides extensive support for citations, scientific and technical writing features, outline navigation, and more.</p><p>In this blog post, we’d like to show how to use the visual markdown editor and highlight some features that you should know.</p><ul><li><a href="#get-started-with-the-visual-markdown-editor">How to get started with the visual markdown editor</a></li><li><a href="#create-content-in-the-visual-markdown-editor">How to create content in the visual markdown editor</a></li><li><a href="#edit-content-in-visual-editing-mode">How to edit content in visual editing mode</a></li></ul><h2 id="get-started-with-the-visual-markdown-editor">Get Started With the Visual Markdown Editor</h2><p>To switch into the visual mode for a markdown document:</p><ul><li>In RStudio v2022.02, click the &ldquo;Visual&rdquo; button located on the left side of the editor toolbar.</li><li>In earlier versions, click the compass icon located on the right side of the editor toolbar.</li></ul><p>Alternatively, use the <kbd>⇧</kbd> + <kbd>⌘</kbd> + <kbd>F4</kbd> keyboard shortcut.</p><script src="https://fast.wistia.com/embed/medias/am9337b81b.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:57.29% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_am9337b81b videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><center><i><caption>Switching into the visual markdown editor mode</caption></i></center><p>You can customize various editor settings. Go to Tools -&gt; Global Options -&gt; R Markdown -&gt; Visual to choose your options, such as:</p><ul><li>Whether your new documents use the visual markdown editor by default.</li><li>Whether the document outline shows by default.</li><li>How to wrap text in the document.</li></ul><center><img src="images/image1.png" alt="RStudio options in the visual editing mode dialogue box" width="60%" ></center><center><i><caption>Editor options</caption></i></center><h2 id="create-content-in-the-visual-markdown-editor">Create Content in the Visual Markdown Editor</h2><h3 id="how-to-embed-rich-media">How to embed rich media</h3><p>Once in visual editing mode, you can paste rich content (tables, images, some formatting, etc.) into a document and match it in the source document.</p><script src="https://fast.wistia.com/embed/medias/opxbr5wnre.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:33.54% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_opxbr5wnre videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><center><i><caption>Dragging and dropping an image into a document in visual editing mode</caption></i></center><h3 id="how-to-use-keyboard-and-markdown-shortcuts">How to use keyboard and markdown shortcuts</h3><p>The visual mode supports both standard keyboard shortcuts (e.g., <kbd>⌘</kbd> + <kbd>B</kbd> for bold) and markdown syntax.</p><script src="https://fast.wistia.com/embed/medias/yvzc3efjr2.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:50.42% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_yvzc3efjr2 videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><center><i><caption>Italicizing and bolding text with markdown and keyboard shortcuts</caption></i></center><p>Here are some commonly used shortcuts:</p><table><thead><tr><th>Command</th><th>Keyboard Shortcut</th><th>Markdown Shortcut</th></tr></thead><tbody><tr><td>Bold</td><td>⌘ B</td><td><strong>bold</strong></td></tr><tr><td>Italic</td><td>⌘ I</td><td><em>italic</em></td></tr><tr><td>Code</td><td>⌘ D</td><td><code>code</code></td></tr><tr><td>Link</td><td>⌘ K</td><td><href></td></tr><tr><td>Heading 1</td><td>⌥⌘ 1</td><td>#</td></tr><tr><td>Heading 2</td><td>⌥⌘ 2</td><td>##</td></tr><tr><td>Heading 3</td><td>⌥⌘ 3</td><td>###</td></tr><tr><td>R Code Chunk</td><td>⌥⌘ I</td><td>```{r}</td></tr></tbody></table><p>Check out more on the <a href="https://rstudio.github.io/visual-markdown-editing/shortcuts.html" target = "_blank">Editing Shortcuts page</a> of the Visual R Markdown site.</p><h3 id="how-to-use-the-catch-all-shortcut-to-insert-anything">How to use the catch-all shortcut to insert anything</h3><p>You can also use the catch-all <kbd>⌘</kbd> + <kbd>/</kbd> shortcut to insert headings, comments, symbols, and more.</p><script src="https://fast.wistia.com/embed/medias/0ytilauj0j.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:57.08% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_0ytilauj0j videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><h3 id="how-to-use-the-editor-toolbar">How to use the editor toolbar</h3><p>The editor toolbar includes buttons for common commands to edit formatting or insert content.</p><script src="https://fast.wistia.com/embed/medias/pw2tuib05h.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:50.63% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_pw2tuib05h videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><center><i><caption>Inserting a table using the editor toolbar</caption></i></center><h2 id="edit-content-in-visual-editing-mode">Edit content in visual editing mode</h2><p>The visual markdown editor also provides efficient tools to edit and customize content.</p><h3 id="how-to-edit-a-tables-rows-columns-and-alignment">How to edit a table&rsquo;s rows, columns, and alignment</h3><p>Right-click on a table in your document. You can set the alignment, add or delete rows and columns, add a header, and more.</p><script src="https://fast.wistia.com/embed/medias/p1zdt2pggj.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:50.63% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_p1zdt2pggj videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><center><i><caption>Adding a row and aligning a column</caption></i></center><h3 id="how-to-change-an-images-height-width-and-aspect-ratio">How to change an image&rsquo;s height, width, and aspect ratio</h3><p>Once you have an image in your document, you can edit the size and aspect ratio with the visual markdown editor. Click on the image and a small window will appear. Edit the numbers to change the height, width, and aspect ratio.</p><script src="https://fast.wistia.com/embed/medias/yagn8fs2vi.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:61.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_yagn8fs2vi videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><h2 id="learn-more">Learn More</h2><p>The visual markdown editor has many other features for efficient report writing.</p><ul><li>Check out more information on the <a href="https://rstudio.github.io/visual-markdown-editing/" target = "_blank">Visual R Markdown website</a>.</li><li>Download the <a href="https://rstudio.com/products/rstudio/download/" target = "_blank">latest RStudio IDE version</a> to try out the editor.</li><li>Watch Tom Mock live code in visual editing mode in his webinar, <a href="https://youtu.be/WkF7nqEYF1E" target = "_blank">R Markdown Advanced Tips to Become a Better Data Scientist &amp; RStudio Connect</a>.</li></ul></description></item><item><title>RStudio Community Monthly Events Roundup - May 2022</title><link>https://www.rstudio.com/blog/rstudio-community-monthly-events-roundup-may-2022/</link><pubDate>Wed, 04 May 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-community-monthly-events-roundup-may-2022/</guid><description><sup>Photo by <a href="https://unsplash.com/@nickmorrison?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Nick Morrison</a> on <a href="https://unsplash.com/?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup><p>Welcome to RStudio Community Monthly Events Roundup, where we update you on upcoming virtual events happening at RStudio this month. Missed the great talks and presentations from last month? Find them listed under <a href="#icymi-april-2022-events">ICYMI: April 2022 Events</a>.</p><p>You can subscribe to the Community Events Calendar so that new events will automatically appear on your calendar. Please note that by subscribing, all of the events in the calendar will appear on your own calendar. If you wish to add individual events instead, please use the links below.</p><p>We can’t wait to see you there!</p><h2 id="save-the-date">Save the Date</h2><ul><li>May 5, 2022 at 12 ET: Data Science Hangout with Rachel Poulsen, Cloud Systems Data Science Lead at Google Cloud (<a href="https://www.addevent.com/event/Qv9211919" target = "_blank">add to calendar</a>)</li><li>May 12, 2022 at 12 ET: Data Science Hangout with Wayne Jones, Principal Data Scientist at Shell (<a href="https://www.addevent.com/event/Qv9211919" target = "_blank">add to calendar</a>)</li><li>May 17, 2022 at 12 ET: R for Clinical Study Reports &amp; Submission | Led by Yilong Zhang, PhD at Merck (<a href="https://rstd.io/pharma-meetup" target = "_blank">add to calendar</a>)</li><li>May 19, 2022 at 12 ET: Data Science Hangout with Lindsey Clark, Director of Data Science at Healthcare Bluebook (<a href="https://www.addevent.com/event/Qv9211919" target = "_blank">add to calendar</a>)</li><li>May 25, 2022 at 12 ET: Optimizing Shiny for enterprise-grade apps | Led by Veerle Van Leemput at Analytic Health (<a href="https://evt.to/aeioeimaw" target = "_blank">add to calendar</a>)</li><li>May 26, 2022 at 12 ET: Data Science Hangout with Alice Walsh, VP, Translational Research at Pathos (<a href="https://www.addevent.com/event/Qv9211919" target = "_blank">add to calendar</a>)</li><li>June 1, 2022 at 12 ET: Using Python with RStudio Team | Led by David Aja at RStudio (<a href="https://www.addevent.com/event/gN13719438" target = "_blank">add to calendar</a>)</li><li>June 2, 2022 at 12 ET: Data Science Hangout with Travis Gerke, Director of Data Science at PCCTC (<a href="https://www.addevent.com/event/Qv9211919" target = "_blank">add to calendar</a>)</li><li>June 7, 2022 at 12 ET: Making microservices a part of your data science team | Led by Tom Schenk &amp; Bejan Sadeghian at KPMG (<a href="https://evt.to/aeshgidow" target = "_blank">add to calendar</a>)</li><li>June 9, 2022 at 12 ET: Data Science Hangout with…you?! (<a href="https://www.linkedin.com/in/rachaeldempsey/" target = "_blank">reach out to Rachael Dempsey</a>)</li><li>June 14, 2022 at 12 ET: RStudio Healthcare Meetup: Translating facts into insights at Children&rsquo;s Hospital of Philadelphia | Led by Jake Riley (<a href="https://www.addevent.com/event/Du13258557" target = "_blank">add to calendar</a>)</li><li>June 16, 2022 at 12 ET: Data Science Hangout with David Meza, AIML R&amp;D Lead, People Analytics at NASA (<a href="https://www.addevent.com/event/Qv9211919" target = "_blank">add to calendar</a>)</li></ul><h2 id="recurring-events">Recurring Events</h2><h3 id="data-science-hangout">Data Science Hangout</h3><p>The Data Science Hangout is a weekly, free-to-join open conversation for current and aspiring data science leaders to connect and chat about some of the more human-centric questions around data science leadership. These happen every Thursday at 12 ET with a different leader featured each week.</p><p>The conversation is all audience-based and you can jump on whenever it fits your schedule. Add the weekly hangouts to your calendar on <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">AddEvent</a> and check out the <a href="https://www.rstudio.com/data-science-hangout/" target = "_blank">website</a> with all the recordings.</p><p>A few other things:</p><ul><li>All are welcome - no matter your industry/experience</li><li>No need to register for anything</li><li>It&rsquo;s always okay to join for part of a session</li><li>You can just listen-in if you want</li><li>You can ask anonymous questions too!</li></ul><h2 id="rstudio-enterprise-community-meetups">RStudio Enterprise Community Meetups</h2><p>We also host industry meetups for teams to share the work they are doing within their organizations, teach lessons learned, and network with others. Join the group on <a href="https://www.meetup.com/RStudio-Enterprise-Community-Meetup/" target = "_blank">Meetup</a>.</p><h2 id="icymi-april-2022-events">ICYMI: April 2022 Events</h2><ul><li>April 7, 2022 at 12 ET: <a href="https://youtu.be/s9m2mxNHZaY" target = "_blank">Data Science Hangout with Jenny Listman</a>, Director of Research at Statespace</li><li>April 12, 2022 at 12 ET: RStudio Finance Meetup: <a href="https://youtu.be/ssmwUBSpF-8" target = "_blank">Robust, modular dashboards that minimize tech debt</a> | Led by Alan Carlson at Snap Finance</li><li>April 14, 2022 at 12 ET: <a href="https://youtu.be/8m5J5UXhyhI" target = "_blank">Data Science Hangout with Joseph Korszun</a>, Manager of Data Science at ProCogia</li><li>April 21, 2022 at 12 ET: <a href="https://youtu.be/OtxU4rc9lUY" target = "_blank">Data Science Hangout with Tegan Ashby</a>, Senior Developer, Basketball Systems at the Brooklyn Nets and Co-Founder of Women in Sports Data</li><li>April 26, 2022 at 12 ET: <a href="https://youtu.be/5NQnwHXKVj8" target = "_blank">Championing Analytic Infrastructure</a> | Led by Kelly O’Briant at RStudio</li><li>April 28, 2022 at 12 ET: <a href="https://youtu.be/XcjDYIVn9j0" target = "_blank">Data Science Hangout with Daren Eiri</a>, Director of Data Science at Arrowhead General Insurance Agency</li><li>May 3, 2022 at 4 ET: <a href="https://www.youtube.com/playlist?list=PL9HYL-VRX0oRKK9ByULWulAOO5jN70eXv" target = "_blank">Shiny modularization, Leaflet for R and Leaflet JS extensions with Epi-interactive</a> | Led by Dr Uli Muellner and Nick Snellgrove at Epi-interactive (coming soon)</li></ul><h2 id="call-for-speakers">Call for Speakers</h2><p>If you’re interested in sharing your work at a Meetup (or just starting to consider it for a future date down the road!), <a href="https://forms.gle/EtXMpSoTfhpGopiS8" target = "_blank">please fill out the speaker submission form</a>. We’re always looking for a diverse set of speakers — if you are a member of a group that is underrepresented in data science, including people of color, those with disabilities, elders/older adults, LGBTQ folks, and women/minority genders, we highly encourage you to submit!</p></description></item><item><title>Announcing Sitewide Search on rstudio.com</title><link>https://www.rstudio.com/blog/announcing-sitewide-search-on-rstudio-com/</link><pubDate>Mon, 02 May 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-sitewide-search-on-rstudio-com/</guid><description><p><sup>Photo by <a href="https://unsplash.com/@markuswinkler?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Markus Winkler</a> on <a href="https://unsplash.com/s/photos/magnifying-glass?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup></p><div class="lt-gray-box">Sarah Lin is RStudio's Enterprise Information Manager.</div><p>As many RStudio customers &amp; open-source users know, we have a lot of websites. Historically, as the company grew, each department created its own website(s) to convey information about a specific topic. As a result, it can be hard to know which site contains the information you’re looking for. Until now, the search box on rstudio.com only searched only 21% of the RStudio websites.</p><p>The Enterprise Information Management team is pleased to launch Sitewide Search on rstudio.com. Using the search icon on rstudio.com, Sitewide Search will now return content from all of the following sites &amp; subdomains:</p><ul><li>rstudio.com</li><li>rstudio.cloud</li><li>shinyapps.io</li><li>rstudio.com/blog</li><li>blogs.rstudio.com/ai</li><li>db.rstudio.com</li><li>docs.rstudio.com</li><li>education.rstudio.com</li><li>environments.rstudio.com</li><li>keras.rstudio.com</li><li>packagemanager.rstudio.com</li><li>pkgs.rstudio.com</li><li>rmarkdown.rstudio.com</li><li>rviews.rstudio.com</li><li>shiny.rstudio.com</li><li>solutions.rstudio.com</li><li>spark.rstudio.com</li><li>support.rstudio.com</li><li>team-admin.rstudio.com</li><li>tensorflow.rstudio.com</li></ul><p>Based on an analysis of search term data from the previous search engine as well as Google Search Console, Sitewide Search results will be broken down into 4 content areas, or &ldquo;buckets&rdquo;:</p><ul><li>Product<ul><li>Product includes relevant content from professional product pages on rstudio.com as well as RStudio Cloud and Shinyapps.io.</li></ul></li><li>Blogs<ul><li>The Blog results will include content from several of the blogs produced by RStudio: the main RStudio.com Blog, R Views, the RStudio AI Blog, and the historical posts from the Education Blog.</li></ul></li><li>Customer Information &amp; Open Source.<ul><li>Customer Information and Open Source include results relevant to each type of RStudio user. There is a small amount of content that overlaps between both types of content.</li></ul></li></ul><p>Additionally, users can filter by website name, tags, and categories along the left side of the search results page to refine their search. For Blogs, Sitewide Search has a toggle button on the left menu to further restrict the search results to the most current 6 months.</p><p>Users are likely to experience a more comprehensive and robust set of search results from previous searches due to the sheer increase in sites included. Sitewide Search results are also relevancy ranked based on search term matching and where the term(s) appear on the website.</p><p>Two sites, status.rstudio.com and dailies.rstudio.com, are not included in Sitewide Search due to the timeline nature of updates to those sites. Instead, we have included links to both sites on the bottom of the search results pages.</p><p>For this phase, Sitewide Search is only present on the search box header on rstudio.com. We will be rolling it out to other sites in the future, though the sites and timeline have not been decided.</p></description></item><item><title>rstudio::conf(2022) Workshops</title><link>https://www.rstudio.com/blog/rstudio-conf-2022-workshops/</link><pubDate>Thu, 28 Apr 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-conf-2022-workshops/</guid><description><p>We are so excited for rstudio::conf! To start off the conference in July, we have an amazing line-up of workshops. There&rsquo;s a session for you wherever you are on your data science journey. Get inspired and learn something new.</p><p>You can find all workshop information on the <a href="https://www.rstudio.com/conference/2022/2022-conf-workshops-pricing/#workshops" target = "_blank">rstudio::conf(2022) website</a>. Please note that while there will be virtual options for the conference, the workshops are only offered in person.</p><h3 id="introduction-to-the-tidyverse-">Introduction to the tidyverse 🌌</h3><p><em>Presented by the RStudio Academy Team</em></p><p>A unique 6-week data science apprenticeship where you’ll develop the skills necessary to do data science with the R language.</p><p>Learn more about <a href="https://www.rstudio.com/conference/2022/workshops/intro-to-tidyverse/" target = "_blank">Introduction to the tidyverse</a>.</p><h3 id="getting-started-with-shiny-">Getting Started with Shiny ✨</h3><p><em>Presented by Colin Rundel</em></p><p>Shiny is an R package that makes it easy to build interactive web apps straight from R. This workshop will start at the beginning.</p><p>Learn more about <a href="https://www.rstudio.com/conference/2022/workshops/get-started-shiny/" target = "_blank">Getting Started with Shiny</a>.</p><h3 id="getting-started-with-quarto-">Getting Started with Quarto 🔵</h3><p><em>Presented by Tom Mock</em></p><p>This workshop is designed for those who have no or little prior experience with R Markdown and who want to learn Quarto.</p><p>Learn more about <a href="https://www.rstudio.com/conference/2022/workshops/get-started-quarto/" target = "_blank">Getting Started with Quarto</a>.</p><h3 id="from-r-markdown-to-quarto-">From R Markdown to Quarto 📝</h3><p><em>Presented by Andrew Bray</em></p><p>This workshop is designed for those who want to take their R Markdown skills and expertise and apply them in Quarto, the next generation of R Markdown.</p><p>Learn more about <a href="https://www.rstudio.com/conference/2022/workshops/rmd-to-quarto/" target = "_blank">From R Markdown to Quarto</a>.</p><h3 id="making-art-from-code-">Making Art from Code 🎨</h3><p><em>Presented by Danielle Navarro</em></p><p>This workshop provides a hands-on introduction to generative art in R. You’ll learn artistic techniques that generative artists use regularly in their work including flow fields, iterative function systems, tilings, and more.</p><p>Learn more about <a href="https://www.rstudio.com/conference/2022/workshops/art-from-code/" target = "_blank">Making Art from Code</a>.</p><h3 id="designing-the-data-science-classroom-">Designing the Data Science Classroom 🧑🏫</h3><p><em>Presented by Mine Çetinkaya-Rundel and Maria Tackett</em></p><p>The goal of this workshop is to equip educators with concrete information on content, workflows, and infrastructure for painlessly introducing modern computation with R and RStudio within a data science curriculum.</p><p>Learn more about <a href="https://www.rstudio.com/conference/2022/workshops/teach-ds/" target = "_blank">Designing the Data Science Classroom</a>.</p><h3 id="building-tidy-tools-">Building Tidy Tools 🧰</h3><p><em>Presented by Emma Rand and Ian Lyttle</em></p><p>This is a two-day, hands-on workshop for those who have embraced the tidyverse and want to build their own packages.</p><p>Learn more about <a href="https://www.rstudio.com/conference/2022/workshops/build-tidy-tools/" target = "_blank">Building Tidy Tools</a>.</p><h3 id="r-for-people-analytics-">R for People Analytics 🏢</h3><p><em>Presented by Keith McNulty, Alex LoPilato, and Liz Romero</em></p><p>The course will cover some of the most commonly used methods of analysis and inference when working with data related to people, such as survey data and organizational network data.</p><p>Learn more about <a href="https://www.rstudio.com/conference/2022/workshops/people-analytics-rstats/" target = "_blank">R for People Analytics</a>.</p><h3 id="how-data-science-with-r-works-for-systems-administrators-">How Data Science with R Works for Systems Administrators 💻</h3><p><em>Presented by Alex Gold</em></p><p>In this workshop, you&rsquo;ll learn to use the capabilities of RStudio Team to enable your organization&rsquo;s R and Python users, including topics like package and environment management, performance and scaling, external data connections, and integrating RStudio Team with CI/CD pipelines.</p><p>Learn more about <a href="https://www.rstudio.com/conference/2022/workshops/ds-for-sysadmins/" target = "_blank">How Data Science with R Works for Systems Administrators</a>.</p><h3 id="graphic-design-with-ggplot2-how-to-create-engaging-and-complex-visualizations-in-r-">Graphic Design with ggplot2: How to Create Engaging and Complex Visualizations in R 📊</h3><p><em>Presented by Cédric Scherer</em></p><p>The workshop covers the most important steps and helpful tips to create visually appealing, engaging and complex graphics with ggplot2.</p><p>Learn more about <a href="https://www.rstudio.com/conference/2022/workshops/ggplot2-graphic-design/" target = "_blank">Graphic Design with ggplot2</a>.</p><h3 id="what-they-forgot-to-teach-you-about-r-">What They Forgot to Teach You About R 🔥</h3><p><em>Presented by Shannon McClintock Pileggi, Jenny Bryan, and David Aja</em></p><p>This is a two-day hands on workshop designed for experienced R and RStudio users who want to (re)design their R lifestyle.</p><p>Learn more about <a href="https://www.rstudio.com/conference/2022/workshops/wtf-rstats/" target = "_blank">What They Forgot to Teach You About R</a>.</p><h3 id="building-production-quality-shiny-applications-almost-full-">Building Production-Quality Shiny Applications (almost full!) 🛠️</h3><p><em>Presented by Eric Nantz</em></p><p>This workshop is for the Shiny developer who has entered this stage of their application development journey.</p><p>Learn more about <a href="https://www.rstudio.com/conference/2022/workshops/shiny-prod-apps/" target = "_blank">Building Production-Quality Shiny Applications</a>.</p><h3 id="machine-learning-with-tidymodels-almost-full-">Machine Learning with tidymodels (almost full!) 🧁</h3><p><em>Presented by Julia Silge, Max Kuhn, and David Robinson</em></p><p>This workshop provides an introduction to machine learning with R.</p><p>Learn more about <a href="https://www.rstudio.com/conference/2022/workshops/tidymodels-ml/" target = "_blank">Machine Learning with tidymodels</a>.</p><h3 id="causal-inference-in-r-almost-full-">Causal Inference in R (almost full!) ➡️</h3><p><em>Presented by Lucy D&rsquo;Agostino McGowan and Malcolm Barrett</em></p><p>In this workshop, we’ll teach the essential elements of answering causal questions in R through causal diagrams, and causal modeling techniques such as propensity scores and inverse probability weighting.</p><p>Learn more about <a href="https://www.rstudio.com/conference/2022/workshops/causal-inference-rstats/" target = "_blank">Causal Inference in R</a>.</p><p>Are you as excited as we are?</p><center><a class="btn btn-primary" href="https://www.rstudio.com/conference/2022/2022-conf-workshops-pricing/#workshops" target="_blank">Register for a workshop</a></center></description></item><item><title>Speed Up Data Analytics and Wrangling With Parquet Files</title><link>https://www.rstudio.com/blog/speed-up-data-analytics-with-parquet-files/</link><pubDate>Tue, 26 Apr 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/speed-up-data-analytics-with-parquet-files/</guid><description><p><sup>Photo by <a href="https://unsplash.com/@jakegivens?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Jake Givens</a> on <a href="https://unsplash.com/s/photos/fast?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup></p><div class="lt-gray-box">This is a guest post from Ryan Garnett, Ray Wong, and Dan Reed from Green Shield Canada. <a href="https://www.greenshield.ca/en-ca/" target = "_blank">Green Shield Canada</a>, a social enterprise and one of the country’s largest health benefits carriers, currently serves over 4.5 million Canadians across health and dental benefits and pharmacy benefits management. GSC also provides clients with an integrated experience that includes health care delivery via an ever-expanding digital health ecosystem and full benefits administration support.</div><link href="style.css" rel="stylesheet"></link><h2 id="the-challenge">The Challenge</h2><p>Data is increasing in value for many organizations &mdash; an asset leveragedto help make informed business decisions. Unfortunately, this sentimenthas not been the norm throughout the life of most organizations, withvast amounts of data locked in legacy data management systems anddesigns. The majority of organizations use relational databasemanagement systems (RDBMS) like Oracle, Postgres, Microsoft SQL, orMySQL to store and manage their enterprise data. Typically these systemswere designed to collect and process data quickly within a transactionaldata model. While these models are excellent for applications, they can pose challenges for performing business intelligence, data analytics, or predictive analysis. Many organizations are realizing their legacy systems are not sufficient for data analytics initiatives, providing an opportunity for analytics teams to present tangible options to improve their organization’s data analytics infrastructure.</p><p>Regardless if you are engineering data for others to consume foranalysis, or performing the analytics, reducing the time toperform data processing is critically important. Within this post, we aregoing to evaluate the performance of two distinct data storageformats; row-based (CSV) and columnar (parquet); with CSV being a triedand tested standard data format used within the data analytics field,and parquet becoming a viable alternative in many data platforms.</p><h2 id="setup">Setup</h2><p>We performed the analysis for the post on Green Shield Canada’sanalytics workstation. Our workstation is a shared resource for ouranalytics team that is running RStudio Workbench with the followingconfigurations:</p><table><tbody><tr class="odd"><td><em>Operating system</em></td><td>Ubuntu 20</td></tr><tr class="even"><td><em>Cores</em></td><td>16</td></tr><tr class="odd"><td><em>CPU speed</em></td><td>2.30GHz</td></tr><tr class="even"><td><em>RAM</em></td><td>1TB</td></tr></tbody></table><h3 id="load-packages">Load Packages</h3><p>We use the following packages throughout the post:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-R" data-lang="R"><span style="color:#60a0b0;font-style:italic"># Importing data</span><span style="color:#06287e">library</span>(arrow)<span style="color:#06287e">library</span>(readr)<span style="color:#60a0b0;font-style:italic"># Data analysis and wrangling</span><span style="color:#06287e">library</span>(dplyr)<span style="color:#60a0b0;font-style:italic"># Visualization and styling</span><span style="color:#06287e">library</span>(ggplot2)<span style="color:#06287e">library</span>(gt)</code></pre></div><h3 id="data-sources">Data Sources</h3><p>We store Green Shield Canada’s source data in a transactional datamodel within an Oracle database. The purpose of the transaction modelwithin Oracle is to quickly adjudicate medical claims within GreenShield’s Advantage application, and it has been performing exceptionallywell. While a transactional data model provides great performance fortransactional applications, the data model design is less than optimalfor data analytics uses. Green Shield Canada, like many organizations,is undergoing a significant digital transformation with a high emphasison data analytics. During the digital transformation, an analytical datamodel will be developed, built from many of the source tables currentlystored in Oracle database tables, with the need to perform numerous datawrangling tasks.</p><p>Within Green Shield Canada, data is sized based on the following fourgroups:</p><ul><li>x-small dataset &lt; 1M rows (day)</li><li>small dataset 1-10M rows (month)</li><li>medium data 10-100M rows (year)</li><li>large data &gt; 100M-1B rows (decade)</li></ul><p>The main dataset used within the analysis is Green Shield Canada’s claimhistory data. This dataset includes various data elements related to thetransactional claims submitted by Green Shield’s clients. This datasetis critically important to the organization, providing valuable insightsinto how the company operates and the service provided to ourcustomers. The following is a table with the characteristics related tothe claim history dataset.</p><div id="ktsbzxnuih" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;"><style><p></style></p><table class="gt_table"><thead class="gt_header"><tr><th colspan="6" class="gt_heading gt_title gt_font_normal" style="background-color: #005028; color: #FFFFFF; font-size: x-large; text-align: left; vertical-align: top; font-weight: bold; border-top-width: 3px; border-top-style: solid; border-top-color: #B6B6B6;">Dataset Characteristics</th></tr><tr><th colspan="6" class="gt_heading gt_subtitle gt_font_normal gt_bottom_border" style="background-color: #005028; color: #FFFFFF; font-size: small; text-align: left; vertical-align: top; border-bottom-width: 3px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">Claim History Data</th></tr></thead><thead class="gt_col_headings"><tr><th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1" style="background-color: #8DC63F; color: #414042; font-size: medium; text-align: center; vertical-align: middle; border-bottom-width: 3px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">Dataset Size Group</th><th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1" style="background-color: #8DC63F; color: #414042; font-size: medium; text-align: center; vertical-align: middle; border-bottom-width: 3px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">Dataset Name</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" style="background-color: #8DC63F; color: #414042; font-size: medium; text-align: center; vertical-align: middle; border-bottom-width: 3px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">Number of Rows</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" style="background-color: #8DC63F; color: #414042; font-size: medium; text-align: center; vertical-align: middle; border-bottom-width: 3px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">Number of Columns</th><th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1" style="background-color: #8DC63F; color: #414042; font-size: medium; text-align: center; vertical-align: middle; border-bottom-width: 3px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">CSV File Size</th><th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1" style="background-color: #8DC63F; color: #414042; font-size: medium; text-align: center; vertical-align: middle; border-bottom-width: 3px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">Parquet File Size</th></tr></thead><tbody class="gt_table_body"><tr><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">x-small</td><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">claim_history_day</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">317,617</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">201</td><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">281.8 MB</td><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">38.1 MB</td></tr><tr><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">small</td><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">claim_history_month</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">5,548,609</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">202</td><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">4.8 GB</td><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">711.9 MB</td></tr><tr><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">medium</td><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">claim_history_year</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">66,001,292</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">201</td><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">57.3 GB</td><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">7.5 GB</td></tr><tr><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">large</td><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">claim_history</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">408,197,137</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">201</td><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">351.5 GB</td><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">45.1 GB</td></tr></tbody></table></div><p>The second dataset used within the analysis is Green Shield Canada’sprovider data. This dataset includes various data elements related tothe provider network that provides medical services for Green ShieldCanada’s customers. The following is a table with the characteristicsassociated with the provider dataset.</p><div id="pbnzuofvii" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;"><style><p></style></p><table class="gt_table"><thead class="gt_header"><tr><th colspan="5" class="gt_heading gt_title gt_font_normal" style="background-color: #005028; color: #FFFFFF; font-size: x-large; text-align: left; vertical-align: top; font-weight: bold; border-top-width: 3px; border-top-style: solid; border-top-color: #B6B6B6;">Dataset Characteristics</th></tr><tr><th colspan="5" class="gt_heading gt_subtitle gt_font_normal gt_bottom_border" style="background-color: #005028; color: #FFFFFF; font-size: small; text-align: left; vertical-align: top; border-bottom-width: 3px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">Provider Data</th></tr></thead><thead class="gt_col_headings"><tr><th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1" style="background-color: #8DC63F; color: #414042; font-size: medium; text-align: center; vertical-align: middle; border-bottom-width: 3px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">Dataset Name</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" style="background-color: #8DC63F; color: #414042; font-size: medium; text-align: center; vertical-align: middle; border-bottom-width: 3px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">Number of Rows</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" style="background-color: #8DC63F; color: #414042; font-size: medium; text-align: center; vertical-align: middle; border-bottom-width: 3px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">Number of Columns</th><th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1" style="background-color: #8DC63F; color: #414042; font-size: medium; text-align: center; vertical-align: middle; border-bottom-width: 3px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">CSV File Size</th><th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1" style="background-color: #8DC63F; color: #414042; font-size: medium; text-align: center; vertical-align: middle; border-bottom-width: 3px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">Parquet File Size</th></tr></thead><tbody class="gt_table_body"><tr><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">provider</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">1,077,046</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">18</td><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">146.1 MB</td><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">31 MB</td></tr></tbody></table></div><h2 id="the-solution">The Solution</h2><p>Green Shield Canada has decided to convert data sources used foranalytics from row-based sources to a columnar format, specifically<a href="https://parquet.apache.org/" target = "_blank">Apache Parquet</a>.</p><blockquote><p><em>Apache Parquet is an open source, column-oriented data file formatdesigned for efficient data storage and retrieval. It providesefficient data compression and encoding schemes with enhancedperformance to handle complex data in bulk.</em> <br> <em>&mdash; Apache Foundation</em></p></blockquote><p>We leverage the <a href="https://arrow.apache.org/docs/r/" target = "_blank">arrow R package</a> to convert our row-based datasetsinto parquet files. Parquet partitions data intosmaller chunks and enables improved performance when filteringagainst columns that have partitions.</p><p>Parquet file formats have three main benefits for analytical usage:</p><ul><li><strong>Compression</strong>: low storage consumption</li><li><strong>Speed</strong>: efficiently reads data in less time</li><li><strong>Interoperability</strong>: can be read by many different languages</li></ul><p>Converting our datasets from row-based (CSV) to columnar (parquet) hassignificantly reduced the file size. The CSV files range from <strong>4.7</strong> to<strong>7.8</strong> times larger than parquet files.</p><p>We will explore computationally expensive tasks inboth data engineering and data analysis processes. We will perform four specific tasks on all four of the data sizes groups (x-small, small,medium, and large) produced from our claim history dataset.</p><ol><li>join provider information to claim history</li><li>processed claims volume by benefit type per time interval (i.e., day,month, and/or year)</li><li>processed claims statistics by benefit type per time interval(i.e., day, month, and/or year)</li><li>provider information with processed claims statistics by benefittype per time interval (i.e., day, month, and/or year)</li></ol><h3 id="x-small-data">X-Small Data</h3><p>The x-small data consists of data collected on a single day in January2021. The dataset has <strong>317,617</strong> rows of data.</p><h4 id="csv-data">CSV Data</h4><p>The CSV file used in this section was <strong>281.8 MB</strong> in size.</p><html><div id="tab" class="tab"><button class="tablinks" onclick="clickHandle(event, 'Task 1_1', 'Task 1')">Task 1</button><button class="tablinks" onclick="clickHandle(event, 'Task 1_2', 'Task 1')">Task 2</button><button class="tablinks" onclick="clickHandle(event, 'Task 1_3', 'Task 1')">Task 3</button><button class="tablinks" onclick="clickHandle(event, 'Task 1_4', 'Task 1')">Task 4</button></div><div id="Task 1_1" class="tabcontent" style="display:block;"><pre><code># Task 1 - joinstart &lt;- Sys.time()claims_columns &lt;-cols_only(CLAIM_STATUS_TYPE_CD = col_character(),CH_SUBM_PROVIDER_ID = col_double(),BNFT_TYPE_CD = col_character(),CH_REND_AMT = col_double())provider_columns &lt;-cols_only(provider_id = col_double(),provider_type = col_character(),benefit_description = col_character())left_join(read_csv(&quot;/home/data/CLAIM_HISTORY_DAY.csv&quot;,col_types = claims_columns) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;),read_csv(&quot;/home/data/PROVIDER.csv&quot;,col_types = provider_columns),by = c(&quot;CH_SUBM_PROVIDER_ID&quot; = &quot;provider_id&quot;))end &lt;- Sys.time()end - start</code></pre><p>The task took <strong>16.006 secs</strong> to execute.</p></div><div id="Task 1_2" class="tabcontent" style="none"><pre><code># Task 2 - group_by + countstart &lt;- Sys.time()claims_columns &lt;-cols_only(CLAIM_STATUS_TYPE_CD = col_character(),CH_SUBM_PROVIDER_ID = col_double(),BNFT_TYPE_CD = col_character(),CH_REND_AMT = col_double())read_csv(&quot;/home/data/CLAIM_HISTORY_DAY.csv&quot;,col_types = claims_columns) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;) %&gt;%group_by(BNFT_TYPE_CD) %&gt;%count() %&gt;%ungroup() %&gt;%arrange(desc(n))end &lt;- Sys.time()end - start</code></pre><p>The task took <strong>10.84989 secs</strong> to execute.</p></div><div id="Task 1_3" class="tabcontent"><pre><code># Task 3 - group_by + summarizestart &lt;- Sys.time()claims_columns &lt;-cols_only(CLAIM_STATUS_TYPE_CD = col_character(),CH_SUBM_PROVIDER_ID = col_double(),BNFT_TYPE_CD = col_character(),CH_REND_AMT = col_double())provider_columns &lt;-cols_only(provider_id = col_double(),provider_type = col_character(),benefit_description = col_character())read_csv(&quot;/home/data/CLAIM_HISTORY_DAY.csv&quot;,col_types = claims_columns) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;) %&gt;%group_by(BNFT_TYPE_CD) %&gt;%summarize(minimum_amount =min(CH_REND_AMT, na.rm = TRUE),mean_amount =mean(CH_REND_AMT, na.rm = TRUE),max_amount =max(CH_REND_AMT, na.rm = TRUE)) %&gt;%ungroup()end &lt;- Sys.time()end - start</code></pre><p>The task took <strong>11.8559 secs</strong> to execute.</p></div><div id="Task 1_4" class="tabcontent"><pre><code># Task 4 - join + group_by + summarizestart &lt;- Sys.time()claims_columns &lt;-cols_only(CLAIM_STATUS_TYPE_CD = col_character(),CH_SUBM_PROVIDER_ID = col_double(),BNFT_TYPE_CD = col_character(),CH_REND_AMT = col_double())provider_columns &lt;-cols_only(provider_id = col_double(),provider_type = col_character(),benefit_description = col_character())left_join(read_csv(&quot;/home/data/CLAIM_HISTORY_DAY.csv&quot;,col_types = claims_columns) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;),read_csv(&quot;/home/data/PROVIDER.csv&quot;,col_types = provider_columns),by = c(&quot;CH_SUBM_PROVIDER_ID&quot; = &quot;provider_id&quot;)) %&gt;%group_by(benefit_description, BNFT_TYPE_CD) %&gt;%summarize(minimum_amount =min(CH_REND_AMT, na.rm = TRUE),mean_amount =mean(CH_REND_AMT, na.rm = TRUE),max_amount =max(CH_REND_AMT, na.rm = TRUE)) %&gt;%ungroup()end &lt;- Sys.time()end - start</code></pre><p>The task took <strong>16.02928 secs</strong> to execute.</p></div><script>function clickHandle(evt, tableNum, groupID) {let i, tabcontent, tablinks;// This is to clear the previous clicked content.tabcontent = document.getElementsByClassName("tabcontent");for (i = 0; i < tabcontent.length; i++) {if (tabcontent[i].id.startsWith(groupID))tabcontent[i].style.display = "none";}// Set the tab to be "active".tablinks = document.getElementsByClassName("tablinks");for (i = 0; i < tablinks.length; i++) {if (tabcontent[i].id.startsWith(groupID))tablinks[i].className = tablinks[i].className.replace(" active", "");}// Display the clicked tab and set it to active.document.getElementById(tableNum).style.display = "block";evt.currentTarget.className += " active";}</script></html><h4 id="parquet-data">Parquet Data</h4><p>The parquet file used in this section was <strong>38.1 MB</strong> in size.</p><html><div id="tab2" class="tab"><button class="tablinks" onclick="clickHandle(event, 'Task 2_1', 'Task 2')">Task 1</button><button class="tablinks" onclick="clickHandle(event, 'Task 2_2', 'Task 2')">Task 2</button><button class="tablinks" onclick="clickHandle(event, 'Task 2_3', 'Task 2')">Task 3</button><button class="tablinks" onclick="clickHandle(event, 'Task 2_4', 'Task 2')">Task 4</button></div><div id="Task 2_1" class="tabcontent" style="display:block"><pre><code># Task 1 - joinstart &lt;- Sys.time()left_join(open_dataset(source = &quot;/home/data/CLAIM_HISTORY_DAY&quot;) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;) %&gt;%select(CH_SUBM_PROVIDER_ID, BNFT_TYPE_CD, CH_REND_AMT),open_dataset(sources = &quot;/home/data/Provider&quot;) %&gt;%select(provider_id,provider_type,benefit_description),by = c(&quot;CH_SUBM_PROVIDER_ID&quot; = &quot;provider_id&quot;)) %&gt;%collect()end &lt;- Sys.time()end - start</code></pre><p>The task took <strong>1.776429 secs</strong> to execute.</p></div><div id="Task 2_2" class="tabcontent"><pre><code># Task 2 - group_by + countstart &lt;- Sys.time()open_dataset(source = &quot;/home/data/CLAIM_HISTORY_DAY&quot;) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;) %&gt;%select(BNFT_TYPE_CD) %&gt;%group_by(BNFT_TYPE_CD) %&gt;%count() %&gt;%ungroup() %&gt;%arrange(desc(n)) %&gt;%collect()end &lt;- Sys.time()end - start</code></pre><p>The task took <strong>0.7456837 secs</strong> to execute.</p></div><div id="Task 2_3" class="tabcontent"><pre><code># Task 3 - group_by + summarizestart &lt;- Sys.time()open_dataset(source = &quot;/home/data/CLAIM_HISTORY_DAY&quot;) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;) %&gt;%select(BNFT_TYPE_CD, CH_REND_AMT) %&gt;%group_by(BNFT_TYPE_CD) %&gt;%summarize(minimum_amount =min(CH_REND_AMT, na.rm = TRUE),mean_amount =mean(CH_REND_AMT, na.rm = TRUE),max_amount =max(CH_REND_AMT, na.rm = TRUE)) %&gt;%ungroup() %&gt;%collect()end &lt;- Sys.time()end - start</code></pre><p>The task took <strong>0.2979383 secs</strong> to execute.</p></div><div id="Task 2_4" class="tabcontent"><pre><code># Task 4 - join + group_by + summarizestart &lt;- Sys.time()left_join(open_dataset(source = &quot;/home/data/CLAIM_HISTORY_DAY&quot;) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;) %&gt;%select(CH_SUBM_PROVIDER_ID, BNFT_TYPE_CD, CH_REND_AMT),open_dataset(sources = &quot;/home/data/Provider&quot;) %&gt;%select(provider_id,provider_type,benefit_description),by = c(&quot;CH_SUBM_PROVIDER_ID&quot; = &quot;provider_id&quot;)) %&gt;%group_by(benefit_description, BNFT_TYPE_CD) %&gt;%summarize(minimum_amount =min(CH_REND_AMT, na.rm = TRUE),mean_amount =mean(CH_REND_AMT, na.rm = TRUE),max_amount =max(CH_REND_AMT, na.rm = TRUE)) %&gt;%ungroup() %&gt;%collect()end &lt;- Sys.time()end - start</code></pre><p>The task took <strong>1.359842 secs</strong> to execute.</p></div></html><h3 id="small-data">Small Data</h3><p>The small data consists of data collected in January 2021. The datasethas <strong>5,548,609</strong> rows of data.</p><h4 id="csv-data-1">CSV Data</h4><p>The CSV file used in this section was <strong>4.8 GB</strong> in size.</p><html><div id="tab3" class="tab"><button class="tablinks" onclick="clickHandle(event, 'Task 3_1', 'Task 3')">Task 1</button><button class="tablinks" onclick="clickHandle(event, 'Task 3_2', 'Task 3')">Task 2</button><button class="tablinks" onclick="clickHandle(event, 'Task 3_3', 'Task 3')">Task 3</button><button class="tablinks" onclick="clickHandle(event, 'Task 3_4', 'Task 3')">Task 4</button></div><div id="Task 3_1" class="tabcontent" style="display:block"><pre><code># Task 1 - joinstart &lt;- Sys.time()claims_columns &lt;-cols_only(CLAIM_STATUS_TYPE_CD = col_character(),CH_SUBM_PROVIDER_ID = col_double(),BNFT_TYPE_CD = col_character(),CH_REND_AMT = col_double())provider_columns &lt;-cols_only(provider_id = col_double(),provider_type = col_character(),benefit_description = col_character())left_join(read_csv(&quot;/home/data/CLAIM_HISTORY_MONTH.csv&quot;,col_types = claims_columns) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;),read_csv(&quot;/home/data/PROVIDER.csv&quot;,col_types = provider_columns),by = c(&quot;CH_SUBM_PROVIDER_ID&quot; = &quot;provider_id&quot;))end &lt;- Sys.time()end - start</code></pre><p>The task took <strong>3.677011 mins</strong> to execute.</p></div><div id="Task 3_2" class="tabcontent" style="display:none"><pre><code># Task 2 - group_by + countstart &lt;- Sys.time()claims_columns &lt;-cols_only(CLAIM_STATUS_TYPE_CD = col_character(),CH_SUBM_PROVIDER_ID = col_double(),BNFT_TYPE_CD = col_character(),CH_REND_AMT = col_double(),PROCESS_DAY = col_double())read_csv(&quot;/home/data/CLAIM_HISTORY_MONTH.csv&quot;,col_types = claims_columns) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;) %&gt;%group_by(BNFT_TYPE_CD, PROCESS_DAY) %&gt;%count() %&gt;%ungroup() %&gt;%arrange(desc(n))end &lt;- Sys.time()end - start</code></pre><p>The task took <strong>3.161771 mins</strong> to execute.</p></div><div id="Task 3_3" class="tabcontent" style="display:none"><pre><code># Task 3 - group_by + summarizestart &lt;- Sys.time()claims_columns &lt;-cols_only(CLAIM_STATUS_TYPE_CD = col_character(),CH_SUBM_PROVIDER_ID = col_double(),BNFT_TYPE_CD = col_character(),CH_REND_AMT = col_double(),PROCESS_DAY = col_double())provider_columns &lt;-cols_only(provider_id = col_double(),provider_type = col_character(),benefit_description = col_character())read_csv(&quot;/home/data/CLAIM_HISTORY_MONTH.csv&quot;,col_types = claims_columns) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;) %&gt;%group_by(BNFT_TYPE_CD, PROCESS_DAY) %&gt;%summarize(minimum_amount =min(CH_REND_AMT, na.rm = TRUE),mean_amount =mean(CH_REND_AMT, na.rm = TRUE),max_amount =max(CH_REND_AMT, na.rm = TRUE)) %&gt;%ungroup()end &lt;- Sys.time()end - start</code></pre><p>The task took <strong>3.095256 mins</strong> to execute.</p></div><div id="Task 3_4" class="tabcontent" style="display:none"><pre><code># Task 4 - join + group_by + summarizestart &lt;- Sys.time()claims_columns &lt;-cols_only(CLAIM_STATUS_TYPE_CD = col_character(),CH_SUBM_PROVIDER_ID = col_double(),BNFT_TYPE_CD = col_character(),CH_REND_AMT = col_double(),PROCESS_DAY = col_double())provider_columns &lt;-cols_only(provider_id = col_double(),provider_type = col_character(),benefit_description = col_character())left_join(read_csv(&quot;/home/data/CLAIM_HISTORY_MONTH.csv&quot;,col_types = claims_columns) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;),read_csv(&quot;/home/data/PROVIDER.csv&quot;,col_types = provider_columns),by = c(&quot;CH_SUBM_PROVIDER_ID&quot; = &quot;provider_id&quot;)) %&gt;%group_by(benefit_description,BNFT_TYPE_CD,PROCESS_DAY) %&gt;%summarize(minimum_amount =min(CH_REND_AMT, na.rm = TRUE),mean_amount =mean(CH_REND_AMT, na.rm = TRUE),max_amount =max(CH_REND_AMT, na.rm = TRUE)) %&gt;%ungroup()end &lt;- Sys.time()end - start</code></pre><p>The task took <strong>3.44803 mins</strong> to execute.</p></div></html><h4 id="parquet-data-1">Parquet Data</h4><p>The parquet file used in this section was <strong>711.9 MB</strong> in size.</p><html><div id="tab4" class="tab"><button class="tablinks" onclick="clickHandle(event, 'Task 4_1', 'Task 4')">Task 1</button><button class="tablinks" onclick="clickHandle(event, 'Task 4_2', 'Task 4')">Task 2</button><button class="tablinks" onclick="clickHandle(event, 'Task 4_3', 'Task 4')">Task 3</button><button class="tablinks" onclick="clickHandle(event, 'Task 4_4', 'Task 4')">Task 4</button></div><div id="Task 4_1" class="tabcontent" style="display:block"><pre><code># Task 1 - joinstart &lt;- Sys.time()left_join(open_dataset(source = &quot;/home/data/CLAIM_HISTORY_MONTH&quot;) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;) %&gt;%select(CH_SUBM_PROVIDER_ID,BNFT_TYPE_CD,CH_REND_AMT),open_dataset(sources = &quot;/home/data/Provider&quot;) %&gt;%select(provider_id, provider_type, benefit_description),by = c(&quot;CH_SUBM_PROVIDER_ID&quot; = &quot;provider_id&quot;)) %&gt;%collect()end &lt;- Sys.time()end - start</code></pre><p>The task took <strong>1.604066 secs</strong> to execute.</p></div><div id="Task 4_2" class="tabcontent" style="display:none"><pre><code># Task 2 - group_by + countstart &lt;- Sys.time()open_dataset(source = &quot;/home/data/CLAIM_HISTORY_MONTH&quot;) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;) %&gt;%select(BNFT_TYPE_CD) %&gt;%group_by(BNFT_TYPE_CD) %&gt;%count() %&gt;%ungroup() %&gt;%arrange(desc(n)) %&gt;%collect()end &lt;- Sys.time()end - start</code></pre><p>The task took <strong>0.3016093 secs</strong> to execute.</p></div><div id="Task 4_3" class="tabcontent" style="display:none"><pre><code># Task 3 - group_by + summarizestart &lt;- Sys.time()open_dataset(source = &quot;/home/data/CLAIM_HISTORY_MONTH&quot;) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;) %&gt;%select(BNFT_TYPE_CD, CH_REND_AMT, PROCESS_DAY) %&gt;%group_by(BNFT_TYPE_CD, PROCESS_DAY) %&gt;%summarize(minimum_amount =min(CH_REND_AMT, na.rm = TRUE),mean_amount =mean(CH_REND_AMT, na.rm = TRUE),max_amount =max(CH_REND_AMT, na.rm = TRUE)) %&gt;%ungroup() %&gt;%collect()end &lt;- Sys.time()end - start</code></pre><p>The task took <strong>0.5149045 secs</strong> to execute.</p></div><div id="Task 4_4" class="tabcontent" style="display:none"><pre><code># Task 4 - join + group_by + summarizestart &lt;- Sys.time()left_join(open_dataset(source = &quot;/home/data/CLAIM_HISTORY_MONTH&quot;) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;) %&gt;%select(CH_SUBM_PROVIDER_ID,BNFT_TYPE_CD,CH_REND_AMT,PROCESS_DAY),open_dataset(sources = &quot;/home/data/Provider&quot;) %&gt;%select(provider_id,provider_type,benefit_description),by = c(&quot;CH_SUBM_PROVIDER_ID&quot; = &quot;provider_id&quot;)) %&gt;%group_by(benefit_description,BNFT_TYPE_CD,PROCESS_DAY) %&gt;%summarize(minimum_amount =min(CH_REND_AMT, na.rm = TRUE),mean_amount =mean(CH_REND_AMT, na.rm = TRUE),max_amount =max(CH_REND_AMT, na.rm = TRUE)) %&gt;%ungroup() %&gt;%collect()end &lt;- Sys.time()end - start</code></pre><p>The task took <strong>1.12566 secs</strong> to execute.</p></div></html><h3 id="medium-data">Medium Data</h3><p>The medium data consists of data collected over 2021. The dataset has<strong>66,001,292</strong> rows of data.</p><h4 id="csv-data-2">CSV Data</h4><p>The CSV file used in this section was <strong>57.3 GB</strong> in size.</p><html><div id="tab5" class="tab"><button class="tablinks" onclick="clickHandle(event, 'Task 5_1', 'Task 5')">Task 1</button><button class="tablinks" onclick="clickHandle(event, 'Task 5_2', 'Task 5')">Task 2</button><button class="tablinks" onclick="clickHandle(event, 'Task 5_3', 'Task 5')">Task 3</button><button class="tablinks" onclick="clickHandle(event, 'Task 5_4', 'Task 5')">Task 4</button></div><div id="Task 5_1" class="tabcontent" style="display:block"><pre><code># Task 1 - joinstart &lt;- Sys.time()claims_columns &lt;-cols_only(CLAIM_STATUS_TYPE_CD = col_character(),CH_SUBM_PROVIDER_ID = col_double(),BNFT_TYPE_CD = col_character(),CH_REND_AMT = col_double())provider_columns &lt;-cols_only(provider_id = col_double(),provider_type = col_character(),benefit_description = col_character())left_join(read_csv(&quot;/home/data/CLAIM_HISTORY_YEAR.csv&quot;,col_types = claims_columns) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;),read_csv(&quot;/home/data/PROVIDER.csv&quot;,col_types = provider_columns),by = c(&quot;CH_SUBM_PROVIDER_ID&quot; = &quot;provider_id&quot;))end &lt;- Sys.time()end - start</code></pre><p>The task took <strong>40.19741 mins</strong> to execute.</p></div><div id="Task 5_2" class="tabcontent" style="display:none"><pre><code># Task 2 - group_by + countstart &lt;- Sys.time()claims_columns &lt;-cols_only(CLAIM_STATUS_TYPE_CD = col_character(),CH_SUBM_PROVIDER_ID = col_double(),BNFT_TYPE_CD = col_character(),CH_REND_AMT = col_double(),PROCESS_MONTH = col_double())read_csv(&quot;/home/data/CLAIM_HISTORY_YEAR.csv&quot;,col_types = claims_columns) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;) %&gt;%group_by(BNFT_TYPE_CD, PROCESS_MONTH) %&gt;%count() %&gt;%ungroup() %&gt;%arrange(desc(n))end &lt;- Sys.time()end - start</code></pre><p>The task took <strong>38.88081 mins</strong> to execute.</p></div><div id="Task 5_3" class="tabcontent" style="display:none"><pre><code># Task 3 - group_by + summarizestart &lt;- Sys.time()claims_columns &lt;-cols_only(CLAIM_STATUS_TYPE_CD = col_character(),CH_SUBM_PROVIDER_ID = col_double(),BNFT_TYPE_CD = col_character(),CH_REND_AMT = col_double(),PROCESS_MONTH = col_double())provider_columns &lt;-cols_only(provider_id = col_double(),provider_type = col_character(),benefit_description = col_character())read_csv(&quot;/home/data/CLAIM_HISTORY_YEAR.csv&quot;,col_types = claims_columns) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;) %&gt;%group_by(BNFT_TYPE_CD, PROCESS_MONTH) %&gt;%summarize(minimum_amount =min(CH_REND_AMT, na.rm = TRUE),mean_amount =mean(CH_REND_AMT, na.rm = TRUE),max_amount =max(CH_REND_AMT, na.rm = TRUE)) %&gt;%ungroup()end &lt;- Sys.time()end - start</code></pre><p>The task took <strong>37.73755 mins</strong> to execute.</p></div><div id="Task 5_4" class="tabcontent" style="display:none"><pre><code># Task 4 - join + group_by + summarizestart &lt;- Sys.time()claims_columns &lt;-cols_only(CLAIM_STATUS_TYPE_CD = col_character(),CH_SUBM_PROVIDER_ID = col_double(),BNFT_TYPE_CD = col_character(),CH_REND_AMT = col_double(),PROCESS_MONTH = col_double())provider_columns &lt;-cols_only(provider_id = col_double(),provider_type = col_character(),benefit_description = col_character())left_join(read_csv(&quot;/home/data/CLAIM_HISTORY_YEAR.csv&quot;,col_types = claims_columns) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;),read_csv(&quot;/home/data/PROVIDER.csv&quot;,col_types = provider_columns),by = c(&quot;CH_SUBM_PROVIDER_ID&quot; = &quot;provider_id&quot;)) %&gt;%group_by(benefit_description,BNFT_TYPE_CD,PROCESS_MONTH) %&gt;%summarize(minimum_amount =min(CH_REND_AMT, na.rm = TRUE),mean_amount =mean(CH_REND_AMT, na.rm = TRUE),max_amount =max(CH_REND_AMT, na.rm = TRUE)) %&gt;%ungroup()end &lt;- Sys.time()end - start</code></pre><p>The task took <strong>40.0343 mins</strong> to execute.</p></div></html><h4 id="parquet-data-2">Parquet Data</h4><p>The parquet file used in this section was <strong>7.5 GB</strong> in size.</p><html><div id="tab6" class="tab"><button class="tablinks" onclick="clickHandle(event, 'Task 6_1', 'Task 6')">Task 1</button><button class="tablinks" onclick="clickHandle(event, 'Task 6_2', 'Task 6')">Task 2</button><button class="tablinks" onclick="clickHandle(event, 'Task 6_3', 'Task 6')">Task 3</button><button class="tablinks" onclick="clickHandle(event, 'Task 6_4', 'Task 6')">Task 4</button></div><div id="Task 6_1" class="tabcontent" style="display:block"><pre><code># Task 1 - joinstart &lt;- Sys.time()left_join(open_dataset(source = &quot;/home/data/CLAIM_HISTORY_YEAR&quot;) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;) %&gt;%select(CH_SUBM_PROVIDER_ID, BNFT_TYPE_CD, CH_REND_AMT),open_dataset(sources = &quot;/home/data/Provider&quot;) %&gt;%select(provider_id,provider_type,benefit_description),by = c(&quot;CH_SUBM_PROVIDER_ID&quot; = &quot;provider_id&quot;)) %&gt;%collect()end &lt;- Sys.time()end - start</code></pre><p>The task took <strong>4.153103 secs</strong> to execute.</p></div><div id="Task 6_2" class="tabcontent" style="display:none"><pre><code># Task 2 - group_by + countstart &lt;- Sys.time()open_dataset(source = &quot;/home/data/CLAIM_HISTORY_YEAR&quot;) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;) %&gt;%select(BNFT_TYPE_CD, PROCESS_MONTH) %&gt;%group_by(BNFT_TYPE_CD, PROCESS_MONTH) %&gt;%count() %&gt;%ungroup() %&gt;%arrange(desc(n)) %&gt;%collect()end &lt;- Sys.time()end - start</code></pre><p>The task took <strong>0.844259 secs</strong> to execute.</p></div><div id="Task 6_3" class="tabcontent" style="display:none"><pre><code># Task 3 - group_by + summarizestart &lt;- Sys.time()open_dataset(source = &quot;/home/data/CLAIM_HISTORY_YEAR&quot;) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;) %&gt;%select(BNFT_TYPE_CD, CH_REND_AMT, PROCESS_MONTH) %&gt;%group_by(BNFT_TYPE_CD, PROCESS_MONTH) %&gt;%summarize(minimum_amount =min(CH_REND_AMT, na.rm = TRUE),mean_amount =mean(CH_REND_AMT, na.rm = TRUE),max_amount =max(CH_REND_AMT, na.rm = TRUE)) %&gt;%ungroup() %&gt;%collect()end &lt;- Sys.time()end - start</code></pre><p>The task took <strong>1.010546 secs</strong> to execute.</p></div><div id="Task 6_4" class="tabcontent" style="display:none"><pre><code># Task 4 - join + group_by + summarizestart &lt;- Sys.time()left_join(open_dataset(source = &quot;/home/data/CLAIM_HISTORY_YEAR&quot;) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;) %&gt;%select(CH_SUBM_PROVIDER_ID,BNFT_TYPE_CD,CH_REND_AMT,PROCESS_MONTH),open_dataset(sources = &quot;/home/data/Provider&quot;) %&gt;%select(provider_id,provider_type,benefit_description),by = c(&quot;CH_SUBM_PROVIDER_ID&quot; = &quot;provider_id&quot;)) %&gt;%group_by(benefit_description,BNFT_TYPE_CD,PROCESS_MONTH) %&gt;%summarize(minimum_amount =min(CH_REND_AMT, na.rm = TRUE),mean_amount =mean(CH_REND_AMT, na.rm = TRUE),max_amount =max(CH_REND_AMT, na.rm = TRUE)) %&gt;%ungroup() %&gt;%collect()end &lt;- Sys.time()end - start</code></pre><p>The task took <strong>3.062172 secs</strong> to execute.</p></div></html><h3 id="large-data">Large Data</h3><p>The large data consists of data collected between 2014 and 2022. Thedataset has <strong>408,197,137</strong> rows of data.</p><h4 id="csv-data-3">CSV Data</h4><p>The CSV file used in this section was <strong>351.5 GB</strong> in size.</p><html><div id="tab7" class="tab"><button class="tablinks" onclick="clickHandle(event, 'Task 7_1', 'Task 7')">Task 1</button><button class="tablinks" onclick="clickHandle(event, 'Task 7_2', 'Task 7')">Task 2</button><button class="tablinks" onclick="clickHandle(event, 'Task 7_3', 'Task 7')">Task 3</button><button class="tablinks" onclick="clickHandle(event, 'Task 7_4', 'Task 7')">Task 4</button></div><div id="Task 7_1" class="tabcontent" style="display:block"><pre><code># Task 1 - joinstart &lt;- Sys.time()claims_columns &lt;-cols_only(CLAIM_STATUS_TYPE_CD = col_character(),CH_SUBM_PROVIDER_ID = col_double(),BNFT_TYPE_CD = col_character(),CH_REND_AMT = col_double())provider_columns &lt;-cols_only(provider_id = col_double(),provider_type = col_character(),benefit_description = col_character())left_join(read_csv(&quot;/home/data/CLAIM_HISTORY_DECADE.csv&quot;,col_types = claims_columns) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;),read_csv(&quot;/home/data/PROVIDER.csv&quot;,col_types = provider_columns),by = c(&quot;CH_SUBM_PROVIDER_ID&quot; = &quot;provider_id&quot;))end &lt;- Sys.time()end - start</code></pre><p>The task did not complete, producing <strong>Error: std::bad_alloc</strong>.</p></div><div id="Task 7_2" class="tabcontent" style="display:none"><pre><code># Task 2 - group_by + countstart &lt;- Sys.time()claims_columns &lt;-cols_only(CLAIM_STATUS_TYPE_CD = col_character(),CH_SUBM_PROVIDER_ID = col_double(),BNFT_TYPE_CD = col_character(),CH_REND_AMT = col_double(),PROCESS_YEAR = col_double(),PROCESS_MONTH = col_double())read_csv(&quot;/home/data/CLAIM_HISTORY_DECADE.csv&quot;,col_types = claims_columns) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;) %&gt;%group_by(BNFT_TYPE_CD, PROCESS_YEAR, PROCESS_MONTH) %&gt;%count() %&gt;%ungroup() %&gt;%arrange(desc(n))end &lt;- Sys.time()end - start</code></pre><p>The task did not complete, producing <strong>Error: std::bad_alloc</strong>.</p></div><div id="Task 7_3" class="tabcontent" style="display:none"><pre><code># Task 3 - group_by + summarizestart &lt;- Sys.time()claims_columns &lt;-cols_only(CLAIM_STATUS_TYPE_CD = col_character(),CH_SUBM_PROVIDER_ID = col_double(),BNFT_TYPE_CD = col_character(),CH_REND_AMT = col_double(),PROCESS_YEAR = col_double(),PROCESS_MONTH = col_double())provider_columns &lt;-cols_only(provider_id = col_double(),provider_type = col_character(),benefit_description = col_character())read_csv(&quot;/home/data/CLAIM_HISTORY_DECADE.csv&quot;,col_types = claims_columns) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;) %&gt;%group_by(BNFT_TYPE_CD, PROCESS_YEAR, PROCESS_MONTH) %&gt;%summarize(minimum_amount =min(CH_REND_AMT, na.rm = TRUE),mean_amount =mean(CH_REND_AMT, na.rm = TRUE),max_amount =max(CH_REND_AMT, na.rm = TRUE)) %&gt;%ungroup()end &lt;- Sys.time()end - start</code></pre><p>The task did not complete, producing <strong>Error: std::bad_alloc</strong>.</p></div><div id="Task 7_4" class="tabcontent" style="display:none"><pre><code># Task 4 - join + group_by + summarizestart &lt;- Sys.time()claims_columns &lt;-cols_only(CLAIM_STATUS_TYPE_CD = col_character(),CH_SUBM_PROVIDER_ID = col_double(),BNFT_TYPE_CD = col_character(),CH_REND_AMT = col_double(),PROCESS_YEAR = col_double(),PROCESS_MONTH = col_double())provider_columns &lt;-cols_only(provider_id = col_double(),provider_type = col_character(),benefit_description = col_character())left_join(read_csv(&quot;/home/data/CLAIM_HISTORY_DECADE.csv&quot;,col_types = claims_columns) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;),read_csv(&quot;/home/data/PROVIDER.csv&quot;,col_types = provider_columns),by = c(&quot;CH_SUBM_PROVIDER_ID&quot; = &quot;provider_id&quot;)) %&gt;%group_by(benefit_description,BNFT_TYPE_CD,PROCESS_YEAR,PROCESS_MONTH) %&gt;%summarize(minimum_amount =min(CH_REND_AMT, na.rm = TRUE),mean_amount =mean(CH_REND_AMT, na.rm = TRUE),max_amount =max(CH_REND_AMT, na.rm = TRUE)) %&gt;%ungroup()end &lt;- Sys.time()end - start</code></pre><p>The task did not complete, producing <strong>Error: std::bad_alloc</strong>.</p></div></html><h4 id="parquet-data-3">Parquet Data</h4><p>The parquet file used in this section was <strong>45.1 GB</strong> in size.</p><html><div id="tab8" class="tab"><button class="tablinks" onclick="clickHandle(event, 'Task 8_1', 'Task 8')">Task 1</button><button class="tablinks" onclick="clickHandle(event, 'Task 8_2', 'Task 8')">Task 2</button><button class="tablinks" onclick="clickHandle(event, 'Task 8_3', 'Task 8')">Task 3</button><button class="tablinks" onclick="clickHandle(event, 'Task 8_4', 'Task 8')">Task 4</button></div><div id="Task 8_1" class="tabcontent" style="display:block"><pre><code># Task 1 - joinstart &lt;- Sys.time()left_join(open_dataset(source = &quot;/home/data/CLAIM_HISTORY_DECADE&quot;) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;) %&gt;%select(CH_SUBM_PROVIDER_ID, BNFT_TYPE_CD, CH_REND_AMT),open_dataset(sources = &quot;/home/data/Provider&quot;) %&gt;%select(provider_id,provider_type,benefit_description),by = c(&quot;CH_SUBM_PROVIDER_ID&quot; = &quot;provider_id&quot;)) %&gt;%collect()end &lt;- Sys.time()end - start</code></pre><p>The task took <strong>16.42989 secs</strong> to execute.</p></div><div id="Task 8_2" class="tabcontent" style="display:none"><pre><code># Task 2 - group_by + countstart &lt;- Sys.time()open_dataset(source = &quot;/home/data/CLAIM_HISTORY_DECADE&quot;) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;) %&gt;%select(BNFT_TYPE_CD) %&gt;%group_by(BNFT_TYPE_CD) %&gt;%count() %&gt;%ungroup() %&gt;%arrange(desc(n)) %&gt;%collect()end &lt;- Sys.time()end - start</code></pre><p>The task took <strong>4.389257 secs</strong> to execute.</p></div><div id="Task 8_3" class="tabcontent" style="display:none"><pre><code># Task 3 - group_by + summarizestart &lt;- Sys.time()open_dataset(source = &quot;/home/data/CLAIM_HISTORY_DECADE&quot;) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;) %&gt;%select(BNFT_TYPE_CD, CH_REND_AMT) %&gt;%group_by(BNFT_TYPE_CD) %&gt;%summarize(minimum_amount =min(CH_REND_AMT, na.rm = TRUE),mean_amount =mean(CH_REND_AMT, na.rm = TRUE),max_amount =max(CH_REND_AMT, na.rm = TRUE)) %&gt;%ungroup() %&gt;%collect()end &lt;- Sys.time()end - start</code></pre><p>The task took <strong>4.441824 secs</strong> to execute.</p></div><div id="Task 8_4" class="tabcontent" style="display:none"><pre><code># Task 4 - join + group_by + summarizestart &lt;- Sys.time()left_join(open_dataset(source = &quot;/home/data/CLAIM_HISTORY_DECADE&quot;) %&gt;%filter(CLAIM_STATUS_TYPE_CD == &quot;PC&quot;) %&gt;%select(CH_SUBM_PROVIDER_ID, BNFT_TYPE_CD, CH_REND_AMT),open_dataset(sources = &quot;/home/data/Provider&quot;) %&gt;%select(provider_id,provider_type,benefit_description),by = c(&quot;CH_SUBM_PROVIDER_ID&quot; = &quot;provider_id&quot;)) %&gt;%group_by(benefit_description, BNFT_TYPE_CD) %&gt;%summarize(minimum_amount =min(CH_REND_AMT, na.rm = TRUE),mean_amount =mean(CH_REND_AMT, na.rm = TRUE),max_amount =max(CH_REND_AMT, na.rm = TRUE)) %&gt;%ungroup() %&gt;%collect()end &lt;- Sys.time()end - start</code></pre><p>The task took <strong>14.93252 secs</strong> to execute.</p></div></html><h2 id="our-findings">Our Findings</h2><p>The results from our analysis were remarkable. Converting our data fromrow-based to columnar in parquet format significantly improvedprocessing time. Processes that would take tens ofminutes to an hour are now possible within seconds…<strong>game changer</strong>! Theparquet format is a low/no-cost solution that provides immediateanalytical improvements for both our data engineering and data analyticsteams.</p><h3 id="processing-time">Processing Time</h3><p>CSV processing time varied from <strong>10.85</strong> seconds to <strong>2,411.84</strong>seconds (40.2 minutes), whereas parquet file processing time ranged from<strong>0.3</strong> seconds to <strong>16.43</strong> seconds for all four dataset size groups.Note that the CSV large dataset errored (<em>Error:std::bad_alloc</em>) and did not complete. The <em>Error: std::bad_alloc</em> issynonymous with out-of-memory, yes insufficient memory even with our 1TBworkstation!</p><html><div id="tab9" class="tab"><button class="tablinks" onclick="clickHandle(event, 'Task 9_1', 'Task 9')">Task 1</button><button class="tablinks" onclick="clickHandle(event, 'Task 9_2', 'Task 9')">Task 2</button><button class="tablinks" onclick="clickHandle(event, 'Task 9_3', 'Task 9')">Task 3</button><button class="tablinks" onclick="clickHandle(event, 'Task 9_4', 'Task 9')">Task 4</button></div><div id="Task 9_1" class="tabcontent" style="display:block"><p><img src="images/unnamed-chunk-39-1.png" alt=""></p></div><div id="Task 9_2" class="tabcontent" style="display:none"><p><img src="images/unnamed-chunk-40-1.png" alt=""></p></div><div id="Task 9_3" class="tabcontent" style="display:none"><p><img src="images/unnamed-chunk-41-1.png" alt=""></p></div><div id="Task 9_4" class="tabcontent" style="display:none"><p><img src="images/unnamed-chunk-42-1.png" alt=""></p></div></html><h3 id="improvement-factor">Improvement Factor</h3><p>Not only did our processing efficiency improve across all categories ofsizes of data, but storage size efficiency of the same datasets is also not to be overlooked. Being able to compute common analytical queryingquicker and with a smaller footprint is an unrefutable win. Optimizationin both size and speed is an attainable innovation for any DataEngineer/Analyst that is quantifiable and beneficial for anyorganization.</p><p>The following illustrates the improvement factor (aka the number oftimes improvement using parquet provides over CSV) for each of the fourtasks, as well as the storage size improvements obtained using columnarstorage.</p><html><div id="tab0" class="tab"><button class="tablinks" onclick="clickHandle(event, 'Task 0_1', 'Task 0')">Processing Improvement</button><button class="tablinks" onclick="clickHandle(event, 'Task 0_2', 'Task 0')">Improvement Distribution</button><button class="tablinks" onclick="clickHandle(event, 'Task 0_3', 'Task 0')">Average Improvement</button><button class="tablinks" onclick="clickHandle(event, 'Task 0_4', 'Task 0')">File Storage Improvement</button></div><div id="Task 0_1" class="tabcontent" style="display:block"><div id="pfwhzcnkop" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;"><table class="gt_table"><thead class="gt_header"><tr><th colspan="5" class="gt_heading gt_title gt_font_normal gt_bottom_border" style="background-color: #005028; color: #FFFFFF; font-size: x-large; text-align: left; vertical-align: top; font-weight: bold; border-top-width: 3px; border-top-style: solid; border-top-color: #B6B6B6;">Processing Improvements with Parquet Files</th></tr></thead><thead class="gt_col_headings"><tr><th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1" style="background-color: #8DC63F; color: #414042; font-size: medium; text-align: center; vertical-align: middle; border-bottom-width: 3px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">Dataset Size Group</th><th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1" style="background-color: #8DC63F; color: #414042; font-size: medium; text-align: center; vertical-align: middle; border-bottom-width: 3px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">Task</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" style="background-color: #8DC63F; color: #414042; font-size: medium; text-align: center; vertical-align: middle; border-bottom-width: 3px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">CSV Processing Time (in seconds)</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" style="background-color: #8DC63F; color: #414042; font-size: medium; text-align: center; vertical-align: middle; border-bottom-width: 3px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">Parquet Processing Time (in seconds)</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" style="background-color: #8DC63F; color: #414042; font-size: medium; text-align: center; vertical-align: middle; border-bottom-width: 3px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">Parquet Improvement Factor</th></tr></thead><tbody class="gt_table_body"><tr><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">x-small</td><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">join</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">16.01</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">1.78</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">9</td></tr><tr><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">x-small</td><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">group_by + count</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">10.85</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">0.75</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">15</td></tr><tr><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">x-small</td><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">group_by + summarize</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">11.86</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">0.30</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">40</td></tr><tr><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">x-small</td><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">join + group_by + summarize</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">16.03</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">1.36</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">12</td></tr><tr><td class="gt_row gt_left" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">small</td><td class="gt_row gt_left" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">join</td><td class="gt_row gt_right" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">220.62</td><td class="gt_row gt_right" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">1.60</td><td class="gt_row gt_right" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">138</td></tr><tr><td class="gt_row gt_left" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">small</td><td class="gt_row gt_left" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">group_by + count</td><td class="gt_row gt_right" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">189.71</td><td class="gt_row gt_right" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">0.30</td><td class="gt_row gt_right" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">629</td></tr><tr><td class="gt_row gt_left" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">small</td><td class="gt_row gt_left" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">group_by + summarize</td><td class="gt_row gt_right" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">185.72</td><td class="gt_row gt_right" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">0.51</td><td class="gt_row gt_right" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">361</td></tr><tr><td class="gt_row gt_left" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">small</td><td class="gt_row gt_left" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">join + group_by + summarize</td><td class="gt_row gt_right" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">206.88</td><td class="gt_row gt_right" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">1.13</td><td class="gt_row gt_right" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">184</td></tr><tr><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">medium</td><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">join</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">2,411.84</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">4.15</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">581</td></tr><tr><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">medium</td><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">group_by + count</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">2,332.85</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">0.84</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">2,763</td></tr><tr><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">medium</td><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">group_by + summarize</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">2,264.25</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">1.01</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">2,241</td></tr><tr><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">medium</td><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">join + group_by + summarize</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">2,402.06</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">3.06</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">784</td></tr><tr><td class="gt_row gt_left" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">large</td><td class="gt_row gt_left" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">join</td><td class="gt_row gt_right" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">NA</td><td class="gt_row gt_right" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">16.43</td><td class="gt_row gt_right" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">NA</td></tr><tr><td class="gt_row gt_left" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">large</td><td class="gt_row gt_left" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">group_by + count</td><td class="gt_row gt_right" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">NA</td><td class="gt_row gt_right" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">4.39</td><td class="gt_row gt_right" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">NA</td></tr><tr><td class="gt_row gt_left" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">large</td><td class="gt_row gt_left" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">group_by + summarize</td><td class="gt_row gt_right" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">NA</td><td class="gt_row gt_right" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">4.44</td><td class="gt_row gt_right" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">NA</td></tr><tr><td class="gt_row gt_left" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">large</td><td class="gt_row gt_left" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">join + group_by + summarize</td><td class="gt_row gt_right" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">NA</td><td class="gt_row gt_right" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">14.93</td><td class="gt_row gt_right" style="background-color: #E4E4E5; color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">NA</td></tr></tbody><tfoot class="gt_sourcenotes"><tr><td class="gt_sourcenote" style="background-color: #FFFFFF; color: #414042; font-size: x-small; text-align: left; vertical-align: middle;" colspan="5">CSV large dataset did not complete <br>Producing Error: std::bad_alloc</td></tr></tfoot></table></div></div><div id="Task 0_2" class="tabcontent" style="display:none"><p><img src="images/unnamed-chunk-44-1.png" alt=""></p></div><div id="Task 0_3" class="tabcontent" style="display:none"><div id="xqnbigurhc" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;"><table class="gt_table"><thead class="gt_header"><tr><th colspan="2" class="gt_heading gt_title gt_font_normal" style="background-color: #005028; color: #FFFFFF; font-size: x-large; text-align: left; vertical-align: top; font-weight: bold; border-top-width: 3px; border-top-style: solid; border-top-color: #B6B6B6;">Average Processing Improvements with Parquet Files</th></tr><tr><th colspan="2" class="gt_heading gt_subtitle gt_font_normal gt_bottom_border" style="background-color: #005028; color: #FFFFFF; font-size: small; text-align: left; vertical-align: top; border-bottom-width: 3px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">By Dataset Group Size</th></tr></thead><thead class="gt_col_headings"><tr><th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1" style="background-color: #8DC63F; color: #414042; font-size: medium; text-align: center; vertical-align: middle; border-bottom-width: 3px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">Dataset Size Group</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1" style="background-color: #8DC63F; color: #414042; font-size: medium; text-align: center; vertical-align: middle; border-bottom-width: 3px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">Average Parquet Improvement Factor</th></tr></thead><tbody class="gt_table_body"><tr><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">x-small</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">19</td></tr><tr><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">small</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">328</td></tr><tr><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">medium</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">1,592</td></tr><tr><td class="gt_row gt_left" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">large</td><td class="gt_row gt_right" style="color: #414042; text-align: center; vertical-align: middle; white-space: pre; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #B6B6B6;">NA</td></tr></tbody><tfoot class="gt_sourcenotes"><tr><td class="gt_sourcenote" style="background-color: #FFFFFF; color: #414042; font-size: x-small; text-align: left; vertical-align: middle;" colspan="2">CSV large dataset did not complete <br>Producing Error: std::bad_alloc</td></tr></tfoot></table></div></div><div id="Task 0_4" class="tabcontent" style="display:none"><p><img src="images/unnamed-chunk-46-1.png" alt=""></p></div></html><h2 id="closing-remarks">Closing Remarks</h2><p>The time it takes to process data impacts all users, data engineers,data analytics, data scientists, decision makers, business users, andclients. Reducing processing time will improve the experience for allusers along the data journey. Parquet files allow foranalytical teams to reduce their analytical time significantly, be thatdata engineering, modelling, or data analytics. With parquet notrequiring all the data to be read into memory prior to analysis, thefile format provides an option for all organizations, regardless of theirexisting data infrastructure investment.</p><p>Analytics looks to provide value to business; many times it focuses onimproving efficiencies of models or adding new technology. Sometimes wecan get significant improvements that pay value to business with simplesolutions, like changing data storage formats. Boring yes, but <strong>1,500</strong>times faster processing is super!</p><div class="lt-gray-box"><b>Correction:</b> The original version of this post was missing <code>col_types</code> in <code>read_csv()</code>.</div></description></item><item><title>RStudio Community Table Gallery</title><link>https://www.rstudio.com/blog/rstudio-community-table-gallery/</link><pubDate>Mon, 25 Apr 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-community-table-gallery/</guid><description><p>Tables are an excellent way to organize your data, whether the medium is an R Markdown document, poster, or a Shiny app. We&rsquo;ve collected many community contributions that shine in this regard and they are showcased in the new <a href="https://community.rstudio.com/c/table-gallery/64" target = "_blank">RStudio Community Table Gallery</a>.</p><p>We&rsquo;re starting off by featuring over 30 different tables. They use a multitude of table packages such as {DT}, {reactable}, and {gt}. Some are interactive, others are static, <em>all</em> are excellent. Each entry shows the finished product and always includes a full description and reusable code. This collective font of information will certainly get your imagination going about all the goodness that&rsquo;s to be had from tables!</p><p>And, hey!, we&rsquo;re just getting started. We will add more examples and keep evolving this site so it will continue to be fresh and inspiring in the coming years. There are always new innovations in tabular design and you can bet that they&rsquo;ll be featured in the RStudio Community Table Gallery.</p><script src="https://fast.wistia.com/embed/medias/6auoddudxw.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:60.42% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_6auoddudxw videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><p>We’d like to thank all of the first group of 20 contributors who made this resource possible:</p><p><em>Abdoul ISSA BIDA</em></p><ul><li><a href="https://community.rstudio.com/t/121492">From Vines to Wines: the most exceptional wines from all over the world</a></li></ul><p><em>Agustin Calatroni, Becca Krouse, Stephanie Lussier</em></p><ul><li><a href="https://community.rstudio.com/t/121483">dataxray: an interactive table interface for data summaries</a></li></ul><p><em>Benjamin Nowak</em></p><ul><li><a href="https://community.rstudio.com/t/117744">One Farm - Creating Our Cultivated Planet &amp; The Big Barnyard Tables</a></li><li><a href="https://community.rstudio.com/t/117184">Riding tables with {gt} and {gtExtras}</a></li></ul><p><em>Bill Schmid</em></p><ul><li><a href="https://community.rstudio.com/t/121601">How often per season do NBA teams attempt more 3 than 2-point shots? {reactable} {reactablefmtr}</a></li><li><a href="https://community.rstudio.com/t/121602">Imperial March - NCAA Basketball {reactable} {reactablefmtr}</a></li></ul><p><em>David Smale</em></p><ul><li><a href="https://community.rstudio.com/t/86113">Top of the Class: Public Spending on Education</a></li></ul><p><em>Edgar Zamora</em></p><ul><li><a href="https://community.rstudio.com/t/120527">MLB&rsquo;s Biggest Teams</a></li></ul><p><em>Etienne Bacher</em></p><ul><li><a href="https://community.rstudio.com/t/85701">Reproducing the Periodic Table of Elements</a></li></ul><p><em>Georgios Karamanis</em></p><ul><li><a href="https://community.rstudio.com/t/86399">Beyoncé and Taylor Swift Albums</a></li></ul><p><em>Greg Lin</em></p><ul><li><a href="https://community.rstudio.com/t/129317">Spotify Chart Toppers</a></li><li><a href="https://community.rstudio.com/t/129316">Top CRAN Packages with expandable row details.</a></li><li><a href="https://community.rstudio.com/t/129282">Twitter Followers in an interactive HTML table</a></li><li><a href="https://community.rstudio.com/t/129315">Women&rsquo;s World Cup Predictions</a></li></ul><p><em>Greta Gasparac</em></p><ul><li><a href="https://community.rstudio.com/t/121418">Interactive Shiny Table - Premier League standings 2021</a></li></ul><p><em>Jack Davison</em></p><ul><li><a href="https://community.rstudio.com/t/119603">Using gt and openair to present air quality data</a></li></ul><p><em>Juan Cruz Rodriguez</em></p><ul><li><a href="https://community.rstudio.com/t/86442">EmojiSweeper, remember MineSweeper? He&rsquo;s back! In {DT} form.</a></li></ul><p><em>Kaustav Sen</em></p><ul><li><a href="https://community.rstudio.com/t/85840">A first stroll through the {gt} package</a></li></ul><p><em>Kyle Cuilla</em></p><ul><li><a href="https://community.rstudio.com/t/81205">Interactive HTML Table of NFL Team Ratings</a></li><li><a href="https://community.rstudio.com/t/82995">Interactive HTML with crosstalk filters - Table of NFL Team Ratings</a></li><li><a href="https://community.rstudio.com/t/120665">Interactive Sparklines with {reactablefmtr}</a></li><li><a href="https://community.rstudio.com/t/86430">Interactive table with drill down - Fantasy Football Receiving Stats &amp; Opportunities</a></li><li><a href="https://community.rstudio.com/t/87145">The NFL Analytics Say &ldquo;Go For It!&quot;</a></li></ul><p><em>Michael Thomas (Ketchbrook Analytics)</em></p><ul><li><a href="https://community.rstudio.com/t/120336">Conditionally Formatted State Transition Matrices</a></li></ul><p><em>Niels van der Velden</em></p><ul><li><a href="https://community.rstudio.com/t/81401">Editable DataTables in R shiny using SQL</a></li><li><a href="https://community.rstudio.com/t/81403">Employee Directory editable DT</a></li></ul><p><em>Ryo Nakagawara</em></p><ul><li><a href="https://community.rstudio.com/t/86449">Expected Goals (xG) &amp; Shot Timeline for Soccer/Football with {gt}</a></li></ul><p><em>Ryszard Szymański</em></p><ul><li><a href="https://community.rstudio.com/t/121358">Fast Big Data Tables in Shiny</a></li></ul><p><em>Stephan Teodosescu</em></p><ul><li><a href="https://community.rstudio.com/t/119430">2021-22 English Premier League Standings</a></li></ul><p><em>Vladislav Fridkin</em></p><ul><li><a href="https://community.rstudio.com/t/120539">Satellites currently orbiting the Earth</a></li></ul></description></item><item><title>Successfully Putting Shiny in Production</title><link>https://www.rstudio.com/blog/successfully-putting-shiny-in-production/</link><pubDate>Thu, 21 Apr 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/successfully-putting-shiny-in-production/</guid><description><p><sup>Photo by <a href="https://unsplash.com/@wesleypribadi?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Wesley Pribadi</a> on <a href="https://unsplash.com/">Unsplash</a></sup></p><p>Have you ever created a Shiny app that works great on your computer, but had issues once you shared it with others? This is a common experience for many data scientists. To successfully put an app into &ldquo;production&rdquo; — where others need it to be accessible, speedy, safe, and accurate — there are several challenges that we may have to overcome. With the right environment and approach, our Shiny app can smoothly deliver data-driven insights for all users.</p><p>This post leans heavily on great talks by <a href="https://www.youtube.com/watch?v=Wy3TY0gOmJw" target = "_blank">Joe Cheng</a> and <a href="https://www.youtube.com/watch?v=dQAyASaH-Jo" target = "_blank">Kelly O’Briant</a> and the book <a href="https://mastering-shiny.org/index.html" target = "_blank">Mastering Shiny</a> by Hadley Wickham. If you want to see more on the topic, please check them out.</p><h2 id="what-is-production">What is Production?</h2><p>Drawing from Joe Cheng’s talk, when we say putting a Shiny app in production, we mean that users are accessing, running, and relying on your app &ldquo;with real consequences if things go wrong&rdquo;. For a Shiny app to be successful, it must meet certain criteria:</p><ul><li>It must be &ldquo;up&rdquo;: our users must be able to access and run the app.</li><li>It must be safe: we do not want unauthorized access to the app or its contents.</li><li>It must be correct: the insights shown must be accurate.</li><li>It must be snappy: even if an app meets the other criteria, users will not use it if they have to wait a long time to generate insights. Our app must be able to handle traffic and use.</li></ul><p>For example, California&rsquo;s COVID Assessment Tool (CalCAT) is built on Shiny. The CalCAT team created a public-facing app to serve millions of Californian residents. The team also maintains an internal Shiny app that requires authentication to view confidential data. With RStudio Connect and RStudio Package Manager, they can manage any influx of traffic to keep the app running smoothly.</p><script src="https://fast.wistia.com/embed/medias/zrun8ktqx6.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:60.63% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_zrun8ktqx6 videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><center><caption><a href="https://calcat.covid19.ca.gov/cacovidmodels/" target = "_blank">California COVID Assessment Tool</a></center><p>However, even with a proper production environment, various challenges may arise. Let’s explore what these could be.</p><h2 id="challenges-in-putting-shiny-in-production">Challenges in Putting Shiny in Production</h2><h3 id="cultural">Cultural</h3><p>Shiny apps are created by R users. An R user can quickly and iteratively create a Shiny app with no knowledge of HTML, CSS, or JavaScript.</p><p>However, since R users are not necessarily software engineers, we may only realize we’re creating a production app when others need to access it regularly. We may not be aware of best practices for putting things in production. Without this awareness, we may not know to ask for the necessary resources or time to improve the performance of our apps.</p><h3 id="organizational">Organizational</h3><p>IT and management may have questions when we try to put Shiny apps into production. Perhaps they had never heard of Shiny before and are unaware of what resources and environment it requires. Related to the challenge listed above, they may not want data scientists to create production artifacts due to security concerns. Regardless of the reasons, we need to make the case for R and Shiny when communicating with these teams.</p><h3 id="technical">Technical</h3><p>We mentioned that apps need to be up, safe, correct, and snappy. To accomplish this, we need to develop our apps carefully. Data scientists need a process for optimizing their code. This means identifying what needs improvement, understanding what to do next, and testing thoroughly.</p><h2 id="tools-for-overcoming-these-challenges">Tools for Overcoming These Challenges</h2><p>These challenges play out differently from organization to organization. However, we do have certain tools that can help address cultural, organizational, and technical barriers to putting Shiny in production.</p><h3 id="a-sandbox-publishing-environment-for-staging">A sandbox publishing environment for staging</h3><p>A &ldquo;sandbox&rdquo; should be part of our development infrastructure. The sandbox is a place to stage our work that is identical to our production environment. As opposed to the production environment, this is a spot where we expect (and want!) things to break. This lets us find and work out bugs before putting our app out in the real world.</p><p>A sandbox not only provides a low-risk way for developers to see how things would work in production, but it also gives us a way to showcase our skills to management. Once we publish our app to our sandbox, we can demonstrate the Shiny app’s functionality to get approval on the tool.</p><p>To highlight an example, the Dutch National Institute for Public Health and the Environment (RIVM) deployed the &ldquo;Clusterbuster&rdquo; Shiny app to <a href="https://www.youtube.com/watch?v=9Nn9yjpivlE" target = "_blank">help hundreds of doctors and epidemiologists battle COVID-19 in the Netherlands</a>.</p><p>First, everybody needed to agree on a tool. Using a prototype like the one below, the RIVM team showed that Shiny could create an aesthetically pleasing app with a positive user experience. This convinced the IT and management teams to move forward with Shiny.</p><script src="https://fast.wistia.com/embed/medias/aj1bckrly2.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:60.63% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_aj1bckrly2 videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><center><caption><a href="https://rivm.shinyapps.io/clusterbuster/" target = "_blank">Clusterbuster proof of concept example</a> showing synthetic data</caption></center><br><p><a href="https://www.rstudio.com/products/connect/" target = "_blank">RStudio Connect</a>, RStudio’s enterprise-level publishing platform, can provide data scientists with a sandbox. Data scientists can create a staged version of their app and decide who has access to test it out. Then, we can publish to a separate Connect environment for production.</p><h3 id="workflow-for-developing-production-ready-shiny-apps">Workflow for developing production-ready Shiny apps</h3><p>RStudio has various R packages that support the development of production-ready Shiny apps. Data scientists can implement a workflow with these packages to draw on best practices from the software engineering world.</p><center><img src="images/image1.png" alt="The shinyloadtest optimization loop showing benchmark, analyze, recommend, and optimize"></center><center><caption><a href="https://rstudio.github.io/shinyloadtest/articles/case-study-scaling.html" target = "_blank">The shinyloadtest optimization loop</a></center></caption><p>The optimization loop consists of:</p><ul><li><strong>Benchmarking with shinyloadtest</strong>: With shinyloadtest, we generate realistic but synthetic data to record how our app runs for multiple users at the same time.</li><li><strong>Analyzing with shinyloadtest and profvis</strong>: Our intuition is not very good at guessing what is slow. Profilers and reports help us find out.</li><li><strong>Making recommendations based on results:</strong> Based on the analysis, we propose a way to increase the capacity of the app.</li><li><strong>Optimizing:</strong> Now, we can work on changing our code. This can look many different ways.<ul><li>We can move work out of Shiny. One option is to divide labor by using R Markdown as an ETL process so we are not processing data in the Shiny code itself.</li><li>We can use other tools at our disposal, such as using <a href="https://arrow.apache.org/docs/python/feather.html" target = "_blank">feather</a> files to read data rather than CSVs.</li><li>We can cache certain parts of our code. For example, plots are often one of the slowest parts of our app. With plot caching, we can dramatically speed them up.</li></ul></li></ul><p>Once we have completed the loop, we benchmark again to determine if our app is fast enough for our needs.</p><h3 id="metrics-that-quantify-impact">Metrics that quantify impact</h3><p>Ultimately, our goal as data scientists is to communicate insights that drive value to our organization. Once our Shiny app is out there, how do we show that it’s making an impact? We can put numbers behind our work to demonstrate how effectively we&rsquo;re reaching others.</p><p>The <a href="https://www.rstudio.com/blog/track-shiny-app-use-server-api/" target = "_blank">RStudio Connect Server API</a> tracks our app’s user activity. We can access data on visits, session duration, and more. Not only that, but we can specify goals that send us an alert when we aren’t doing as expected. Evaluating our app helps us make decisions on how to improve and tailor our users&rsquo; experience.</p><p>We can create a dashboard for stakeholders to explore the API data in real-time. This one shows the most popular apps and most active viewers:</p><script src="https://fast.wistia.com/embed/medias/03wpixim0r.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_03wpixim0r videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><center><caption><a href="https://www.rstudio.com/blog/track-shiny-app-use-server-api/" target = "_blank">Learn more about tracking Shiny app usage</a></caption></center><p>Metrics like these help us demonstrate the impact that our app is making. This helps management understand the value of Shiny apps in production.</p><h2 id="learn-more">Learn More</h2><p>Want to learn more about how to successfully put Shiny in production?</p><ul><li>Watch the talks from <a href="https://www.youtube.com/watch?v=Wy3TY0gOmJw" target = "_blank">Joe Cheng</a> and <a href="https://www.youtube.com/watch?v=dQAyASaH-Jo" target = "_blank">Kelly O’Briant</a>.</li><li>Read more about the optimization loop in the <a href="https://mastering-shiny.org/performance.html" target = "_blank">Performance chapter of Mastering Shiny</a>.</li><li>Check out the recent webinar presented by Cole Arendt on <a href="https://www.youtube.com/watch?v=0iljqY9j64U" target = "_blank">Shiny usage tracking in RStudio Connect</a>.</li></ul><p>Want to see other examples of Shiny apps in the real world?</p><ul><li>Read how the <a href="https://www.rstudio.com/blog/using-shiny-in-production-to-monitor-covid-19/" target = "_blank">California Department of Public Health created a Shiny app to quickly share data with millions of Californians</a>.</li><li>Explore lessons learned from the <a href="https://www.rstudio.com/blog/how-do-you-use-shiny-to-communicate-to-8-million-people/" target = "_blank">Georgia Institute of Technology on building the COVID-19 Event Risk Assessment Planning Tool</a>.</li><li>Find out how <a href="https://www.rstudio.com/about/customer-stories/janssen-story/" target = "_blank">Janssen Pharmaceuticals built an R-based platform for drug discovery</a>.</li><li>Watch how <a href="https://www.rstudio.com/collections/community/r-in-insurance/" target = "_blank">Aetna/CVS built Shiny apps to compute patient clustering and detect comorbidities</a>.</li></ul></description></item><item><title>Advocate for Data Science at Your Organization</title><link>https://www.rstudio.com/blog/advocating-for-data-science-at-your-organization/</link><pubDate>Tue, 19 Apr 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/advocating-for-data-science-at-your-organization/</guid><description><script src="https://fast.wistia.com/embed/medias/c7duppaa3y.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_c7duppaa3y videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/c7duppaa3y/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><h2 id="introducing-rstudiocomchampion">Introducing rstudio.com/champion</h2><p>Let&rsquo;s admit it. Getting to use the tools you want sometimes needs a little convincing.</p><p>Since coming onboard in 2017, I&rsquo;ve had the opportunity to meet so many wonderful people from the community. People that: organize meetups in their own communities, spend time after-hours teaching their co-workers, solve business problems at work with Shiny, write blog posts to help others, and so much more.</p><p>They might be titled data scientists, but don&rsquo;t have to be. They are clinicians, psychologists, economists, marketers, consultants, environmentalists, investors, social workers…the list goes on.</p><p>People are doing amazing things with data science, yet so many are still in a position where they are not able to make the case for it at their own organizations. So, what can we all do to help?</p><p>As a starting point, we&rsquo;ve created a new site: <a href="https://www.rstudio.com/champion" target = "_blank">rstudio.com/champion</a>. We&rsquo;ve gathered resources from the community who have been through this before and have so many great tips to share. Whether you&rsquo;re getting pushback about using open-source, being told to use a BI tool instead, or just unable to find the other data scientists at your company - we want to make this process less frustrating.</p><p>The new site covers four main sections:</p><ul><li><a href="https://www.rstudio.com/champion/business-case" target = "_blank">Building a business case for code-first data science tools</a></li><li><a href="https://www.rstudio.com/champion/community-building" target = "_blank">Growing your data science community</a></li><li><a href="https://www.rstudio.com/champion/use-cases" target = "_blank">Use cases gathered from various industries</a></li><li><a href="https://www.rstudio.com/champion/working-with-it" target = "_blank">Tips for starting the conversation with IT</a></li></ul><p>A lot of this content was out there already, it just needed a bit of detective work to gather it all. We hope these tips and industry examples from webinars, meetups, data science hangouts, conference talks, and conversations with the community help as you champion data science at your own organizations.</p><p><img src="industry.png" alt="Screenshot of industry page of Champion site showing examples from a variety of industries like Life Sciences and Public Sector"></p><center><sup>And more!</sup></center><p>This is just the beginning though.</p><p>This will be an evolving resource, so <a href="http://rstd.io/champion-site-feedback" target = "_blank">we&rsquo;d love to hear</a> what has helped you or what you may still need to see. A special thank you to so many from the community who have shared tips and examples on this first launch of the site today.</p><p>We love chatting with people about advocating for data science. You can always schedule a time to <a href="https://rstudio.chilipiper.com/book/champions-hub" target = "_blank">speak with our team</a>.</p><p><b><a href="https://www.addevent.com/event/dM11812539" target = "_blank">Join us for a Champion Meetup on Tuesday, April 26th at 12 ET</a></b>. Kelly O’Briant will share her own experience of advocating for analytic infrastructure and we&rsquo;ll open it up to a discussion with others who are going through the same. <em>Please note, the presentation portion of this meetup will be recorded, but the open discussion following the talk will not.</em></p><p><a class="btn-sm-block btn btn-primary btn-block pl-3 pr-3 mt-1" href="https://www.rstudio.com/champion/ " target="_blank">Visit rstudio.com/champion</a></p></description></item><item><title>RStudio Connect Python Minimum Version Update</title><link>https://www.rstudio.com/blog/rstudio-connect-minimum-python-version-update/</link><pubDate>Mon, 11 Apr 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-minimum-python-version-update/</guid><description><h1 id="what-administrators-and-publishers-should-know">What Administrators and Publishers should know</h1><blockquote><p>The March 2022 RStudio Connect release removes support for Python 2 and updates the minimum version supported to Python 3.5.</p></blockquote><h2 id="why-now">Why now?</h2><p>Python 2.7 has reached end of life (EOL) maintenance status. On January 1, 2020, the Python language governing body ended support for this version, and they are no longer providing security patches. RStudio Connect continued to support Python 2.7 beyond its EOL status announcement, but several factors have influenced our decision to end support:</p><ul><li>Python 3 is now widely adopted and is the actively-developed version of the Python language.</li><li>In January 2021, the <code>pip</code> 21.0 release officially dropped support for Python 2.</li><li>A large number of projects pledged to drop support for Python 2 in 2020 including TensorFlow, scikit-learn, Apache Spark, pandas, XGBoost, NumPy, Bokeh, Matplotlib, IPython, and Jupyter Notebook.</li></ul><h2 id="how-does-this-affect-my-connect-installation-and-published-content-using-an-older-version-of-python">How does this affect my Connect installation and published content using an older version of Python?</h2><p>The March 2022 release of RStudio Connect introduces a breaking change for installations and content that use a Python version below 3.5.</p><ul><li>If an older version of Python is listed as an available option in the Connect configuration file, Connect will fail to start.</li><li>Published R Markdown reports and Jupyter Notebooks that use older Python versions can still be viewed. However, they cannot be re-rendered. Scheduled reports that continue to run will send error message emails.</li><li>Existing applications and APIs that use older Python versions will no longer run. An HTTP 502 error will be returned for all requests to these applications.</li></ul><h2 id="what-action-needs-to-be-taken">What action needs to be taken?</h2><p>Connect Administrators need to remove older versions of Python from the Connect installation. Publishers need to update their deployed content to use Python version 3.5 or higher.</p><h3 id="update-the-connect-configuration-file">Update the Connect configuration file</h3><p>In order to upgrade RStudio Connect, verify that your configuration file does not include Python 2 or Python 3 versions prior to 3.5. If it does, and you do not remove those configuration settings, the Connect service will throw an error during start-up. This is a breaking change.</p><p>The configuration file should only contain Python 3 versions that meet the minimum requirements (Example):</p><pre><code>; /etc/rstudio-connect/rstudio-connect.gcfg[Python]Enabled = trueExecutable = /shared/Python/3.7.6/bin/python3Executable = /shared/Python/3.8.1/bin/python3</code></pre><p>In addition to the new minimum version requirements, Python installations no longer require the <code>virtualenv</code> package to be installed. Python content will now use the <code>venv</code> module included with Python 3.</p><p><em>Note: RStudio Workbench does not document minimum version requirements for Python, but you may want to schedule time to check or update the versions available there as well to avoid publishing errors due to environment mismatches.</em></p><h3 id="update-content-that-uses-an-older-version-of-python">Update content that uses an older version of Python</h3><p>Content owners need to update their code to use Python version 3.5 or higher. Content can be re-published to the same location, preserving existing settings like custom URLs, environment variables, access permissions, or runtime settings.</p><p>If published apps or APIs using an older version of Python are not updated, they will fail to run. Static R Markdown reports and Jupyter Notebooks using an older version of Python can still be viewed, but they will fail to re-render. Scheduled reports will send error message emails.</p><h2 id="how-can-you-identify-content-that-will-break">How can you identify content that will break?</h2><p>Conduct a Python runtime audit of your server and the deployed content. This <a href="https://youtu.be/GLJucEndOgo">video overview</a> provides tips for auditing Python usage on RStudio Connect depending on the level of detail you need.</p><ul><li>Start by executing the script in <code>/opt/rstudio-connect/scripts/find-python-envs</code> which will list all Python virtual environments and the applications that use them.</li><li>If you require a more detailed audit, use the RStudio Connect Server API to create a custom report:<ul><li>This <a href="https://docs.rstudio.com/connect/cookbook/runtimes/">basic example</a> includes a summary report without links to content items.</li><li>This <a href="https://solutions.rstudio.com/data-science-admin/connect-apis/python-audit-report/">advanced example</a> includes links to each piece of content with owner contact information.</li></ul></li></ul><h1 id="upgrade-rstudio-connect">Upgrade RStudio Connect</h1><p>Before upgrading, please review the <a href="http://docs.rstudio.com/connect/news">full release notes</a>.</p><p>Upgrading RStudio Connect typically requires less than five minutes. This release contains a breaking change which may require an update to the Connect configuration file.</p><p>The <code>rstudio-connect</code> service will fail to start if the configuration file includes Python 2 or Python 3 versions prior to 3.5. <a href="https://docs.rstudio.com/connect/admin/getting-started/#editing-config">Edit the configuration</a> to remove any <code>Python.Executable</code> properties that do not meet the new minimum version requirements.</p><p>If you are upgrading from a version earlier than the February 2022 edition, be sure to consult the release notes for the intermediate releases, as well:</p><ul><li>February 2022 contained a breaking change for certain LDAP configurations, and <a href="https://www.rstudio.com/about/platform-support/">Platform Support</a> updates.</li><li>November 2021 contained a breaking change for stricter permissions on the creation of several <code>Server.DataDir</code> sub-directories.</li><li>Review breaking changes from earlier versions in the <a href="http://docs.rstudio.com/connect/news">release notes</a>.</li></ul><p>To perform an RStudio Connect upgrade, download and run the installation script. The script installs a new version of Connect on top of the earlier one. Existing configuration settings are respected.</p><pre><code># Download the installation scriptcurl -Lo rsc-installer.sh https://cdn.rstudio.com/connect/installer/installer-v1.10.0.sh# Run the installation scriptsudo -E bash ./rsc-installer.sh 2022.03.2</code></pre><p><a href="https://docs.rstudio.com/rsc/upgrade/">Standard upgrade documentation can be found here</a>.</p><h3 align="center"><a href="https://rstudio.com/about/subscription-management/">Sign up for RStudio Professional Product Updates</a></h3></description></item><item><title>Make robust, modular dashboards with golem and graveler</title><link>https://www.rstudio.com/blog/make-robust-modular-dashboards-with-golem-and-graveler/</link><pubDate>Thu, 07 Apr 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/make-robust-modular-dashboards-with-golem-and-graveler/</guid><description><p><sup>Photo courtesy of <a href="https://pixabay.com/photos/job-office-team-business-internet-5382501/">Pixabay</a></sup></p><div class="lt-gray-box">This is a guest post from Alan Carlson at Snap Finance. As the Tech Lead for the Business Intelligence (BI) team, Alan's primary focus at Snap is researching, creating, and maintaining methods that help the rest of Snap’s BI Team in their work. From dashboards to visualizations to R code in general, he has built multiple packages and bookdowns that make BI easier to train and to use within the RStudio environment.<p>Since 2012, Snap has been on a mission to bring flexible, pay-over-time financing options to all consumers. For more information, visit <a href="https://snapfinance.com/" target = "_blank">snapfinance.com</a>.</p></div><p>Has this ever happened to you?</p><p>You are assigned to maintain a dashboard someone on your team has graciously left you, and the users of this dashboard ask you to add a new feature. Fine, you think. &ldquo;The data for this dashboard doesn’t look terribly complex. I’ll just step in where so-and-so left off and add that feature easily!&rdquo;</p><p>Minutes quickly turn into hours as you realize that the last developer coded an entirely different way than you do. It will take days of refactoring just to understand what their code is trying to do, let alone add a new feature. Eventually, you scrap their entire dashboard and decide to rebuild it.</p><p>Or perhaps you have incoming hires of fresh developers who have never worked with {shiny} before. Can you trust that the training you have (if any) is sufficient and will prepare them to build production-ready dashboards independently?</p><p>This was the two-pronged challenge facing our team a few years ago. We needed to completely overhaul old dashboards while also training staff on how to build tools that help drive insights for the company. We quickly realized that we needed to decrease the time spent building dashboard frameworks for our new members and ensure that anyone taking over another dashboard could easily modify them.</p><p><a target="_blank" rel="noopener noreferrer" class="btn btn-primary pl-5 pr-5 mt-4" href="https://www.addevent.com/event/xZ12108850">Join us live on April 12thto hear directly from the Snap Finance team about how they put robust, modular dashboards into practice and ask questions about how you can do the same!</a></p><h2 id="our-journey-towards-a-reproducible-development-workflow">Our journey towards a reproducible development workflow</h2><p>One of the major advantages of open source is the vast community that enables the widespread sharing of knowledge. Community members collaborate to develop tools that help solve problems. Sharing those tools allows others to benefit from their contributions.</p><p>Our journey started with <a href="https://engineering-shiny.org/index.html" target = "_blank">{golem}</a>, an open-source package built by the great team at <a href="https://thinkr.fr/" target = "_blank">ThinkR</a>. At a high level, {golem} turns your Shiny dashboards into a package framework and allows you to develop and deploy them exactly as you would an R package. This allows for better documentation, testing, robustness, etc. It’s a wonderful framework for engineering dashboards. However, the concepts themselves can be complex and technical. At the time, we knew this would be hard to explain (and implement) with our new developers.</p><p>Building upon the amazing {golem} package, we created a wrapper that abstracts away its technical side and sets defaults for our development workflow. We also set the defaults to include our company branding. We named this new internal package {graveler}.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p><h2 id="using-graveler-to-set-up-a-shiny-dashboard-framework">Using {graveler} to set up a Shiny dashboard framework</h2><p>Information on {graveler} can be found in the <a href="https://github.com/ghcarlalan/graveler" target = "_blank">GitHub repository</a>. Install the package using devtools:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-R" data-lang="R">devtools<span style="color:#666">::</span><span style="color:#06287e">install_github</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">ghcarlalan/graveler&#34;</span>)</code></pre></div><p>A new option appears when you open a new project. You can set your package name, username, and display title in this dialogue box.</p><p>Creating this project will include all the files necessary to create your initial dashboard and theming. You will see three open files: <code>01_dev.R</code>, <code>run_dev.R</code>, and <code>02_deploy.R</code>.</p><script src="https://fast.wistia.com/embed/medias/qns1frhlv2.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:62.5% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_qns1frhlv2 videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><p>The first file sets up the dependencies for deployment. These include the libraries you need to run your dashboard, a <code>golem.config</code> system file, an <code>app.R</code> file to deploy on RStudio Connect, and a manifest file to use git-backed content within RStudio Connect. You can adjust these to fit your workflow.</p><p>Once those steps are taken care of, execute the <code>run_dev.R</code> file, create the <code>golem-config.yml</code> file, and you will have your very own Shiny dashboard skeleton set up with minimal effort!</p><script src="https://fast.wistia.com/embed/medias/own33y25ws.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:62.5% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_own33y25ws videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><p>The third file has helper functions we use to add <a href="https://docs.rstudio.com/connect/user/content-settings/#content-vars" target = "_blank">environment variables</a> programmatically to our published work.</p><p>To add a new tab / module to your dashboard, you run <code>graveler::level_up(name = &quot;foo&quot;)</code>. This creates a module for your dashboard that contains sections for your UI and server code. At the bottom of each module, you will see three lines of code that you will copy and paste in their respective system files: <code>body.R</code>, <code>app_server.R</code>, and <code>sidebar.R</code>. The <code>level_up()</code> function also creates a <code>fct</code> file per {golem}’s recommendation, which incentivizes you to build functions for your dashboard modules to streamline debugging and testing.</p><p>From there, it’s just a matter of building modules and publishing with whatever workflow your team uses.</p><script src="https://fast.wistia.com/embed/medias/pl819znagu.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:60.83% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_pl819znagu videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><h2 id="sharing-and-publishing-dashboards-with-rstudio-tools">Sharing and publishing dashboards with RStudio tools</h2><p>Here at Snap, we use <a href="https://docs.rstudio.com/connect/user/git-backed/" target = "_blank">RStudio’s Git-backed publishing feature</a> to seamlessly modify dashboards across our team and publish to RStudio Connect. We have two RStudio Connect servers: production and development, which allows us to test features or coding methods without messing with what our stakeholders are seeing. <a href="https://www.rstudio.com/products/package-manager/" target = "_blank">RStudio Package Manager</a> helps us share internally created packages (like {graveler}) and ensure that we all use the same packages/functions for our work.</p><h2 id="ultimately-were-spending-more-time-_actually-developing_-now">Ultimately, we’re spending more time <em>actually developing</em> now</h2><p>With {graveler}, RStudio Package Manager, and RStudio Connect, developers are spending more time <em>actually developing</em> instead of spending time trying to spin up a Shiny framework. More importantly, they are all building dashboards <em>the same way</em>, reducing tech debt and simplifying code review. The ability to bring on new developers, integrate them into our workflow, and have them become almost instant contributors to our BI work with {golem} and {graveler} has been a great advantage to our team, our codebase, and our Sr. Developers’ sanities.</p><section class="footnotes" role="doc-endnotes"><hr><ol><li id="fn:1" role="doc-endnote"><p>But if {golem} is the platonic ideal of how to engineer dashboards, then what comes before it?</p><p>If you know your folklore, golems are animated beings made from inanimate objects. This sounds like what we do as developers: breathe life into our dashboards. However, Golem is also a creature from Nintendo’s Pokémon game franchise. If you are unfamiliar, Pokémon get stronger via evolution and, in Golem’s case, it reaches its final form once you evolve it from a Graveler.</p><p>And with the name finally settled, we could now start on our journey of standardization and replication with {golem}’s devolution: {graveler}. <a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p></li></ol></section></description></item><item><title>Mark Your Calendar for the Appsilon Shiny Conference </title><link>https://www.rstudio.com/blog/mark-your-calendar-for-the-appsilon-shiny-conference/</link><pubDate>Wed, 06 Apr 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/mark-your-calendar-for-the-appsilon-shiny-conference/</guid><description><p>We’re excited to team up with our Full Service Partner, <a href="https://appsilon.com/" target = "_blank">Appsilon</a>, as they host their first virtual <a href="https://appsilon.com/2022-appsilon-shiny-conference/" target = "_blank">Shiny conference</a>.</p><h2 id="what-should-i-expect">What should I expect?</h2><p>Three days of free, online Shiny content ranging from tips and tricks from the experts, to fascinating community case studies, and examples of enterprise scaling solutions.</p><p>Live presentations are scheduled to overlap appropriate business hours for time zones across Europe, Africa, and the Americas (Start: April 27th, 11:00 CEST/05:00 CDT, End: April 29th, 19:00 CEST/13:00 CDT).</p><p>Our own Joe Cheng and Winston Chang will be joining the keynote panel to answer your questions and discuss what they’re most excited about on the Shiny horizon. They’ll be chatting with Appsilon’s CEO, Filip Stachura, and the panel will be moderated by Eric Nantz, who you may recognize from the <a href="https://shinydevseries.com/" target = "_blank">Shiny Developer Series</a>.</p><p>The panelists will be taking live questions from the audience using slido. Drop in to ask your question and upvote others. The goal is to facilitate a lively conversation that speaks to the wider Shiny community, and everyone is encouraged to participate, regardless of level of experience.</p><br><center><a href="https://rstd.io/appsilon-shiny-conf" style="padding: 12px; border: none; font-size: 18px; border-radius: 3px; background-color: #4c83b6; color: #fff; box-shadow: 0, 1px, 3px, 0px, rgba(0,0,0,0.10);">Submit a question here!</a></center><h2 id="how-can-i-sign-up">How can I sign up?</h2><p>Registration is free and open to everyone around the world. Go ahead and reserve your spot here: <a href="https://shinyconf.com" target = "_blank">shinyconf.com</a></p><p>The team is using Hoppin to deliver a seamless conference experience where participants can connect with each other while watching talks and tutorials. Talks range from 5 to 20 minutes, so there should be a variety of content interesting to everyone. You can also see an early sneak peek of the speaker lineup and talk titles on the conference website.<br></p><center><a href="https://shinyconf.com" style="padding: 12px; border: none; font-size: 18px; border-radius: 3px; background-color: #4c83b6; color: #fff; box-shadow: 0, 1px, 3px, 0px, rgba(0,0,0,0.10);">Register now!</a></center><h2 id="where-can-i-learn-more">Where can I learn more?</h2><p>Visit <a href="https://appsilon.com/2022-appsilon-shiny-conference/" target = "_blank">shinyconf.com</a> to learn more about the types of talks, the organizing committee, and to see the Code of Conduct and adherence to accessibility standards.</p><center><a href="https://appsilon.com/2022-appsilon-shiny-conference/" style="padding: 12px; border: none; font-size: 18px; border-radius: 3px; background-color: #4c83b6; color: #fff; box-shadow: 0, 1px, 3px, 0px, rgba(0,0,0,0.10);">Learn more!</a></center><br>We look forward to seeing you online!</description></item><item><title>Teaching Data Science in the Cloud</title><link>https://www.rstudio.com/blog/teaching-data-science-in-the-cloud/</link><pubDate>Mon, 04 Apr 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/teaching-data-science-in-the-cloud/</guid><description><p><sup>Photo by <a href="https://unsplash.com/@cwmonty?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Chris Montgomery</a> on <a href="https://unsplash.com/?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup></p><p>Data science and programming languages like R and Python are some of the most in-demand skills in the world. <a href="https://www.rstudio.com/products/cloud/" target = "_blank">RStudio Cloud</a> is a simple but powerful solution for teaching and learning analytics at scale. RStudio Cloud solves many of the technical and financial challenges associated with teaching data science. It’s also a joy to use for professors, students, and IT administrators.</p><p>Pete Knast hosted the <a href="https://www.youtube.com/playlist?list=PL9HYL-VRX0oS7cksN3VyS7LszzqAFZsOg" target = "_blank">RStudio Cloud Live Series</a> to discuss approaches and tools for teaching in the cloud. These webinars were presented by Dr. Brian Anderson, Dr. Patricia Menéndez, and Dr. Mine Çetinkaya-Rundel.</p><p>We received many great pedagogical questions during the sessions. We also received inquiries about RStudio Cloud’s functionality and implementation. Below, we share insights from our presenters.</p><ul><li><a href="#implementing-teaching-in-the-cloud">Implementing teaching in the cloud</a></li><li><a href="#teaching-data-science">Teaching data science</a></li><li><a href="#using-rstudio-cloud">Using RStudio Cloud</a></li></ul><p>We also provide more information on RStudio Cloud for you to explore.</p><ul><li><a href="#learn-more-about-rstudio-cloud">Learn more about RStudio Cloud</a></li><li><a href="#resources-and-links">Resources and links</a></li></ul><h2 id="implementing-teaching-in-the-cloud">Implementing Teaching in the Cloud</h2><p><strong>How do I get started with talking to my college dean and IT about the possibility of hosting RStudio Cloud? What was the budget approval process?</strong></p><p><strong>Dr. Menéndez:</strong> First, I wanted to see what RStudio Cloud could offer to the units I am teaching. I wrote a pros/cons document with the reasons and why it was worth investing in it. I discussed the functionality of the tool with the RStudio team. We also had a conversation with the IT team and we decided to use the server from RStudio Cloud.</p><p>I discussed with the RStudio team to see if the budget was aligned to what I needed, and then with the department head and the department manager. Finally, the department manager discussed the budget with the RStudio team.</p><p><strong>I wonder about the partnership between administration, IT, and faculty. Our IT has limited ability to implement remote and cloud resources.</strong></p><p><strong>Dr. Anderson:</strong> This is where cloud solutions can be helpful. By minimizing the direct IT support on student machines, or in a lab space, cloud solutions should not add significantly to IT workload. Of course, how the school deploys the cloud solution is a variable in that workload, as is the extent to which the cloud solution integrates well with other IT platforms used for instruction.</p><p>Ideally, a cloud solution would be familiar to both faculty and IT, in the sense that the cloud version of the tool has a similar look and functionality as its desktop version, if applicable. Support agreements tied to licensing can also be a possibility, along with a robust help area for students to seek solutions to common software challenges. It is, though, critical for a new initiative to be either led by faculty or otherwise have significant faculty support.</p><p><strong>Can RStudio Cloud integrate with my school’s existing authentication setup?</strong></p><p><strong>Pete Knast:</strong> Yes, depending on your school’s Single Sign-On setup, we can likely integrate with it. We’ve integrated with popular ones like Shibboleth, Google Auth, and SAML. If you’re using something else, let us know.</p><p><strong>Could we see a demo of the rscloud package?</strong></p><p><strong>Pete Knast:</strong> The <a href="https://GitHub.com/rstudio/rscloud" target = "_blank">rscloud</a> package is an API wrapper for the rstudio.cloud service. The initial release includes APIs for managing space memberships and listing projects within a space. You can see a demo of the package on our <a href="https://www.youtube.com/playlist?list=PL9HYL-VRX0oSn7tBLDiSt4Vnyk3yB6ipA" target = "_blank">RStudio Cloud YouTube playlist</a>.</p><h2 id="teaching-data-science">Teaching Data Science</h2><p><strong>Do students generally have any prior programming experience? Do you do a primer first with GitHub?</strong></p><p><strong>Dr. Menéndez:</strong> The students generally need to learn both R and GitHub. I teach them to use Git through both the command line interface and through RStudio. If you learn through the command-line interface, then you can use Git with other programming languages like Matlab or Python. As part of the course, students learn how to create a repository and how to store it in GitHub. I use RStudio Cloud for the first few weeks to get them familiar with R and RStudio. Once we start learning Git, I transition them to using RStudio installed locally on their machines.</p><p><strong>I have found that teaching Markdown is the most difficult for students. How do you approach this?</strong></p><p><strong>Dr. Menéndez:</strong> Students struggle at the beginning with Markdown. First, I teach them about R and reproducibility. Then, we talk about integrating code with text to create a report. I introduce this sequentially so they understand why they need markdown and what the benefits are. In the reproducibility unit, we follow the same structure so that students without a lot of R knowledge are able to create reproducible reports using R and R Markdown while learning how to use Git and GitHub.</p><p><strong>In the past, I’ve had students pick some dataset and by the end of the semester create a report based on their chosen dataset. Do you use the same dataset for your entire course?</strong></p><p><strong>Dr. Menéndez:</strong> For each lecture and assignment, we use different datasets. We may use the same dataset in different contexts, but in general, we use different real-life examples each time.</p><p><strong>Can you speak more to teaching business students analytics? We are developing similar programs and need to differentiate them from data science.</strong></p><p><strong>Dr. Anderson:</strong> From my perspective, what differentiates business analytics from data science is context and workflow. When we think about using analytics to solve a business problem or inform a decision, communication is paramount to this process. What makes analytics useful in business is the ability to influence and persuade, which requires the analyst to have a deep understanding of the business context and the ability to communicate key insights in a clear and compelling way. As such, communication exists as coequal in workflow importance to, for example, modeling and data generation. Further, because of the nature of how we communicate in a business context, data visualization skills take on greater salience. To be clear, data science workflows also include context, communication, and visualizations. My argument is that a business analytics curriculum will place different emphases on workflow elements than would, potentially, a data science curriculum.</p><p><strong>On the financial constraints of adopting private solutions, would you think it possible to teach how to do, e.g., financial analysis with R? What about employment opportunities with this route?</strong></p><p><strong>Dr. Anderson:</strong> Yes, I think R, particularly with additional tools like RStudio, could be extended to teach business fundamentals that are typically taught using MS Excel. The challenge, however, will be ensuring that business graduates, particularly in finance and accounting, also have excellent MS Excel skills. In that sense, leveraging R is best positioned as a complement to, and not a replacement of, MS Excel.</p><h2 id="using-rstudio-cloud">Using RStudio Cloud</h2><p><strong>Does it make sense to use RStudio Cloud for the entirety of the course or to have students transition to local machines?</strong></p><p><strong>Dr. Çetinkaya-Rundel:</strong> That is a good question and it depends on the goals of your course. As someone who generally teaches introductory courses, I want students to understand that the “cloud” is not an esoteric thing — that it is actually somebody else’s machine. We want students to understand what it means for things to live on the cloud so if they start working on a project with sensitive data, they know they shouldn’t just upload it to RStudio Cloud without first making sure that it is okay.</p><p>If the goals of your course include software installation, then it makes sense to transition. One benefit of teaching installation later is that students are not both new to R and installation, so they can distinguish between an R error and an error due to their setup.</p><p>When people go on to do things past their university life, chances are they will continue to work in the cloud. A lot of academic computing happens on computing clusters. This notion of doing things on the cloud eases onboarding, but it is not just an unrealistic “baby steps” solution.</p><p><strong>Dr. Menéndez:</strong> The students with no coding experience tend to be very nervous at the beginning. In addition to learning R, RStudio, and reproducibility, they also need to learn the command line interface and Git/GitHub. Using RStudio Cloud in the first four weeks makes it easy because students feel safe opening projects, running code, etc. They learn about version control and installing packages. After a few weeks, they feel very confident. Then, we slowly transition to the desktop.</p><p><strong>Can you speak to your use of the functionality that allows you as the instructor to access the projects of the students?</strong></p><p><strong>Dr. Menéndez:</strong> This is a great feature from RStudio Cloud. I can click into a feature that lets me see all of the students’ projects. I can search for a specific student in the members&rsquo; space and then click on their project to open it. I’d prefer students create a reproducible example when they need help, but this gives me the flexibility to see what’s happening if a student is really stuck.</p><p><img src="slide.png" alt="Slide showing the RStudio Cloud Assignment space"></p><center><caption><a href="https://GitHub.com/okayama1/Rstudio-talk/blob/master/slides/PMenendez_Rstudio.pdf?utm_source=PMenendez_talk" target = "_blank">Slide 14 from Dr. Menéndez’s slide deck</a></caption></center><p><strong>Is there a way to distribute data files to students without setting up a project?</strong></p><p><strong>Dr. Çetinkaya-Rundel:</strong> If you are not setting up a project, there isn&rsquo;t necessarily an RStudio Cloud-specific solution. If you have access to a place to host your dataset, such as a GitHub repository, then you can read the files using the URL. If the goal is to teach students how to move files, then I provide instructions on how to download a dataset onto their computer and then upload it into RStudio Cloud.</p><p><strong>How does RStudio Cloud scale (say, when you have a class of 30 students versus a class of 300)?</strong></p><p><strong>Dr. Menéndez:</strong> RStudio Cloud scales very well. You share the RStudio Cloud space for a unit with a link, so it doesn’t matter if you have 30 students or 300 students. They click the link to enter the workspace. The issue is when you are getting questions from 300 students, but that is more about content than the technology. For that, we open several Zoom channels where the students can seek help for technical issues if they arise. I communicate with my teaching associates via Slack so that the teaching team is all connected.</p><p><strong>How do you handle grading?</strong></p><p><strong>Dr. Çetinkaya-Rundel:</strong> There isn’t a grading feature in RStudio Cloud but it is a feature request. If you&rsquo;re not having students submit their work elsewhere (e.g., GitHub, or your school&rsquo;s learning management system), I would recommend going into each project, creating a file for the feedback or leaving it inline, and recording grades in a separate CSV file you can upload to wherever you store grades.</p><p><strong>Dr. Menéndez:</strong> I do not use something that integrates RStudio Cloud with LMSs. Students work on their assignments using RStudio Cloud and then they download their R studio projects into a ZIP folder. Afterwards, they upload the folder into the LMS system and share their project link. We mark the RStudio project and the knitted files and I provide students feedback via the LMS system.</p><h2 id="learn-more-about-rstudio-cloud">Learn More About RStudio Cloud</h2><p><strong>Is it true you can access Jupyter Notebooks in RStudio Cloud?</strong></p><p>Yes, this is still in Beta but you can request access if you have a paid plan. We will be transitioning to general availability later this year. Even if you do not have a paid plan and are interested in trying it out, send us an email and we can set you up.</p><p><strong>How does RStudio Cloud differ from RStudio Server? Are there any additional benefits that RStudio Cloud may provide?</strong></p><p>RStudio Cloud has a few features that don’t exist in other RStudio offerings, like the ability to create workspaces or assignments. RStudio Cloud is also a hosted SaaS offering so RStudio handles the infrastructure for you.</p><p><strong>Where can one go to start experimenting with RStudio Cloud in our classrooms? Who should I email for practical steps?</strong></p><p>You can send questions to <a href="mailto:sales@rstudio.cloud">sales@rstudio.cloud</a>. To start experimenting, sign up for a free account here: <a href="https://rstudio.cloud/plans/free" target = "_blank"><a href="https://rstudio.cloud/plans/free">https://rstudio.cloud/plans/free</a></a>.</p><p><strong>Is there an educator plan? I&rsquo;m wondering how a large course (100+) can use RStudio Cloud given the limitations of workspaces and hrs/month.</strong></p><p>The various paid tiers which are discounted heavily for educators don&rsquo;t have any limitations on workspaces or hours. We have numerous degree-granting institutions that use RStudio Cloud for courses that cover hundreds and even thousands of students.</p><p><strong>I&rsquo;ve seen discounted academic licenses, but not free. Is this accurate?</strong></p><p>There is a free license that anyone can use, even non-academics. There are also multiple types of discounts for academics depending on your use case, such as if you are an instructor at a degree-granting institution, academic researcher, TA, or student.</p><p><strong>I was told last year by someone from RStudio that they were working on a collaboration feature in the RStudio Cloud. Is this feature going to be released soon?</strong></p><p>Yes, we are hoping to have true collaborative editing much like you will find in a Google Doc or other RStudio offerings by the end of Q1 2022.</p><p><strong>How does RStudio Cloud differ from RStudio Server? Are there limitations on the use of libraries or any other add-in that you would normally use on the desktop or server version? Are there any additional benefits that RStudio Cloud may provide over RStudio Server?</strong></p><p>RStudio Cloud is different in that it offers additional capabilities so there are no real limitations. You can read more here:</p><ul><li><a href="https://www.rstudio.com/assets/img/RStudio-Cloud-FAQ-1-Oct-2021.pdf" target = "_blank">FAQ for Leadership and IT</a></li><li><a href="https://www.rstudio.com/assets/img/RStudio-Cloud-vs-RStudio-Workbench-2021June14.pdf" target = "_blank">RStudio Cloud &amp; RStudio Workbench: Academic Teaching/Research IT Overview</a></li></ul><h2 id="resources-and-links">Resources and Links</h2><p>We have a lot more to share:</p><ul><li>Visit the <a href="https://www.rstudio.com/products/cloud/" target = "_blank">RStudio Cloud product page</a>.</li><li>Sign up for a <a href="https://rstudio.cloud/plans/free" target = "_blank">free account</a>.</li><li><a href="https://www.rstudio.com/products/team/ytl-schedule/" target = "_blank">Book a call with an RStudio Cloud Expert</a>.</li></ul><h3 id="recordings">Recordings</h3><p>Watch the presenters’ full recordings:</p><ul><li><a href="https://www.youtube.com/watch?v=-kDO_Y8SctU" target = "_blank">Leveraging the Cloud for Analytics Instruction at Scale: Challenges and Opportunities</a> by Dr. Brian Anderson</li><li><a href="https://www.youtube.com/watch?v=DQSFOaFLI0M" target = "_blank">An inclusive solution for teaching and learning R during the COVID pandemic</a> by Dr. Patricia Menéndez.<ul><li>Find Dr. Menéndez slides on <a href="https://bit.ly/PMenendez_RstudioCloud" target = "_blank">GitHub</a>.</li></ul></li><li><a href="https://youtu.be/liyJparRz2c" target = "_blank">RStudio Cloud Demo</a> with Dr. Mine Çetinkaya-Rundel.</li></ul><h3 id="educational-resources">Educational resources</h3><p>Find resources for the use of RStudio in education:</p><ul><li><a href="https://rstudio.cloud/learn/primers" target = "_blank">RStudio Cloud Primers</a> with interactive courses in data science basics.</li><li><a href="https://www.rstudio.com/resources/cheatsheets/" target = "_blank">RStudio Cloud cheat sheets</a> for learning and using your favorite R packages and the RStudio IDE.</li><li><a href="https://youtube.com/playlist?list=PL9HYL-VRX0oSn7tBLDiSt4Vnyk3yB6ipA" target = "_blank">RStudio Cloud YouTube playlist</a> with how-to’s for creating a shared space, project types, and more.</li></ul><h3 id="blog-posts">Blog posts</h3><p>Read past blog posts on RStudio Cloud:</p><ul><li><a href="https://www.rstudio.com/blog/rstudio-cloud-announcement/" target = "_blank">Do, Share, Teach, and Learn Data Science with RStudio Cloud</a></li><li><a href="https://www.rstudio.com/blog/rstudio-cloud-a-student-perspective/" target = "_blank">Learning Data Science with RStudio Cloud: A Student’s Perspective</a></li></ul></description></item><item><title>RStudio Community Monthly Events Roundup - April 2022</title><link>https://www.rstudio.com/blog/rstudio-community-monthly-events-roundup-april-2022/</link><pubDate>Fri, 01 Apr 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-community-monthly-events-roundup-april-2022/</guid><description><sup>Photo by <a href="https://unsplash.com/@nickmorrison?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Nick Morrison</a> on <a href="https://unsplash.com/?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup><p>Welcome to RStudio Community Monthly Events Roundup, where we update you on upcoming virtual events happening at RStudio this month. Missed the great talks and presentations from last month? Find them listed under <a href="#icymi-march-2022-events">ICYMI: March 2022 Events</a>.</p><p>You can subscribe to the Community Events Calendar so that new events will automatically appear on your calendar. Please note that by subscribing, all of the events in the calendar will appear on your own calendar. If you wish to add individual events instead, please use the links below.</p><p>We can’t wait to see you there!</p><h2 id="save-the-date">Save the Date</h2><ul><li>April 7, 2022 at 12 ET: Data Science Hangout with Jenny Listman, Director of Research at Statespace <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">(add to calendar)</a></li><li>April 12, 2022 at 12 ET: RStudio Finance Meetup: Robust, modular dashboards that minimize tech debt | Led by Alan Carlson at Snap Finance <a href="https://www.addevent.com/event/xZ12108850" target = "_blank">(add to calendar)</a></li><li>April 14, 2022 at 12 ET: Data Science Hangout with Joseph Korszun, Manager of Data Science at ProCogia <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">(add to calendar)</a></li><li>April 21, 2022 at 12 ET: Data Science Hangout with Tegan Ashby, Senior Developer, Basketball Systems at the Brooklyn Nets and Co-Founder of Women in Sports Data <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">(add to calendar)</a></li><li>April 26, 2022 at 12 ET: Championing Analytic Infrastructure | Led by Kelly O’Briant at RStudio <a href="https://www.addevent.com/event/dM11812539" target = "_blank">(add to calendar)</a></li><li>April 28, 2022 at 12 ET: Data Science Hangout with Daren Eiri, Director of Data Science at Arrowhead General Insurance Agency <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">(add to calendar)</a></li><li>May 3, 2022 at 4 ET: Under the Hood of the Aotearoa Road Trip App with Epi-interactive | Led by Dr Uli Muellner and Nick Snellgrove at Epi-interactive <a href="https://evt.to/aemdisoiw" target = "_blank">(add to calendar)</a></li><li>May 11, 2022 at 12 ET: Optimizing Shiny for enterprise-grade apps | Led by Veerle Van Leemput at Analytic Health <a href="https://evt.to/aeioeimaw" target = "_blank">(add to calendar)</a></li><li>May 17, 2022 at 12 ET: R for Clinical Study Reports &amp; Submission | Led by Yilong Zhang, PhD at Merck <a href="https://rstd.io/pharma-meetup" target = "_blank">(add to calendar)</a></li></ul><h2 id="recurring-events">Recurring Events</h2><h3 id="data-science-hangout">Data Science Hangout</h3><p>The Data Science Hangout is a weekly, free-to-join open conversation for current and aspiring data science leaders to connect and chat about some of the more human-centric questions around data science leadership. These happen every Thursday at 12 ET with a different leader featured each week.</p><p>The conversation is all audience-based and you can jump on whenever it fits your schedule. Add the weekly hangouts to your calendar on <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">AddEvent</a> and check out the <a href="https://www.rstudio.com/data-science-hangout/" target = "_blank">website</a> with all the recordings.</p><p>A few other things:</p><ul><li>All are welcome - no matter your industry/experience</li><li>No need to register for anything</li><li>It&rsquo;s always okay to join for part of a session</li><li>You can just listen in if you want</li><li>You can ask anonymous questions too!</li></ul><h3 id="rstudio-enterprise-community-meetups">RStudio Enterprise Community Meetups</h3><p>We also host industry meetups for teams to share the work they are doing within their organizations, teach lessons learned, and network with others. Join the group on <a href="https://www.meetup.com/RStudio-Enterprise-Community-Meetup/" target = "_blank">Meetup</a>.</p><h2 id="icymi-march-2022-events">ICYMI: March 2022 Events</h2><ul><li>March 2, 2022 at 12 ET: <a href="https://youtu.be/0iljqY9j64U" target = "_blank">Advanced Usage Tracking of Shiny Applications in RStudio Connect</a> | Led by Cole Arendt</li><li>March 3, 2022 at 12 ET: <a href="https://youtu.be/z3j_JPsyKvk" target = "_blank">Data Science Hangout with Stephen Bailey</a>, Data Engineer at Whatnot</li><li>March 9, 2022: <a href="https://youtu.be/l_U3hQ6mm60" target = "_blank">Data Visualization Accessibility</a> | Led by Mara Averick and Maya Gans</li><li>March 10, 2022 at 12 ET: <a href="https://youtu.be/FHNp9IFak6E" target = "_blank">Data Science Hangout with Kristi Angel</a>, Experimentation at Stitch Fix</li><li>March 15, 2022 at 12 ET: <a href="https://youtu.be/hyvClyUOY04" target = "_blank">R for Excel Users - First Steps</a> | Led by George Mount</li><li>March 17, 2022 at 12 ET: <a href="https://youtu.be/xkMt6aK0ZjE" target = "_blank">Data Science Hangout with Joe Gibson</a>, Senior Project Director at de Beaumont Foundation</li><li>March 23, 2022 at 12 ET: <a href="https://youtu.be/nA9fVOCD8yM" target = "_blank">RStudio Energy Meetup - Introduction to functional data analysis</a> | Led by Santiago Rodriguez</li><li>March 24, 2022 at 12 ET: <a href="https://www.youtube.com/playlist?list=PL9HYL-VRX0oTu3bUoyYknD-vpR7Uq6bsR" target = "_blank">Data Science Hangout with Erin Pierson</a>, Senior Manager of Trading Operations at Charles Schwab (coming soon)</li><li>March 31, 2022 at 12 ET: <a href="https://www.youtube.com/playlist?list=PL9HYL-VRX0oTu3bUoyYknD-vpR7Uq6bsR" target = "_blank">Data Science Hangout with Mike Smith</a>, Senior Director Statistics at Pfizer (coming soon)</li></ul><h2 id="call-for-speakers">Call for Speakers</h2><p>If you’re interested in sharing your work at a Meetup (or just starting to consider it for a future date down the road!), <a href="https://forms.gle/EtXMpSoTfhpGopiS8" target = "_blank">please fill out the speaker submission form</a>. We’re always looking for a diverse set of speakers — if you are a member of a group that is underrepresented in data science, including people of color, those with disabilities, elders/older adults, LGBTQ folks, and women/minority genders, we highly encourage you to submit!</p></description></item><item><title>Track Shiny App User Activity With the RStudio Connect Server API</title><link>https://www.rstudio.com/blog/track-shiny-app-use-server-api/</link><pubDate>Wed, 30 Mar 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/track-shiny-app-use-server-api/</guid><description><p>Data scientists spend a lot of time creating apps, dashboards, and reports. All of this effort is often hampered by siloed workflows between coworkers and across teams, which leads to delays in presenting your insights to stakeholders and clients.</p><p>After all that time and effort, are you even sure what you&rsquo;re sharing is relevant to your audience? You may start to wonder: who looked at this recently? What content is most popular? Having the numbers to back up your work can help you determine your next move and justify your efforts to your stakeholders.</p><p><a href="https://www.rstudio.com/products/connect/" target = "_blank">RStudio Connect</a> is designed to help you share and understand your data products in real-time. Connect is RStudio&rsquo;s enterprise publishing platform for data science products like R Markdown documents, Shiny apps, Flask APIs, and more. Using the <a href="https://docs.rstudio.com/connect/api" target = "_blank">RStudio Connect Server API</a>, you can extend Connect to see advanced usage metrics to answer important questions and focus your data science work.</p><p>Cole Arendt from RStudio presented on the topic during a <a href="https://www.youtube.com/watch?v=0iljqY9j64U" target = "_blank">YouTube Live event</a>. Watch the webinar here:</p><center><iframe width="640" height="360" src="https://www.youtube.com/embed/0iljqY9j64U" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></center><h2 id="collecting-instrumentation-data-from-rstudio-connect">Collecting Instrumentation Data From RStudio Connect</h2><p>RStudio Connect automatically records “instrumentation data”, or data from when users visit your server. As a publisher or administrator, you have access to these data: who logged in, when they logged in, what they looked at, and how long they spent on that piece of content.</p><p>Here is an example of Shiny instrumentation data from Connect:</p><p><img src="images/image1.png" alt="Shiny usage instrumentation data"></p><p>You can use instrumentation data to answer questions like:</p><ul><li>How many Shiny apps are on this server?</li><li>Who looked at this app recently and for how long?</li><li>Which users have access to which app?</li><li>Is viewership increasing?</li></ul><p>With this information, you can track progress against your goals and efficiently set up your next steps. This will cut down on overhead and free up time to allocate towards other efforts.</p><p>If you are interested in ways of boosting viewership, you can create a custom gallery of your content using the <a href="https://rstudio.github.io/connectwidgets/" target = "_blank">connectwidgets</a> package. This makes it easy for your audience to discover your work without scrolling through your entire Connect server. Have some restricted apps in your gallery? The permissions feature directs non-authenticated viewers to request access. Read more about <a href="https://www.rstudio.com/blog/rstudio-connect-1-9-0/" target = "_blank">RStudio Connect&rsquo;s content creation features</a>.</p><h2 id="extending-connect-with-the-rstudio-connect-server-api">Extending Connect With the RStudio Connect Server API</h2><p>The RStudio Connect Server API provides easy access to your server&rsquo;s instrumentation data. With the <a href="https://pkgs.rstudio.com/connectapi/" target = "_blank">R</a> and <a href="https://github.com/rstudio/rsconnect-python/" target = "_blank">Python</a> clients, you can load your server&rsquo;s instrumentation data into your IDE.</p><p>To install connectapi in R, run the below:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-R" data-lang="R">remotes<span style="color:#666">::</span><span style="color:#06287e">install_github</span>(<span style="color:#4070a0">&#39;</span><span style="color:#4070a0">rstudio/connectapi&#39;</span>)</code></pre></div><p>Create a client:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-R" data-lang="R"><span style="color:#06287e">library</span>(connectapi)client <span style="color:#666">&lt;-</span> <span style="color:#06287e">connect</span>(server <span style="color:#666">=</span> <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">https://connect.example.com&#39;</span>,api_key <span style="color:#666">=</span> <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">&lt;SUPER SECRET API KEY&gt;&#39;</span>)<span style="color:#60a0b0;font-style:italic"># If your server is defined by your environment variables, you can just run:</span>client <span style="color:#666">&lt;-</span> <span style="color:#06287e">connect</span>()</code></pre></div><p>Once you set up your client, you can use it to interact with RStudio Connect. Say you want to retrieve the instrumentation data from all of your Shiny apps:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-R" data-lang="R">usage_shiny <span style="color:#666">&lt;-</span> <span style="color:#06287e">get_usage_shiny</span>(client)</code></pre></div><p>This results in a data frame with the instrumentation data mentioned above.</p><h2 id="creating-a-report-with-connect-server-api-data">Creating a Report With Connect Server API Data</h2><p>Once you have access to the Connect server data, you can make custom informative reports for your stakeholders. For example, <a href="https://colorado.rstudio.com/rsc/usage/rsc-usage.html" target = "_blank">this flexdashboard</a> shows the most popular Shiny applications and static content from RStudio&rsquo;s demo server:</p><script src="https://fast.wistia.com/embed/medias/c8xjq88vc7.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:60.63% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_c8xjq88vc7 videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><p>Since this dashboard is built with R Markdown, you can schedule emails with RStudio Connect and the <a href="https://pkgs.rstudio.com/blastula/" target = "_blank">blastula</a> package. Send an email to your team anytime the dashboard is refreshed. Find out how on the <a href="https://solutions.rstudio.com/r/blastula/" target = "_blank">Solutions website</a>.</p><p><img src="images/image2.png" alt="Example email from flexdashboard showing Shiny usage"></p><p>Need something more interactive? You can create a <a href="https://colorado.rstudio.com/rsc/usage-interactive/" target = "_blank">Shiny app</a> for users to explore the data in real-time. This one shows the most popular apps and most active viewers over time.</p><script src="https://fast.wistia.com/embed/medias/03wpixim0r.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:60.63% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_03wpixim0r videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><p>Want to try these out with your RStudio Connect server? You can download the code from our <a href="https://github.com/sol-eng/connect-usage" target = "_blank">Github repository</a>, add your environment variables, and display your own data.</p><h2 id="learn-more">Learn More</h2><p>With the RStudio Connect API, you can access data that helps quantify your work&rsquo;s reach and make content more relevant to your stakeholders.</p><ul><li>Watch the full webinar on <a href="https://www.youtube.com/watch?v=0iljqY9j64U" target = "_blank">YouTube</a> and review Cole&rsquo;s <a href="https://github.com/RStudioEnterpriseMeetup/Presentations/blob/main/shiny-app-usage.pdf" target = "_blank">slides</a>.</li><li>Check out the <a href="https://docs.rstudio.com/connect/cookbook/" target = "_blank">RStudio Connect Server API Cookbook</a> for useful recipes when accessing your usage data.</li><li>Read the Solutions Engineering team&rsquo;s documentation on <a href="https://solutions.rstudio.com/data-science-admin/tracking/" target = "_blank">how to create the dashboard shown above</a> and <a href="https://solutions.rstudio.com/data-science-admin/connect-apis/" target = "_blank">a showcase of different types of Connect API reports</a>.</li><li>Access the <a href="https://pkgs.rstudio.com/connectapi/" target = "_blank">R client</a> and <a href="https://github.com/rstudio/rsconnect-python/" target = "_blank">Python client</a> for the Connect Server API.</li></ul><p>Want to make the RStudio Connect Server API bigger and better? We have a variety of spots to check out.</p><ul><li>Join the discussion on <a href="https://community.rstudio.com/t/rstudio-connect-usage-data-thread-to-discuss-ideas-improvements-and-share-how-youve-done-this/130581" target = "_blank">RStudio Community</a>.</li><li>Contribute to <a href="https://github.com/sol-eng/connect-usage" target = "_blank">open-source packages and tools</a>. We welcome pull requests, examples, feature requests, issues, and comments.</li></ul></description></item><item><title>Shiny Wordle Word Journey</title><link>https://www.rstudio.com/blog/shiny-wordle-journey/</link><pubDate>Mon, 28 Mar 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-wordle-journey/</guid><description><div class="lt-gray-box">A couple of weeks ago, Winston Chang showed how to create a Wordle app in Shiny in a four-part video series. Go watch the video series to <a href="https://www.youtube.com/playlist?list=PL9HYL-VRX0oQnWIeY_ydYBdU76iQ-tchU" target = "_blank">see how he did it</a>! Also, follow along on the first steps to start with Shiny in this tutorial.</div><p>In our house, we like puzzles and Wordle. My family noticed while playing the deployed version that there were some words that, while amusing to younger children, were not necessarily what we were expecting to be included. So, with Winston&rsquo;s support, we decided to take a look at the word list and see if we could make some updates.</p><p>I&rsquo;ve used GitHub and R, so I could start from there. I started with a few steps:</p><ul><li>Fork the <a href="https://github.com/wch/shiny-wordle" target = "_blank">GitHub repository</a></li><li>Clone the forked repository to my computer</li><li>Open the file &lsquo;wordlist.R&rsquo; in the RStudio IDE</li></ul><p>I was expecting to just modify a word list, basically by parsing and modifying the text file &lsquo;wordlist.R&rsquo;, putting in a pull request and waiting for it to be merged by Winston to see our changes. But, of course, we didn&rsquo;t want to wait that long, so we decided to try deploying the Shiny app on our own instead.</p><p>First, we wanted to see if we could deploy it as it is without any modifications. I saw that Winston&rsquo;s was deployed at <a href="https://www.shinyapps.io/" target = "_blank">shinyapps.io</a>, so I too went to shinyapps.io to get set up.</p><ul><li>Go to <a href="https://www.shinyapps.io/" target = "_blank">shinyapps.io</a></li><li>Register an account (I just used a Gmail account)</li><li>Once logged in, follow the <a href="https://shiny.rstudio.com/articles/shinyapps.html" target = "_blank">instructions</a><ul><li>Install <a href="https://rstudio.github.io/rsconnect/" target = "_blank">rsconnect</a></li><li>Configure rsconnect</li></ul></li></ul><p>Then, I went back to my RStudio IDE to try to deploy a Shiny app.</p><p>I thought I&rsquo;d try the demo one first to make sure things were working, so I did <strong>File</strong> -&gt; <strong>New File</strong> -&gt; <strong>Shiny Web App</strong>:</p><center><img src="images/image1.png" alt="RStudio's file menu opning up a Shiny Web App" width="70%"></center><p>That brought up a &lsquo;New Shiny Web Application&rsquo; window. I named the it &lsquo;shiny-demo&rsquo; and clicked &lsquo;Create&rsquo;.</p><center><img src="images/image2.png" alt="New Shiny Web App window in RStudio with shiny-demo as the name of the new app" width="70%"></center><p>That opened a window with the file &lsquo;app.R&rsquo; that told me to click the button &lsquo;Run App&rsquo;, so I did!</p><center><img src="images/image3.png" alt="New app.R file letting you know that this is a Shiny app that you can run by clicking Run App" width="70%"></center><p>And up opened a Shiny app on Old Faithful!</p><center><img src="images/image4.png" alt="Old Faithful Geyser Data app showing a histogram of the data and a slider for number of bins" width="70%"></center><p>Great, so we had it working locally. We&rsquo;re basically the coolest now; we did something interactive! Now, could we deploy it to shinyapps.io to share this exciting app with the world?</p><p>I went down to the console, first loaded the rsconnect package, and then deployed the app.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(<span style="color:#4070a0">&#39;</span><span style="color:#4070a0">rsconnect&#39;</span>)</code></pre></div><p>(If you forget to load the library, fear not, it won&rsquo;t work, so you&rsquo;ll be reminded that you need to do it).</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">deployApp</span>()</code></pre></div><center><img src="images/image5.png" alt="Text shown when deploying an app letting you know the process of being uploaded to shinyapps.io"></center><p>Then 🎉! There it was at <a href="https://tracy-teal.shinyapps.io/shiny-demo/">https://tracy-teal.shinyapps.io/shiny-demo/</a>.</p><p>Empowered by deploying something, we wanted to go on to deploy the Shiny Wordle app. If we could, it would give us all the power to create whatever word lists we wanted!</p><p>I went back to my &lsquo;shiny-wordle&rsquo; directory:</p><ul><li>Set the working directory to that folder using <code>setwd()</code></li><li>Open the file &lsquo;app-final.R&rsquo; and click &lsquo;Run App&rsquo;</li></ul><p>It opened locally! It was working!</p><p>Now to deploy.</p><p>I tried <code>deployApp('app-final.R')</code>.</p><p>That didn&rsquo;t work. It said <code>Error in deployApp(&quot;app-final.R&quot;)</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-bash" data-lang="bash"><span style="color:#4070a0">&#34;/Users/tracyteal/Documents/git/fork/shiny-wordle/app-final.R must be a directory, an R Markdown document, or an HTML document.&#34;</span></code></pre></div><p>So, I renamed <code>app-final.R</code> to <code>app.R</code> and ran:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">deployApp</span>()</code></pre></div><p>(But I could have done <code>shiny::shinyAppFile(&quot;app-final.R&quot;)</code>).</p><p>Then, it started and took awhile and then success!</p><p>There my app was at <a href="https://tracy-teal.shinyapps.io/shiny-wordle/">https://tracy-teal.shinyapps.io/shiny-wordle/</a>.</p><center><img src="images/image6.png" alt="Blank Shiny Wordle app" width="70%"></center><p>Now to try editing the files.</p><p>We edited &lsquo;wordlist.R&rsquo;. The vector <code>words_common</code> contains the set of words that are chosen to be the correct ones. So, rather than a whole list, we put in just a few words that we wanted to have as the answers.</p><center><img src="images/image7.png" alt="wordlist R file open to show the words common vector of correct words" width="70%"></center><p>If you want to add new words that aren&rsquo;t necessarily actual words, like people&rsquo;s names, you need to add them to both the <code>words_common</code> and <code>words_all</code> lists.</p><center><img src="images/image8.png" alt="wordlist R file open to show the words all vector of correct words" width="70%"></center><p>Then, we redeployed by running <code>deployApp()</code> again.</p><p>We had to hit &lsquo;y&rsquo; when it asked if we wanted to &lsquo;Update application currently deployed&rsquo;.</p><p>It ran and then, 🎉! We had our very own Wordle with our very own words!</p><p>A customized word list could be a very good gift for family or friends. ☺️</p><p>We went on to modify the whole word list, and put in that PR for Winston.</p><p>We learned a lot and are excited to try more. We were super excited to get an interactive app up and running so quickly.</p><p>So, with the Shiny app, we&rsquo;ll keep working on the words, and maybe even a version with a different number of letters, as long as we can find a wordlist…</p></description></item><item><title>Announcing A Stroke of Innovation</title><link>https://www.rstudio.com/blog/announcing-a-stroke-of-innovation/</link><pubDate>Fri, 25 Mar 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-a-stroke-of-innovation/</guid><description><p>It all started with a call booked on the RStudio website.</p><blockquote><p>My name is Óli Páll, and I am the Chief Data Officer at the City of Reykjavík in Iceland. My team and I have been tasked with the development of a data warehouse from scratch - and serving a data science team in the city…</p></blockquote><p>Óli had to provide value — and quickly. His team needed to feel empowered to experiment and learn. His stakeholders wanted to see tangible outputs from their discussions. The higher-ups had to trust in the data science team&rsquo;s work.</p><p>Open source made it possible. The team could meet with a department head on Monday and show them an interactive prototype on Wednesday, all thanks to their skills with R. Óli wanted to know, what <em>else</em> was possible? What could they accomplish with a full suite of open-source tools?</p><p>In the end, Óli&rsquo;s team transformed the digital landscape for the residents of Reykjavik. A group from RStudio flew to Iceland to capture their story.</p><p>Watch the trailer now:</p><script src="https://fast.wistia.com/embed/medias/uetbywo8b1.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:52.71% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_uetbywo8b1 videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/uetbywo8b1/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><p><a class="btn-sm-block btn btn-primary btn-block pl-3 pr-3 mt-4" href="https://www.rstudio.com/iceland" target="_blank">Dive in here</a></p></description></item><item><title>Creating APIs for Data Science With plumber</title><link>https://www.rstudio.com/blog/creating-apis-for-data-science-with-plumber/</link><pubDate>Tue, 22 Mar 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/creating-apis-for-data-science-with-plumber/</guid><description><p><sup>Photo by <a href="https://unsplash.com/@kharaoke?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Khara Woods</a> on <a href="https://unsplash.com/">Unsplash</a></sup></p><p>Whether it&rsquo;s pulling <a href="https://developer.twitter.com/en/docs/twitter-api" target = "_blank">data from Twitter</a>, accessing <a href="https://openweathermap.org/api" target = "_blank">the most recent weather information</a>, or tracking <a href="https://openflights.org/data.html" target = "_blank">where a particular plane is going</a>, application programming interfaces (APIs) are often part of our data science pipeline. But why would you want to <em>create</em> an API? And how difficult is it to do?</p><p>APIs make it easy to scale the reach of your work. They allow your data science results to be responsive, accessible, and automated. And thanks to the <a href="https://www.rplumber.io/" target = "_blank">plumber</a> package, you can convert your R functions into API endpoints using just a few special comments.</p><h2 id="what-is-an-api">What is an API?</h2><p>APIs are messenger systems that allow applications to communicate with one another. You send a request to the API. The API takes your request to the server and receives a response. Then, the API delivers the response back to you.</p><p>You may already use APIs to retrieve data as part of your data science pipeline. For example, the <a href="https://cran.r-project.org/web/packages/rtweet/index.html" target = "_blank">rtweet</a> package allows R users to interact with Twitter&rsquo;s API. You request data through the package and then receive the API&rsquo;s data as a response.</p><p><img src="images/image1.png" alt="Graphic of a laptop sending a request to an API, the API getting information from a server, then responding to the request back to the computer"></p><p>APIs communicate via &ldquo;endpoints&rdquo;. The endpoint receives a request to take an action. For example, when you run <code>usrs &lt;- search_users(&quot;#rstats&quot;, n = 1000)</code> from rtweet, you are interacting with an endpoint that returns a list of users.</p><p>Since APIs allow different systems to interact when they wouldn&rsquo;t be able to otherwise, they are incredibly powerful tools to increase interactivity and reach.</p><h2 id="why-would-a-data-scientist-want-to-create-an-api">Why would a data scientist want to create an API?</h2><p>At some point, you may want to share your R output with others. If the other person is not an R user, they may not be able to use your work without translating it into their language of choice.</p><p>If your results are available in the form of an API, then anybody can import your results without this difficult translation step. API responses are readable across platforms and applications. Just as you use R to interact with the Twitter API, others can access the Twitter API with other tools.</p><p>Let&rsquo;s say you are working with a website developer who uses Javascript. You just developed a model in R and you&rsquo;d like to share the results. You can send the developer an API so that they can display the results on a website without reconstructing your model in another language. The website can show updated results because it is communicating with your API in real-time. You do not have to manually refresh your code each time there&rsquo;s a change in the data. For example, RStudio&rsquo;s <a href="https://www.rstudio.com/pricing/" target = "_blank">pricing calculator</a> uses an API created from a backend R model to feed the results into our website!</p><p>Making your data science work available through an API reduces the handoff between R and other tools or technologies. More people can access your results and use them to make data-driven decisions.</p><div class="lt-gray-box">We recommend reading James Blair's post on how APIs increase the impact of your analyses, <a href="https://www.rstudio.com/blog/rstudio-and-apis/" target = "_blank">RStudio and APIs</a>.</div><h2 id="creating-an-api-with-plumber">Creating an API with plumber</h2><p>The plumber package allows you to create APIs from your R code. It does this through special comments that give instructions on how to turn the functions in your script into API endpoints. It&rsquo;s pretty amazing — with this package, your R code is easily accessible from other tools and frameworks.</p><p>Here&rsquo;s an example plumber script. Notice how familiar it looks:</p><p><img src="images/image2.png" alt="Screenshot of an R script decorated with plumber comments"></p><p>Let&rsquo;s walk through how to convert this R function into an API.</p><p><strong>1. Write standard R code</strong></p><p>Let&rsquo;s say we want to randomly choose 100 numbers and create a histogram. We write out a function in R:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">function</span>() {rand <span style="color:#666">&lt;-</span> <span style="color:#06287e">rnorm</span>(<span style="color:#40a070">100</span>)<span style="color:#06287e">hist</span>(rand)}</code></pre></div><p>Notice that the function is not assigned to an object. We can test it out by running the below:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">test <span style="color:#666">&lt;-</span> <span style="color:#06287e">function</span>() {rand <span style="color:#666">&lt;-</span> <span style="color:#06287e">rnorm</span>(<span style="color:#40a070">100</span>)<span style="color:#06287e">hist</span>(rand)}<span style="color:#06287e">test</span>()</code></pre></div><p><strong>2. Add special comments</strong></p><p>Now, we instruct plumber on how to turn the function into an API endpoint. Plumber parses your script to identify special comments beginning in the <code>#*</code> or <code>@</code> symbols. It uses them to convert your script into an API.</p><p>Let&rsquo;s give our function a description using <code>#*</code>. Here, we&rsquo;re telling plumber to call this function &ldquo;Plot a histogram&rdquo;:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic">#* Plot a histogram</span></code></pre></div><p>Now, let&rsquo;s tell plumber that when we get a request, execute this function and return the plot:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic">#* @get /plot</span></code></pre></div><p>By default, plumber will turn your response into JSON format. You can adjust the type of response if that is not the output you would like. For example, our function outputs an image. It doesn&rsquo;t make sense to return an image in JSON format. We can &ldquo;serialize&rdquo; our result so that the API returns a PNG rather than JSON.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic">#* @serializer png</span></code></pre></div><p>This is just one example of what an API can do. To learn more, check out the <a href="https://www.rplumber.io/articles/rendering-output.html" target = "_blank">plumber documentation on rendering output</a>.</p><p>Now, our script looks like this:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic"># plumber.R</span><span style="color:#06287e">library</span>(plumber)<span style="color:#60a0b0;font-style:italic">#* Plot a histogram</span><span style="color:#60a0b0;font-style:italic">#* @serializer png</span><span style="color:#60a0b0;font-style:italic">#* @get /plot</span><span style="color:#06287e">function</span>() {rand <span style="color:#666">&lt;-</span> <span style="color:#06287e">rnorm</span>(<span style="color:#40a070">100</span>)<span style="color:#06287e">hist</span>(rand)}</code></pre></div><p>Congratulations! We wrote an API using R.</p><p><strong>3. Plumb it</strong></p><p>Now that we&rsquo;ve created an API, it&rsquo;s time to &ldquo;plumb&rdquo; (run) it!</p><p>After we write our plumber script in the RStudio IDE, a special button appears that allows us to &ldquo;Run API&rdquo;:</p><p><img src="images/image3.png" alt="Screenshot of R console highlighting the button where it says we can run the API"></p><p>Running the API generates an interface for our API.</p><script src="https://fast.wistia.com/embed/medias/ioibdyrv6e.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:60.63% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_ioibdyrv6e videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><center><caption>Plumbing an API in RStudio</caption></center><p>The interface provides a way to interact with our API&rsquo;s endpoints. We can test out different calls to make sure that everything runs as expected.</p><p><img src="images/image4.png" alt="Button to get request from the API interface generated by plumber"></p><center><caption>Endpoint in our code and the interface</caption></center><p>Run &lsquo;try it out&rsquo; and then &lsquo;execute&rsquo; to see what the API returns (in our case, an image of a histogram):</p><script src="https://fast.wistia.com/embed/medias/b0d0wt7ji7.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:60.83% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_b0d0wt7ji7 videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><center><caption>Testing out our API through the interface</caption></center><p>Notice that you never left RStudio to create, run, and test your API!</p><p><strong>4. Deploy the API</strong></p><p>We can develop and test an API on our laptop, but how do we share it with others (for example, the website developer we mentioned previously)? We do not want our laptop to be serving the requests for a variety of reasons, including maintenance and security concerns.</p><p><a href="https://www.rstudio.com/products/connect/" target = "_blank">RStudio Connect</a> is an enterprise publishing platform that deploys APIs created by plumber with versioning, dependency management, and authentication. RStudio Connect also supports the deployment of many other data product formats, including Python APIs developed using frameworks such as Flask, FastAPI, Quart, Falcon, and Sanic. See the <a href="https://www.rstudio.com/blog/rstudio-connect-2021-08-python-updates/" target = "_blank">RStudio Connect Python Updates blog post</a> for more info on deploying Python APIs on Connect.</p><script src="https://fast.wistia.com/embed/medias/znvyytmc1u.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:60.83% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_znvyytmc1u videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><center><caption>Editing access settings in RStudio Connect</caption></center><p>RStudio Connect also ensures that you are not consuming more system resources than is necessary. It automatically manages the processes necessary to handle the current load and balances incoming traffic across all available processes. It will also shut down idle processes when they&rsquo;re not in use.</p><p>Learn more about <a href="https://www.rplumber.io/articles/hosting.html" target = "_blank">hosting Plumber APIs</a>.</p><p>Now that our API is hosted, anybody can use it in their application! Access it on RStudio Connect: <a href="https://colorado.rstudio.com/rsc/plumber-histogram-example/" target = "_blank"><a href="https://colorado.rstudio.com/rsc/plumber-histogram-example/">https://colorado.rstudio.com/rsc/plumber-histogram-example/</a></a>.</p><h2 id="learn-more">Learn More</h2><p>APIs increase the impact of your data science work by making your code accessible to a larger audience. Thanks to plumber, you can create them by providing a few special comments in your R code.</p><ul><li>Read the <a href="https://www.rplumber.io/index.html" target = "_blank">plumber documentation</a>.</li><li>Want a more in-depth example? Watch James Blair convert a data science model into an API in his excellent talk, <a href="https://www.rstudio.com/resources/webinars/expanding-r-horizons-integrating-r-with-plumber-apis/" target = "_blank">Expanding R Horizons: Integrating R with Plumber APIs</a>.</li><li>Want to see more plumber examples? The Solutions Engineering team shares some in their <a href="https://github.com/sol-eng/plumberExamples" target = "_blank">GitHub repository</a>.</li><li>Learn more about <a href="https://www.rstudio.com/products/connect/" target = "_blank">RStudio Connect</a>.</li></ul></description></item><item><title>Call for talks deadline extended!</title><link>https://www.rstudio.com/blog/new-deadline/</link><pubDate>Fri, 18 Mar 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/new-deadline/</guid><description><p>The call for talks at <a href="https://www.rstudio.com/conference/">rstudio::conf(2022)</a> has been extended to March 28!</p><p>We&rsquo;d love to hear about:</p><ul><li>How you&rsquo;ve used R (by itself or with other tools) to solve a challenging problem.</li><li>Projects and teams where R and Python live together in harmony.</li><li>Your favorite R package and how it makes life easier or unlocks new capabilities.</li><li>Your techniques for teaching data science to help reach new domains and new audiences.</li><li>Your broad reflections on data science, packages, code, and community.</li><li>Anything else you think the RStudio community would love to hear about!</li></ul><h2 id="what-do-talks-look-like">What do talks look like?</h2><p>Talks are 20 minutes long and can be either live and in-person or pre-recorded and remote.If you&rsquo;re interested in giving a pre-recorded talk, it doesn&rsquo;t need to be in English.</p><p>As the shape of the event becomes more clear, we may expand the program to include pre-recorded lightning talks.If this happens, lightning talks will be drawn from the main pool of talk proposals, i.e. there is no separate application process.</p><h2 id="why-give-a-talk">Why give a talk?</h2><ul><li>All speakers will receive coaching from <a href="https://www.articulationinc.com/">Articulation Inc</a>. Speakers for rstudio::global(2021) found the coaching both fun and very worthwhile; it&rsquo;s a great opportunity to polish your speaking skills while getting to know the other speakers.</li><li>You&rsquo;ll get free registration to the conference.</li><li>We don&rsquo;t have all the details yet, but we will have some travel support available for those who need it.</li></ul><h2 id="how-do-i-submit">How do I submit?</h2><p>To submit a talk, you&rsquo;ll need to create a 60 second video that introduces you and your proposed topic.In the video, you should tell us who you are, why your topic is important, and what attendees will take away from it.If you&rsquo;re worried about creating a video, Jesse Mostipak put together a <a href="https://twitter.com/kierisi/status/1503781247461560334">twitter thread</a> with a bunch of advice.One tip: if you&rsquo;ve spent a bunch of time on Zoom, Google Meet, etc during the pandemic, an easy way to create your video is to just meet with yourself and record it!</p><p>If you&rsquo;re interested, please submit a proposal at: <a href="https://rstd.io/conf-talks-2022">https://rstd.io/conf-talks-2022</a>.</p><p><a href="https://rstd.io/conf-talks-2022"><strong>Apply now!</strong></a></p></description></item><item><title>Announcing RStudio Academy</title><link>https://www.rstudio.com/blog/announcing-rstudio-academy/</link><pubDate>Wed, 16 Mar 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-rstudio-academy/</guid><description><p>RStudio is excited to announce the public release of RStudio Academy. Academy is a mentor-led data science apprenticeship for professional teams. RStudio experts provide training that focuses on learning and applying data science skills — not just broadcasting facts.</p><p>Academy has been in beta for over a year and has helped many organizations raise the data science skill level of their teams.</p><h2 id="move-beyond-facts-to-acquire-real-life-skills">Move beyond facts to acquire real-life skills</h2><p>Facts, such as statistical function names, make up the foundation of most learning experiences. Facts can be learned by reading, watching a video, or working through a simple example. However, data science requires both facts and <em>skills</em> — applying facts to solve real-life problems. Learners need relevant practice and ongoing feedback to improve their data science skills.</p><p>With this in mind, we designed Academy to provide a space for practice as well as an environment for feedback. Academy consists of:</p><ul><li>An online platform deployed as a dedicated learning environment for each team.</li><li>A cohort-based, mentor-led data science apprenticeship model, where learners get ongoing assistance and guidance from their mentor and fellow learners.</li></ul><p>Academy spans 10 weeks and is divided into clear, progressive milestones. Each week, learners complete interactive tutorials and practice adaptive exercises that teach the skills needed to complete each milestone. Mentors provide strategies for self-directed learning to use throughout the apprenticeship and beyond.</p><h2 id="a-learning-platform-tailored-to-your-data-science-teams-needs">A learning platform tailored to your data science team’s needs</h2><p>For those who work with data day-to-day, it is difficult to carve out time to improve their data science skillset. Learning resources are generic and cannot always be applied to the task at hand. Leaders need to prioritize targeted learning for their teams to grow in their capabilities.</p><p>Academy provides opportunities tailored to specific use cases and learning goals. RStudio works with you to develop projects that are highly relevant to your organization&rsquo;s daily work.</p><p>With Academy, your team receives:</p><ul><li><strong>Immersive Content:</strong> The learning experience is tailored to reflect the daily work and goals of the team, with real-life use cases, data, and best practices.</li><li><strong>Adaptive Practice:</strong> The platform delivers adaptive practice and spaced repetition so that the individual learner can master the relevant skills.</li><li><strong>Timely Feedback:</strong> Mentors regularly provide direct feedback on progress, both individually and to the group. Tutorials contain interactive exercises, which provide immediate feedback tailored to students&rsquo; specific answers.</li><li><strong>Social Accountability:</strong> Interactions with the mentor and cohort of fellow learners help motivate students to stay on top of their work and reach beyond the core lesson plan.</li></ul><p>This approach ensures your team gains the specific skills they need to drive business value.</p><h2 id="learn-more">Learn more</h2><p>RStudio Academy delivers a turn-key data science learning platform to help professionals increase the level of their data science skills. We are very excited to offer it to your team.</p><ul><li>Check out the <a href="https://www.rstudio.com/academy" target = "_blank">new Academy site</a>.</li><li>Contact <a href="https://rstudio.chilipiper.com/book/academy-demo" target = "_blank">RStudio sales</a> or reach out to your customer service representative.</li></ul></description></item><item><title>Curating Your Data Science Content on RStudio Connect</title><link>https://www.rstudio.com/blog/rstudio-connect-data-showcase/</link><pubDate>Tue, 15 Mar 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-data-showcase/</guid><description><p><a href="https://www.rstudio.com/products/connect/" target = "_blank">RStudio Connect</a> is RStudio&rsquo;s publishing platform that hosts data science content created in R or Python, such as R Markdown documents, Shiny apps, Jupyter Notebooks, and more.</p><p>As you publish to RStudio Connect, you will want your audience to have a great experience looking through your work. Released in July 2021, the <a href="https://rstudio.github.io/connectwidgets/" target = "_blank">connectwidgets</a> package helps create a custom view of your content. Your projects will be easier to organize, distribute, and discover.</p><p>The connectwidgets package queries your Connect server to get information about your published content items. It turns them into HTML widget components that can be displayed either in an R Markdown document or Shiny app.</p><script src="https://fast.wistia.com/embed/medias/zqgzo2fbe7.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:53.96% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_zqgzo2fbe7 videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><center><caption>Example of gallery created by connectwidgets</caption></center><p>The Marketing team at RStudio has several pieces of content published on RStudio&rsquo;s Connect server. In this blog post, we&rsquo;ll walk through how we created a project gallery using connectwidgets and R Markdown. View our gallery on <a href="https://colorado.rstudio.com/rsc/rstudio-marketing-content-showcase/" target = "_blank">colorado.rstudio.com</a>!</p><p>Want to see connectwidgets in action? Check out Kelly O&rsquo;Briant&rsquo;s webinar, <a href="https://www.youtube.com/watch?v=GBNzhIkObyE" target = "_blank">Build Your Ideal Showcase of Data Products</a>.</p><p>Interested in trying out connectwidgets but don&rsquo;t have RStudio Connect? Log into this <a href="https://beta.rstudioconnect.com/connect/" target = "_blank">evaluation environment</a> to get access to a test server.</p><h2 id="curate-projects-in-r-markdown">Curate Projects in R Markdown</h2><p>Let&rsquo;s begin by loading the connectwidgets package:</p><pre><code>```{r}install.packages(&quot;connectwidgets&quot;)library(connectwidgets)```</code></pre><p>Open up a template by going to File, New File, R Markdown, From Template, and then selecting the <code>connectwidgets(HTML)</code> example.</p><script src="https://fast.wistia.com/embed/medias/l1jgjtt0qw.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:62.5% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_l1jgjtt0qw videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><center><caption>Opening a connectwidgets template</caption></center><p>To establish the connection with your RStudio Connect server, add your credentials to your <code>.Renviron</code> file:</p><pre><code>CONNECT_SERVER=&quot;https://rsc.company.com/&quot;CONNECT_API_KEY=&quot;mysupersecretapikey&quot;</code></pre><p>In your R Markdown document, pull in your credential information with the function below:</p><pre><code>```{r}client &lt;- connect(# server = Sys.getenv(&quot;CONNECT_SERVER&quot;),# api_key = Sys.getenv(&quot;CONNECT_API_KEY&quot;))```</code></pre><p>The <code>content()</code> function retrieves items from your RStudio Connect server and stores them in a tibble.</p><pre><code>```{r}all_content &lt;- client %&gt;%content()```</code></pre><p>You can work with the resulting data frame using your usual R tools. For example, you can select items using built-in helper functions or the dplyr package. In this case, we want to filter the dataset to content by certain users:</p><pre><code>```{r}marketing_content &lt;- all_content %&gt;%filter(owner_username %in% c(&quot;username1&quot;, &quot;username2&quot;))```</code></pre><h2 id="organize-your-work-with-html-widgets">Organize Your Work With HTML Widgets</h2><p>The package provides organizational components for card, grid, and table views. Each component links to the associated content item in RStudio Connect.</p><p>For each chunk, you can select the components that you want to showcase.</p><h3 id="card-view">Card View</h3><p>Cards display your content and associated metadata. You can set the image and the description of your project from the RStudio Connect dashboard.</p><pre><code>```{r card}marketing_content %&gt;%filter(name == &quot;specific_item_name&quot;) %&gt;%rsc_card()```</code></pre><script src="https://fast.wistia.com/embed/medias/ilg7u3ftdp.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:60.42% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_ilg7u3ftdp videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><center><caption>Connectwidgets card view</caption></center><h3 id="grid-view">Grid View</h3><p>Grids allow you to place content in side-by-side tiles. Similar to cards, you can display certain pieces of metadata.</p><pre><code>```{r grid-shiny}marketing_content %&gt;%filter(app_mode == &quot;shiny&quot;) %&gt;%rsc_grid()```</code></pre><script src="https://fast.wistia.com/embed/medias/65z81jgnu3.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:43.96% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_65z81jgnu3 videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><center><caption>Connectwidgets grid view</caption></center><h3 id="table-view">Table View</h3><p>Would you rather share text? The table component allows you to create a table that shows a fixed set of metadata.</p><pre><code>```{r table-plumbertableau}marketing_content %&gt;%filter(stringr::str_detect(name, &quot;seattle_parking&quot;)) %&gt;%rsc_table()```</code></pre><script src="https://fast.wistia.com/embed/medias/12ww0iswau.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:60.42% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_12ww0iswau videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><center><caption>Connectwidgets table view</caption></center><h2 id="customize-your-data-science-gallerys-look">Customize Your Data Science Gallery&rsquo;s Look</h2><p>Since connectwidgets is built on R Markdown, you can style your document&rsquo;s colors and fonts. Here, we use the <a href="https://rstudio.github.io/bslib/" target = "_blank">bslib</a> package to customize our gallery&rsquo;s appearance:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-yaml" data-lang="yaml">---<span style="color:#bbb"></span><span style="color:#bbb"></span><span style="color:#007020;font-weight:bold">title</span>:<span style="color:#bbb"> </span><span style="color:#4070a0">&#34;RStudio Marketing Content Gallery&#34;</span><span style="color:#bbb"></span><span style="color:#bbb"></span><span style="color:#007020;font-weight:bold">output</span>:<span style="color:#bbb"></span><span style="color:#bbb"></span><span style="color:#bbb"> </span><span style="color:#007020;font-weight:bold">html_document</span>:<span style="color:#bbb"></span><span style="color:#bbb"></span><span style="color:#bbb"> </span><span style="color:#007020;font-weight:bold">theme</span>:<span style="color:#bbb"></span><span style="color:#bbb"></span><span style="color:#bbb"> </span><span style="color:#007020;font-weight:bold">bg</span>:<span style="color:#bbb"> </span><span style="color:#4070a0">&#34;#FFFFFF&#34;</span><span style="color:#bbb"></span><span style="color:#bbb"> </span><span style="color:#007020;font-weight:bold">fg</span>:<span style="color:#bbb"> </span><span style="color:#4070a0">&#34;#404040&#34;</span><span style="color:#bbb"></span><span style="color:#bbb"> </span><span style="color:#007020;font-weight:bold">primary</span>:<span style="color:#bbb"> </span><span style="color:#4070a0">&#34;#4D8DC9&#34;</span><span style="color:#bbb"></span><span style="color:#bbb"> </span><span style="color:#007020;font-weight:bold">heading_font</span>:<span style="color:#bbb"></span><span style="color:#bbb"></span><span style="color:#bbb"> </span><span style="color:#007020;font-weight:bold">google</span>:<span style="color:#bbb"> </span><span style="color:#4070a0">&#34;Source Serif Pro&#34;</span><span style="color:#bbb"></span><span style="color:#bbb"> </span><span style="color:#007020;font-weight:bold">base_font</span>:<span style="color:#bbb"></span><span style="color:#bbb"></span><span style="color:#bbb"> </span><span style="color:#007020;font-weight:bold">google</span>:<span style="color:#bbb"> </span><span style="color:#4070a0">&#34;Source Sans Pro&#34;</span><span style="color:#bbb"></span><span style="color:#bbb"> </span><span style="color:#007020;font-weight:bold">code_font</span>:<span style="color:#bbb"></span><span style="color:#bbb"></span><span style="color:#bbb"> </span><span style="color:#007020;font-weight:bold">google</span>:<span style="color:#bbb"> </span><span style="color:#4070a0">&#34;JetBrains Mono&#34;</span><span style="color:#bbb"></span><span style="color:#bbb"></span>---<span style="color:#bbb"></span></code></pre></div><p>Add narrative or additional code chunks to give context to your projects.</p><h2 id="publish-to-rstudio-connect-for-easy-access">Publish to RStudio Connect for Easy Access</h2><p>Once you&rsquo;re ready, you can publish your document to RStudio Connect.</p><script src="https://fast.wistia.com/embed/medias/k5tteubt5z.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_k5tteubt5z videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><center><caption>Publishing to RStudio Connect</caption></center><p>RStudio Connect allows you to update the access settings so that you can share your gallery with others. You can also create a custom vanity URL for your work.</p><h2 id="learn-more">Learn More</h2><p>With RStudio Connect, you can publish your data science projects and also deliver a great experience to your audience.</p><ul><li>Learn more about <a href="https://www.rstudio.com/products/connect/" target = "_blank">RStudio Connect</a> and <a href="https://rstudio.github.io/connectwidgets/" target = "_blank">connectwidgets</a>.</li><li>View Kelly O&rsquo;Briant&rsquo;s webinar, <a href="https://www.youtube.com/watch?v=GBNzhIkObyE" target = "_blank">Build Your Ideal Showcase of Data Products</a>, for an in-depth example of connectwidgets.</li><li>Find out more about tracking the use of your RStudio Connect work by watching Cole Arendt&rsquo;s presentation, <a href="https://www.youtube.com/watch?v=0iljqY9j64U" target = "_blank">Shiny Usage Tracking in RStudio Connect</a>.</li></ul></description></item><item><title>rstudio::conf(2022) is open for registration!</title><link>https://www.rstudio.com/blog/rstudio-conf-2022-is-open-for-registration/</link><pubDate>Mon, 07 Mar 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-conf-2022-is-open-for-registration/</guid><description><p>rstudio::conf, the conference for all things R and RStudio, will takeplace July 25-28 in <a href="https://www.nationalharbor.com/meetings-groups/event-spaces/gaylord-national/">National Harbor,DC</a>!As usual, we&rsquo;ll have two days of workshops followed by two days oftalks. If you&rsquo;ve attended before and already know you want to attend,<a href="https://na.eventscloud.com/rstudioconf2022">register now</a>! Otherwise,read on to learn about the conference program, workshop, and ourdiversity scholarship program.</p><p>Check out highlights from a previous conf:</p><div><script src="https://fast.wistia.com/embed/medias/g712g9kse6.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding"style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper"style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_g712g9kse6 videoFoam=true"style="height:100%;position:relative;width:100%"><div class="wistia_swatch"style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><p><img src="https://fast.wistia.com/embed/medias/g712g9kse6/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></p></div></div></div></div></div><div style="text-align: center; margin: 2em 0 2em 0;"><p><a href="https://www.rstudio.com/conference/" style="padding: 12px; border: none; font-size: 18px; border-radius: 3px; background-color: #4c83b6; color: #fff; box-shadow: 0, 1px, 3px, 0px, rgba(0,0,0,0.10);">Visitthe conference website!</a></p></div><h1 id="conference">Conference</h1><p>We&rsquo;re thrilled to announce our keynote speakers:</p><ul><li><a href="https://mine-cr.com/">Mine Çetinkaya-Rundel</a> (Duke University andRStudio) and <a href="https://jules32.github.io/">Julia Stewart Lowndes</a>(Openscapes and NCEAS, University of California, Santa Barbara) willtalk about publishing and collaboration with<a href="https://quarto.org">Quarto</a>, the next generation of RMarkdown.</li><li><a href="https://jtleek.com/">Jeff Leek</a> (Fred Hutchinson Cancer ResearchCenter) will talk about the <a href="https://www.datatrail.org">DataTrailprogram</a> for community-based data sciencetraining in communities with limited technology trainingopportunities.</li><li><a href="https://juliasilge.com/">Julia Silge</a> (RStudio) and <a href="https://www.rstudio.com/authors/max-kuhn/">MaxKuhn</a> (RStudio) will talkabout good practices for applied machine learning, from modeldevelopment to model deployment.</li></ul><p>We&rsquo;ll also have over 80 talks in four parallel tracks. You&rsquo;ll learn fromoutstanding speakers drawn from the wider R and data-sciencecommunities, as well as RStudio data scientists and engineers. Check outthe <a href="https://www.rstudio.com/resources/rstudioglobal-2021/">talks from lastyear</a> to get asense of the breadth and depth of our typical content.</p><p>If you&rsquo;d like to speak, our <a href="https://www.rstudio.com/blog/save-the-date/#call-for-talks">call fortalks</a> isopen, and we&rsquo;ve extended the deadline to March 21. Talks can be live andin-person, or remote and pre-recorded, and all speakers will receivecoaching from <a href="https://www.articulationinc.com/">Articulation Inc</a>.</p><p>All in-person attendees will be required to be vaccinated and boosted.Read the details of how we plan to make everyone safe in our <a href="https://www.rstudio.com/conference/2022/2022-conf-code-of-conduct/">code ofconduct</a>.</p><p>If you can&rsquo;t make it in person, don&rsquo;t worry! All keynotes and talks willbe live-streamed, you&rsquo;ll be able to ask questions in the same way asin-person participants, and we&rsquo;re working on ways to virtually connectyou with other attendees. Virtual registration will be free andavailable closer to the date.</p><div style="text-align: center; margin: 2em 0 2em 0;"><p><a href="https://na.eventscloud.com/rstudioconf2022" style="padding: 12px; border: none; font-size: 18px; border-radius: 3px; background-color: #4c83b6; color: #fff; box-shadow: 0, 1px, 3px, 0px, rgba(0,0,0,0.10);">Registernow!</a></p></div><h1 id="workshops">Workshops</h1><p>There will be two days of optional, in-person training before theconference. This year, we have 15 exciting workshops, ranging from anintroduction to the tidyverse to a masterclass in package development.This is your opportunity to go deep in topics like graphic design for<a href="http://ggplot2.tidyverse.org/">ggplot2</a>, publishing with<a href="https://quarto.org">Quarto</a>, interactive apps with<a href="http://shiny.rstudio.com/">shiny</a>, machine learning with<a href="http://tidymodels.org/">tidymodels</a>, causal inference, making art withcode, people analytics, and much more! For those new to R and thetidyverse, we&rsquo;re debuting a new <a href="https://www.rstudio.com/academy/">hybridexperience</a> that includes onlinelearning before and after the in-person workshop. See the complete listof workshops, who&rsquo;s teaching, and the details of what you&rsquo;ll learn onthe <a href="https://www.rstudio.com/conference/2022/2022-conf-workshops-pricing/">workshopspage</a>.</p><p>Please note that rstudio::conf() workshops are very popular and do sellout quickly. If you want to attend a specific workshop, we highlyrecommend signing up well before the conference.</p><div style="text-align: center; margin: 2em 0 2em 0;"><p><a href="https://na.eventscloud.com/rstudioconf2022" style="padding: 12px; border: none; font-size: 18px; border-radius: 3px; background-color: #4c83b6; color: #fff; box-shadow: 0, 1px, 3px, 0px, rgba(0,0,0,0.10);">Registernow!</a></p></div><h1 id="diversity-scholarships">Diversity scholarships</h1><p>We will provide 50 diversity scholarships to individuals around theworld who are members of a group underrepresented at rstudio::conf().These groups include people of color, those with disabilities,elders/older adults, LGBTQ folks, and women/minority genders. Because weknow there is so much uncertainty around travel, especially for peopleoutside the United States, diversity scholars can choose to participatevirtually or in person.</p><p>Diversity scholarships will include:</p><ul><li>conference registration,</li><li>a workshop (either virtual or in person),</li><li>practical support, if needed, to enable participation in the virtualconference (such as an accessibility aid, a resource for internetaccess, or childcare),</li><li>up to $1500 in travel support for in-person participants,</li><li>and, of course, swag!</li></ul><p>The workshop for diversity scholars participating virtually will coverhow to get started with Quarto. Diversity scholars participating inperson will choose among the available in-person workshops. Where travelrestrictions allow, travel support for diversity scholars participatingin person will be available. We&rsquo;ll know more details closer to the date,and we&rsquo;ll do our best to help all diversity scholars.</p><p>Applications for the Diversity Scholarship program close on March 21,2022 <a href="https://en.wikipedia.org/wiki/Anywhere_on_Earth">AoE</a>.</p><div style="text-align: center; margin: 2em 0 0 0;"><p><a href="https://docs.google.com/forms/d/e/1FAIpQLSfaTTB9pPKctM3rCsahcbfns7T-8M7rUMh8VjC7-JWZ3_taqw/viewform" style="padding: 12px; font-size: 18px; border-radius: 3px; background-color: #4c83b6; color: #fff; box-shadow: 0, 1px, 3px, 0px, rgba(0,0,0,0.10);">Applynow!</a></p></div></description></item><item><title>RStudio Community Monthly Events Roundup - March 2022</title><link>https://www.rstudio.com/blog/rstudio-community-monthly-events-roundup-march-2022/</link><pubDate>Tue, 01 Mar 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-community-monthly-events-roundup-march-2022/</guid><description><sup>Photo by <a href="https://unsplash.com/@nickmorrison?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Nick Morrison</a> on <a href="https://unsplash.com/?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup><p>Welcome to RStudio Community Monthly Events Roundup, where we update you on upcoming events happening at RStudio this month. Missed the great talks and presentations from last month? Find them listed under <a href="#icymi-february-2022-events">ICYMI: February 2022 Events</a>.</p><p>You can <a href="https://www.addevent.com/calendar/wT379734" target = "_blank">subscribe</a> to the Community Events Calendar so that new events will automatically appear on your calendar. Please note that by subscribing, all of the events in the calendar will appear on your own calendar. If you wish to add individual events instead, please use the links below.</p><p>We can’t wait to see you there!</p><h2 id="save-the-date">Save the Date</h2><ul><li>March 2, 2022 at 12 ET: Advanced Usage Tracking of Shiny Applications in RStudio Connect | Presented by Cole Arendt <a href="https://www.addevent.com/event/aW12082208" target = "_blank">(add to calendar)</a></li><li>March 3, 2022 at 12 ET: Data Science Hangout with Stephen Bailey, Date Engineer at Whatnot <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">(add to calendar)</a></li><li>March 9, 2022: Data Visualization Accessibility | Presented by Mara Averick and Maya Gans <a href="https://www.addevent.com/event/jJ11782140" target = "_blank">(add to calendar)</a></li><li>March 10, 2022 at 12 ET: Data Science Hangout with Kristi Angel, Senior Data Scientist at Grubhub <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">(add to calendar)</a></li><li>March 15, 2022 at 12 ET: R for Excel Users - First Steps | Presented by George Mount <a href="https://rstd.io/excel-meetup" target = "_blank">(add to calendar)</a></li><li>March 17, 2022 at 12 ET: Data Science Hangout with Joe Gibson, Senior Project Director at de Beaumont Foundation <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">(add to calendar)</a></li><li>March 23, 2022 at 12 ET: RStudio Energy Meetup - Introduction to functional data analysis | Presented by Santiago Rodriguez <a href="https://rstd.io/energy-meetup" target = "_blank">(add to calendar)</a></li><li>March 24, 2022 at 12 ET: Data Science Hangout with Erin Pierson, Senior Manager of Trading Operations at Charles Schwab <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">(add to calendar)</a></li><li>March 31, 2022 at 12 ET: Data Science Hangout with Mike Smith, Senior Director Statistics at Pfizer <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">(add to calendar)</a></li><li>April 5, 2022 at 12 ET: Championing Analytic Infrastructure | Presented by Kelly O’Briant <a href="https://www.addevent.com/event/dM11812539/" target = "_blank">(add to calendar)</a></li></ul><h2 id="recurring-events">Recurring Events</h2><h3 id="data-science-hangout">Data Science Hangout</h3><p>Last year, we started an informal &ldquo;data science hangout&rdquo; at RStudio for the data science community to connect and chat about some of the more human-centric questions around data science leadership. These happen every Thursday at 12 ET with a different leader featured each week.</p><p>The conversation is all audience-based and there&rsquo;s no registration needed, so you can jump on whenever it fits your schedule. Add the weekly hangouts to your calendar on <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">AddEvent</a> and check out the new <a href="https://www.rstudio.com/data-science-hangout/" target = "_blank">website</a> with all the recordings.</p><h3 id="rstudio-enterprise-community-meetups">RStudio Enterprise Community Meetups</h3><p>We also host industry meetups for teams to share the work they are doing within their organizations, teach lessons learned, and network with others. Join the group on <a href="https://www.meetup.com/RStudio-Enterprise-Community-Meetup/" target = "_blank">Meetup</a>.</p><h2 id="icymi-february-2022-events">ICYMI: February 2022 Events</h2><ul><li>February 1, 2022: <a href="https://www.youtube.com/watch?v=07j22d4B_hA" target = "_blank">Capacity Planning for Microsoft Azure Data Centers - Using R &amp; RStudio Connect</a> | Presented by Paul Chang</li><li>February 3, 2022: <a href="https://youtu.be/PQsIOR6oH5o" target = "_blank">Data Science Hangout with Katie Schafer</a>, VP of Advanced Analytics at Beam Dental</li><li>February 7, 2022: <a href="https://youtu.be/zikxpOoEcLk" target = "_blank">RStudio Finance Meetup: Industry trends from banks to hedge funds to federal agencies</a> | Presented by Merav Yuravlivker and Dmitri Adler</li><li>February 10, 2022: <a href="https://youtu.be/vj70GHhYtc8" target = "_blank">Data Science Hangout with Matthias Mueller</a>, Sr Director of Analytics at Campaign Monitor</li><li>February 16, 2022: <a href="https://youtu.be/op4Q_z5juZc" target = "_blank">RStudio Sports Analytics Meetup: Using RStudio &amp; Google Cloud to Scale Sports Analytics</a> | Presented by Alok Pattani</li><li>February 17, 2022: <a href="https://www.rstudio.com/data-science-hangout/29-mike-miller" target = "_blank">Data Science Hangout with Mike Engine</a>, Vice President and Data Science Team Leader at Engine</li></ul><h2 id="call-for-speakers">Call for Speakers</h2><p>If you’re interested in sharing your work at a Meetup (or just starting to consider it for a future date down the road!), <a href="https://forms.gle/EtXMpSoTfhpGopiS8" target = "_blank">please fill out the speaker submission form</a>. We’re always looking for a diverse set of speakers — if you are a member of a group that is underrepresented in data science, including people of color, those with disabilities, elders/older adults, LGBTQ folks, and women/minority genders, we highly encourage you to submit!</p></description></item><item><title>Challenges in Package Management</title><link>https://www.rstudio.com/blog/challenges-in-package-management/</link><pubDate>Wed, 23 Feb 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/challenges-in-package-management/</guid><description><center><p><font size="2">Photo courtesy of <a href="https://www.pexels.com/photo/business-cargo-cargo-container-city-262353/">Pexels</a></font></p></center><p>Installing software packages from public repositories like <a href="https://cran.r-project.org/">CRAN</a> or <a href="https://pypi.org/">PyPI</a> is easy until it isn&rsquo;t. New developers and veterans reading this are likely familiar with the frustration of a lost afternoon from a package failing to install accompanied by an indecipherable error message—spending hours scouring the web only to find that the instructions were missing a crucial step. Did the package maintainers miss this, or could there be more going on?</p><p>From the outside, it might feel like it should be easier than it is, but digging a little deeper can lead to a sea of complexity and confusion. So let&rsquo;s look at some of the common challenges associated with package management, understand why it&rsquo;s difficult, and how we can do better.</p><h2 id="what-is-package-management">What is Package Management</h2><p>Package management is the entire ecosystem of tools and processes that install, upgrade, delete, and generally manage software programs for a computer. Here, we&rsquo;ll focus on the challenges related to the systems that manage software dependencies, also called software packages, for data scientists and software engineers.</p><p>Software packages are shared and installed to extend or enhance existing language functionality. For example, there are packages to serve HTTP requests, plot complex graphs, perform sophisticated statistical functions, and many more.</p><p>Most programming languages will have public repositories that host these packages like CRAN for R, or PyPI and Anaconda for Python. If you&rsquo;re familiar with R, then anytime you&rsquo;ve run the <code>install.packages</code> command, you&rsquo;ve probably downloaded a package from the CRAN repository. Similarly, if you&rsquo;ve used <code>pip</code> for Python, you&rsquo;ve likely interfaced with the PyPI repository.</p><h2 id="a-simple-example">A Simple Example</h2><p>Say you want to install the data visualization dependency <a href="https://packagemanager.rstudio.com/client/#/repos/2/packages/ggplot2"><code>ggplot2</code></a> to plot a graph. R can accomplish this by running the <code>install.packages</code> function. The output of which looks something like this:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-R" data-lang="R"><span style="color:#666">&gt;</span> <span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">ggplot2&#34;</span>)Installing package into ‘<span style="color:#666">/</span>usr<span style="color:#666">/</span>local<span style="color:#666">/</span>lib<span style="color:#666">/</span>R<span style="color:#666">/</span>site<span style="color:#666">-</span>library’(as <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">lib&#39;</span> is unspecified)also installing the dependencies<span style="color:#4070a0">&#39;</span><span style="color:#4070a0"> colorspace&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">cli&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">crayon&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">utf8&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">farver&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">labeling&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">lifecycle&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">munsell&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">R6&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">RColorBrewer&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">viridisLite&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">ellipsis&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">fansi&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">magrittr&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">pillar&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">pkgconfig&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">vctrs&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">digest&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">glue&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">gtable&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">isoband&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">rlang&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">scales&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">tibble&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">withr&#39;</span>trying URL <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">https://packagemanager.rstudio.com/all/__linux__/focal/latest/src/contrib/colorspace_2.0-2.tar.gz&#39;</span>Content type <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">binary/octet-stream&#39;</span> length <span style="color:#40a070">2621589</span> <span style="color:#06287e">bytes </span>(<span style="color:#40a070">2.5</span> MB)<span style="color:#666">==</span><span style="color:#666">==</span><span style="color:#666">==</span><span style="color:#666">==</span><span style="color:#666">==</span><span style="color:#666">==</span><span style="color:#666">==</span><span style="color:#666">==</span><span style="color:#666">==</span><span style="color:#666">==</span><span style="color:#666">==</span><span style="color:#666">==</span><span style="color:#666">==</span><span style="color:#666">==</span><span style="color:#666">==</span><span style="color:#666">==</span><span style="color:#666">==</span><span style="color:#666">==</span><span style="color:#666">==</span><span style="color:#666">==</span><span style="color:#666">==</span><span style="color:#666">==</span><span style="color:#666">==</span><span style="color:#666">==</span><span style="color:#666">==</span>downloaded <span style="color:#40a070">2.5</span> MB<span style="color:#007020;font-weight:bold">...</span> omitted for brevity <span style="color:#007020;font-weight:bold">...</span><span style="color:#666">*</span> installing <span style="color:#666">*</span>binary<span style="color:#666">*</span> package <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">ggplot2&#39;</span><span style="color:#007020;font-weight:bold">...</span><span style="color:#666">*</span> <span style="color:#06287e">DONE </span>(ggplot2)The downloaded source packages are in‘<span style="color:#666">/</span>tmp<span style="color:#666">/</span>RtmpXOEyc3<span style="color:#666">/</span>downloaded_packages’</code></pre></div><p>So what is even happening here? We wanted to download the package <code>ggplot2</code> and ended up also downloading <code>colorspace</code>, <code>cli</code>, and a bunch of others! It turns out that software packages usually require or recommend other software packages.</p><p>These recursive dependencies are called a package&rsquo;s <strong>dependency graph</strong>. These dependency graphs can quickly get complicated, as we can see in this visualization:</p><center><p><img src="dependency-graph.jpg" alt="graph showing software packages as nodes and dependency relationships as edges"></p><p><font size="2">A complex dependency graph showing the required and recommended dependencies for the <code>ggplot2</code> package</font></p></center><p>Beyond this graph, a package can also require system dependencies. You may have installed these in the past using tools like <code>brew</code> for macOS, <code>apt</code> for Linux&rsquo;s Ubuntu distribution, or <code>choco</code> for Windows.</p><p>At this point, you may be saying to yourself, &ldquo;install the system dependencies, install the package&rsquo;s dependencies and those package&rsquo;s dependencies, and finally install the package.&rdquo; It sounds challenging, but it should be doable, right? Unfortunately, we still have a few more challenges to face in this recursive rabbit hole.</p><h2 id="versioning-and-dependency-solvers">Versioning and Dependency Solvers</h2><p>In general, adding software dependencies to a project happens naturally as the project evolves. That is, on day one, you may install <code>tibble</code>, the next day <code>rmarkdown</code>, and so on. However, these projects could also be evolving, so on that first day, <code>tibble</code> was version <code>3.1.4</code>, and the next day it&rsquo;s <code>3.1.5</code>.</p><p>In this example, a problem could arise where the installed version of <code>rmarkdown</code> was developed using the <code>3.1.5</code> version and may not be compatible with older versions.</p><p>You could update <code>tibble</code>, but it might depend on another package that is not compatible with the newer version. This problem also extends to the requisite system dependencies, which can be more challenging to reconcile.</p><center><p><img src="version-example.jpg" alt="A dependency network showing two packages tibble-3.1.4 and apollo installed, the desired rmarkdown package and its incompatible dependency tibble-3.1.5."></p><p><font size="2">An example graph showing an incompatible dependency cycle, assuming we already have <code>tibble-3.1.4</code> and <code>apollo</code> installed, and we want the latest version of <code>rmarkdown</code></font></p></center><p>Different package management systems deploy various strategies to solve this problem. For example, CRAN employs an approach that enforces that all packages are compatible with one another at any given point in time. Another option is to include a dependency solver like Python&rsquo;s <a href="https://pip.pypa.io/en/stable/topics/dependency-resolution/"><code>pip</code></a> and <a href="https://www.anaconda.com/blog/understanding-and-improving-condas-performance"><code>conda</code></a> tools do.</p><p>A dependency solver systematically works out which packages and versions of those packages are most likely to be compatible. This process can be immensely challenging and take an indeterminate amount of time to get perfect, so most projects tend to rely on heuristics or a guess-and-check method to get it right. For example, <code>pip</code> now uses a backtracking algorithm to perform this operation. This change happened as recently as November 2020, and I recommend <a href="https://pyfound.blogspot.com/2019/12/moss-czi-support-pip.html">looking back</a> at why it was necessary in the first place.</p><p>Let&rsquo;s look at one more wrinkle in installing these dependencies.</p><h2 id="operating-system-and-cpu-architecture">Operating System and CPU Architecture</h2><p>So far, we&rsquo;ve only talked about problems related to dependencies playing nicely with one another, but after we know what packages and package versions we need, is there anything else that can go wrong? Some of you may have noticed this peculiar line in the simple example above:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-bash" data-lang="bash">* installing *binary* package <span style="color:#4070a0">&#39;ggplot2&#39;</span>...</code></pre></div><p>This statement means we are installing a compiled binary software package. Most of the software we install today is a &ldquo;binary&rdquo; distribution which means it is pre-built for a specific operating system and CPU architecture. These distributions install more quickly, and require less work than configuring our computers to build from the source code.</p><p>The problems start when our computer differs from the machine used to build the packages. For example, you may have heard of Apple&rsquo;s new M1 apple laptops that use the <a href="https://developer.apple.com/documentation/xcode/writing-arm64-code-for-apple-platforms">ARM64 CPU</a> architecture. As a result of this release, existing package binaries compiled for macOS Intel x86 computers will not work on these new machines. Some package managers attempt to solve this by building packages for as many operating systems and CPU architectures as possible, while others try to maximize compatibility through portable build distributions like <a href="https://www.python.org/dev/peps/pep-0513/">Python&rsquo;s manylinux system</a>.</p><p>However, the situation gets more problematic when we again consider system dependencies, but I&rsquo;ll leave that topic for another time as it warrants an article in itself.</p><h2 id="more-information">More Information</h2><p>We covered many topics, including package management, versioning, dependency resolvers, and compiling pre-built package binaries. Of course, there&rsquo;s so much more to say about these subjects, but hopefully, this helps shed light on why package management is challenging and the problems engineers face every day.</p><p>RStudio recognizes these challenges and has a couple of ways to help with our <a href="https://www.rstudio.com/products/package-manager/">Package Manager</a> product which:</p><ul><li>provides an easy way to pin to CRAN dates to ensure compatibility and determinism</li><li>supports internal, non-CRAN R packages to make sharing and collaboration as simple as possible</li><li>provides pre-built R package binaries for Windows and Linux operating systems and CPU architectures</li><li>and much more</li></ul><p>If these challenges sound exciting or you&rsquo;d like to learn more, the RStudio Package Manager team is hiring! See our <a href="https://www.rstudio.com/about/careers/">careers page</a> for more information.</p><h2 id="references">References</h2><ul><li><a href="https://cran.r-project.org/">https://cran.r-project.org/</a></li><li><a href="https://pypi.org/">https://pypi.org/</a></li><li><a href="https://pip.pypa.io/en/stable/topics/dependency-resolution/">https://pip.pypa.io/en/stable/topics/dependency-resolution/</a></li><li><a href="https://www.anaconda.com/blog/understanding-and-improving-condas-performance">https://www.anaconda.com/blog/understanding-and-improving-condas-performance</a></li><li><a href="https://pyfound.blogspot.com/2019/12/moss-czi-support-pip.html">https://pyfound.blogspot.com/2019/12/moss-czi-support-pip.html</a></li><li><a href="https://developer.apple.com/documentation/xcode/writing-arm64-code-for-apple-platforms">https://developer.apple.com/documentation/xcode/writing-arm64-code-for-apple-platforms</a></li><li><a href="https://www.python.org/dev/peps/pep-0513/">https://www.python.org/dev/peps/pep-0513/</a></li></ul></description></item><item><title>RStudio Workbench 2022.02.0 Release</title><link>https://www.rstudio.com/blog/rstudio-workbench-2022-02-0/</link><pubDate>Tue, 22 Feb 2022 01:20:00 -0600</pubDate><guid>https://www.rstudio.com/blog/rstudio-workbench-2022-02-0/</guid><description><p>RStudio Workbench&rsquo;s 2022.02 release, code-named Prairie Trillium, includes many new and exciting features and updates. Today we&rsquo;ll highlight a few noteworthy features:</p><ul><li><p><a href="#specifying-r-versions-in-the-ad-hoc-launcher">Specifying R Versions in the Ad Hoc Job Launcher</a></p></li><li><p><a href="#session-suspension">Session Suspension</a></p></li><li><p><a href="#ssl-communication-between-rstudio-workbench-and-launcher-sessions">SSL Communication Between RStudio Workbench and Launcher Sessions</a></p></li><li><p><a href="#new-job-launcher-release">New Job Launcher Release</a>.</p></li></ul><p>To read about all of the new features and updates available in this release, check out the latest <a href="https://www.rstudio.com/products/rstudio/release-notes/" target="_blank">Release Notes</a>. For more detailed explanations, visit the <a href="https://docs.rstudio.com/ide/server-pro" target="_blank">RStudio Workbench Admin Guide</a>.</p><h1 id="specifying-r-versions-in-the-ad-hoc-launcher">Specifying R Versions in the Ad Hoc Launcher</h1><p>You are now able to select the R version your script will run with when running a script as a launcher job. From the Environment tab, simply select the version from the new R version drop-down menu, as shown below. The default option is &ldquo;(Use System Default)&quot;, which is the version that was used prior to this new option. <img src="./rVersionLauncher.png" alt="The &ldquo;Run Script as Launcher Job&rdquo; modal in RStudio Workbench"></p><p>If you&rsquo;ve configured the script to run in a different cluster or with a different image from your active RStudio session, a &ldquo;User-specified&hellip;&rdquo; option will be available in the menu. Selecting this option displays a free-form text field where you can type the <code>R_HOME</code> path of the R version you&rsquo;d like the script to run with.</p><p><img src="rVersionOtherEnvironment.png" alt="The &ldquo;Run Script as Launcher Job&rdquo; modal in RStudio Workbench with the &ldquo;User specified&hellip;&rdquo; &ldquo;R version&rdquo; option selected"></p><h1 id="session-suspension">Session Suspension</h1><p>RStudio sessions now provide more insight into session states that will prevent a session from auto-suspending. This is particularly useful for RStudio Cloud users, or other RStudio Server or RStudio Workbench environments where users are charged for the amount of time a session is active.</p><p>To indicate that RStudio is doing something that will prevent auto-suspension, a new icon (<img src="suspendBlocked.png" alt="Session suspend blocked icon">) appears in the console toolbar. A mouse-over the icon will list everything that is currently blocking auto-suspension:</p><p><img src="./sessionAutoSuspendBlocked.png" alt="Session auto-suspend paused due to an R script job running"></p><p>By default, the icon will appear 5 seconds after a suspension-blocking task begins. This behavior can be configured or disabled in the Global Options Console pane.</p><h2 id="session-welcome-message">Session Welcome Message</h2><p>When returning to a session that is either active or was suspended, you are now greeted with a welcome message in the R console.</p><p>For example, when returning to a session that was suspended, you will see a message like the following:</p><pre><code>Session restored from your saved work on 2022-Feb-19 11:26:00 UTC (2 days ago)</code></pre><p>And when returning to a session that has been active, you will see a message like this:</p><pre><code>Connected to your session in progress, last started 2022-Feb-11 15:36:00 UTC (4 hours ago)</code></pre><h2 id="session-suspend-logs">Session Suspend Logs</h2><p>Server logs can be generated each time a session is blocked from suspending. These can be useful in determining why a session did not suspend. To view them, ask your System Administrator to enable Information-level logging and look for the <code>SessionTimeoutSuspendBlocked</code> message.</p><h1 id="ssl-communication-between-rstudio-workbench-and-launcher-sessions">SSL Communication Between RStudio Workbench and Launcher Sessions</h1><p>This release adds secure socket communication between RStudio Workbench and sessions running in Kubernetes or Slurm clusters. This applies to all session types: RStudio, VS Code, and Jupyter. It&rsquo;s enabled by default using automatically provisioned certificates, created per job. Read more in the <a href="https://docs.rstudio.com/ide/server-pro/latest/access_and_security/secure_sockets.html#secure-session-communication" target="_blank">RStudio Workbench Admin Guide</a>.</p><h1 id="new-job-launcher-release">New Job Launcher Release</h1><p>RStudio Workbench 2022.02 comes bundled with our latest release of the Job Launcher. The new release adds an exciting new feature - Kubernetes Templating. You have complete control over how the Launcher submits jobs and services to Kubernetes, by editing them using a format similar to <a href="https://helm.sh/" target="_blank">Helm Charts</a>. This Helm-style format provides conditional logic to allow you to change fields on the job like annotations, labels, as well as providing you the ability to run user-defined functions for complete integration with outside business logic.</p><p>This new feature can be enabled by setting <code>use-templating=1</code> in <code>launcher.kubernetes.conf</code>. This feature completely supersedes the job-json-overrides feature, which is now deprecated. See the <a href="https://docs.rstudio.com/job-launcher/kube.html" target="_blank">Kubernetes plugin documentation</a> for more details.</p><p>Thank you for taking the time to read about a few of the features we&rsquo;re most excited about in the 2022.02 RStudio Workbench release! For detailed instructions on upgrading from a prior release or RStudio Open Source, visit <a href="https://docs.rstudio.com/rsw/upgrade/" target="blank_">Upgrade RStudio Workbench</a>.</p></description></item><item><title>RStudio 2022.02.0: What's New</title><link>https://www.rstudio.com/blog/rstudio-2022-02-0-what-s-new/</link><pubDate>Tue, 22 Feb 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-2022-02-0-what-s-new/</guid><description><p>This post highlights some of the improvements in the latest RStudio IDE release 2022.02.0, code-named &ldquo;Prairie Trillium&rdquo;. To read about all of the new features and updates available in this release, check out the latest <a href="https://www.rstudio.com/products/rstudio/release-notes/" target="_blank">Release Notes</a>.</p><ul><li><a href="#support-for-r-4-2-0">Support for R (&gt;= 4.2.0)</a><ul><li><a href="#updated-graphics-engine">Updated graphics engine support</a></li><li><a href="#windows-r-4-2-0">Windows support for R (&gt;= 4.2.0)</a></li></ul></li><li><a href="#visual-mode-editor-improvements">RMarkdown Visual editor improvements</a><ul><li><a href="#collapsible-code-chunks">Collapsible code chunks in visual editor</a></li><li><a href="#line-numbering-visual">Optional line numbering in code chunks in visual editor</a></li><li><a href="#code-diagnostics-visual">Optional code diagnostics in visual editor</a></li></ul></li><li><a href="#more-info">More info</a></li></ul><h1 id="support-for-r-4-2-0">Support for R (&gt;= 4.2.0)</h1><p>The development build of R, which will be released as R 4.2.0 later this year, features a <a href="https://cran.r-project.org/doc/manuals/r-devel/NEWS.html" target = "_blank">number of changes</a>. The latest RStudio release supports the most significant of these changes.</p><h2 id="updated-graphics-engine">Updated graphics engine support</h2><p>The graphics engine, which is used to render plots in the RStudio Plots pane, has been upgraded to graphics engine version 15 in R 4.2.0. Older versions of RStudio displayed a warning that <span style="color:red"><tt>R graphics engine version 15 is not supported by this version of RStudio</tt></span> and plots opened in a separate window rather than in the Plots pane. In this release, the Plots Pane is fully compatible with R &gt;= 4.2.0.</p><h2 id="windows-r-4-2-0">Windows support</h2><ul><li><p>R 4.2.0 will use UTF-8 as the native encoding on recent Windows systems, such as Windows 10, Windows Server 2016, or later. Builds of R &gt;= 4.2.0 on Windows use UCRT to support this use of UTF-8. This RStudio release provides support for UTF-8 native encoding on Windows, including listing file paths and directories containing non-ASCII UTF-8 characters that may be common for international users working in other languages and locales.</p></li><li><p>R 4.2.0 soft-deprecated the download method of <code>wininet</code>, which was commonly used as a method of downloading files and installing packages in R on Windows platforms. This release updates the file download method used when installing packages; users will no longer see a warning indicating that <span style="color:red"><tt>the &lsquo;wininet&rsquo; method is deprecated</tt></span> when installing R packages via RStudio.</p></li></ul><h1 id="visual-mode-editor-improvements">Visual editor improvements</h1><h2 id="collapsible-code-chunks">Collapsible code chunks</h2><p>The visual editor now has the ability to expand or collapse some or all code chunks to make it easier to navigate the document. You can collapse a code chunk by clicking on the code-folding arrow to the top-left corner of each chunk. Each collapsed chunk will display the number of lines of code that have been collapsed. Output can be collapsed independently of the code itself.</p><p><img src="images/paste-15CA77D2.png" alt="An expanded code chunk"></p><p><img src="images/paste-DB47123A.png" alt="A collapsed code chunk"></p><p>Alternatively, you can use the Command Palette or keyboard shortcuts to collapse or expand a single code chunk, or all code chunks at once. You can collapse a single code chunk with <code>Ctrl</code> + <code>Alt</code> + <code>L</code>, with your cursor in the code chunk, and expand it again with <code>Ctrl</code> + <code>Alt</code> + <code>Shift</code> + <code>L</code>.</p><p><img src="images/paste-72DF538D.png" alt="Using Command Palette to collapse code"></p><h2 id="line-numbering-visual">Optional line numbering</h2><p>Users now have the option to display line numbers, code diagnostic markers, or both in the left-hand margin of each code chunk in the visual editor.</p><p>The option to display line numbering in the visual editor is separate from the option controlling line numbering in the source editor, and these line numbers are disabled by default, even if line numbering is turned on in the source editor. To turn on line numbering for the visual editor, go to <strong><code>Tools &gt; Global Options &gt; RMarkdown &gt; Visual</code></strong> and check the box to &ldquo;Show line numbers in code blocks&rdquo;.</p><p><img src="images/Screen%20Shot%202022-02-03%20at%2010.41.48%20AM.png" alt="Enabling line numbers in Visual mode"></p><p>Each code chunk will now display its own line numbering within the code chunk.</p><p><img src="images/paste-8C81F145.png" alt="Code chunk with line numbering">Alternatively, you can also enable line numbering by opening the Command Palette and searching for &ldquo;line numbers&rdquo; to check the setting &ldquo;Show line numbers in visual mode code blocks&rdquo;.</p><p><img src="images/line_numbers-01.png" alt="Enabling line numbers in the Command Palette"></p><h2 id="code-diagnostics-visual">Optional code diagnostics</h2><p>You can also display code diagnostics within the visual editor, using the same settings to enable these in both the visual editor and the source editor. For more details on enabling and using code diagnostics in general, see <a href="https://support.rstudio.com/hc/en-us/articles/205753617-Code-Diagnostics-in-the-RStudio-IDE" title="Support: Code diagnostics in the RStudio IDE">the support article</a> on code diagnostics within the IDE.</p><p>After enabling code diagnostics for both the visual and source editor in <strong><code>Tools &gt; Global Options &gt; Code &gt; Diagnostics,</code></strong> lines of code detected as potentially having issues will be indicated in the left-hand margin with an <img src="images/Screen%20Shot%202022-02-03%20at%2010.56.35%20AM-01.png" alt="Error" width="16" style="margin: 0;"/>, <img src="images/Screen%20Shot%202022-02-03%20at%2010.56.46%20AM-01.png" alt="Warning" width="16" height="16" style="margin: 0;"/>, or <img src="images/Screen%20Shot%202022-02-03%20at%2010.56.54%20AM.png" alt="Info" width="16" style="margin: 0;"/> symbol, depending on the nature of the issue.</p><p><img src="images/paste-1E66F4D8.png" alt="Enabling code diagnostics in the IDE"></p><p>For example, in the code sample below, issues are denoted with diagnostic markers next to the line numbers, and for certain issues, the affected code is underlined.</p><p><img src="images/Code%20chunk%20with%20diagnostics.png" alt="Code chunk with diagnostic markers"></p><p>By hovering over the associated marker, we can read a message as to the nature of the issue(s) with that line of code. After editing and saving the file, if the issue has been successfully resolved, the marker, and any associated underlining, will disappear.</p><p><img src="images/Screen%20Shot%202022-02-03%20at%2011.27.21%20AM.png" alt="Diagnostic marker message in tooltip"></p><h1 id="more-info">More Info</h1><p>There&rsquo;s lots more in this release, and it&rsquo;s <a href="https://www.rstudio.com/products/rstudio/download/" target="_blank">available for download today</a>. You can read about all the features and bugfixes in the RStudio 2022.02.0 &ldquo;Prairie Trillium&rdquo; release in the <a href="https://www.rstudio.com/products/rstudio/release-notes/" target="_blank">RStudio Release Notes</a>. We&rsquo;d love to hear your feedback about the new release on our <a href="https://community.rstudio.com/c/rstudio-ide/9" target="_blank">community forum</a>.</p></description></item><item><title>2022 Internships</title><link>https://www.rstudio.com/blog/2022-internships/</link><pubDate>Fri, 18 Feb 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/2022-internships/</guid><description><p><sup>Photo by <a href="https://twitter.com/CMastication">JD Long</a></sup></p><p>We’re excited to announce our 2022 Summer Internship Program. Through the program, students will be able to apply for full-time summer internships within one of the departments of RStudio. The program allows students to do impactful work that helps both RStudio users and the broader community, and ensures that the community of developers and educators is just as diverse as its community of users.</p><p>Internship projects will give you hands-on experience, exposure to a collaborative culture, and opportunities to sharpen problem-solving and critical-thinking skills. You will acquire professional experience and work with knowledgeable data scientists, software developers, and educators to create and share new tools and ideas.</p><p>These are paid internships for 12 weeks, and there are two cohorts, starting either May 9th or June 13th, depending on your school schedule. To qualify, you must currently be a student, (broadly construed - if you think you’re a student, you probably are), have some experience writing code, and using Git and GitHub. To demonstrate these skills, your application needs to include a link to a package, Shiny app, data analysis repository, or other software or educational materials. It’s OK if you create something specifically for this application: we just need to know that you’re already familiar with the mechanics of collaborative software development. The postings will remain open until filled, but for a full review, best to get your application in by <strong>March 7, 2022.</strong></p><p>RStudio is a geographically distributed team which means you can be based anywhere in the United States. We hope to expand the program to other countries in the future, but for this year you must legally be able to work in the United States. You will be working 100% remotely and you will meet with your mentor regularly online.</p><p>Internships this year will be in</p><ul><li><a href="https://rstudio.com/about/job-posting/?gh_jid=4962537003" target = "_blank">Engineering</a></li><li><a href="https://www.rstudio.com/about/job-posting/?gh_jid=4960408003" target = "_blank">Open Source Engineering</a></li><li>Education <a href="https://rstudio.com/about/job-posting/?gh_jid=4962623003" target = "_blank">Case Studies</a>, <a href="https://rstudio.com/about/job-posting/?gh_jid=4963596003" target = "_blank">Projects and Exercises</a>, <a href="https://rstudio.com/about/job-posting/?gh_jid=4963543003" target = "_blank">UX/UI</a></li></ul><h2 id="qualifications">Qualifications</h2><ul><li>Current student over the age of 18</li><li>Legally able to work in the United States</li><li>Experience writing code - R, Python, Java, Go, C++, Javascript or Typescript</li><li>Experience using Git, Github or Bitbucket</li><li>An interest in developing tools or educational materials for data science</li></ul><h2 id="how-to-apply">How to Apply</h2><p>You can apply for the position that you’re interested in through our career portal. You are welcome to apply to more than one internship opportunity.</p><p>See the Internships on our <a href="https://www.rstudio.com/about/careers/" target = "_blank">Careers page</a>.</p><p>To apply you need</p><ul><li>A resume</li><li>Links to any software projects you’ve worked on (GitHub, BitBucket, GoogleDrive, web pages, or other)</li></ul><p>Questions about the program? Post questions and look for answers at <a href="https://community.rstudio.com/tag/rstudio-internship" target = "_blank">community.rstudio.com/tags/rstudio-internship</a>.</p><p>RStudio is committed to being a diverse and inclusive workplace. We encourage applicants of different backgrounds, cultures, genders, experiences, abilities and perspectives to apply. All qualified applicants will receive equal consideration without regard to race, color, national origin, religion, sexual orientation, gender, gender identity, age, or physical disability. However, applicants must legally be able to work in the United States.</p></description></item><item><title>Working With Databases and SQL in RStudio</title><link>https://www.rstudio.com/blog/working-with-databases-and-sql-in-rstudio/</link><pubDate>Thu, 17 Feb 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/working-with-databases-and-sql-in-rstudio/</guid><description><p>Photo by <a href="https://unsplash.com/@choys_?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Conny Schneider</a> on <a href="https://unsplash.com/">Unsplash</a></p><p>Relational databases are a common way to store information, and SQL is a widely-used language for managing data held in these systems. RStudio provides several options to work with these crucial tools.</p><p>Let&rsquo;s explore using a <a href="https://bit.io/ivelasq3/elements" target = "_blank">PostgreSQL database</a> that contains <a href="https://github.com/fivethirtyeight/data/tree/master/bob-ross" target = "_blank">FiveThirtyEight’s data on Bob Ross paintings</a>.</p><h2 id="connecting-to-databases-with-rstudio">Connecting to Databases With RStudio</h2><p>You can connect to databases in RStudio, either by manually writing the connection code or using the Connections Pane.</p><p>Install the packages that correspond to your database. For example, you can connect to a bit.io PostgreSQL database by creating an <a href="https://bit.io/" target = "_blank">account</a> and inserting the repo&rsquo;s details in a code chunk:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-R" data-lang="R"><span style="color:#60a0b0;font-style:italic"># Install these packages if you have not already</span><span style="color:#60a0b0;font-style:italic"># install.packages(c(&#39;DBI&#39;, &#39;RPostgres&#39;))</span>con <span style="color:#666">&lt;-</span> DBI<span style="color:#666">::</span><span style="color:#06287e">dbConnect</span>(RPostgres<span style="color:#666">::</span><span style="color:#06287e">Postgres</span>(),dbname <span style="color:#666">=</span> <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">ivelasq3&#39;</span>,host <span style="color:#666">=</span> <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">db.bit.io&#39;</span>,port <span style="color:#666">=</span> <span style="color:#40a070">5432</span>,user <span style="color:#666">=</span> <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">ivelasq3_demo_db_connection&#39;</span>,password <span style="color:#666">=</span> <span style="color:#06287e">Sys.getenv</span>(<span style="color:#4070a0">&#39;</span><span style="color:#4070a0">BITIO_KEY&#39;</span>) <span style="color:#60a0b0;font-style:italic"># insert your password here</span>)</code></pre></div><p>In addition to manually writing code, you can connect to databases with the Connections Pane in the IDE. It shows all the connections to supported data sources. You can also scan through your databases, see which connections are currently active, and close connections.</p><p><img src="img/features.png" alt="RStudio IDE Connections Pane and callouts for each of its functionality"></p><center><caption><a href="https://db.rstudio.com/tooling/connections/" target = "_blank">RStudio IDE Connections Pane</a></caption></center><p>For RStudio commercial customers, we offer <a href="https://www.rstudio.com/products/drivers/" target = "_blank">RStudio Professional ODBC Drivers</a>. These are ODBC data connectors that help you connect to some of the most popular databases and use them in a production environment.</p><h2 id="querying-databases-using-rstudio">Querying Databases Using RStudio</h2><p>Once you have your connection set up, you can run database queries in RStudio. There are several ways of doing this. Let’s explore RStudio&rsquo;s SQL integration, the <a href="https://cran.r-project.org/web/packages/DBI/index.html" target = "_blank">DBI package</a>, the <a href="https://dbplyr.tidyverse.org/" target = "_blank">dbplyr package</a>, and <a href="https://rmarkdown.rstudio.com/" target = "_blank">R Markdown</a>.</p><h2 id="sql-integration-in-rstudio">SQL Integration in RStudio</h2><p>The RStudio IDE has direct integration with <code>.sql</code> files. You can open, edit, and test those file types inside RStudio.</p><p>Generate a <code>.sql</code> file with your open connection (or go to <strong>File</strong>, <strong>New File</strong>, <strong>SQL Script</strong>) and start writing your query.</p><p>Notice that there’s a comment RStudio added to the top of the file:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#666">-</span><span style="color:#666">-</span> <span style="color:#666">!</span>preview conn<span style="color:#666">=</span>con</code></pre></div><p>This comment tells RStudio to execute the query against the open connection named <code>con</code>. Click <strong>Preview</strong> or press <kbd>Ctrl</kbd> + <kbd>Shift</kbd> + <kbd>Enter</kbd> to run the query, and your results appear in a new tab:</p><p><img src="img/sql-file.png" alt="Screenshot of a SQL query in a SQL file and previewing the results in the RStudio IDE"></p><h3 id="the-dbi-package">The DBI package</h3><p>You can query your data with the <code>DBI::dbGetQuery()</code> function. Paste your SQL code as a quoted string. Using the example database from earlier, let’s query the first three rows of the <code>elements</code> table:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-R" data-lang="R">DBI<span style="color:#666">::</span><span style="color:#06287e">dbGetQuery</span>(con, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">SELECT episode, title FROM \&#34;ivelasq3/elements\&#34;.\&#34;elements\&#34; LIMIT 3&#39;</span>)</code></pre></div><pre><code> episode title1 S01E01 &quot;\\&quot;A WALK IN THE WOODS\\&quot;&quot;2 S01E02 &quot;\\&quot;MT. MCKINLEY\\&quot;&quot;3 S01E03 &quot;\\&quot;EBONY SUNSET\\&quot;&quot;</code></pre><p>The <a href="https://glue.tidyverse.org/" target = "_blank">glue package</a> package makes writing SQL queries a little easier. The <code>glue::glue_sql()</code> function is able to handle the SQL quoting and variable placement:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">tbl_glue <span style="color:#666">&lt;-</span>glue<span style="color:#666">::</span><span style="color:#06287e">glue_sql</span>(<span style="color:#4070a0">&#39;</span><span style="color:#4070a0">SELECT episode, title FROM &#34;ivelasq3/elements&#34;.&#34;elements&#34; LIMIT 3&#39;</span>)DBI<span style="color:#666">::</span><span style="color:#06287e">dbGetQuery</span>(con, tbl_glue)</code></pre></div><pre><code> episode title1 S01E01 &quot;\\&quot;A WALK IN THE WOODS\\&quot;&quot;2 S01E02 &quot;\\&quot;MT. MCKINLEY\\&quot;&quot;3 S01E03 &quot;\\&quot;EBONY SUNSET\\&quot;&quot;</code></pre><h3 id="the-dbplyr-package">The dbplyr package</h3><p>You can write your queries with dplyr syntax using the <a href="https://dbplyr.tidyverse.org/" target = "_blank">dbplyr package</a>.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-R" data-lang="R"><span style="color:#06287e">library</span>(dplyr)tbl_dbplyr <span style="color:#666">&lt;-</span><span style="color:#06287e">tbl</span>(con, dbplyr<span style="color:#666">::</span><span style="color:#06287e">ident_q</span>(<span style="color:#4070a0">&#39;</span><span style="color:#4070a0">&#34;ivelasq3/elements&#34;.&#34;elements&#34;&#39;</span>))</code></pre></div><p>The dbplyr package translates dplyr verbs into SQL queries, making it easy to work with the data from your database.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-R" data-lang="R">tbl_dbplyr <span style="color:#666">%&gt;%</span><span style="color:#06287e">summarise</span>(total <span style="color:#666">=</span> <span style="color:#06287e">n</span>())</code></pre></div><pre><code># Source: lazy query [?? x 1]# Database: postgres [ivelasq3_demo_db_connection@db.bit.io:5432/ivelasq3]total&lt;int64&gt;1 403</code></pre><p>You can always inspect the SQL translation with the <code>show_query()</code> function. The dbplyr package will switch between SQL syntaxes based on the DB type (e.g., MS, Oracle, PG, etc.).</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-R" data-lang="R">tbl_dbplyr <span style="color:#666">%&gt;%</span><span style="color:#06287e">summarise</span>(total <span style="color:#666">=</span> <span style="color:#06287e">n</span>()) <span style="color:#666">%&gt;%</span><span style="color:#06287e">show_query</span>()</code></pre></div><pre><code>&lt;SQL&gt;SELECT COUNT(*) AS &quot;total&quot;, TRUE AS &quot;na.rm&quot;FROM &quot;ivelasq3/elements&quot;.&quot;elements&quot;</code></pre><p>The dbplyr package allows you to work iteratively like you would in dplyr. All of your code is in R so you do not have to switch between languages to explore the data.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">tbl_dbplyr2 <span style="color:#666">&lt;-</span>tbl_dbplyr <span style="color:#666">%&gt;%</span><span style="color:#06287e">mutate</span>(night_and_ocean <span style="color:#666">=</span><span style="color:#06287e">case_when</span>(night <span style="color:#666">==</span> <span style="color:#40a070">1</span> <span style="color:#666">&amp;</span> ocean <span style="color:#666">==</span> <span style="color:#40a070">1</span> <span style="color:#666">~</span> <span style="color:#40a070">1</span>,<span style="color:#007020;font-weight:bold">TRUE</span> <span style="color:#666">~</span> <span style="color:#40a070">0</span>))tbl_dbplyr2 <span style="color:#666">%&gt;%</span><span style="color:#06287e">summarise</span>(night_sum <span style="color:#666">=</span> <span style="color:#06287e">sum</span>(night),ocean_sum <span style="color:#666">=</span> <span style="color:#06287e">sum</span>(ocean),night_and_ocean_sum <span style="color:#666">=</span> <span style="color:#06287e">sum</span>(night_and_ocean))</code></pre></div><pre><code># Source: lazy query [?? x 3]# Database: postgres [ivelasq3_demo_db_connection@db.bit.io:5432/ivelasq3]night_sum ocean_sum night_and_ocean_sum&lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;1 11 36 4</code></pre><p>Using the function <code>collect()</code>, we can then use our data with other functions or R packages such as ggplot2.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(ggplot2)tbl_ggplot <span style="color:#666">&lt;-</span>tbl_dbplyr <span style="color:#666">%&gt;%</span><span style="color:#06287e">collect</span>() <span style="color:#666">%&gt;%</span><span style="color:#06287e">rowwise</span>() <span style="color:#666">%&gt;%</span><span style="color:#06287e">mutate</span>(total_number <span style="color:#666">=</span><span style="color:#06287e">as.numeric</span>(<span style="color:#06287e">sum</span>(<span style="color:#06287e">c_across</span>(<span style="color:#06287e">where</span>(is.numeric))))) <span style="color:#666">%&gt;%</span><span style="color:#06287e">ggplot</span>(<span style="color:#06287e">aes</span>(total_number)) <span style="color:#666">+</span><span style="color:#06287e">geom_histogram</span>(fill <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">#A4C689&#34;</span>) <span style="color:#666">+</span><span style="color:#06287e">theme_minimal</span>() <span style="color:#666">+</span><span style="color:#06287e">xlab</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Number of elements by episode&#34;</span>)</code></pre></div><p><img src="img/plot.png" alt="Histogram of number of elements by Bob Ross episode"></p><h3 id="r-markdown">R Markdown</h3><p>Would you rather write verbatim SQL code? You can run SQL code in an R Markdown document. Create a <code>sql</code> code chunk and specify your connection with the <code>connection = con</code> code chunk option.</p><pre><code>```{sql}#| connection = conSELECT episode, titleFROM &quot;ivelasq3/elements&quot;.&quot;elements&quot;LIMIT 3```</code></pre><div class="knitsql-table"><table><caption><span id="tab:unnamed-chunk-8">Table 1: </span>3 records</caption><thead><tr class="header"><th align="left">episode</th><th align="left">title</th></tr></thead><tbody><tr class="odd"><td align="left">S01E01</td><td align="left">“"A WALK IN THE WOODS"”</td></tr><tr class="even"><td align="left">S01E02</td><td align="left">“"MT. MCKINLEY"”</td></tr><tr class="odd"><td align="left">S01E03</td><td align="left">“"EBONY SUNSET"”</td></tr></tbody></table></div><p>R Markdown provides options that simplify using SQL with R. For example, <a href="https://community.rstudio.com/t/rmd-file-with-embedded-sql-chunk-possible-to-move-the-sql-to-external-file-then-source/49651" target = "_blank">this post</a> shows how you can use the <code>cat</code> engine to write the content of a chunk to a file.</p><pre><code>```{cat}#| engine.opts = list(file = &quot;select_tbl.sql&quot;, lang = &quot;sql&quot;)SELECT episode, titleFROM &quot;ivelasq3/elements&quot;.&quot;elements&quot;LIMIT 3```</code></pre><p>You can read in the file using the <code>code</code> chunk option so you do not have to write out your SQL query.</p><pre><code>```{sql}#| connection = con, code=readLines(&quot;select_tbl.sql&quot;)```</code></pre><p>You can send the query output to an R data frame by defining <code>output.var</code> in the code chunk. Then you can reuse that data frame elsewhere in your code.</p><pre><code>```{sql}#| connection = con,#| code=readLines(&quot;select_tbl.sql&quot;),#| output.var = &quot;dat&quot;```</code></pre><pre><code>```{r}print(dat)```</code></pre><pre><code> episode title1 S01E01 &quot;\\&quot;A WALK IN THE WOODS\\&quot;&quot;2 S01E02 &quot;\\&quot;MT. MCKINLEY\\&quot;&quot;3 S01E03 &quot;\\&quot;EBONY SUNSET\\&quot;&quot;</code></pre><p>These options make working with SQL in R Markdown even smoother.</p><h2 id="learn-more">Learn More</h2><p>This blog post just touched on a few examples of how to work with databases and SQL in RStudio. Check out more resources below.</p><ul><li>Read how to use RStudio products and packages with databases on our website, <a href="https://db.rstudio.com/" target = "_blank"><a href="https://db.rstudio.com/">https://db.rstudio.com/</a></a>. This comprehensive website provides more information on working with databases in RStudio as well as examples of best practices.</li><li>Learn more about <a href="https://www.rstudio.com/blog/rstudio-1-2-preview-sql/" target = "_blank">RStudio&rsquo;s SQL integration</a>.</li><li>Explore the powerful package <a href="https://dbplyr.tidyverse.org/" target = "_blank">dbplyr</a>.</li><li>Find out more about the <a href="https://bookdown.org/yihui/rmarkdown/language-engines.html#sql" target = "_blank">SQL engine in R Markdown</a>.</li><li>Check out some great talks by <a href="https://www.youtube.com/watch?v=gdzONbwfWk0" target = "_blank">Irene Steves</a>, <a href="https://www.youtube.com/watch?v=JwP5KdWSgqE" target = "_blank">Ian Cook</a>, and <a href="https://www.youtube.com/watch?v=aVI4YZ1CB2c" target = "_blank">Edgar Ruiz</a>.</li></ul></description></item><item><title>Save the date for rstudio::conf(2022)!</title><link>https://www.rstudio.com/blog/save-the-date/</link><pubDate>Mon, 14 Feb 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/save-the-date/</guid><description><script src="https://fast.wistia.com/embed/medias/k1ffgjxvr0.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_k1ffgjxvr0 videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/k1ffgjxvr0/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><p>rstudio::conf(2022) is coming July 25-28 to National Harbor, DC!We&rsquo;ll have more soon on our amazing keynote speakers, our line up of tutorials, our diversity scholarships program, and what we&rsquo;ll be doing to keep y&rsquo;all safe.For now, please put the date in your calendar, and read on if you&rsquo;re interested in submitting a talk</p><h2 id="call-for-talks">Call for talks</h2><p>The call for talks at rstudio::conf(2022) is now open!We&rsquo;d love to hear about:</p><ul><li>How you&rsquo;ve used R (by itself or with other tools) to solve a challenging problem.</li><li>Projects and teams where R and Python live together in harmony.</li><li>Your favorite R package and how it makes life easier or unlocks new capabilities.</li><li>Your techniques for teaching data science to help reach new domains and new audiences.</li><li>Your broad reflections on data science, packages, code, and community.</li><li>Anything else you think the RStudio community would love to hear about!</li></ul><p>Talks are 20 minutes long and, while we hope that as many speakers as possible can make it in-person, we understand how much uncertainty the pandemic brings.Our submission form asks you to indicate your current plans regarding in-person attendance and, in general, we will plan for the reality that some speakers will need to present via a pre-recorded talk.</p><p>As the shape of the event becomes more clear, we may expand the program to include pre-recorded lightning talks.If this happens, lightning talks will be drawn from the main pool of talk proposals, i.e. there is no separate application process.Speakers receive complimentary registration for the conference (and, budget allowing, needs-based assistance for accommodation and travel).</p><p>All speakers will receive coaching from <a href="https://www.articulationinc.com/">Articulation Inc</a>.This means your past speaking experience isn&rsquo;t as important as the quality of your ideas: if you have an interesting topic, we encourage you to apply, regardless of your background, experience, or job title.Speakers for rstudio::global(2021) found the coaching both fun and very worthwhile; it&rsquo;s a great opportunity to polish your speaking skills while getting to know the other speakers.</p><p>To submit a talk, you&rsquo;ll need to create a 60 second video that introduces you and your proposed topic.In the video, you should tell us who you are, why your topic is important, and what attendees will take away from it.If you&rsquo;re interested, please submit a proposal at: <a href="https://rstd.io/conf-talks-2022">https://rstd.io/conf-talks-2022</a>.<del>Submission closes March 14, and we&rsquo;ll communicate decisions no later than mid-April.</del></p><p><strong>We&rsquo;ve extended the submission deadline! Proposals now due March 28.</strong></p><p><a href="https://rstd.io/conf-talks-2022"><strong>Apply now!</strong></a></p></description></item><item><title>How BI and Data Science Deliver Deeper Insight Together</title><link>https://www.rstudio.com/blog/how-bi-and-data-science-deliver-deeper-insight-together/</link><pubDate>Tue, 08 Feb 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/how-bi-and-data-science-deliver-deeper-insight-together/</guid><description><p>Photo by <a href="https://unsplash.com/@pawel_czerwinski?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Pawel Czerwinski</a> on <a href="https://unsplash.com/s/photos/data?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></p><p>Michael Lippis of The Outlook podcast interviewed RStudio’s Lou Bajuk to discuss why RStudio encourages its customers to break down the analytic silos that often exist between data science and business intelligence (BI) teams. During the interview, Michael and Lou examined four main topics:</p><ul><li><a href="#the-relationship-between-bi-and-data-science">The relationship between BI and data science</a></li><li><a href="#understanding-self-service-bi-tools">Understanding self-service BI tools</a></li><li><a href="#understanding-code-first-tools">Understanding code-first tools</a></li><li><a href="#leveraging-the-strengths-of-both-worlds">Leveraging the strengths of both worlds</a></li></ul><p>We’ve extracted key parts of the podcast interview below and edited the quotes for clarity and length. You can listen to the full interview on the <a href="https://www.rstudio.com/collections/additional-talks/bi-data-science-deliver-deeper-insights/" target = "_blank">RStudio website</a>.</p><h2 id="the-relationship-between-bi-and-data-science">The Relationship Between BI and Data Science</h2><div class="question-quote"><span class="speaker-name">Mike:</span>Do organizations view BI and data science as competitors?</div><div class="speaker-quote"><span class="speaker-name">Lou:</span>Unfortunately, this is far too often the case. BI and data science share many aspects. They both allow the analytically minded to draw on data from multiple data sources and create rich interactive applications that can be shared with others to improve decision-making.</div><div class="no-speaker-quote">Ironically, these common purposes and capabilities often trap the teams that use these tools into being organizational competitors because these different approaches can end up delivering applications that look fairly similar. The nuances of the two approaches can be obscured to decision-makers, especially to those who hold the budget, and this leads to potential competition between the groups.</div><div class="question-quote"><span class="speaker-name">Mike:</span>Should data science complement or augment self-service BI?</div><div class="speaker-quote"><span class="speaker-name">Lou:</span>Both are important. In the complement approach, organizations use the BI and data science tools separately but side by side. BI tools might be used for widespread reporting where simpler analyses and visualizations are sufficient. Code-first data science tools might be used for specialized or complex activities.</div><div class="no-speaker-quote">In the augment approach, organizations use BI and data science tools by directly linking them in various ways. Data science tools might be used to apply more advanced analytic techniques, which can help the BI user focus on what's most important in the data. From the perspective of the data science team, delivering these insights through the organization's BI tool can reach a much broader audience. This can be great for boosting the visibility and perceived value of the work that the team does, as well as increasing the impact of their work.</div><h2 id="understanding-self-service-bi-tools">Understanding Self-service BI Tools</h2><div class="question-quote"><span class="speaker-name">Mike:</span>Why are self-service BI tools so widely used?</div><div class="speaker-quote"><span class="speaker-name">Lou:</span>We like to talk about code-first data science giving data scientists superpowers. Similarly, self-service BI tools give business analysts superpowers. Tools like Tableau or Power BI are widely used because they allow business analysts who might not be comfortable coding to perform analytic tasks, things like exploring and visualizing data where they can apply their knowledge of the business problem at hand.</div><div class="no-speaker-quote">Generally, BI tools help support a data-driven organization. When these tools are adopted as a corporate standard, they really help provide a common platform for sharing insights and supporting decision making.</div><div class="question-quote"><span class="speaker-name">Mike:</span>Now, what challenges do BI tools present?</div><div class="speaker-quote"><span class="speaker-name">Lou:</span>There are a few challenges. BI tools can make it difficult to adapt if the underlying data changes as the data transformations that are done there are often obscured in a series of point-and-click actions. While these tools might have a wide range of visualizations and some basic stats tools, they're largely constrained by whatever capabilities their vendors implement.</div><div class="no-speaker-quote">Another more subtle and potentially quite serious challenge is that the BI tool can help create uncertain conclusions. Humans are hardwired to see patterns and create explanations for them even if the patterns aren't real. It might be hard to get any conclusion at all because there's not a clear pattern in the data.</div><div class="no-speaker-quote">Finally, these tools require skills that aren't easily transferable. Getting the most out of these BI tools, especially creating a reusable analysis, requires pretty specialized skills. If the analyst moves to another organization or if their organization decides not to renew the commercial software because it's expensive, these product-specific skills can be wasted.</div><h2 id="understanding-code-first-tools">Understanding Code-first Tools</h2><div class="question-quote"><span class="speaker-name">Mike:</span>When compared with self-service BI tools, what do open-source data science tools using R and Python provide? </div><div class="speaker-quote"><span class="speaker-name">Lou:</span>One of the biggest things is that users of these tools can draw on the broad spectrum of capabilities contributed by the open-source community. This ensures that the analysts and data scientists always have the right tool for the analytic problem at hand. Another advantage of open-source tools using R and Python is that they're inherently reusable, extensible, and inspectable.</div><div class="no-speaker-quote">This is a big reason why RStudio advocates and supports code-first data science: these attributes make it easy to track changes over time using version control, maintain reproducibility, and provide more options for customization.<p>Open source tools also make it easy to integrate with other analytic frameworks in an organization. We call this interoperability. This helps keep data scientists more productive and ensures the better utilization of all these data resources and computational clusters that IT has spent a lot of effort setting up.</div></p><div class="question-quote"><span class="speaker-name">Mike:</span>What challenges are teams faced with that leverage open-source, code-friendly data science?</div><div class="speaker-quote"><span class="speaker-name">Lou:</span>The biggest challenge that people typically raise to a code-first approach is that it requires people creating the analysis to have coding skills. While a Shiny application presents an analysis in a way that doesn't require any familiarity with R and Python by the end-user, ultimately these products do require someone familiar with the languages to develop them.</div><div class="no-speaker-quote">For these open-source environments, you need some way of managing packages and managing the environments. There's also a lack of native deployment capabilities in R and Python, which often lead to these data science teams creating their own ways of sharing applications with their users. These homegrown solutions can be difficult to develop and maintain over time.<p>Finally, the lack of native capabilities for security and scalability in the cloud means that these organizations often struggle to support large teams, whether it&rsquo;s on the development side or the deployment side. This often leads to very siloed data science work.</div></p><div class="question-quote"><span class="speaker-name">Mike:</span>Has RStudio done anything to address those open-source challenges?</div><div class="speaker-quote"><span class="speaker-name">Lou:</span>Absolutely. We are dedicated to the proposition that code-first data science is uniquely powerful and that everyone can learn to code. We support this through our education efforts, our community side, and generally making R easier to learn and use through our open-source projects, like the tidyverse, the set of packages that are focused on making R easier to use.</div><div class="no-speaker-quote">Our professional product suite, <a href="https://www.rstudio.com/products/team/" target = "_blank">RStudio Team</a>, provides all the enterprise security and scalability, package management, and centralized administration for both development and deployment environments and delivers all the enterprise features that organizations require. Our hosted offerings, <a href="https://www.rstudio.com/products/cloud/" target = "_blank">RStudio Cloud</a> and <a href="https://www.shinyapps.io/" target = "_blank">shinyapps.io</a>, enable data scientists to develop and deploy data products on the cloud without needing to set up their own infrastructure to do so.</div><h2 id="leveraging-the-strengths-of-both-worlds">Leveraging the strengths of both worlds</h2><div class="question-quote"><span class="speaker-name">Mike:</span>What is a good strategy for matching a data science approach to an application need?</div><div class="speaker-quote"><span class="speaker-name">Lou:</span>The key is to understand what you're trying to accomplish and what skills and expertise you have to draw on. BI tools provide a lower barrier of entry for someone who may not be comfortable coding in a language like R or Python. However, the questions you are trying to answer get more complex as they require greater analytic depth. Users will encounter a low ceiling to the complexity of the questions they can ask and answer.</div><div class="no-speaker-quote">On the other hand, code-first data science tools represent a relatively high barrier to entry for a novice user. They require those who create the analysis to have some understanding of coding and be familiar with how to apply advanced analytic methods. However, the flexibility and analytic breadth of these code-first data science tools combined create a pretty high ceiling to be able to answer difficult, valuable, often vague questions for an organization.</div><div class="question-quote"><span class="speaker-name">Mike:</span>Is there a role for data hand-offs in BI and data science collaboration?</div><div class="speaker-quote"><span class="speaker-name">Lou:</span>Absolutely. This is one of the key ways that organizations augment their BI tools with data science. We define data handoff as a situation where a data set is created by a data scientist or data science team and stored in a database or other data source where it can be shared with BI analysts for visualization in a BI tool. Data scientists create the data, BI tools visualize it.</div><div class="no-speaker-quote">These sorts of data hand-offs are vital. Data science teams often work with large messy data from unstructured or novel sources, and then apply advanced analytical methods and statistical rigor to them. And while these teams may create reports and rich interactive applications, they can't hope to address every potential question that might be asked of the data.<p>By making this preprocessed data available to users of BI tools, including any predictions or calculated features in the data, data scientists can give BI users the ability to explore the data themselves. This increases the visibility and reuse of the data and consequently, the impact of the data science work the team has done.</div></p><div class="question-quote"><span class="speaker-name">Mike:</span>Can you share a use case with the audience?</div><div class="speaker-quote"><span class="speaker-name">Lou:</span>One of my favorite customer examples is from Monash University in Australia. The director of the strategic intelligence and insight unit was tasked with providing decision support for their executive team. They needed to combine data from a large array of complex data sources for deeper insights.</div><div class="no-speaker-quote">So, they used Shiny and RStudio Pro Products to strengthen strategic decision-making. As he described it, they were able to start with the decision they needed to support and then reverse engineer back to what was the data they needed, where was it, and how did it need to be transformed, and what was the interactivity that they needed to deliver to allow their executives to ask and answer the questions they needed.</div><div class="no-speaker-quote">They adopted this code-first approach to provide an advanced level of flexibility as well as reproducibility so they could clearly communicate to their executives and support decisions that they needed to make.</div><div class="question-quote"><span class="speaker-name">Mike:</span>Where can our audience get more information on RStudio solutions?</div><div class="speaker-quote"><span class="speaker-name">Lou:</span>We've done a series of <a href="https://www.rstudio.com/tags/bi-tools/">blog posts</a> that are available on the RStudio blog. Also, I mentioned one customer example here but in the <a href="https://www.rstudio.com/about/customer-stories/">customer section of our website</a>, we also have more examples. That's a great place to learn about how other customers and other organizations have leveraged RStudio tools.</div><h2 id="for-more-information">For More Information</h2><p>Combining open-source tools like R and self-service business intelligence tools helps organizations get the full value possible out of their data. By avoiding a one-size-fits-all approach, you can develop more agile processes and improve decision-making.</p><p>If you’d like to explore other resources related to this article:</p><ul><li>Read about how RStudio&rsquo;s modular platform helps you get the most value out of your analytic investments on our <a href="https://www.rstudio.com/solutions/interoperability/" target = "_blank">Interoperability page</a>.</li><li>Learn more on how to break down your analytic silos and get deeper insights to drive more value from your data on our <a href="https://www.rstudio.com/solutions/bi-and-data-science/" target = "_blank">BI and data science page</a>.</li></ul></description></item><item><title>Build and Share Jupyter Notebooks on RStudio Team</title><link>https://www.rstudio.com/blog/build-and-share-jupyter-notebooks-on-rstudio-team/</link><pubDate>Thu, 03 Feb 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/build-and-share-jupyter-notebooks-on-rstudio-team/</guid><description><p>Photo by <a href="https://unsplash.com/@jayphoto?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Justin W</a> on <a href="https://unsplash.com/?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></p><p>Jupyter Notebooks are interactive documents for code, outputs, and text. However, they’re often stuck in data scientists’ local computing environments. Collaborating can be difficult and sharing can be tedious. To live up to their fullest potential, data science teams need a way to scale their development securely and efficiently — while providing stakeholders easy access to their output and visualizations.</p><p><a href="https://www.rstudio.com/products/team/" target = "_blank">RStudio Team</a>, made up of RStudio Workbench, RStudio Connect, and RStudio Package Manager, brings everything together to help data scientists create, reproduce, and share insights from their Jupyter Notebooks.</p><p>Let’s dive into a real-life example by exploring data from <a href="https://cneos.jpl.nasa.gov/about/" target = "_blank">NASA&rsquo;s Center for Near Earth Objects (NEOs)</a>. Daniel Petzold walks us through his data analysis and reporting. Want to explore the report yourself? Check out the published version on <a href="https://colorado.rstudio.com/rsc/space-tracker/" target = "_blank">RStudio Connect</a> and the code on <a href="https://github.com/danielpetzold/space-tracker" target = "_blank">Github</a>.</p><h2 id="analyze-and-visualize-data-within-a-maintained-environment">Analyze and Visualize Data Within a Maintained Environment</h2><p>On <a href="https://www.rstudio.com/products/workbench/" target = "_blank">RStudio Workbench</a>, you have a choice of editors: the RStudio IDE, JupyterLab, Jupyter Notebook, or VS Code. Choose your preference. From here, you can explore your dataset, embed HTML directly in your document, create visualizations, and more. Watch Daniel walk through his exploratory analysis on JupyterLab:</p><script src="https://fast.wistia.com/embed/medias/90ce73qe52.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:62.5% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_90ce73qe52 videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/90ce73qe52/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><h2 id="publish-directly-to-your-content-hub">Publish Directly to Your Content Hub</h2><p>Now that you’ve run your analyses and created insightful visualizations, you want to be able to share them with your team. RStudio Workbench allows you to publish to <a href="https://www.rstudio.com/products/connect/" target = "_blank">RStudio Connect</a>, the content platform from RStudio.</p><p>You have multiple options: push-button deployment from Jupyter Notebook or <a href="https://docs.rstudio.com/how-to-guides/users/basic/publish-jupyter-notebook/" target = "_blank">using terminal commands</a> from JupyterLab. Daniel shows us the push-button approach through Jupyter Notebook:</p><script src="https://fast.wistia.com/embed/medias/5il9gez25g.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:62.5% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_5il9gez25g videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/5il9gez25g/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><h2 id="share-with-your-stakeholders">Share With Your Stakeholders</h2><p>It’s not enough to publish your work. Once on RStudio Connect, you can share with end-users. Make your analysis accessible to specific users or more generally with different authentication measures. In addition, you can schedule the document to run at a certain time and send out an email with refreshed data.</p><p>See these functionalities from Daniel’s standpoint:</p><script src="https://fast.wistia.com/embed/medias/nd7yubhla5.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:62.5% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_nd7yubhla5 videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/nd7yubhla5/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><h2 id="learn-more">Learn More</h2><p>With RStudio Team, you can develop, collaborate, and share within an integrated architecture. Learn more about <a href="https://www.rstudio.com/products/team/" target = "_blank">RStudio Team</a>.</p><ul><li>Watch Daniel&rsquo;s full walkthrough on <a href="https://youtu.be/x8Wf8qXAGDI" target = "_blank">YouTube</a>.</li><li>Learn more about using Python with RStudio on our <a href="https://solutions.rstudio.com/python/" target = "_blank">Solutions page</a>.</li><li>Want another example of RStudio Team? Watch <a href="https://www.youtube.com/watch?v=VrF9EdgiSy8" target = "_blank">RStudio Team Demo | Build &amp; Share Data Products Like The World’s Leading Companies</a>on YouTube.</li></ul></description></item><item><title>RStudio Community Monthly Events Roundup - February 2022</title><link>https://www.rstudio.com/blog/rstudio-community-monthly-events-roundup-february-2022/</link><pubDate>Wed, 02 Feb 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-community-monthly-events-roundup-february-2022/</guid><description><sup>Photo by <a href="https://unsplash.com/@nickmorrison?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Nick Morrison</a> on <a href="https://unsplash.com/?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup><p>Welcome to RStudio Community Monthly Events Roundup, where we update you on upcoming events happening at RStudio this month. Missed the great talks and presentations from last month? Find them listed under <a href="#icymi-2022-events">ICYMI: 2022 Events</a>.</p><p>You can <a href="https://www.addevent.com/calendar/wT379734" target = "_blank">subscribe</a> to the Community Events Calendar so that new events will automatically appear on your calendar. Please note that by subscribing, all of the events in the calendar will appear on your own calendar. If you wish to add individual events instead, please use the links below.</p><p>We can’t wait to see you there!</p><h2 id="save-the-date">Save the Date</h2><ul><li>February 3, 2022 at 12 ET: Data Science Hangout with Katie Schafer, Head of Analytics at Beam Dental <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">(add to calendar)</a></li><li>February 7, 2022 at 12 ET: RStudio Finance Meetup: Industry trends from banks to hedge funds to federal agencies | Presented by Merav Yuravlivker and Dmitri Adler <a href="https://www.addevent.com/event/Rc10480836" target = "_blank">(add to calendar)</a></li><li>February 10, 2022 at 12 ET: Data Science Hangout with Matthias Mueller, Sr Director of Analytics at Campaign Monitor <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">(add to calendar)</a></li><li>February 16, 2022 at 12 ET: RStudio Sports Analytics Meetup: Using RStudio &amp; Google Cloud to Scale Sports Analytics | Presented by Alok Pattani <a href="https://www.addevent.com/event/ZK11476389" target = "_blank">(add to calendar)</a></li><li>February 17, 2022 at 12 ET: Data Science Hangout with Mike Engine, Vice President and Data Science Team Leader at Engine <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">(add to calendar)</a></li><li>February 24, 2022 at 12 ET: Data Science Hangout with Joseph Korszun, Manager of Data Science at ProCogia <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">(add to calendar)</a></li><li>March 3, 2022 at 12 ET: Data Science Hangout with Stephen Bailey, Date Engineer at Whatnot <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">(add to calendar)</a></li><li>March 9, 2022 at 12 ET: Data Visualization Accessibility | Presented by Mara Averick and Maya Gans <a href="https://www.addevent.com/event/jJ11782140" target = "_blank">(add to calendar)</a></li><li>March 10, 2022 at 12 ET: Data Science Hangout with Kristi Angel, Senior Data Scientist at Grubhub <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">(add to calendar)</a></li><li>March 17, 2022 at 12 ET: Data Science Hangout with Joe Gibson, Senior Project Director at de Beaumont Foundation <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">(add to calendar)</a></li></ul><h2 id="recurring-events">Recurring Events</h2><h3 id="data-science-hangout">Data Science Hangout</h3><p>Last year, we started an informal &ldquo;data science hangout&rdquo; for the data science community to connect and chat about some of the more human-centric questions around data science leadership. These happen every Thursday at 12 ET with a different leader featured each week.</p><p>The conversation is all audience-based and there&rsquo;s no registration needed, so you can jump on whenever it fits your schedule. Add the weekly hangouts to your calendar on <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">AddEvent</a> and check out the new <a href="https://www.rstudio.com/data-science-hangout/" target = "_blank">website</a> with all the recordings.</p><h3 id="rstudio-enterprise-community-meetups">RStudio Enterprise Community Meetups</h3><p>We also host industry meetups for teams to share the work they are doing within their organizations, teach lessons learned, and network with others. Join the group on <a href="https://www.meetup.com/RStudio-Enterprise-Community-Meetup/" target = "_blank">Meetup</a>.</p><h2 id="icymi-2022-events">ICYMI: 2022 Events</h2><ul><li>January 6, 2022 at 12 ET: <a href="https://www.rstudio.com/data-science-hangout/23-ian-anderson" target = "_blank">Data Science Hangout</a> with Ian Anderson, Director of Hockey Analytics at the Philadelphia Flyers</li><li>January 11, 2022 at 4 ET: <a href="https://youtu.be/DQSFOaFLI0M" target = "_blank">An inclusive solution for teaching and learning R during the COVID pandemic</a> | Presented by Patricia Menéndez</li><li>January 13, 2022 at 12 ET: <a href="https://www.rstudio.com/data-science-hangout/24-prabha-thanikasalam" target = "_blank">Data Science Hangout</a> with Prabha Thanikasalam, Senior Director, Analytics and Supply Chain Solutions at Flex</li><li>January 18, 2022 at 12 ET: <a href="https://youtu.be/vRbUM0n_nb8" target = "_blank">R in Supply Chain</a>: Intro to Supply Chain Design &amp; Forecasting Demand with R | Presented by Laura Rose &amp; Ralph Asher</li><li>January 20, 2022 at 12 ET: <a href="https://www.rstudio.com/data-science-hangout/25-asmae-toumi" target = "_blank">Data Science Hangout</a> with Asmae Toumi, Director of Analytics at PursueCare</li><li>January 25, 2022 at 12 ET: <a href="https://youtu.be/MrW5XFf7aps" target = "_blank">Building a Blog with R</a> | Presented by Isabella Velásquez</li><li>January 27, 2022 at 10 ET: <a href="https://www.youtube.com/playlist?list=PL9HYL-VRX0oRKK9ByULWulAOO5jN70eXv" target = "_blank">R in Public Sector: Organizational &amp; Technical Aspects of Shiny in Production</a> | Presented by Sjoerd Wieringa &amp; Job Spijker</li><li>January 27, 2022 at 12 ET: <a href="https://www.rstudio.com/data-science-hangout/26-theresa-ward" target = "_blank">Data Science Hangout</a> with Theresa Ward, Sr Manager - New Glenn Production at Blue Origin</li><li>February 1, 2022 at 12 ET: <a href="https://youtu.be/07j22d4B_hA" target = "_blank">Capacity Planning for Microsoft Azure Data Centers</a> | Using R &amp; RStudio Connect | Presented by Paul Chang</li></ul><h2 id="call-for-speakers">Call for Speakers</h2><p>If you’re interested in sharing your work at a Meetup (or just starting to consider it for a future date down the road!), <a href="https://forms.gle/EtXMpSoTfhpGopiS8" target = "_blank">please fill out the speaker submission form</a>. We’re always looking for a diverse set of speakers — if you are a member of a group that is underrepresented in data science, including people of color, those with disabilities, elders/older adults, LGBTQ folks, and women/minority genders, we highly encourage you to submit!</p></description></item><item><title>What's New on RStudio Cloud - February 2022</title><link>https://www.rstudio.com/blog/what-s-new-on-rstudio-cloud-february-2022/</link><pubDate>Tue, 01 Feb 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/what-s-new-on-rstudio-cloud-february-2022/</guid><description><p>Whether you want to do, share, teach, or learn data science, <a href="https://www.rstudio.com/products/cloud/" target = "_blank">RStudio Cloud</a> is a cloud-based solution that allows you to do so online. The RStudio Cloud team has rolled out new features and improvements since our <a href="https://www.rstudio.com/blog/what-s-new-on-rstudio-cloud-september-2021/" target = "_blank">last post in September 2021</a>. So what’s new?</p><h2 id="students-pay-option-for-instructors">Students Pay Option for Instructors</h2><p>If you want to use RStudio Cloud to teach, but are not in a position to cover costs for your students, you can now choose to have each student pay for their course access. See the <a href="https://rstudio.cloud/plans/instructor?option=student" target = "_blank">Cloud Instructor</a> page for more details on the option, but here are the basics:</p><ul><li>You, the instructor, sign up for the Instructor plan with the <a href="https://rstudio.cloud/plans/instructor?option=student" target = "_blank">Students Pay</a> option. You pay $15 / month.</li><li>You create your course space and invite your students to the space.</li><li>To work on projects in your space, a student must sign up for a paid subscription to Cloud - typically the <a href="https://rstudio.cloud/plans/plus" target = "_blank">Cloud Plus</a> plan for $5 / month is the right option for students.</li><li>If a student attempts to create or open a project in your course space and does not have a paid subscription, they will be prompted to sign up for one.</li><li>Within your course space, you and your students have access to all premium features associated with the Instructor plan, and there are no usage charges for work done there.</li></ul><p>See the <a href="https://rstudio.cloud/learn/guide#course-spaces" target = "_blank">Teaching with Cloud</a> section of the Guide for details on all the available Instructor plan options.</p><h2 id="easier-access-to-project-settings">Easier Access to Project Settings</h2><p>You can now change a project&rsquo;s settings directly from any projects listing. Edit the project&rsquo;s name or description, or change who can access it - without opening the project.</p><p><strong>HOW TO</strong></p><ol><li>In a projects listing, press the <img src="images/img1.png" alt="Circle with three dot icon" width="3%"> button to open any project&rsquo;s menu and then choose Settings.</li></ol><center><img src="images/img2.png" alt="Going into the Settings of a sample project" width="70%"></center><ol start="2"><li>Use the Info and Access settings panels to view or edit the project&rsquo;s settings.</li></ol><center><img src="images/img3.png" alt="Settings of a sample project with info and access options" width="50%"></center><p>For more info on project settings, see the <a href="https://rstudio.cloud/learn/guide#project-settings" target = "_blank">Project Settings</a> section of the Guide.</p><h2 id="faster-package-installation">Faster Package Installation</h2><p>We recently integrated RStudio Cloud with <a href="https://packagemanager.rstudio.com/client/#/" target = "_blank">RStudio Public Package Manager</a>. Package Manager provides pre-built binaries for all packages on CRAN. As a result, you will see much faster package installs. You can learn more about RStudio Package Manager <a href="https://www.rstudio.com/products/package-manager/" target = "_blank">on our website</a>.</p><h2 id="new-layout--project-filtering-options">New Layout &amp; Project Filtering Options</h2><p>We released several UI improvements, as follows:</p><ul><li><strong>Project Lists Navigator:</strong> You can now switch between different project lists via a navigation pane on the left-hand side of the Projects area.</li></ul><center><img src="images/img4.png" alt="Projects navigation pane" width="30%"></center><ul><li><strong>New Filtering Options:</strong> You can filter any project list according to the access level of the projects via a drop-down menu.</li></ul><center><img src="images/img5.png" alt="Selecting all access from drop down menu" width="50%"></center><ul><li><strong>More Obvious Usage Area Access:</strong> The Usage area for Your Workspace and shared spaces is now accessible via a text navigation item (rather than discreetly tucked away behind an icon).</li></ul><center><img src="images/img6.png" alt="Usage area access in navigation toolbar" width="50%"></center><ul><li><p><strong>More Obvious Trash Access:</strong> The trash for a space is now accessible via the Project Lists Navigator mentioned above (rather than via an icon in the header).</p></li><li><p><strong>Better Navigation on Smaller Screens:</strong> On smaller screens, the different areas in your current context are accessible via a drop-down menu in the header (rather than discreetly tucked away in the main navigation sidebar or user panel).</p></li></ul><center><img src="images/img7.png" alt="Smaller screen drop down menu example" width="30%"></center><h2 id="maximum-ram-doubled">Maximum RAM Doubled</h2><p>Cloud Premium and Instructor plans now allow you to allocate up to 16GB RAM per project, double the previous limit of 8GB.</p><p><strong>HOW TO</strong></p><p>Visit the <a href="https://rstudio.cloud/learn/guide#project-settings-resources" target = "_blank">Project Resources</a> section of the Guide to learn how to adjust the RAM and other resources allocated to a project.</p><h2 id="learn-more-about-rstudio-cloud">Learn More About RStudio Cloud</h2><p>We are excited to provide you with more capabilities so that you can jump right into your data science work. For more information and resources, please visit:</p><ul><li><a href="https://www.rstudio.com/products/cloud/" target = "_blank" rel = "noopener noreferrer">RStudio Cloud Product Page</a></li><li><a href="https://rstudio.cloud/learn/whats-new" target = "_blank" rel = "noopener noreferrer">What&rsquo;s New on RStudio Cloud</a></li><li><a href="https://community.rstudio.com/c/rstudio-cloud/14" target = "_blank" rel = "noopener noreferrer">RStudio Cloud Page on RStudio Community</a></li></ul></description></item><item><title>Develop, Collaborate, and Scale Across R and Python</title><link>https://www.rstudio.com/blog/develop-collaborate-and-scale-across-r-and-python/</link><pubDate>Thu, 27 Jan 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/develop-collaborate-and-scale-across-r-and-python/</guid><description><p>Data scientists need many skills, not just in handling data, but working with different programming languages, technical environments, computational requirements, and more. They need to do this while meeting stakeholder expectations quickly and accurately. Juggling their work across systems makes it difficult to coordinate work, creating silos of information and gaps in efficiency and security.</p><p><a href="https://www.rstudio.com/products/workbench/" target = "_blank">RStudio Workbench</a> is the enterprise-level integrated development environment for data scientists who need to develop, collaborate, and scale in R and Python. Team members work from a centralized server in the language and development environment of their choice and with the computing power that they need.</p><p>Data scientists on RStudio Workbench can focus on creating insights with the multiple tools available to them, a shared space for all data science development, and easy management of a single infrastructure.</p><script src="https://fast.wistia.com/embed/medias/8gsko694ys.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_8gsko694ys videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/8gsko694ys/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><center><caption><i>Watch our video to learn more about RStudio Workbench.</caption></center></i><h2 id="develop-insights-in-r-and-python">Develop Insights in R and Python</h2><p>With RStudio Workbench, data scientists can get started quickly regardless of their preferred programming languages. Teams have a wide range of editor options: RStudio, JupyterLab, Jupyter Notebooks, and VS Code. Python users can use Python, R users can use R — or team members can use both within a single project within RStudio Workbench.</p><p>A development environment should not hinder data scientists’ work. With RStudio Workbench, users can run concurrent sessions and conduct analyses side-by-side. Users can also manage upgrades and test code by running multiple versions of R and Python at the same time. The additional computing power allows a data science team to scale their work and meet their stakeholders’ needs.</p><h2 id="improve-collaboration-with-a-centralized-environment">Improve Collaboration With a Centralized Environment</h2><p>RStudio Workbench provides a single place for data science teams to share projects regardless of the environment used. Team members won’t need to spend time trying to find someone’s old analysis or having to install VSCode to pull up a script.</p><p>For data scientists using the RStudio IDE, RStudio Workbench can securely grant their teammates access to a project. When multiple users are active in the project at once, they can see each others&rsquo; real-time activity and collaboratively edit the file. Team members can support each other to develop high-quality, efficient analyses.</p><h2 id="reduce-start-up-time-with-preconfiguration">Reduce Start-up Time With Preconfiguration</h2><p>RStudio Workbench helps data scientists jump into their first line of code quickly through the configuration of the shared environment. A team can use a standardized set of installed software, ensuring that work is reproducible and valuable time isn’t spent on installation and configuration. A single infrastructure means it’s easy to configure and maintain by IT staff, ensuring the team meets all organizational security requirements.</p><p>Run into issues? RStudio is here to help. With RStudio Workbench, we provide individualized support to ensure data scientists have what they need.</p><h2 id="learn-more">Learn More</h2><p>RStudio Workbench makes it easy for data scientists to focus on creating insights.</p><ul><li>More information can be found on our <a href="https://www.rstudio.com/products/workbench/" target = "_blank">product page</a>.</li><li>Watch our RStudio Workbench video on <a href="https://youtu.be/J-JJAjo_5Ew" target = "_blank">YouTube</a>.</li></ul></description></item><item><title>How to Win the RStudio Shiny Contest</title><link>https://www.rstudio.com/blog/how-to-win-the-rstudio-shiny-contest/</link><pubDate>Wed, 26 Jan 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/how-to-win-the-rstudio-shiny-contest/</guid><description><div class="lt-gray-box">This is a guest post from Marcin Dubel, a 2021 Shiny Contest Grand Prize winner and Software Engineer at <a href="https://appsilon.com/" target = "_blank">Appsilon</a>, a Full Service RStudio Partner.</div><p>RStudio’s 4th annual Shiny Contest is just around the corner. As a winner of last year’s contest with <a href="https://community.rstudio.com/t/shark-attack-shiny-contest-submission/104695" target = "_blank">Shark Attack</a>, I thought I’d share my methods for building a quality app. I break down the success of an application into three pillars: the Idea, the Impression, and the Process. Paying attention to detail in each will lead to a good result. Slack on one and you risk the top prize.</p><p>Whether you’re submitting to the contest or seeking to improve real-world skills, I hope you’ll find something in this article to improve your Shiny applications.</p><h2 id="idea">Idea</h2><h3 id="topic">Topic</h3><p>Usually, when working on a Shiny application, the idea and content are predefined in the project. However, in a contest, you are free to select your own subject. But this freedom of choice and opportunity brings about a new set of challenges! Coming up with yet another classic dashboard won’t be innovative enough. Don’t hold back. Because the winners of these contests don’t get to that virtual podium by doing the same old tried and true dashboard. This is your chance to be creative. To bring an intriguing app concept to the table.</p><p>Let’s be realistic here. Don’t expect to sit down, think for a minute, and churn out the groundbreaking topic and mechanics for your application. Dedicate a lot of time to this step. Search what interests you and look for inspiration. Be flexible! Try different ideas with quick proof of concepts and check how you envision the future development of this application. Feel free to abandon the initial work if you don’t think it’ll pan out to an interesting project. Don’t let the sunk cost fallacy hold you back.</p><h3 id="data">Data</h3><p>A crucial part of any Shiny application is the underlying data. I’d advise being careful with this: collecting, cleaning, maintaining, checking, and updating the data might be a tough or tedious task. It’ll likely consume most of your time and although the effect of this work is not easily visible to most, it’ll pay off in the end.</p><p>Of course, this doesn’t apply to all cases. I can imagine datasets so unique, interesting, and clean that it could be worth it to build contest apps around them. Whenever you are considering an app like this, adjust your plans to this additional risk factor.</p><p>For example, take a look at the Global Shark Attack data table below. This is a unique dataset, but frankly, it’s hard to look at and some of the data is missing.</p><center><iframe src="https://public.opendatasoft.com/explore/embed/dataset/global-shark-attack/table/?disjunctive.country&disjunctive.area&disjunctive.activity&static=false&datasetcard=false" width="680" height="400" frameborder="0"></iframe></center>After all the hard work of handling that data, you really need to ensure you have a good way to present it. Starting with a quick mock dataset is worthwhile to validate the app's purpose. Simply presenting data on the dashboard usually isn’t enough and won’t get you very far. Take advantage of a Shiny applications' greatest power: interactivity. Let users dig down, explore, and find surprises in the data. Users will make their own impression of your app as they play around with it.<h2 id="impression">Impression</h2><h3 id="make-them-care">Make them care</h3><p>Even the most interesting dataset won’t be given a second glance if presented as a giant table. You should think about adding some level of purposeful, insightful interactivity. Give users a reason to interact with the application. Make sure that your topic is engaging and makes users want to stick around longer. Don’t just give the user a button, give them a reason to click it.</p><p>Remember, you only make a first impression once. So the key selling point of the application should be displayed upfront and easily accessible. Don’t hide it somewhere in the depths of the code. Also remember to make applications intuitive, so that the users don’t get lost. Introducing some help or descriptions is always welcome.</p><h3 id="visuals---design-over-features">Visuals - Design over features</h3><p>Nothing makes a better impression than a beautiful app. And that first impression is so important. Having a set of bonus features is great if you have the time, but don’t sacrifice your UI. Humans are visual creatures and users will unconsciously rate your application in the first few seconds based on how visually appealing it is. If you’re facing a time crunch, it might be even worth sacrificing some extra features to polish the overall style of the application.</p><p><img src="images/image1.gif" alt="GIF of moving shark in Shark Attack app" title="Shark Attack Gif"></p><h2 id="process">Process</h2><h3 id="all-in-time---a-step-by-step-approach">All in time - a step by step approach</h3><p>To keep your work organized, write down tasks in a planner. I used a Github project connected to my repository. Whenever I had an idea for a single feature of the application, I wrote it down and prioritized it. Prioritization is the key. You can’t afford to waste time on some minor, nice-to-have feature while your app is still not doing its main job.</p><p>Another crucial rule you may know from big development projects: separate work into small, independent tasks. This allows you to implement some features while being stuck in another branch. Also, it helps to optimize the workday (e.g., dealing with quick and simple tasks when you don’t have much time).</p><p><img src="images/image2.jpg" alt="Github Project Kanban Board" title="Github Project Kanban Board"></p><h3 id="pocs---function-before-beauty">PoCs - Function before beauty</h3><p>To avoid wasting too much time, start your work with a Proof of Concept – the minimal application that checks if key technical challenges can be overcome. If it doesn’t work, you still have enough time to turn around and change the scope. If the plan is impossible or too complicated to implement, you need to cut your losses.</p><p>Here you can see an early PoC of Shark Attack. I needed to test if ‘key capturing’ was possible and would work in the way I imagined.</p><p><img src="images/image3.gif" alt="Proof of Concept with an envelope moving along a grid" title="Shark Attack App Proof of Concept"></p><p>Even when working on a proof of concept, I recommend keeping good programming practices in place. If your application grows bigger you won’t have time to pay tech debt and it’ll slow your progress in the long run.</p><h3 id="test-early-and-often">Test early and often</h3><p>Remember, you know your app inside and out. But the users and judges will enter it for the first time. And fresh eyes tend to uncover unintuitive, redundant, or missing features. They might use the app in an altogether different way than you expect. Don’t wait till the last minute to show your progress to a fresh audience. You need to have enough time to implement the feedback that you receive.</p><p>And I don’t mean some sophisticated user feedback collection sessions - just show the app’s current state to your friends or family and ask for honest opinions. If you’re unfamiliar with user tests, you can learn more with this <a href="https://appsilon.com/user-tests-build-better-shiny-apps-with-effective-user-testing/" target = "_blank">tutorial on User Tests</a>.</p><h2 id="summary">Summary</h2><p>Most importantly, to build an award-winning app - have fun! It’s an exciting challenge, with no external pressure or firm restrictions. The RStudio Shiny Contest is an opportunity to engage in creative learning and gives you a platform to share your hard work.</p><p>There are a lot of strong submissions and contestants are getting more creative with each year. Frankly, it’s an honor just to be a part of the contest and share with the community. Don’t get me wrong, it’s nice to win. But seeing what others are doing and how far Shiny can go is the ultimate prize for the Shiny community.</p><p>If you’re interested in achieving more with Shiny, be sure to sign-up for the <a href="https://appsilon.com/appsilon-shiny-conference-2022-announcement/" target = "_blank">Appsilon Shiny Conference (27-29 April 2022)</a>. Join the R Shiny community as we welcome keynote talks and guest speakers who specialize in R and Shiny - including Diane Beldame of ThinkR, Eric Nantz of the R-Podcast, and many more! Discover recent advancements in R Shiny technology, network and collaborate with the global Shiny dev community, and share your open source packages and Shiny apps at the <a href="https://appsilon.com/2022-appsilon-shiny-conference/" target = "_blank">Appsilon Shiny Conference</a>.</p></description></item><item><title>Three Videos to Supercharge Your R Skills</title><link>https://www.rstudio.com/blog/three-videos-to-supercharge-your-r-skills/</link><pubDate>Tue, 18 Jan 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/three-videos-to-supercharge-your-r-skills/</guid><description><p>Image by Rachael Dempsey</p><p>We have many great videos on the RStudio YouTube channel. You can watch folks discuss <a href="https://youtu.be/E887K1au5ug" target = "_blank">their data science stories and processes</a>, learn about <a href="https://youtu.be/5gqksthQ0cM" target = "_blank">packages</a> and <a href="https://youtu.be/_XNKSEQTo30" target = "_blank">products</a>, and hear <a href="https://youtu.be/IkqItgPSPro" target = "_blank">inspiring examples from others in the community</a>.</p><p>With hundreds of videos, you have hours of content to watch! To get you started, we want to highlight three videos about R tools that will take your R skills to another level, whether for work, fun, or general learning. Happy watching!</p><p><strong>1. Business Reports with R Markdown by Christophe Dervieux</strong></p><p>Want to learn how to style your R Markdown reports to tailor them for your organization? Christophe Dervieux shows us various options to customize our HTML output:</p><ul><li><strong>Using R Markdown’s built-in support for Sass:</strong> R Markdown now has built-in support for Sass through the <a href="https://rstudio.github.io/sass/" target = "_blank">sass</a> R package. Sass is a CSS extension language that helps create CSS rules in more flexible ways than with plain CSS. You can directly supply .scss files to <code>css</code> arguments of your html document. It’s easier to work with CSS rules and variables to apply your style guidelines.</li><li><strong>Going further with bslib:</strong> The <a href="https://rstudio.github.io/bslib/" target = "_blank">bslib</a> package provides tools for customizing Bootstrap themes directly from R. R Markdown’s built-in support allows you to customize your documents without CSS or Sass. You can start with pre-packaged themes or create a custom look.</li></ul><p>Christophe also discusses how to develop templates for Office outputs, create PDFs from HTML using the <a href="https://github.com/rstudio/pagedown" target = "_blank">pagedown</a> R package, and more. Watch Christophe’s full talk here:</p><center><iframe width="560" height="315" src="https://www.youtube.com/embed/gQ9he9dyfGs" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></center><p><strong>2. Exploratory Data Analysis by Priyanka Gagneja</strong></p><p>Exploratory data analysis, or EDA, is a crucial step of every data science project. However, it can be repetitive and time-consuming.</p><p>Priyanka Gagneja shares packages that have helped her automate her EDA process. She walks through:</p><ul><li><strong>Starting your EDA, feature engineering, and data reporting</strong> using <a href="https://cran.r-project.org/web/packages/DataExplorer/vignettes/dataexplorer-intro.html" target = "_blank">DataExplorer</a>.</li><li><strong>Checking data quality</strong> with <a href="https://cran.r-project.org/web/packages/dataReporter/index.html" target = "_blank">DataReporter</a>.</li><li><strong>Calculating summary statistics</strong> using <a href="https://cran.r-project.org/web/packages/skimr/vignettes/skimr.html" target = "_blank">skimr</a>.</li></ul><p>Once you have completed your EDA, then it’s time to start looking at patterns and relationships. Priyanka demonstrates:</p><ul><li><strong>Creating pipeable pivot tables</strong> using <a href="https://rdrr.io/cran/rpivotTable/man/rpivotTable.html" target = "_blank">rPivotTable</a>.</li><li><strong>Visualizing your data</strong> with <a href="https://cran.r-project.org/web/packages/esquisse/vignettes/get-started.html" target = "_blank">esquisse</a>.</li><li><strong>Generating plots and reports</strong> with <a href="https://cran.r-project.org/web/packages/chronicle/index.html" target = "_blank">chronicle</a>.</li></ul><p>Learn about these packages and more in her talk:</p><center><iframe width="560" height="315" src="https://www.youtube.com/embed/qvFeaPRgOns" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></center><p><strong>3. Scaling Spreadsheets with R by Nathan Stephens</strong></p><p>We use Excel spreadsheets for the same reasons we use R: to wrangle, transform, analyze, visualize, and communicate our data. However, it can be difficult to work in Excel if your data is large or your analysis is complicated. R, however, is an attractive alternative:</p><ul><li><strong>Increasing file size:</strong> While you start hitting the limits of Excel when you move into gigabytes of data, R can handle those files very easily.</li><li><strong>Handling complexity:</strong> Things quickly become complicated in Excel with Visual Basic scripts, various pivot tables, multiple spreadsheets, and so on. These can often be replaced by a simple R script.</li></ul><p>When would you start to think about using R rather than Excel? Nathan Stephens shows us that boundary where you might agree that R is the right tool for the job.</p><center><iframe width="560" height="315" src="https://www.youtube.com/embed/yb_mBJz3iSc" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></center><h2 id="learn-more">Learn More</h2><p>We hope that you can apply these skills to your future projects. There is a lot more to enjoy!</p><ul><li>Interested in watching these webinars live? Join the <a href="https://www.meetup.com/RStudio-Enterprise-Community-Meetup/" target = "_blank">RStudio Enterprise Community Meetup</a> to learn more from industry leaders on data science best practices and the capabilities of open-source software.</li><li>Want to keep watching? Check out <a href="https://www.youtube.com/channel/UC3xfbCMLCw1Hh4dWop3XtHg" target = "_blank">RStudio’s YouTube page</a> for more content from RStudio staff and others in the data science community.</li></ul></description></item><item><title>Sharing Secure and Scalable Shiny Apps on RStudio Connect</title><link>https://www.rstudio.com/blog/sharing-shiny-apps-on-rstudio-connect/</link><pubDate>Thu, 13 Jan 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/sharing-shiny-apps-on-rstudio-connect/</guid><description><p><a href="https://shiny.rstudio.com/" target = "_blank">Shiny</a> is an R package that makes it easy for data scientists to build interactive web apps straight from R, without having to learn any new languages or frameworks. Once you’ve created a Shiny app, it’s time to share your great work!</p><p>However, how can you ensure that:</p><ul><li>Only authorized users can view the dashboard?</li><li>Your data is secure once uploaded?</li><li>Users don’t experience a lag time when exploring the data?</li></ul><p><a href="https://www.rstudio.com/products/connect/" target = "_blank">RStudio Connect</a> is an enterprise-level product from RStudio to securely host and share Shiny applications, as well as other data science products. You can publish Shiny apps with a push of a button from the RStudio IDE:</p><p><img src="images/gif1.gif" alt="Push-button publishing from the RStudio IDE" title="Deploying a Shiny dashboard on RStudio Connect"></p><center><i><caption>Push-button publishing from the RStudio IDE</caption></i></center><p>RStudio Connect allows you to share your Shiny app knowing your viewers have the right permissions, your data is secure, and your app will display your results quickly and efficiently.</p><p>Let&rsquo;s explore RStudio Connect with a dashboard using the <a href="https://allisonhorst.github.io/palmerpenguins/" target = "_blank">Palmer Penguins</a> data. Check out the dashboard on <a href="https://colorado.rstudio.com/rsc/palmer-penguins-shiny-example/" target = "_blank">colorado.rstudio.com</a>.</p><h2 id="control-viewership-with-user-authentication">Control Viewership With User Authentication</h2><p>Once you’ve uploaded a Shiny app to Connect, you have the option to:</p><ul><li><strong>Allow anybody to access, no login required:</strong> Any visitor to RStudio Connect will be able to view the content. This includes anonymous users who are not authenticated with the system.</li><li><strong>Allow anybody with a login to access:</strong> Those with RStudio Connect accounts are permitted to view this content.</li><li><strong>Allow only select users or groups to access:</strong> Specific users (or groups of users) are allowed to view this content. Other users will not have access.</li></ul><p><img src="images/gif2.gif" alt="Changing the sharing settings on a Shiny app on Connect from anybody to only those with an account" title="Changing the viewership access on a Shiny app on RStudio Connect"></p><center><i><caption>Changing sharing settings on RStudio Connect</caption></i></center><p>This provides you flexibility over how uploaded content is shared. You can also assign roles to individuals or groups of team members. Viewers have read-only access while Collaborators can modify the content.</p><p><img src="images/gif3.gif" alt="Adding a collaborator to a Shiny app" title="Adding a collaborator to a Shiny app"></p><center><i><caption>Adding a collaborator to a Shiny app</caption></i></center><p>RStudio Connect also allows you to enable Single-Sign On so that your users won’t need to remember an additional log-in. Check out how we sign into our Shiny apps at RStudio!</p><p><img src="images/gif4.gif" alt="Single-sign on from Google to access our dashboard" title="Single-sign on with RStudio Connect"></p><center><i><caption>Using single-sign to access the Shiny dashboard</caption></i></center><h2 id="deploy-shiny-applications-securely-on-premises-or-in-your-vpc">Deploy Shiny Applications Securely, on Premises or in Your VPC</h2><p>While SaaS solutions also allow you to share Shiny apps, the data and application are accessible on someone else’s cloud. This is an issue if your data is private or sensitive.</p><p>With RStudio Connect, the application is kept secure within your organization’s own environment. You can prevent unauthorized access and ensure that you comply with all data security and privacy guidelines.</p><p>If your environment requires offline package access, we recommend you use a local repository option such as <a href="https://www.rstudio.com/products/package-manager/" target = "_blank">RStudio Package Manager</a>.</p><h2 id="maintain-fast-response-times-with-performance-tuning">Maintain Fast Response Times With Performance Tuning</h2><p>RStudio Connect is built to scale content. After you publish an app, you can change RStudio Connect’s runtime settings to help tune and scale your Shiny applications. By selecting values that support your expected use, you can maximize the trade-off between app responsiveness and memory consumption/load time:</p><p><img src="images/gif5.gif" alt="Increasing the max processes on the Shiny dashboard" title="Performance tuning on RStudio Connect"></p><center><i><caption>Increasing the maximum processes on the Shiny dashboard</caption></i></center><p>Your users can rely on your app to show data in a timely manner, and you can adjust the configurations based on use and traffic.</p><h2 id="learn-more">Learn More</h2><p>With RStudio Connect, you can share your Shiny apps in a secure and scalable way.</p><ul><li>Find out the options for <a href="https://shiny.rstudio.com/tutorial/written-tutorial/lesson7/" target = "_blank">sharing your Shiny apps</a>.<ul><li><a href="https://support.rstudio.com/hc/en-us/articles/217240558-What-is-the-difference-between-RStudio-Connect-and-shinyapps-io-" target = "_blank">What is the difference between RStudio Connect and shinyapps.io?</a></li></ul></li></ul><p>Want to see an example of secure, scalable Shiny apps?</p><ul><li>Read how the California Department of Public Health <a href="https://www.rstudio.com/blog/using-shiny-in-production-to-monitor-covid-19/" target = "_blank">created a Shiny app to quickly share data with millions of Californians</a>.</li><li>Explore lessons learned from the Georgia Institute of Technology on <a href="https://www.rstudio.com/blog/how-do-you-use-shiny-to-communicate-to-8-million-people/" target = "_blank">building the COVID-19 Event Risk Assessment Planning Tool</a>.</li><li>Join us for an <strong><a href="https://www.addevent.com/event/rV10488631" target = "_blank">R in Public Sector Meetup on January 27th: Organizational &amp; Technical Aspects of Shiny in Production</a>.</strong> Speakers from the Dutch National Institute for Public Health and the Environment will discuss the development of a Shiny app that provides actionable information to 300 health professionals.</li></ul><p>In addition to Shiny, RStudio Connect publishes other types of applications, including Dash, Streamlit, and others. Find out more about <a href="https://www.rstudio.com/products/connect/" target = "_blank">RStudio Connect</a>.</p></description></item><item><title>Using RStudio on Amazon SageMaker: Questions from the Community</title><link>https://www.rstudio.com/blog/using-rstudio-on-amazon-sagemaker-faq/</link><pubDate>Mon, 10 Jan 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/using-rstudio-on-amazon-sagemaker-faq/</guid><description><p>On December 6th, 2021, the <a href="https://www.meetup.com/RStudio-Enterprise-Community-Meetup/" target = "_blank">RStudio Enterprise Community Meetup</a> hosted an event, Using RStudio on Amazon SageMaker.</p><p>In the meetup, James Blair covered:</p><ol><li>Setting up <a href="https://www.rstudio.com/products/workbench/" target = "_blank">RStudio Workbench</a> on SageMaker</li><li>A use case for training a model and deploying to SageMaker</li><li>Using that model from a Shiny app that was then published to <a href="https://www.rstudio.com/products/connect/" target = "_blank">RStudio Connect</a></li></ol><p>The presentation was followed by questions and answers (Q&amp;A) with both the RStudio and Amazon SageMaker teams. The Q&amp;A below includes both questions that were answered during the event and those that we were unable to answer live.</p><p>We’ve grouped the questions below into buckets for organizational purposes:</p><ul><li><a href="#general-rstudiosagemaker-questions">General RStudio/SageMaker questions</a></li><li><a href="#sagemaker-specific-questions">SageMaker-specific questions</a></li><li><a href="#infrastructure-questions">Infrastructure questions</a></li><li><a href="#questions-about-connecting-to-data-and-other-tools">Questions about connecting to data and other tools</a></li><li><a href="#questions-specific-to-modeling">Questions specific to modeling</a></li><li><a href="#questions-about-environment-management">Questions about environment management</a></li><li><a href="#pricing-and-license-management-questions">Pricing and license management questions</a></li><li><a href="#questions-about-shiny-and-rstudio-connect">Questions about Shiny and RStudio Connect</a></li><li><a href="#helpful-links">Helpful Links</a></li></ul><p>Watch a snippet from the Meetup below:</p><script src="https://fast.wistia.com/embed/medias/vhsguqnuzs.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_vhsguqnuzs videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/vhsguqnuzs/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><div align="right">Full meetup recording on <a href="https://www.youtube.com/watch?v=fmgSVRWgXDg" target="_blank"> YouTube</a>.</div></font><p>If you have questions about setting this up at your own organization, you can schedule time to talk with the RStudio team on <a href="https://rstudio.chilipiper.com/book/schedule-time-with-rstudio" target = "_blank">our booking system</a>.</p><h2 id="meetup-qa">Meetup Q&amp;A:</h2><h3 id="general-rstudiosagemaker-questions">General RStudio/SageMaker questions</h3><p><strong>What are the differences between RStudio on SageMaker the R kernel in SageMaker Studio?</strong></p><p><strong>James Blair:</strong> Functionality-wise, there&rsquo;s not going to be much of a difference in terms of what you can or can’t do or what types of workflows are or aren&rsquo;t supported. It comes down to just a preference. If you’re comfortable using the R kernel in the native Jupyter style environment, then you can interact with your environment that way.</p><p>One of the main motivating factors behind this integration has been that many R users have come to SageMaker and said we’d love to have RStudio because it&rsquo;s familiar. We feel comfortable and have fewer barriers to entry when we’re using that environment. So, it’s just an environmental preference.</p><h3 id="sagemaker-specific-questions">SageMaker-specific questions</h3><p><strong>Will RStudio be available in the new Amazon SageMaker Studio Lab?</strong></p><p>As it stands today, Amazon SageMaker Studio Lab only comes with JupyterLab support, but please let your SageMaker representative know if that would be something that would be valuable to you. We will take into consideration where the service goes in the future.</p><p><strong>Can you also send jobs to AWS via the future package from SageMaker?</strong></p><p>At this time, there isn’t any tooling that makes it possible to send jobs to AWS via the future package from within SageMaker.</p><p><strong>Is interoperability the same with SageMaker Studio as well? If yes, are there any further benefits when it comes to editor selection that we can opt for?</strong></p><p>Interoperability between SageMaker Studio and RStudio is achieved through the attached EFS storage. Each user has their own EFS storage that is common and persistent across IDEs. This means that a user may start working on a project using R on RStudio to analyze the data and then switch to SageMaker Studio to use Tensorflow in Python to build their deep learning model. Additionally, interoperability from within RStudio can be accomplished using the <a href="https://rstudio.github.io/reticulate/" target = "_blank">reticulate</a> R package to interact with both R and Python.</p><p><strong>How does SageMaker help put machine learning (ML) solutions designed with R into production? Is there any specific ML package that SageMaker can deal with (e.g., tidymodels, caret)?</strong></p><p><strong>James Blair:</strong> When you interact directly with SageMaker, most of the functionality is going to come through that SageMaker Python SDK. Like in the demo, we used reticulate, brought that module in, and then we applied methods and functions from that module to manipulate, train, and deploy a model.</p><p>One of the underlying components of that is the fact that the whole process — the model training, tuning, deployment processes — are all actually happening in Python while being managed from R with reticulate. The container that we ran in the demo was running XGBoost and interacting with that via Python.</p><p>There is support in SageMaker for being able to define custom containers and images for model, training, and deployment where you could define ways to interact with R processes. To my understanding today, there’s not a lot of native R tooling there yet. However, this is something I’d love to see change and am hopeful will change over time as this partnership continues to grow.</p><p><strong>Michael Hsieh:</strong> I’d like to add on to that SageMaker’s managed training and processing that you can access with SageMaker SDK is built with container technology. That means that you can bring in a container that has your package dependency to run it on SageMaker’s managed infrastructure. You can put your tidymodels or caret libraries into a container and start your training. You can also host your models built natively within RStudio into SageMaker and do all the ML Ops, along with SageMaker using the SageMaker SDK within your R environment. We do have a couple of examples showing how to do so:</p><ul><li><a href="https://github.com/aws-samples/reinvent2020-aim404-productionize-r-using-amazon-sagemaker" target = "_blank">Training a forecasting model with fable in RStudio and SageMaker</a></li><li><a href="https://github.com/aws-samples/amazon-sagemaker-statistical-simulation-rstudio" target = "_blank">Statistical simulation in RStudio and SageMaker</a></li></ul><p>Note that the material above uses open-source RStudio. The use of SageMaker SDK with reticulate remains identical in RStudio on Amazon SageMaker.</p><h3 id="infrastructure-questions">Infrastructure questions</h3><p><strong>How large is the automatically mounted home directory? Is it a user-specific or shared directory?</strong></p><p>The home directory is an EFS storage that can scale automatically to petabyte sizes. You can read here about the service: <a href="https://aws.amazon.com/efs/faq/" target = "_blank">Amazon EFS FAQs</a>.</p><p>The home directory is hosted on Amazon EFS and you can access it as you do with other EFS storage. For instance, if you mount this EFS to an ec2 instance with the appropriate permissions you would be able to access the files in your home directory.</p><p>When in SageMaker Studio/RStudio, each user has their own EFS storage, isolated from other users.</p><p><strong>Is RStudio Workbench fully on SageMaker? Or is it using SageMaker as a Launcher backend, which would still require a frontend Workbench server?</strong></p><p><strong>James Blair:</strong> When you set up your SageMaker domain to support RStudio, one of the things that you do is identify a persistent instance that will host RStudio. There does need to be some persistent environment that runs and launches those sessions into SageMaker and allows SageMaker to manage the resources behind those sessions.</p><p>That environment gets automatically provisioned for you when you set this up inside your SageMaker domain and you can decide what instance type you want. I think T3 Medium is the default, it doesn’t have to be very big because there’s no computation happening there.</p><p><strong>Georgios Schinas:</strong> Yes, it’s a T3 Medium instance type and is also provided for free as part of the RStudio offering on SageMaker.</p><p><strong>Any information on when we can expect integration with CloudFormation/CDK? We strive for a strict IaaS solution in our company.</strong></p><p>We currently cannot estimate when this feature will be released but we are aware that this is something that our customers want. Please reach out to your AWS representative to let them know of this request and they will make sure to update you with news and details on this.</p><p><strong>Will there be fluid integration with Redshift? I&rsquo;m thinking IAM authentication that mainly uses the user profile role.</strong></p><p>The user on RStudio is already authenticated with the role attached to the user and can access the rest of the AWS resources based on the roles’ IAM permissions. The easiest way to access these would probably be by using reticulate and boto3.</p><p>The default Docker image on SageMaker comes with the <a href="https://docs.rstudio.com/pro-drivers/" target = "_blank">RStudio Professional Drivers</a> pre-installed, so you can also connect to Redshift via ODBC.</p><p><strong>Regarding the home directory in RStudio Workbench: Is the home directory accessible via Amazon/AWS block storage (EBS) as well?</strong></p><p>The home directory is an EFS volume, so not an EBS volume. However, you can still mount the EFS volume to some other instance to access the data if this is required, similar to how you would do with an EBS.</p><h3 id="questions-about-connecting-to-data-and-other-tools">Questions about connecting to data and other tools</h3><p><strong>Can I use the Athena ODBC driver to run Athena queries from RStudio on SageMaker?</strong></p><p>The default Docker image on SageMaker comes with the RStudio Professional Drivers pre-installed, which includes an ODBC driver for Athena. You can use this driver in connection with the <a href="https://db.rstudio.com/r-packages/dbi/" target = "_blank">DBI</a> and <a href="https://db.rstudio.com/r-packages/odbc/" target = "_blank">odbc</a> R packages to run Athena queries from RStudio on SageMaker.</p><p><strong>Can I share files with others from my RStudio project in SageMaker?</strong></p><p>Currently, this functionality is not available from RStudio. Instead, you may want to consider using Git for code sharing or S3 for large file sharing and collaboration.</p><p><strong>Can I use sparklyr with EMR from RStudio on SageMaker?</strong></p><p>Currently, this functionality is not natively available. However, you can manually install all the necessary packages and configurations to use this library and connect to an EMR from RStudio, similar to how you would currently do on a self-managed RStudio instance.</p><p><strong>Would I connect to Snowflake and query within RStudio Workbench, launching from SageMaker in the same way as when it is not on SageMaker?</strong></p><p>The Snowflake ODBC driver is included in the default Docker image for RStudio on SageMaker. You can use this ODBC driver in combination with the DBI and odbc R packages to connect to and query Snowflake from RStudio on SageMaker.</p><p><img src="images/image1.png" alt="RStudio options for connecting to existing data sources, including Snowflake"></p><h3 id="questions-specific-to-modeling">Questions specific to modeling</h3><p><strong>Would anything different need to be done to parallelize model training done with tidymodels that is different from what would be done outside of SageMaker?</strong></p><p>You can continue using the tooling of your choice (including tidymodels) when running RStudio on SageMaker, similar to how you were doing before.Currently, there is no functionality in tidymodels to take advantage of SageMaker-specific resources. However, within SageMaker you can define your custom training container to run training jobs in parallel.</p><p><strong>Does any connector exist to interface with the tidymodels ecosystem?</strong></p><p>Currently, there is no native connector available, but you can continue using the tooling of your choice when running RStudio on SageMaker, similar to how you were doing before. This means that you can use tidymodels within RStudio on SageMaker, but all execution will be local to the session running in SageMaker.</p><h3 id="questions-about-environment-management">Questions about environment management</h3><p><strong>What are the expectations for environment management? Managed by AWS? RStudio? Me? Will I have access to the same environment two years from now to maintain reproducibility?</strong></p><p><strong>James Blair:</strong> I think there may be two sides to this question because &ldquo;environment&rdquo; can mean a couple of things.</p><p>For packages, your local R environment is within your control, so if you want, you can install packages and those will go into your home directory. That home directory is persistent, so different sessions that start will have the same access to the same files and packages that you were working with previously. There&rsquo;s automatic support for this, so if you wanted to create a very specific collection of packages for a very specific project and have a very specific set of versions, that&rsquo;s supported through tools like <a href="https://rstudio.github.io/renv/articles/renv.html" target = "_blank">renv</a>. (You can also configure this to work with <a href="https://www.rstudio.com/products/package-manager/" target = "_blank">RStudio Package Manager</a>, which mitigates issues by managing the package repository centrally for your organization so that data scientists can install packages quickly and securely, and ensure project reproducibility and repeatability.)</p><p>In combination with what you&rsquo;re getting out of SageMaker, your local environment is totally within your control and certainly, you can set it up so that you can come back to that environment two years down the road and continue working from where you left off.</p><p><strong>Amazon SageMaker team:</strong> You can install packages as you would normally do now. Further to this, the packages are installed on your &ldquo;personal&rdquo; EFS storage (persistent storage) and will be available in future/other sessions as well.</p><p><strong>James Blair:</strong> The other side of this is the administrative environment, like the server and settings for RStudio Workbench, which falls under SageMaker’s control. If you say, I’m going to set up and manage RStudio Workbench through SageMaker, part of what you’re doing is saying, I&rsquo;m going to let SageMaker handle the RStudio administrative tasks for me, which can be a huge benefit because you don’t have to worry about that anymore.</p><p>This also means that I’m letting SageMaker handle that for me, so if there are questions or concerns or things that I want to adjust there, or features I want to be enabled, then that becomes a conversation to have with SageMaker.</p><p>We at RStudio are excited to see how this helps a lot of organizations that we work with who are very Windows-centric. Our software is Linux-based and sometimes that is a hurdle to overcome for organizations.</p><p>Regarding the system libraries component, when you’re running your session on SageMaker, you have root access to that container, so when you go into the Terminal tab of RStudio you can modify that container as you need. If there&rsquo;s a system dependency that you need, you should be able to get in and add it to that container at that time. That&rsquo;s not persistent though, so if the session terminates and that container goes away, you&rsquo;re going to have to reinstall that dependency at that point. The other side is this custom image functionality that we anticipate coming to SageMaker in the near future, which would allow you to identify at the Docker level what system dependencies you need available in your environment and that&rsquo;s how you would manage those persistently.</p><p><strong>Amazon SageMaker team:</strong> This integration is designed to be a fully managed environment, the expectations for management are outlined in the AWS documentation here: <a href="https://docs.aws.amazon.com/sagemaker/latest/dg/rstudio.html" target = "_blank">RStudio on Amazon SageMaker</a>.</p><p><strong>Can you limit or control which package versions or installations are available via Package Manager?</strong></p><p><strong>Amazon SageMaker team:</strong> You can install packages as you would normally do now. Further to this, the packages are installed on your &ldquo;personal&rdquo; EFS storage (persistent storage) and will be available in future/other sessions as well.</p><p>When you set up your SageMaker domain with RStudio, you are prompted to input a URL for RStudio Package Manager. If you are maintaining your own installation of RStudio Package Manager, you can provide the URL during this setup process and it will be set as the default package repository for all RStudio users.</p><h3 id="pricing-and-license-management-questions">Pricing and license management questions</h3><p><strong>Once I have a license of Workbench, will it automatically show in SageMaker or does it require any migration steps?</strong></p><p>To use RStudio on SageMaker, you need to bring your RStudio license to License Manager by completing the following steps:</p><ol><li><p>If you don’t have an RStudio Workbench license, you can purchase one on RStudio Pricing or by contacting RStudio Sales (<a href="mailto:sales@rstudio.com"><a href="mailto:sales@rstudio.com">sales@rstudio.com</a></a>).</p></li><li><p>To add RStudio on SageMaker to your existing RStudio Workbench Enterprise purchase, or to convert an RStudio Workbench Standard license to SageMaker, contact your RStudio Sales Representative (<a href="mailto:sales@rstudio.com"><a href="mailto:sales@rstudio.com">sales@rstudio.com</a></a>), who will send you the appropriate electronic order form.</p></li><li><p>RStudio grants your RStudio Workbench licenses to your AWS accounts through License Manager in the U.S. East (N. Virginia) Region. You can expect the license grant process to be completed within 3 business days after you share the AWS account IDs with RStudio.</p></li><li><p>After the license is granted, you receive notification from RStudio with instructions to log in to the License Manager console&rsquo;s Granted Licenses page in the U.S. East (N. Virginia) Region to accept the license grant.</p></li><li><p>If this is your first time using License Manager, choose &ldquo;Create customer managed license&rdquo;.</p></li></ol><p><img src="images/image2.jpg" alt="AWS License Manager page with the ability to create customer maanged license"></p><p>You can find additional details here: <a href="https://aws.amazon.com/blogs/machine-learning/get-started-with-rstudio-on-amazon-sagemaker/" target = "_blank">Get started with RStudio on Amazon SageMaker</a>.</p><p><strong>Do we have to pay for two instances, one for RStudio Workbench, and one for RSession (at minimum) per user?</strong></p><p>The RStudio Workbench server will by default be running on a ml.t3.medium instance, which will be provided for free. In case this instance type is changed, then charges will be incurred based on the instance type. RSessions are also charged based on instance type/size.</p><p><strong>Do you need a paid RStudio Workbench license to use RStudio in SageMaker?</strong></p><p>Yes. RStudio on Amazon SageMaker is a paid product and requires that each user is appropriately licensed. Amazon SageMaker does not sell or provide RStudio licenses. RStudio on Amazon SageMaker requires a separate product license from RStudio. To elaborate on that a bit further, you could have either RStudio Workbench Standard — which is licensed for one server activation — or Enterprise, which gives you unlimited server activations. If you have RStudio Workbench Standard, then SageMaker becomes your one installation of RStudio. If you have RStudio Workbench Enterprise, you could be running RStudio and SageMaker, as well as your own separate on-prem solution. You have a lot of flexibility at that point.</p><p>For current customers of RStudio Workbench Enterprise, licenses are issued at no additional cost. Read more here: <a href="https://docs.aws.amazon.com/sagemaker/latest/dg/rstudio-license.html" target = "_blank">RStudio license</a>.</p><p><strong>A basic RStudio Workbench license includes 5 seats; does each user then have the ability to spin up their own separate instance with separate memory?</strong></p><p>Yes, here is an overview of functionality from the data scientist&rsquo;s perspective: <a href="https://aws.amazon.com/blogs/aws/announcing-fully-managed-rstudio-on-amazon-sagemaker-for-data-scientists/" target = "_blank">Announcing Fully Managed RStudio on Amazon SageMaker for Data Scientists</a>.</p><h3 id="questions-about-shiny-and-rstudio-connect">Questions about Shiny and RStudio Connect</h3><p><strong>Is this a good solution for hosting and developing Shiny applications?</strong></p><p><strong>James Blair:</strong> For developing, definitely. It gives you a valid environment where you can develop <a href="https://shiny.rstudio.com/" target = "_blank">Shiny</a> applications in a way that natively reaches out to and interacts with your AWS resources. It becomes really easy to read data in from S3 and to interact with other AWS resources without needing to worry about things like credential management.</p><p>On the deployment side, the story is a little bit different. Today, <a href="https://www.rstudio.com/products/connect/" target = "_blank">RStudio Connect</a> — which would be the preferred method for deploying Shiny applications — isn’t something that’s offered through SageMaker. In the meantime, you would develop inside of RStudio Workbench on SageMaker and then publish that application to another location — whether that’s RStudio Connect that you have separately installed and managed or <a href="https://www.shinyapps.io/" target = "_blank">shinyapps.io</a> or whichever option you use.</p><p>RStudio provides a number of options for hosting Shiny applications: shinyapps.io, <a href="https://www.rstudio.com/products/shiny/download-server/" target = "_blank">Shiny Server open source</a>, and RStudio Connect. It is our hope to see an integration for RStudio Connect in SageMaker in the future — much like the RStudio IDE integration you&rsquo;re seeing today.</p><p>To host RStudio Connect and RStudio Package Manager in AWS, you can learn more in this post: <a href="https://aws.amazon.com/blogs/machine-learning/host-rstudio-connect-and-package-manager-for-ml-development-in-rstudio-on-amazon-sagemaker/" target = "_blank">Host RStudio Connect and Package Manager for ML development in RStudio on Amazon SageMaker</a>.</p><p><strong>How do I add a an authentication layer in a Shiny application using Sage Maker and RStudio Workbench? Is it the same as with RStudio Connect?</strong></p><p>RStudio Workbench doesn&rsquo;t have Shiny application hosting features like authentication; it provides development tools. RStudio Connect provides authentication for hosted apps as well as other types of data science content.</p><p><strong>Is making Shiny/API hosting in AWS ec2 automated/semi-automated a future possibility?</strong></p><p>Support for a fully managed RStudio Connect/Package Manager might be coming later in 2022. However, you can today leverage the template explained in the following blogpost to self-host and manage an RStudio Connect server and Package Manager on your AWS account.</p><p>Read more here: <a href="https://aws.amazon.com/blogs/machine-learning/host-rstudio-connect-and-package-manager-for-ml-development-in-rstudio-on-amazon-sagemaker/" target = "_blank">Host RStudio Connect and Package Manager for ML development in RStudio on Amazon SageMaker</a>.</p><p><strong>Is there a timeline for the AWS-managed RStudio Connect?</strong></p><p>We would like to bring the full suite of RStudio Team, including Connect, to SageMaker. This is a topic we are actively exploring with the SageMaker team and will share more news on this as we can. In the meantime, if this is something that would be useful for you, our product management team would love to hear from you, and understand your requirements. In the short term, this blog post explains how to configure RStudio on Amazon SageMaker with your own instance of Connect: <a href="https://aws.amazon.com/blogs/machine-learning/host-rstudio-connect-and-package-manager-for-ml-development-in-rstudio-on-amazon-sagemaker/" target = "_blank">Host RStudio Connect and Package Manager for ML development in RStudio on Amazon SageMaker</a>.</p><h2 id="helpful-links">Helpful Links</h2><ul><li><a href="https://www.youtube.com/watch?v=fmgSVRWgXDg" target = "_blank">James Blair - Using RStudio on Amazon SageMaker Meetup Recording</a><ul><li><a href="https://github.com/blairj09-talks/rstudio-sagemaker-webinar" target = "_blank">Meetup Slides</a></li></ul></li><li><a href="https://aws.amazon.com/blogs/machine-learning/get-started-with-rstudio-on-amazon-sagemaker/" target = "_blank">Getting Started with RStudio on Amazon SageMaker (Amazon Blog)</a></li><li><a href="https://aws.amazon.com/blogs/aws/announcing-fully-managed-rstudio-on-amazon-sagemaker-for-data-scientists/" target = "_blank">Announcing Fully Managed RStudio on Amazon SageMaker for Data Scientists (Amazon Blog)</a></li><li><a href="https://www.rstudio.com/blog/announcing-rstudio-on-amazon-sagemaker/" target = "_blank">Announcing RStudio on Amazon SageMaker (RStudio Blog)</a></li></ul><p><strong><center><font size = "5"><a href="https://rstudio.chilipiper.com/book/schedule-time-with-rstudio" target = "_blank">Schedule a call with RStudio to talk about RStudio on Amazon SageMaker.</a></font></center></strong></p></description></item><item><title>R Markdown Lesser-Known Tips & Tricks #2: Cleaning Up Your Code</title><link>https://www.rstudio.com/blog/r-markdown-tips-tricks-2-cleaning-up-your-code/</link><pubDate>Thu, 06 Jan 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/r-markdown-tips-tricks-2-cleaning-up-your-code/</guid><description><p>The R Markdown file format combines R programming and the markdown language to create dynamic, reproducible documents. Authors use R Markdown for reports, slide shows, blogs, books &mdash; even <a href="https://bookdown.org/yihui/rmarkdown/shiny-start.html" target = "_blank">Shiny apps</a>! But, how can users ensure that their R Markdown documents are easy to write, read, and maintain?</p><p>We asked our Twitter friends <a href="https://twitter.com/_bcullen/status/1333878752741191680" target = "_blank">the tips and tricks that they have picked up</a> along their R Markdown journey. There was a flurry of insightful responses ranging from organizing files to working with YAML, and we wanted to highlight some of the responses so that you can apply them to your work, as well.</p><p>This is the second of a four-part series to help you on your path to R Markdown success, where we discuss <strong>cleaning up your code using R Markdown features.</strong></p><p><strong>1. Set up your document</strong></p><p>Set yourself up for success! At the top of your document, create a code chunk that lists all the packages that you will use. This makes sure you’re ready for the rest of your workflow.</p><pre><code>```{r}library(ggplot2)library(flexdashboard)library(tensorflow)```</code></pre><p><strong>2. Name code chunks for easier navigation</strong></p><p>Label code chunks to remember what each chunk is for. For example, you could call the chunk with your packages <code>setup</code>:</p><pre><code>```{r setup}library(ggplot2)library(flexdashboard)library(tensorflow)```</code></pre><p>In the RStudio IDE, you can navigate to specific chunks. Open the code chunk navigation window, located in the bottom left-hand side of the Source pane:</p><center><img src="img/img2.png" alt="Code chunk navigation window in the bottom left-hand side of the RStudio code pane" width="70%"></center><p><strong>WARNING!</strong> You can label code chunks with hyphens but we don&rsquo;t recommend using underscores or spaces. Think &ldquo;kebabs, not snakes&rdquo;.</p><p><img src="img/img1.png" alt="Allison Horst drawing of a snake screaming snake case to a camel. There&rsquo;s another snake with snake_case written on it, a kebab with kebab case written on it, and a camel with camel case written on it."></p><center><caption>Artwork by @allison_horst, <a href="https://www.allisonhorst.com/" target = "_blank">https://www.allisonhorst.com/</a></caption></center><p>For help with naming code chunks, check out the <a href="https://itsalocke.com/namer/articles/namer" target = "_blank">namer package</a>.</p><p><strong>3. Add chunk options to customize your code chunks</strong></p><p>When you knit your file, you may want your code chunks to look a certain way. You can add <a href="https://yihui.org/knitr/options/" target = "_blank">chunk options</a> to customize the components of your code chunks.</p><p>For example, if you want the code to show up in your knitted file without any messages or warnings, you can write <code>message = FALSE</code> and <code>warning = FALSE</code> in the chunk header:</p><pre><code>```{r setup, message = FALSE, warning = FALSE}library(car)```</code></pre><p>If you do not want to see the code, the messages, or the warnings, but still want the code evaluated, you can use <code>include = FALSE</code>:</p><pre><code>```{r setup, include = FALSE}source(&quot;my-setup.R&quot;, local = knitr::knit_global())```</code></pre><p>There are many useful chunk options. For a full list, see the <a href="https://yihui.org/knitr/options/" target = "_blank">knitr package documentation</a>.</p><p><strong>4. Use global options for your chunks</strong></p><p>Did you know that you can use the same settings across all the code chunks in your R Markdown document? Set your global R options with <code>options()</code> and your knitr global chunk options with <code>knitr::opts_chunk$set()</code>.</p><p>For example, if you know that you want all your numbers to have three digits and all your figures to have a width of 8, use the code below:</p><pre><code>```{r setup, include = FALSE}# set up global R optionsoptions(digits = 3)# set up knitr global chunk optionsknitr::opts_chunk$set(fig.width = 8)```</code></pre><p><strong>5. Control where figures are saved with <code>fig.path</code></strong></p><p>By default, your R Markdown document&rsquo;s figures are saved to a <code>_files</code> folder, but they are deleted after the output document is generated. R Markdown offers <a href="https://bookdown.org/yihui/rmarkdown-cookbook/keep-files.html" target = "_blank">a few options</a> for keeping your plot files.</p><p>However, you may want to save your figures in a specific spot rather than writing <code>ggplot2::ggsave(path = path/to/folder)</code> after each one. To do so, use <code>fig.path</code> to designate the folder. Conveniently, knitr will create the <code>fig.path</code> folder for you if it does not already exist in your working directory. Make sure to include the trailing slash!</p><pre><code>```{r cars-plot, fig.path = &quot;figures/&quot;}ggplot(data = mtcars, aes(x = wt, y = mpg)) +geom_point()```</code></pre><p>If you want to save all your figures in the same folder, specify your <code>fig.path</code> in the global options mentioned above. With <code>fig.path</code>, you can also add a prefix to your plot file names.</p><pre><code>```{r setup}knitr::opts_chunk$set(fig.width = 8fig.path = &quot;figures/prefix-&quot;)```</code></pre><p><strong>6. Write chunk options inside code chunks with <code>#|</code></strong></p><p>We&rsquo;ve seen a number of useful chunk options so far, but that only scratches the surface of <a href="https://yihui.org/knitr/options/#chunk-options" target = "_blank">what knitr provides</a>. If you have many chunk options to specify, your code can quickly become difficult to read.</p><pre><code>```{r cars-plot, echo = FALSE, message = FALSE, fig.width = 6, fig.height = 6, fig.path = &quot;figures/&quot;, fig.cap = &quot;This is a long caption that fits better inside of a code chunk.&quot;, fig.alt = &quot;This is a long description that conveys the meaning of the visual.&quot;}ggplot(data = mtcars, aes(x = wt, y = mpg)) +geom_point()```</code></pre><p>As of <a href="https://github.com/yihui/knitr/releases/tag/v1.35" target = "_blank">knitr v1.35</a>, you can now write chunk options <em>inside</em> of a code chunk using a special comment symbol, <code>#|</code>. You can write your chunk options over as many lines as you like.</p><p>Notice how much easier these chunk options are to read:</p><pre><code>```{r cars-plot}#| echo = FALSE,#| message = FALSE,#| fig.width = 6, fig.height = 6,#| fig.path = &quot;figures/&quot;,#| fig.caption = &quot;This is a long caption that fits better inside of a code chunk&quot;#| fig.alt = &quot;This is a long description that conveys the meaning of the visual.&quot;ggplot(data = mtcars, aes(x = wt, y = mpg)) +geom_point()```</code></pre><p><strong>7. Split up your code into child documents</strong></p><p>Feel that your document is getting too long? To make your code more modular, you can split your code into multiple documents and knit them together to create a single output. Provide paths to one or more &ldquo;child&rdquo; documents using the chunk option <code>child</code>:</p><pre><code>```{r, child = c(&quot;first_file.Rmd&quot;, &quot;second_file.Rmd&quot;)}```</code></pre><p>You can also use child documents conditionally. Say you wanted to publish one report if Brazil wins the World Cup and another if Germany wins. Create a variable for the winner:</p><pre><code>```{r, include = FALSE}winner &lt;- &quot;brazil&quot;```</code></pre><p>Once you know the game result, change <code>winner</code> and the correct report will knit:</p><pre><code>```{r, child = if (winner == 'brazil') 'brazil.Rmd' else 'germany.Rmd'}```</code></pre><p><strong>8. Clean up messy code with the styler addin</strong></p><p>When it comes to writing code, keeping consistent style and formatting is important so that others (your future self included) are better able to read and understand your code. But it&rsquo;s easy to let this slip, especially after working on the same code for many hours. You can use the <a href="https://styler.r-lib.org/" target = "_blank">styler</a> addin to easily re-format messy code in a snap:</p><p><img src="img/img3.gif" alt="Cleaning up messy code with the Styler addin"></p><p>With styler, you have the option to re-format a specific section of code, an entire .Rmd document, or even a whole directory of .R and/or .Rmd files. You can also build this directly into your knitting process using the chunk option <code>tidy</code>:</p><pre><code>```{r setup}knitr::opts_chunk$set(tidy = &quot;styler&quot;)```</code></pre><p>This ensures that every time you knit, any code that is shown in the rendered document will be properly formatted.</p><h2 id="continue-the-journey">Continue the Journey</h2><p>We hope that these tips &amp; tricks help you clean up your code in R Markdown. Thank you to everybody who shared advice, workflows, and features!</p><p>Stay tuned for the third post in this four-part series: <strong>Time-savers &amp; trouble-shooters.</strong></p><h2 id="resources">Resources</h2><ul><li>For more on R Markdown in data science, read <a href="https://r4ds.had.co.nz/r-markdown.html" target = "_blank">R for Data Science</a> and the <a href="https://bookdown.org/yihui/rmarkdown-cookbook/" target = "_blank">R Markdown Cookbook</a>.</li><li>Check out the <a href="https://raw.githubusercontent.com/rstudio/cheatsheets/main/rmarkdown.pdf" target = "_blank">Dynamic documents with rmarkdown cheatsheet </a> for quick reference on chunk options and more.</li><li>Need R Markdown in production? Publish and schedule reports, enable self-service customization, and distribute beautiful emails using <a href="https://www.rstudio.com/products/connect/" target = "_blank">RStudio Connect</a>.</li></ul></description></item><item><title>RStudio Community Monthly Events - January 2022</title><link>https://www.rstudio.com/blog/rstudio-community-monthly-events-january-2022/</link><pubDate>Wed, 05 Jan 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-community-monthly-events-january-2022/</guid><description><sup>Photo by <a href="https://unsplash.com/@nickmorrison?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Nick Morrison</a> on <a href="https://unsplash.com/?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup><p>Welcome to RStudio Community Monthly Events Roundup, where we update you on upcoming events happening at RStudio this month. Missed the great talks and presentations from last month? Find them listed under <a href="#icymi-december-2021-events">ICYMI: December 2021 Events</a>.</p><p>You can <a href="https://www.addevent.com/calendar/wT379734" target = "_blank">subscribe</a> to the Community Events Calendar so that new events will automatically appear on your calendar. Please note that by subscribing, all of the events in the calendar will appear on your own calendar. If you wish to add individual events instead, please use the links below.</p><p>We can’t wait to see you there!</p><h2 id="save-the-date">Save the Date</h2><ul><li>January 6, 2022 at 12 ET: Data Science Hangout with Ian Anderson, Director of Hockey Analytics at the Philadelphia Flyers <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">(add to calendar)</a></li><li>January 11, 2022 at 4 ET: An inclusive solution for teaching and learning R during the COVID pandemic | Presented by Patricia Menéndez <a href="https://www.addevent.com/event/gC11155394" target = "_blank">(add to calendar)</a></li><li>January 13, 2022 at 12 ET: Data Science Hangout with Prabha Thanikasalam, Senior Director, Analytics and Supply Chain Solutions at Flex <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">(add to calendar)</a></li><li>January 18, 2022 at 12 ET: R in Supply Chain: Intro to Supply Chain Design &amp; Forecasting Demand with R | Presented by Laura Rose &amp; Ralph Asher <a href="https://www.addevent.com/event/by10435991" target = "_blank">(add to calendar)</a></li><li>January 20, 2022 at 12 ET: Data Science Hangout with Asmae Toumi, Director of Analytics at PursueCare <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">(add to calendar)</a></li><li>January 25, 2022 at 12 ET: Building a Blog with R | Presented by Isabella Velásquez <a href="https://www.addevent.com/event/mS11158422" target = "_blank">(add to calendar)</a></li><li>January 27, 2022 at 10 ET: R in Public Sector: Organizational &amp; Technical Aspects of Shiny in Production | Presented by Sjoerd Wieringa &amp; Job Spijker <a href="https://www.addevent.com/event/rV10488631" target = "_blank">(add to calendar)</a></li><li>January 27, 2022 at 12 ET: Data Science Hangout with Theresa Ward, Sr Manager - New Glenn Production at Blue Origin <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">(add to calendar)</a></li><li>February 1, 2022 at 12 ET: Capacity Planning for Microsoft Azure Data Centers | Using R &amp; RStudio Connect | Presented by Paul Chang <a href="https://www.addevent.com/event/Py10759092/" target = "_blank">(add to calendar)</a></li><li>February 3, 2022 at 12 ET: Data Science Hangout with Katie Schafer, Manager of Advanced Analytics at Beam Dental <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">(add to calendar)</a></li><li>February 7, 2022 at 12 ET: RStudio Finance Meetup | The shift to data: Industry trends from banks to hedge funds to federal agencies| Presented by Dmitri Adler &amp; Merav Yuravlivker <a href="https://www.addevent.com/event/Rc10480836" target = "_blank">(add to calendar)</a></li><li>February 10, 2022 at 12 ET: Data Science Hangout with Matthias Mueller, Sr Director of Marketing Analytics at Campaign Monitor <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">(add to calendar)</a></li></ul><h2 id="recurring-events">Recurring Events</h2><h3 id="data-science-hangout">Data Science Hangout</h3><p>Last year, we started an informal &ldquo;data science hangout&rdquo; at RStudio for the data science community to connect and chat about some of the more human-centric questions around data science leadership. These happen every Thursday at 12 ET with a different leader featured each week.</p><p>The conversation is all audience-based and there&rsquo;s no registration needed, so you can jump on whenever it fits your schedule. Add the weekly hangouts to your calendar on <a href="https://www.addevent.com/event/Qv9211919" target = "_blank" rel = "noopener noreferrer">AddEvent</a>.</p><h3 id="rstudio-enterprise-community-meetups">RStudio Enterprise Community Meetups</h3><p>We also host industry meetups for teams to share the work they are doing within their organizations, teach lessons learned, and network with others. Join the group on <a href="https://www.meetup.com/RStudio-Enterprise-Community-Meetup/" target = "_blank" rel = "noopener noreferrer">Meetup</a>.</p><h2 id="icymi-december-2021-events">ICYMI: December 2021 Events</h2><ul><li>December 2, 2021 at 12 ET: <a href="https://youtu.be/jyzmOBe4qKY" target = "_blank">Data Science Hangout with Jarus Singh</a>, Director of Quantitative Analytics at Pandora</li><li>December 6, 2021 at 12 ET: <a href="https://youtu.be/fmgSVRWgXDg" target = "_blank"> Using RStudio on Amazon SageMaker</a> | Presented by James Blair</li><li>December 7, 2021 at 12 ET: <a href="https://youtu.be/vIiQJY5V__E" target = "_blank">R en la Administración Pública &amp; informes de técnicas psicométricas con R Markdown</a> | Presented by Daniela Garcia &amp; Julieta Nieva</li><li>December 9, 2021 at 12 ET: <a href="https://youtu.be/G1NThC90ZF8" target = "_blank">Data Science Hangout with Aliyah Wakil</a>, Epidemiology Team Lead at TX Department of State Health Services</li><li>December 9, 2021 at 11 ET: <a href="https://youtu.be/6nz_N_xA3I8" target = "_blank">Cut down on the grunt work and deliver insights more effectively with RStudio Connect, R Markdown, and Jupyter</a> | Presented by Tom Mock</li><li>December 9, 2021 at 2 ET: <a href="https://youtu.be/-kDO_Y8SctU" target = "_blank">Leveraging the Cloud for Analytics Instruction at Scale: Challenges and Opportunities</a> | Presented by Dr. Brian Anderson</li><li>December 14, 2021 at 12 ET: <a href="https://www.meetup.com/RStudio-Enterprise-Community-Meetup/events/281675939/" target = "_blank">Power Calculations in R: How much data is enough?</a> | Panel meetup with Ethan Brown, Jianmei Wang, and Richard Webster</li><li>December 16, 2021 at 12 ET: <a href="https://youtu.be/rCrKIioEO_Q" target = "_blank">Data Science Hangout with Ryan Garnett</a>, Manager Data Management Insights &amp; Analytics at Green Shield Canada</li></ul><h2 id="call-for-speakers">Call for Speakers</h2><p>If you’re interested in sharing your work at a Meetup (or just starting to consider it for a future date down the road!), <a href="https://forms.gle/EtXMpSoTfhpGopiS8" target = "_blank" rel = "noopener noreferrer">please fill out the speaker submission form</a>. We’re always looking for a diverse set of speakers — if you are a member of a group that is underrepresented in data science, including people of color, those with disabilities, elders/older adults, LGBTQ folks, and women/minority genders, we highly encourage you to submit!</p></description></item><item><title>2021 at RStudio: A Year in Review</title><link>https://www.rstudio.com/blog/2021-a-year-in-review/</link><pubDate>Tue, 04 Jan 2022 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/2021-a-year-in-review/</guid><description><p>Photo by <a href="https://unsplash.com/@eyestetix?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Eyestetix Studio</a> on <a href="https://unsplash.com/s/photos/2022?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></p><p>Happy New Year! As we’re settling back into our routines, we wanted to share some highlights of the work done across the company last year. This is just a few of our teams’ great accomplishments in supporting the R and Python data science community — there are too many to list! — and we encourage you to take a look at the links to find out more.</p><h2 id="rstudio-and-interoperability">RStudio and Interoperability</h2><ul><li>We continued <a href="https://www.rstudio.com/solutions/interoperability/" target = "_blank">our commitment to interoperability</a> with support for <a href="https://www.rstudio.com/solutions/bi-and-data-science/" target = "_blank">business intelligence (BI) tools</a>.</li><li>We expanded our <a href="https://solutions.rstudio.com/python/" target = "_blank">support for Python</a>.</li><li>We also have more options for how we support our customers&rsquo; journey in the <a href="https://www.rstudio.com/solutions/rstudio-in-the-cloud/" target = "_blank">cloud</a>, including announcing <a href="https://www.rstudio.com/sagemaker/" target = "_blank">RStudio on Amazon SageMaker</a>.</li></ul><h2 id="rstudio-and-the-community">RStudio and the Community</h2><ul><li>RStudio&rsquo;s 2021 virtual conference, rstudio::global, ran around the clock with exciting events for its 17,000 attendees. See the talks on <a href="https://www.rstudio.com/resources/rstudioglobal-2021/" target = "_blank">the RStudio website</a>.<ul><li>We offered <a href="https://www.rstudio.com/blog/diversity-scholarships/" target = "_blank">70 diversity scholarships</a> to attendees from underrepresented groups.</li></ul></li><li>We mentored 20 BIPOC students as part of <a href="https://www.rstudio.com/blog/the-inspire-u2-program/" target = "_blank">our mentorship program</a>.</li><li>We saw so many amazing examples from the community during the 3rd Annual <a href="https://www.rstudio.com/blog/winners-of-the-3rd-annual-shiny-contest/" target = "_blank">Shiny contest</a>, 2nd Annual <a href="https://www.rstudio.com/blog/winners-of-the-2021-table-contest/" target = "_blank">Table contest</a>, and 1st Annual <a href="https://rviews.rstudio.com/2021/08/04/r-views-blog-contest/" target = "_blank">RViews Call for Documentation</a>.</li><li>We launched the <a href="https://www.meetup.com/RStudio-Enterprise-Community-Meetup/" target = "_blank">RStudio Enterprise Community Meetups</a> and Data Science Hangouts for open and friendly opportunities to meet with others from the community. Join our next <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">Data Science Hangout</a> on Thursday, January 6th!</li></ul><h2 id="rstudio-open-source-packages">RStudio Open-Source Packages</h2><p>There are a lot of exciting releases for open-source packages. Below, we link to relevant places where you can find out more.</p><ul><li>Check out the tidyverse team’s announcements of package releases and updates on <a href="https://www.tidyverse.org/categories/package/" target = "_blank">the tidyverse website</a>.<ul><li>The tidymodels team wrote a 2021 round-up on <a href="https://www.tidyverse.org/blog/2021/12/tidymodels-2021-q4/" target = "_blank">the tidyverse blog</a>.</li></ul></li><li>See updates from the mlverse on the <a href="https://blogs.rstudio.com/ai/" target = "_blank">RStudio AI blog</a>.</li><li>Find R Markdown updates and general open source topics in the &lsquo;<a href="https://www.rstudio.com/blog/categories/open-source/" target = "_blank">Open Source</a>&rsquo; category of the RStudio blog.</li><li>Read about the publication of several packages and their updates on CRAN, just a few of which include:<ul><li><a href="https://www.rstudio.com/blog/2021-spring-rmd-news/" target = "_blank">rmarkdown</a>;</li><li><a href="https://www.rstudio.com/blog/plumber-v1-1-0/" target = "_blank">plumber</a>;</li><li><a href="https://www.rstudio.com/blog/knitr-fig-alt/" target = "_blank">knitr</a>;</li><li><a href="https://blogs.rstudio.com/ai/posts/2021-11-18-keras-updates/" target = "_blank">keras</a>;</li><li><a href="https://www.rstudio.com/blog/pins-1-0-0/" target = "_blank">pins</a>;</li><li><a href="https://www.rstudio.com/blog/shiny-1-6-0/" target = "_blank">Shiny</a>; and</li><li><a href="https://blogs.rstudio.com/ai/posts/2021-07-06-sparklyr-1.7.0-released/" target = "_blank">sparklyr</a>.</li></ul></li></ul><h2 id="rstudio-products">RStudio Products</h2><p>On the product side, we significantly enhanced the capabilities of both our open source and commercial products.</p><ul><li>We moved to <a href="https://www.rstudio.com/blog/calendar-versioning-for-commercial-rstudio-products/" target = "_blank">calendar versioning</a> for our commercial products.</li><li>In addition, we have a new and improved <a href="https://solutions.rstudio.com/" target = "_blank">Solutions website</a> to help you get the most from the products you’ve purchased.</li></ul><h3 id="rstudio-cloud">RStudio Cloud</h3><p>RStudio Cloud has made it easier to share, do, teach, and learn data science. See the <a href="https://rstudio.cloud/learn/whats-new" target = "_blank">What’s New with RStudio Cloud</a> page for more information, such as the addition of more disk space and hours in RStudio Cloud plans.</p><h3 id="rstudio-connect">RStudio Connect</h3><p>Users of RStudio Connect have more tools to create great experiences for their stakeholders. We summarize the updates in <a href="https://www.rstudio.com/blog/rstudio-connect-2021-year-in-review/" target = "_blank">the RStudio Connect Year in Review blog post</a>, including:</p><ul><li>Connectwidgets for organizing, distributing, and finding projects;</li><li>Updates for Python developers; and</li><li>Support for <a href="https://www.rstudio.com/blog/rstudio-connect-2021-09-0-tableau-analytics-extensions/" target = "_blank">Tableau Analytics Extensions</a>.</li></ul><h3 id="rstudio-ide">RStudio IDE</h3><p>We released RStudio 1.4, with many exciting features for the IDE. Check out the <a href="https://www.rstudio.com/products/rstudio/release-notes/" target = "_blank">release notes</a> for the full list of new features, including:</p><ul><li>A visual markdown editor;</li><li>New Python capabilities; and</li><li>Support for rainbow parentheses.</li></ul><p><em>Watch a quick tour of RStudio 1.4 here:</em></p><script src="https://fast.wistia.com/embed/medias/7rhnqchmyu.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_7rhnqchmyu videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/7rhnqchmyu/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><h3 id="rstudio-workbench">RStudio Workbench</h3><p>In June, we renamed RStudio Server Pro to RStudio Workbench to reflect the product’s growing support for a wide range of different development environments. This year, we’ve added:</p><ul><li>VS Code as a fully supported development environment;</li><li>Multiple Python-based improvements; and</li><li>Additional R and RStudio-based improvements.</li></ul><p>More information is available in the <a href="https://www.rstudio.com/products/rstudio/release-notes/" target = "_blank">release notes</a>.</p><h3 id="rstudio-package-manager">RStudio Package Manager</h3><p>RStudio Package Manager has also seen a variety of upgrades, with a more versatile repository calendar, more flexibility in serving multiple binary package versions, and more options for configuring git sources. The <a href="https://docs.rstudio.com/rspm/news/#rstudio-package-manager-1221" target = "_blank">release notes</a> have more details on these features.</p><h2 id="keep-in-touch">Keep in Touch</h2><p>Thanks for taking a look at all the exciting work from 2021! Want to stay informed?</p><ul><li>Sign up on our <a href="https://www.rstudio.com/about/subscription-management/" target = "_blank">subscription page</a> to receive announcements on new blog posts, events, and more.</li><li>You can also follow us on <a href="https://www.linkedin.com/company/rstudio-pbc/" target = "_blank">LinkedIn</a>, <a href="https://twitter.com/rstudio" target = "_blank">Twitter</a>, and <a href="https://www.facebook.com/rstudiopbc/" target = "_blank">Facebook</a>.</li><li>If you’d like to learn more about any of the professional products, please drop a line to <a href="mailto:sales@rstudio.com"><a href="mailto:sales@rstudio.com">sales@rstudio.com</a></a> or schedule a time to chat with us <a href="https://rstudio.chilipiper.com/book/rst-demo" target = "_blank">using our booking system</a>.</li></ul><p>We’re excited for what 2022 has in store and are so grateful for our community. Thank you for all the ways you use data to improve lives and our knowledge of the world in such incredibly diverse ways.</p><center><iframe width="560" height="315" src="https://www.youtube.com/embed/gVsDXNvtah8" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></center></description></item><item><title>RStudio Connect 2021 Year in Review</title><link>https://www.rstudio.com/blog/rstudio-connect-2021-year-in-review/</link><pubDate>Wed, 22 Dec 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-2021-year-in-review/</guid><description><h1 id="rstudio-connect-highlights-for-data-scientists">RStudio Connect Highlights for Data Scientists</h1><p>It&rsquo;s been a busy year for the RStudio Connect team. In case you missed it, here is a quick summary of the most interesting product highlights for <strong>RStudio Connect Publishers</strong>:</p><ul><li><a href="#content-curation">Content curation with the <code>connectwidgets</code> package</a></li><li><a href="#updates-for-python-developers">Updates for Python developers</a></li><li><a href="#updates-for-developers-working-with-tableau">Updates for data scientists who work with Tableau</a></li><li><a href="#upcoming-changes-preparing-for-2022">Upcoming changes: How to prepare for 2022</a></li></ul><h2 id="content-curation">Content Curation</h2><p>As you publish more data science artifacts (applications, documents, APIs, etc.) to RStudio Connect, the organization, distribution, and discovery of those projects can become more challenging.</p><h3 id="introducing-connectwidgetshttpsrstudiogithubioconnectwidgets">Introducing <a href="https://rstudio.github.io/connectwidgets/">connectwidgets</a></h3><p><em>This feature requires RStudio Connect version 1.9.0 or newer.</em></p><p>This year the RStudio Connect team produced an R package that can be used to query a Connect server for your existing content items, then organize them within <code>htmlwidget</code> components in an R Markdown document or Shiny application.</p><p><strong>Present your content</strong> in cards, grids, or tables:</p><p><img src="images/card-grid-view.png" alt="" title="Example of a Card and a Grid created with connectwidgets"></p><p>Card and grid components display metadata about each piece of content. Each card or grid item links to the &ldquo;open solo&rdquo; version of the associated content item on RStudio Connect.</p><p><strong>Filter content</strong> using <code>connectwidgets</code> helper functions and <code>dplyr</code> to produce the curated set of content you&rsquo;d like to display:</p><ul><li><code>by_tags()</code> Filters the data frame to only include content that has been tagged with the specified tag name(s).</li><li><code>by_owners()</code> Filters the data frame to only include content with the specified owner(s) by username.</li></ul><p><strong>Customization is up to you</strong> - <code>connectwidgets</code> components support styling via the <code>bslib</code> package.</p><p><img src="images/connectwidgets-themes.png" alt="" title="Customization examples with bootswatch themes"></p><p><strong>Learn more:</strong></p><ul><li>Visit the <a href="https://docs.rstudio.com/connect/user/curating-content/">RStudio Connect User Guide</a></li><li><a href="https://youtu.be/GBNzhIkObyE">Watch a webinar</a></li></ul><h3 id="content-access-requests">Content Access Requests</h3><p><em>This feature requires RStudio Connect version 1.8.8 or newer.</em></p><p>Since <code>connectwidgets</code> components are rendered with the same permissions you have on the RStudio Connect server, viewers of your curated presentation pages may encounter links to content they wouldn&rsquo;t otherwise have access to. If a viewer follows a link to a content item they don&rsquo;t have permission to visit, they will be directed to request access.</p><p><img src="images/access-request.png" alt="" title="Publisher content access request flow"><br>In this example, the user is a <strong>Publisher</strong>, so they can choose to request either <strong>Collaborator</strong> or <strong>Viewer</strong> permissions. This action triggers an email notification to the content managers who can confirm or deny the request.</p><p><strong>Note:</strong> Administrators can disable content access requests for an entire server with <a href="https://docs.rstudio.com/connect/admin/appendix/configuration/#Applications.PermissionRequest"><code>Applications.PermissionRequest</code></a>.</p><h3 id="send-visitors-to-a-custom-page">Send Visitors to a Custom Page</h3><p><em>This feature requires RStudio Connect version 2021.08.0 or newer.</em></p><p>If you&rsquo;ve created a showcase page that you&rsquo;d like to route all RStudio Connect visitors to see upon logging in, work with your server administrator to configure <a href="https://docs.rstudio.com/connect/admin/appendix/configuration/#Server.RootRedirect"><code>Server.RootRedirect</code></a>.</p><p><img src="images/root-redirect.png" alt="" title="Example options for Server.RootRedirect configuration"></p><p><code>Server.RootRedirect</code> is a configuration setting that can be used to divert users to a URL other than the standard RStudio Connect dashboard.</p><p>If your administrator customizes the <code>RootRedirect</code> URL, it will be important to notify publishers and other administrators about where they can access the content dashboard view of RStudio Connect. This URL can be customized with the <a href="https://docs.rstudio.com/connect/admin/appendix/configuration/#Server.DashboardPath"><code>Server.DashboardPath</code></a> setting. By default, the content dashboard is available at <code>/connect</code>.</p><p>Follow this <a href="https://docs.rstudio.com/how-to-guides/users/pro-tips/widgets/">How To Guide</a> to learn more.</p><p><strong>Note:</strong> If your organization would rather change the branding of RStudio Connect itself, there are new customization options available including logo, favicon, and platform display name. These can all be configured by a server administrator. <a href="https://docs.rstudio.com/connect/admin/appendix/branding/">Learn more here</a> or <a href="https://youtu.be/2cCOeC_bPxU">watch a video demo</a>.</p><h2 id="updates-for-python-developers">Updates for Python Developers</h2><p><em>These features require RStudio Connect version 2021.08.0 or newer.</em></p><h3 id="asgi-frameworks">ASGI Frameworks</h3><p>The list of supported Python content types has been growing over the last two years. In August, our team released an update to RStudio Connect which adds support for ASGI frameworks including FastAPI, Quart, Falcon, and Sanic.</p><p><strong>RStudio Connect Supported Python Content Types in 2021</strong></p><table><thead><tr><th>Content Type</th><th>Framework</th></tr></thead><tbody><tr><td>Documents &amp; Notebooks</td><td>Jupyter Notebooks</td></tr><tr><td>Interactive Applications</td><td>Dash, Streamlit, Bokeh</td></tr><tr><td>WSGI Frameworks</td><td>Flask</td></tr><tr><td>ASGI Frameworks</td><td>FastAPI, Quart, Falcon, Sanic</td></tr></tbody></table><p>FastAPI and other ASGI-compatible APIs can be deployed to RStudio Connect with the <a href="https://docs.rstudio.com/rsconnect-python/index.html">rsconnect-python</a> package. To get started, follow the <a href="https://docs.rstudio.com/connect/user/publishing/#publishing-python-apis">same basic deployment steps</a> required from our other Python content types.</p><p>Read more: <a href="https://www.rstudio.com/blog/rstudio-connect-2021-08-python-updates/">Announcement blog post</a></p><h3 id="jupyter-notebooks-feature-hiding-code-cells">Jupyter Notebooks Feature: Hiding Code Cells</h3><p>Hiding input code cells can be useful when preparing notebooks for audiences where a &ldquo;cleaner&rdquo; or less code-heavy presentation would be more appreciated.</p><p><img src="images/input-shownvhidden.png" alt="" title="Comparison of an example notebook with code cells shown and hidden"></p><p>RStudio Connect now supports two options for hiding input code cells in Jupyter Notebooks:</p><ul><li><a href="https://docs.rstudio.com/connect/user/jupyter-notebook/#hide-all-input">Hide all input code cells</a></li><li><a href="https://docs.rstudio.com/connect/user/jupyter-notebook/#hide-tagged-input">Hide only selected input code cells</a></li></ul><p>Work with a server administrator to upgrade <a href="https://docs.rstudio.com/rsconnect-jupyter/upgrading/">rsconnect-jupyter</a> and <a href="https://docs.rstudio.com/rsconnect-python/#installation">rsconnect-python</a> so you can get access to the new publishing features.</p><h2 id="updates-for-developers-working-with-tableau">Updates for Developers Working with Tableau</h2><p><em>These features require RStudio Connect version 2021.09.0 or newer.</em></p><p>In September, RStudio Connect introduced support for <a href="https://help.tableau.com/current/pro/desktop/en-us/r_connection_manage.htm">Tableau Analytics Extensions</a>, our first external integration with a BI tool. Data Scientists can use Analytics Extensions backed by Plumber and FastAPI to replace arbitrary R and Python code execution in Tableau Workbooks.</p><ul><li><a href="https://youtu.be/t25Lbi5D6kg">Watch the Webinar</a></li></ul><p><strong>Why did we do this?</strong></p><p>When Tableau users want to leverage R scripts in their workbooks, they would use an open source tool called RServe to establish a connection with an execution server running an R session. The RServe solution was not developed or supported by RStudio, and this caused some confusion. Tableau users wanted RStudio to help solve various installation, environment management, configuration, and security challenges they encountered with RServe.</p><p>As an alternative to RServe, we&rsquo;ve invested in new open source packages for creating analytics extensions using Plumber (R) and FastAPI (Python). The APIs created with <a href="https://rstudio.github.io/plumbertableau/"><code>plumbertableau</code></a> and <a href="https://github.com/rstudio/fastapitableau"><code>fastapitableau</code></a> are easy to host on RStudio Connect which is configured by default to host and serve them.</p><p><strong>What are Analytics Extensions?</strong></p><p><a href="https://help.tableau.com/current/pro/desktop/en-us/r_connection_manage.htm">Tableau Analytics Extensions</a> provide a way to create calculated fields in workbooks that can execute scripts outside of the Tableau environment. To use Analytics Extensions, you must configure an instance of Tableau Server, Tableau Online or Tableau Desktop (<a href="https://docs.rstudio.com/rsc/integration/tableau/#tableau-setup">Instructions</a>).</p><p><strong>Why not just build a Shiny application?</strong></p><p>Do you already build interactive applications in Shiny or a Python framework? Great! We know many RStudio users don&rsquo;t have or use Tableau, and that&rsquo;s totally reasonable. We love when people choose Shiny over other alternatives. We also want to support data science teams who have access to both RStudio Pro products and Tableau. This integration is for the folks who need to bridge both worlds.</p><p><strong>How do you get started?</strong></p><p>In principle, extending Tableau should be as simple as directing a workbook to reach out to any existing web API, but Tableau Analytics Extensions require special handling to make valid requests and receive results. To simplify this process, we introduced two new open source libraries which add functionality to Plumber and FastAPI:</p><ul><li>For R: <a href="https://rstudio.github.io/plumbertableau/"><code>plumbertableau</code></a></li><li>For Python: <a href="https://github.com/rstudio/fastapitableau"><code>fastapitableau</code></a></li></ul><p>These libraries can be used to create as many extensions as you want to manage. Data Scientists can learn more in the <a href="https://docs.rstudio.com/connect/user/tableau/">RStudio Connect User Guide</a>. Server administrators should review the full <a href="https://docs.rstudio.com/rsc/integration/tableau/">integration and set up instructions</a> upon upgrade.</p><p><strong>Take a look at an example:</strong> <a href="https://github.com/sol-eng/tableau-examples/tree/main/superstore">Detect outliers and predict profit with the built-in Superstore Tableau dataset</a>.</p><h2 id="upcoming-changes-preparing-for-2022">Upcoming Changes: Preparing for 2022</h2><p>In early 2022, RStudio Connect will release an edition that removes support for the following:</p><ul><li><strong>Experimental v1 Server APIs</strong></li><li><strong>Python 2</strong></li></ul><p>To prepare for these changes, we recommend reviewing the <a href="https://docs.rstudio.com/connect/api/#overview--versioning-of-the-api">API Reference Documentation</a> and the Python 2 support <a href="https://www.rstudio.com/blog/rstudio-connect-2021-08-python-updates/">announcement post</a> from August 2021.</p><p><strong>Data Scientists</strong> should review the official porting guide and redeploy any mission critical content that currently relies on Python 2.</p><ul><li>Read the official <a href="https://docs.python.org/3/howto/pyporting.html">&ldquo;Porting Python 2 Code to Python 3&rdquo; guide</a> and the <a href="https://python3statement.org/practicalities/">Python 3 Statement Practicalities</a> for advice on how to sunset your Python 2 code.</li></ul><p>Notify your <strong>Administrators</strong> so they can determine whether Python 2 content exists on your RStudio Connect server today.</p><ul><li>Audit the complete list of content items on their server and which versions of R/Python they use by deploying <a href="https://github.com/sol-eng/rsc-audit-reports/blob/main/environment-audit/environment-audit-report.Rmd">this report</a>.</li></ul><p>If you have any questions or concerns about these upcoming changes, please contact your RStudio Customer Success representative.</p><h2 id="upgrade-to-rstudio-connect-2021120">Upgrade to RStudio Connect 2021.12.0</h2><p>The latest release of RStudio Connect is 2021.12.0. To perform an upgrade, a <strong>server administrator</strong> should download and run the installation script. The script installs a new version of Connect on top of the earlier one. Existing configuration settings are respected. Be sure to consult the <a href="http://docs.rstudio.com/connect/news">release notes</a> before beginning an upgrade to make note of any breaking changes introduced since your last installation.</p><pre><code># Download the installation scriptcurl -Lo rsc-installer.sh https://cdn.rstudio.com/connect/installer/installer-v1.9.5.sh# Run the installation scriptsudo -E bash ./rsc-installer.sh 2021.12.0</code></pre></description></item><item><title>Integrating Dynamic R and Python Models in Tableau Using plumbertableau</title><link>https://www.rstudio.com/blog/dynamic-r-and-python-models-in-tableau-using-plumbertableau/</link><pubDate>Mon, 20 Dec 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/dynamic-r-and-python-models-in-tableau-using-plumbertableau/</guid><description><p>RStudio believes that you can attain greater business intelligence with interoperable tools that <a href="https://www.rstudio.com/solutions/interoperability/" target = "_blank">take the full advantage of open-source data science</a>. Your organization may rely on Tableau for reporting purposes, but how can you ensure that you&rsquo;re using the full power of your data science team&rsquo;s R and Python models in your dashboards?</p><p>With the <a href="https://rstudio.github.io/plumbertableau/index.html" target = "_blank">plumbertableau</a> package (and its corresponding Python package, <a href="https://rstudio.github.io/fastapitableau/" target = "_blank">fastapitableau</a>), you can use functions or models created in R or Python from Tableau through an API. These packages allow you to showcase cutting-edge data science results in your organization’s preferred dashboard tool.</p><p>While this post mentions R, anything possible with R and plumbertableau is also doable with Python and fastapitableau.</p><h2 id="foster-data-analytics-capabilities-with-plumbertableau">Foster Data Analytics Capabilities With plumbertableau</h2><p>With plumbertableau, you can fully develop your model with code-first data science. The package uses <a href="https://www.rplumber.io/" target = "_blank">plumber</a> to create an API directly from your code. Since your model is fully developed in your data science editor, it can use all the packages and complex calculations it needs.</p><p>You can extract the best data science results using R&rsquo;s capabilities as your model will not be constrained by Tableau&rsquo;s environment.</p><h2 id="improve-data-quality-with-apis-for-continuous-use">Improve Data Quality With APIs for Continuous Use</h2><p>Seamless integration between analytic platforms prevents issues like using outdated, inaccurate, or incomplete data. Rather than depending on a manual process, data scientists can depend on their data pipelines to ensure data integrity.</p><p>With plumbertableau, your tools are integrated through an API. The Tableau dashboard displays results without any intermediate manipulation like copy-and-pasting code or uploading datasets. You can work in confidence knowing your results are synchronized, accurate, and reproducible.</p><h2 id="increase-deliverability-by-streamlining-data-pipelines">Increase Deliverability by Streamlining Data Pipelines</h2><p>If your model has many dependencies or versioning requirements, it can be difficult to handle them outside of the development environment. Debugging is even more time-consuming when you need to work in separate environments to figure out what went wrong.</p><p>With <a href="https://connect.rstudioservices.com/connect/" target = "_blank">RStudio Connect</a>, you can publish directly plumbertableau extensions directly from the RStudio IDE. RStudio Connect automatically manages your API&rsquo;s dependent packages and files to recreate an environment closely mimicking your local development environment. And since all your R code remains in R, you can use your usual data science techniques to efficiently resolve issues.</p><p>Read more on the <a href="https://www.rplumber.io/articles/hosting.html/" target = "_blank">Hosting</a> page of the plumber package.</p><h2 id="how-to-use-plumbertableau-xgboost-with-dynamic-model-output-example">How to Use plumbertableau: XGBoost with Dynamic Model Output Example</h2><p><img src="img/gif4.gif" alt="Showing predictive values in Tableau dashboard"></p><p>In this walkthrough, we will be using data from the <a href="https://data.seattle.gov/" target = "_blank">Seattle Open Data Portal</a> to predict the paid parking occupancy percentage in various areas around the city. We will run an XGBoost model in RStudio, create a plumbertableau extension to embed into Tableau, and visualize and interact with the model in a Tableau dashboard. The code is here for reproducibility purposes; however, it will <strong>require</strong> an RStudio Connect account to complete.</p><p>The plumbertableau and fastapi packages have wonderful documentation. Be sure to read them for more information on:</p><ul><li>The anatomy of the extensions</li><li>Details on setting up RStudio Connect and Tableau</li><li>Other examples to try out in your Tableau dashboards</li></ul><p><img src="img/img2.png" alt="Displaying dynamic model output in Tableau steps"></p><h3 id="1-build-the-model">1. Build the model</h3><p>First, we need to build a model. This walkthrough won’t be covering how to create, tune, or validate a model. If you&rsquo;d like to learn more on models and machine learning, check out the <a href="https://www.tidymodels.org/" target = "_blank">tidymodels</a> website and Julia Silge&rsquo;s fantastic <a href="https://juliasilge.com/category/tidymodels/" target = "_blank">screencasts and tutorials</a>.</p><p><strong>Load Libraries</strong></p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(tidyverse)<span style="color:#06287e">library</span>(RSocrata)<span style="color:#06287e">library</span>(lubridate)<span style="color:#06287e">library</span>(usemodels)<span style="color:#06287e">library</span>(tidymodels)</code></pre></div><p><strong>Download and Clean Data</strong></p><p>The Seattle Open Data Portal uses <a href="https://www.tylertech.com/products/socrata" target = "_blank">Socrata</a>, a data management tool, for its APIs. We can use the <a href="https://cran.r-project.org/web/packages/RSocrata/index.html" target = "_blank">RSocrata</a> package to download the data.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">parking_data <span style="color:#666">&lt;-</span>RSocrata<span style="color:#666">::</span><span style="color:#06287e">read.socrata</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">https://data.seattle.gov/resource/rke9-rsvs.json?$where=sourceelementkey &lt;= 1020&#34;</span>)parking_id <span style="color:#666">&lt;-</span>parking_data <span style="color:#666">%&gt;%</span><span style="color:#06287e">group_by</span>(blockfacename, location.coordinates) <span style="color:#666">%&gt;%</span><span style="color:#06287e">mutate</span>(id <span style="color:#666">=</span> <span style="color:#06287e">cur_group_id</span>()) <span style="color:#666">%&gt;%</span><span style="color:#06287e">ungroup</span>()parking_clean <span style="color:#666">&lt;-</span>parking_id <span style="color:#666">%&gt;%</span><span style="color:#06287e">mutate</span>(<span style="color:#06287e">across</span>(<span style="color:#06287e">c</span>(parkingspacecount, paidoccupancy), as.numeric),occupancy_pct <span style="color:#666">=</span> paidoccupancy <span style="color:#666">/</span> parkingspacecount) <span style="color:#666">%&gt;%</span><span style="color:#06287e">group_by</span>(id <span style="color:#666">=</span> id,hour <span style="color:#666">=</span> <span style="color:#06287e">as.numeric</span>(<span style="color:#06287e">hour</span>(occupancydatetime)),month <span style="color:#666">=</span> <span style="color:#06287e">as.numeric</span>(<span style="color:#06287e">month</span>(occupancydatetime)),dow <span style="color:#666">=</span> <span style="color:#06287e">as.numeric</span>(<span style="color:#06287e">wday</span>(occupancydatetime)),date <span style="color:#666">=</span> <span style="color:#06287e">date</span>(occupancydatetime)) <span style="color:#666">%&gt;%</span><span style="color:#06287e">summarize</span>(occupancy_pct <span style="color:#666">=</span> <span style="color:#06287e">mean</span>(occupancy_pct, na.rm <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>)) <span style="color:#666">%&gt;%</span><span style="color:#06287e">drop_na</span>() <span style="color:#666">%&gt;%</span><span style="color:#06287e">ungroup</span>()</code></pre></div><p>We will also need information on the city blocks, so let&rsquo;s create that dataset.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">parking_information <span style="color:#666">&lt;-</span>parking_id <span style="color:#666">%&gt;%</span><span style="color:#06287e">mutate</span>(loc <span style="color:#666">=</span> location.coordinates) <span style="color:#666">%&gt;%</span><span style="color:#06287e">select</span>(id, blockfacename, loc) <span style="color:#666">%&gt;%</span><span style="color:#06287e">distinct</span>(id, blockfacename, loc) <span style="color:#666">%&gt;%</span><span style="color:#06287e">unnest_wider</span>(loc, <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#39;</span><span style="color:#4070a0">loc1&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">loc2&#39;</span>))</code></pre></div><p><strong>Create Training Data</strong></p><p>Now, let&rsquo;s create the training set from our original data.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">parking_split <span style="color:#666">&lt;-</span>parking_clean <span style="color:#666">%&gt;%</span><span style="color:#06287e">arrange</span>(date) <span style="color:#666">%&gt;%</span><span style="color:#06287e">select</span>(<span style="color:#666">-</span>date) <span style="color:#666">%&gt;%</span><span style="color:#06287e">initial_time_split</span>(prop <span style="color:#666">=</span> <span style="color:#40a070">0.75</span>)</code></pre></div><p><strong>Train and Tune the Model</strong></p><p>Here, we train and tune the model. We select the model with the best RSME to use in our dashboard.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">xgboost_recipe <span style="color:#666">&lt;-</span><span style="color:#06287e">recipe</span>(formula <span style="color:#666">=</span> occupancy_pct <span style="color:#666">~</span> ., data <span style="color:#666">=</span> parking_clean) <span style="color:#666">%&gt;%</span><span style="color:#06287e">step_zv</span>(<span style="color:#06287e">all_predictors</span>()) <span style="color:#666">%&gt;%</span><span style="color:#06287e">prep</span>()xgboost_folds <span style="color:#666">&lt;-</span>recipes<span style="color:#666">::</span><span style="color:#06287e">bake</span>(xgboost_recipe,new_data <span style="color:#666">=</span> <span style="color:#06287e">training</span>(parking_split)) <span style="color:#666">%&gt;%</span>rsample<span style="color:#666">::</span><span style="color:#06287e">vfold_cv</span>(v <span style="color:#666">=</span> <span style="color:#40a070">5</span>)xgboost_model <span style="color:#666">&lt;-</span><span style="color:#06287e">boost_tree</span>(mode <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">regression&#34;</span>,trees <span style="color:#666">=</span> <span style="color:#40a070">1000</span>,min_n <span style="color:#666">=</span> <span style="color:#06287e">tune</span>(),tree_depth <span style="color:#666">=</span> <span style="color:#06287e">tune</span>(),learn_rate <span style="color:#666">=</span> <span style="color:#06287e">tune</span>(),loss_reduction <span style="color:#666">=</span> <span style="color:#06287e">tune</span>()) <span style="color:#666">%&gt;%</span><span style="color:#06287e">set_engine</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">xgboost&#34;</span>, objective <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">reg:squarederror&#34;</span>)xgboost_params <span style="color:#666">&lt;-</span><span style="color:#06287e">parameters</span>(<span style="color:#06287e">min_n</span>(),<span style="color:#06287e">tree_depth</span>(),<span style="color:#06287e">learn_rate</span>(),<span style="color:#06287e">loss_reduction</span>())xgboost_grid <span style="color:#666">&lt;-</span><span style="color:#06287e">grid_max_entropy</span>(xgboost_params,size <span style="color:#666">=</span> <span style="color:#40a070">5</span>)xgboost_wf <span style="color:#666">&lt;-</span>workflows<span style="color:#666">::</span><span style="color:#06287e">workflow</span>() <span style="color:#666">%&gt;%</span><span style="color:#06287e">add_model</span>(xgboost_model) <span style="color:#666">%&gt;%</span><span style="color:#06287e">add_formula</span>(occupancy_pct <span style="color:#666">~</span> .)xgboost_tuned <span style="color:#666">&lt;-</span> tune<span style="color:#666">::</span><span style="color:#06287e">tune_grid</span>(object <span style="color:#666">=</span> xgboost_wf,resamples <span style="color:#666">=</span> xgboost_folds,grid <span style="color:#666">=</span> xgboost_grid,metrics <span style="color:#666">=</span> yardstick<span style="color:#666">::</span><span style="color:#06287e">metric_set</span>(rmse, rsq, mae),control <span style="color:#666">=</span> tune<span style="color:#666">::</span><span style="color:#06287e">control_grid</span>(verbose <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>))xgboost_best <span style="color:#666">&lt;-</span>xgboost_tuned <span style="color:#666">%&gt;%</span>tune<span style="color:#666">::</span><span style="color:#06287e">select_best</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">rmse&#34;</span>)xgboost_final <span style="color:#666">&lt;-</span>xgboost_model <span style="color:#666">%&gt;%</span><span style="color:#06287e">finalize_model</span>(xgboost_best)</code></pre></div><p>We bundle the recipe and fitted model in an object so we can use it later.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">train_processed <span style="color:#666">&lt;-</span><span style="color:#06287e">bake</span>(xgboost_recipe, new_data <span style="color:#666">=</span> <span style="color:#06287e">training</span>(parking_split))prediction_fit <span style="color:#666">&lt;-</span>xgboost_final <span style="color:#666">%&gt;%</span><span style="color:#06287e">fit</span>(formula <span style="color:#666">=</span> occupancy_pct <span style="color:#666">~</span> .,data <span style="color:#666">=</span> train_processed)model_details <span style="color:#666">&lt;-</span> <span style="color:#06287e">list</span>(model <span style="color:#666">=</span> xgboost_final,recipe <span style="color:#666">=</span> xgboost_recipe,prediction_fit <span style="color:#666">=</span> prediction_fit)</code></pre></div><p><strong>Save Objects for the plumbertableau Extension</strong></p><p>We&rsquo;ll want to save our data and our model so that we can use them in the extension. If you have an RStudio Connect account, the <a href="https://pins.rstudio.com/" target = "_blank">pins</a> package is a great choice for saving these objects.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">rsc <span style="color:#666">&lt;-</span>pins<span style="color:#666">::</span><span style="color:#06287e">board_rsconnect</span>(server <span style="color:#666">=</span> <span style="color:#06287e">Sys.getenv</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">CONNECT_SERVER&#34;</span>),key <span style="color:#666">=</span> <span style="color:#06287e">Sys.getenv</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">CONNECT_API_KEY&#34;</span>))pins<span style="color:#666">::</span><span style="color:#06287e">pin_write</span>(board <span style="color:#666">=</span> rsc,x <span style="color:#666">=</span> model_details,name <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">seattle_parking_model&#34;</span>,description <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Seattle Occupancy Percentage XGBoost Model&#34;</span>,type <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">rds&#34;</span>)pins<span style="color:#666">::</span><span style="color:#06287e">pin_write</span>(board <span style="color:#666">=</span> rsc,x <span style="color:#666">=</span> parking_information,name <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">seattle_parking_info&#34;</span>,description <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Seattle Parking Information&#34;</span>,type <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">rds&#34;</span>)</code></pre></div><h3 id="2-create-a-plumbertableau-extension">2. Create a plumbertableau Extension</h3><p>Next, we will use our model to create a plumbertableau extension. As noted previously, the plumbertableau extension is a Plumber API with some special annotations.</p><p>Create an R script called <code>plumber.R</code>. At the top, we list the libraries we&rsquo;ll need.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(plumber)<span style="color:#06287e">library</span>(pins)<span style="color:#06287e">library</span>(tibble)<span style="color:#06287e">library</span>(xgboost)<span style="color:#06287e">library</span>(lubridate)<span style="color:#06287e">library</span>(dplyr)<span style="color:#06287e">library</span>(tidyr)<span style="color:#06287e">library</span>(tidymodels)<span style="color:#06287e">library</span>(plumbertableau)</code></pre></div><p>We want to bring in our model details and our data. If you pinned your data, you&rsquo;ll change the name of the pin below.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">rsc <span style="color:#666">&lt;-</span>pins<span style="color:#666">::</span><span style="color:#06287e">board_rsconnect</span>(server <span style="color:#666">=</span> <span style="color:#06287e">Sys.getenv</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">CONNECT_SERVER&#34;</span>),key <span style="color:#666">=</span> <span style="color:#06287e">Sys.getenv</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">CONNECT_API_KEY&#34;</span>))xgboost_model <span style="color:#666">&lt;-</span>pins<span style="color:#666">::</span><span style="color:#06287e">pin_read</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">isabella.velasquez/seattle_parking_model&#34;</span>, board <span style="color:#666">=</span> rsc)</code></pre></div><p>Now, we add our <a href="https://www.rplumber.io/articles/annotations.htm" target = "_blank">annotations</a>. Note that we use plumbertableau annotations, which are slightly different than the ones from plumber.</p><ul><li>We use <code>tableauArg</code> rather than <code>params</code>.</li><li>We specify what is returned to Tableau with <code>tableauReturn</code>.</li><li>We must use <code>post</code> for what is being returned.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic">#* @apiTitle Seattle Parking Occupancy Percentage Prediction API</span><span style="color:#60a0b0;font-style:italic">#* @apiDescription Return the predicted occupancy percentage at various Seattle locations</span><span style="color:#60a0b0;font-style:italic">#* @tableauArg block_id:integer numeric block ID</span><span style="color:#60a0b0;font-style:italic">#* @tableauArg ndays:integer number of days in the future for the prediction</span><span style="color:#60a0b0;font-style:italic">#* @tableauReturn [numeric] Predicted occupancy rate</span><span style="color:#60a0b0;font-style:italic">#* @post /pred</span></code></pre></div><p>Now, we create our function with the arguments <code>station_id</code> and <code>ndays</code>. These will have corresponding arguments in Tableau. The function will output our predicted occupancy percentage, which will be what we visualize and interact with in the dashboard.</p><p>This function takes the city block and number of days in the future to give us the predicted occupancy percentage at that time.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">function</span>(block_id, ndays) {times <span style="color:#666">&lt;-</span> <span style="color:#06287e">Sys.time</span>() <span style="color:#666">+</span> lubridate<span style="color:#666">::</span><span style="color:#06287e">ddays</span>(ndays)current_time <span style="color:#666">&lt;-</span>tibble<span style="color:#666">::</span><span style="color:#06287e">tibble</span>(times <span style="color:#666">=</span> times,id <span style="color:#666">=</span> block_id)current_prediction <span style="color:#666">&lt;-</span>current_time <span style="color:#666">%&gt;%</span><span style="color:#06287e">transmute</span>(id <span style="color:#666">=</span> id,hour <span style="color:#666">=</span> <span style="color:#06287e">hour</span>(times),month <span style="color:#666">=</span> <span style="color:#06287e">month</span>(times),dow <span style="color:#666">=</span> <span style="color:#06287e">wday</span>(times),occupancy_pct <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">NA</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">bake</span>(xgboost_model<span style="color:#666">$</span>recipe, .)parking_prediction <span style="color:#666">&lt;-</span>xgboost_model<span style="color:#666">$</span>prediction_fit <span style="color:#666">%&gt;%</span><span style="color:#06287e">predict</span>(new_data <span style="color:#666">=</span> current_prediction)predictions <span style="color:#666">&lt;-</span>parking_prediction<span style="color:#666">$</span>.predpredictions[[1]]}</code></pre></div><p>Finally, we finish off our script with the extension footer needed for plumbertableau extensions.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic">#* @plumber</span>tableau_extension</code></pre></div><p>Here is the full <code>plumber.R</code> script:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(plumber)<span style="color:#06287e">library</span>(pins)<span style="color:#06287e">library</span>(tibble)<span style="color:#06287e">library</span>(xgboost)<span style="color:#06287e">library</span>(lubridate)<span style="color:#06287e">library</span>(dplyr)<span style="color:#06287e">library</span>(tidyr)<span style="color:#06287e">library</span>(tidymodels)<span style="color:#06287e">library</span>(plumbertableau)rsc <span style="color:#666">&lt;-</span>pins<span style="color:#666">::</span><span style="color:#06287e">board_rsconnect</span>(server <span style="color:#666">=</span> <span style="color:#06287e">Sys.getenv</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">CONNECT_SERVER&#34;</span>),key <span style="color:#666">=</span> <span style="color:#06287e">Sys.getenv</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">CONNECT_API_KEY&#34;</span>))xgboost_model <span style="color:#666">&lt;-</span>pins<span style="color:#666">::</span><span style="color:#06287e">pin_read</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">isabella.velasquez/seattle_parking_model&#34;</span>, board <span style="color:#666">=</span> rsc)<span style="color:#60a0b0;font-style:italic">#* @apiTitle Seattle Parking Occupancy Percentage Prediction API</span><span style="color:#60a0b0;font-style:italic">#* @apiDescription Return the predicted occupancy percentage at various Seattle locations</span><span style="color:#60a0b0;font-style:italic">#* @tableauArg block_id:integer numeric block ID</span><span style="color:#60a0b0;font-style:italic">#* @tableauArg ndays:integer number of days in the future for the prediction</span><span style="color:#60a0b0;font-style:italic">#* @tableauReturn [numeric] Predicted occupancy rate</span><span style="color:#60a0b0;font-style:italic">#* @post /pred</span><span style="color:#06287e">function</span>(block_id, ndays) {times <span style="color:#666">&lt;-</span> <span style="color:#06287e">Sys.time</span>() <span style="color:#666">+</span> lubridate<span style="color:#666">::</span><span style="color:#06287e">ddays</span>(ndays)current_time <span style="color:#666">&lt;-</span>tibble<span style="color:#666">::</span><span style="color:#06287e">tibble</span>(times <span style="color:#666">=</span> times,id <span style="color:#666">=</span> block_id)current_prediction <span style="color:#666">&lt;-</span>current_time <span style="color:#666">%&gt;%</span><span style="color:#06287e">transmute</span>(id <span style="color:#666">=</span> id,hour <span style="color:#666">=</span> <span style="color:#06287e">hour</span>(times),month <span style="color:#666">=</span> <span style="color:#06287e">month</span>(times),dow <span style="color:#666">=</span> <span style="color:#06287e">wday</span>(times),occupancy_pct <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">NA</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">bake</span>(xgboost_model<span style="color:#666">$</span>recipe, .)parking_prediction <span style="color:#666">&lt;-</span>xgboost_model<span style="color:#666">$</span>prediction_fit <span style="color:#666">%&gt;%</span><span style="color:#06287e">predict</span>(new_data <span style="color:#666">=</span> current_prediction)predictions <span style="color:#666">&lt;-</span>parking_prediction<span style="color:#666">$</span>.predpredictions[[1]]}<span style="color:#60a0b0;font-style:italic">#* @plumber</span>tableau_extension</code></pre></div><h3 id="3-host-your-api">3. Host your API</h3><p>We have to host our API so that it can be accessed in Tableau. In our case, we publish it to RStudio Connect.</p><p>Once hosted, plumbertableau automatically generates a documentation page. Notice that the <code>SCRIPT_*</code> value is not R code. This is a Tableau command that we will use to connect our extension and Tableau.</p><p><img src="img/img3.png" alt="Automatically generated plumbertableau documentation page"></p><caption><center><i>Automatically generated plumbertableau documentation page</center></i></caption><h3 id="4-create-a-calculated-field-in-tableau">4. Create a calculated field in Tableau</h3><p>There are a few steps you need to take so that Tableau can use your plumbertableau extension. If you are using RStudio Connect, read the documentation on how to <a href="https://docs.rstudio.com/rsc/integration/tableau/" target = "_blank">configure RStudio Connect as an analytic extension</a>.</p><p>Create a new workbook and upload the <code>station_information</code> file. Under Analysis, turn off Aggregate Measures. Drop <code>Lat</code> into Rows and <code>Lon</code> into Columns, which will create a map. Save the workbook.</p><p>Make sure your workbook knows to connect to RStudio Connect by going to Analysis &gt; Manage Analytic Extensions Connection &gt; Choose a Connection. Then, select your Connect account.</p><p>Drag <code>Id</code> into the &ldquo;Detail&rdquo; mark. Create a parameter called &ldquo;Days in the Future&rdquo;. We&rsquo;re using our model to predict parking occupancy percentage for that date. Show the parameter on the worksheet.</p><p><img src="img/gif1.gif" alt="Creating a parameter in Tableau"></p><p>Create a calculated field using the <code>SCRIPT</code> from the plumbertableau documentation page:</p><pre><code>SCRIPT_REAL(&quot;/plumbertableau-xgboost-example/pred&quot;, block_id, ndays)</code></pre><p>For each <code>tableauArg</code> we have listed in the extension, we will replace it with its corresponding Tableau value. If you&rsquo;re following along, this means <code>block_id</code> will become <code>ATTR([Id])</code> and <code>ndays</code> will become <code>ATTR([Days in the Future])</code>.</p><pre><code>SCRIPT_REAL(&quot;/plumbertableau-xgboost-example/pred&quot;, ATTR([Id]), ATTR([Days in the Future]))</code></pre><p><img src="img/gif2.gif" alt="Creating a calculated field from a plumbertableau extension"></p><h3 id="5-run-model-and-visualize-results-in-tableau">5. Run model and visualize results in Tableau</h3><p>That&rsquo;s it! Once you embed your extension in Tableau’s calculated fields, you can use your model&rsquo;s results in your Tableau dashboard like any other measure or dimension.</p><p><img src="img/gif3.gif" alt="Showing predictive results in Tableau"></p><p>We can change the <code>ndays</code> argument to get new predictions from our XGBoost model and display them on our Tableau dashboard.</p><p><img src="img/gif5.gif" alt="Showing predictive results in Tableau dashboard by changing the number of days in the future"></p><p>You can style your Tableau dashboard and then provide your users something that is not only aesthetically pleasing, but is dynamically calculating predictions based on a model you have created in R.</p><h2 id="conclusion">Conclusion</h2><p>With plumbertableau, you can showcase sophisticated model results that are easy to integrate, debug, and reproduce. Your work will be at the forefront of data science while being visualized in Tableau&rsquo;s easy, point-and-click interface.</p><h2 id="learn-more">Learn More</h2><p>Watch James Blair showcase plumbertableau in Leveraging R &amp; Python in Tableau with RStudio Connect:</p><script src="https://fast.wistia.com/embed/medias/hl37qvfnml.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_hl37qvfnml videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/hl37qvfnml/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><p>More on how RStudio supports interoperability across tools can be found on our <a href="https://www.rstudio.com/solutions/bi-and-data-science/" target = "_blank">BI and Data Science Overview Page</a>.</p></description></item><item><title>Winners of the 2021 Table Contest</title><link>https://www.rstudio.com/blog/winners-of-the-2021-table-contest/</link><pubDate>Thu, 16 Dec 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/winners-of-the-2021-table-contest/</guid><description><p><img src="winner.png" alt="2021 RStudio Table Contest. Put the able in table"></p><p>We are excited to announce this year&rsquo;s winners of the 2021 RStudio Table Contest.</p><p>Display tables are a fundamental way we summarize and communicate information. But in many instances, these are boring and created without as much thought as they deserve. Until recently, customizing and styling a table to make it just right could be a painful and time-consuming experience.</p><p>But not today. We now have numerous R packages at our disposal to generate well-designed and beautiful presentation tables. And this community has gone out of its way to share some great examples and tutorials on how to do this.</p><h2 id="evaluation-and-judging">Evaluation and Judging</h2><p>Submissions were evaluated based on technical merit and artistic achievement. We would like to thank all of this year&rsquo;s judges. Thank you to <a href="https://github.com/glin">Greg Lin</a>, <a href="https://twitter.com/thomas_mock">Tom Mock</a>, <a href="https://www.linkedin.com/in/samanthatoet/">Samantha Toet</a>, and <a href="https://twitter.com/ivelasq3">Isabella Velásquez</a> for all your help in evaluating submissions and your thoughtful comments.</p><p><strong>Types of Submissions</strong></p><ul><li><strong>Single Table Example</strong>: These may highlight interesting structuring of content, useful and tricky features, or serve as an example of a common table popular in a specific field. All submissions include well-organized, documented code reproducing the table.</li><li><strong>Tutorial</strong>: It’s all about teaching us how to craft an excellent table or understand a package’s features. These may include several tables and include narrative.</li></ul><p><strong>Categories</strong> - We placed tables into one of the following categories:</p><ul><li><strong>Interactive HTML</strong> - An HTML table, built with R, with interactivity.</li><li><strong>Interactive Shiny</strong> - Interactive tables built with Shiny.</li><li><strong>Static</strong> - Tables that do not have interactive elements. These may be designed for print or HTML.</li></ul><h2 id="the-winner">The Winner</h2><p><img src="images/01-satellites.gif" alt="Screenshot of table showing details of satellites around the earth"></p><p><strong>Satellites</strong> - by Vladislav Fridkin - <em>interactive / Shiny</em></p><p><em>An interactive table of satellites built with Shiny, {reactable}, and {gt}</em></p><p>There might be prettier ways to show the 1000s of satellites orbiting us, but how does one effectively organize all the associated data? This Shiny app generates a table summarizing data associated with a selection of satellites. This seems like a really smart way to engage interested people into exploring this type of data.</p><p><a href="https://vfridkin.shinyapps.io/Satellites/">App</a> – <a href="https://github.com/vfridkin/satellite_table">Repo</a> – <a href="https://community.rstudio.com/t/satellites-table-contest-submission/120539">Community</a></p><hr><h2 id="runner-up">Runner Up</h2><p><img src="images/02-one-farm.png" alt="Screenshot of tablewith the ten most cultivated crops of the world"></p><p><strong>One Farm</strong> - by Benjamin Nowak - <em>static / HTML</em></p><p><em>A set of static tables that look like they were ripped out from the pages of a magazine!</em></p><p>These tables are beautifully designed and they present a lot of information in a very digestible way. Love the use of hierarchy, color, and text with plots/images.</p><p><a href="https://raw.githubusercontent.com/BjnNowak/CultivatedPlanet/main/Tables/CultivatedPlanet_V2.png">Table</a> – <a href="https://github.com/BjnNowak/CultivatedPlanet/tree/main">Repo</a> – <a href="https://community.rstudio.com/t/one-farm-table-contest-submission/117744">Community</a></p><hr><h2 id="honorable-mentions">Honorable Mentions</h2><p><img src="images/03-describer.png" alt="Screenshot of table showing descriptive analytics for a dataset and interactive figures"></p><p><strong>Describer: An Interactive Table Interface for Data Summaries</strong> - by Agustin Calatroni, Rebecca Krouse and Stephanie Lussier - <em>interactive / HTML</em></p><p><em>A dream come true for summarizing and inspecting datasets!</em></p><p>A capable tool in a Shiny app for summarizing any dataset. A lot of information is accessible and easily scannable in the default view and the small plots are welcomed and extremely useful. That it handles variable descriptions and provides detailed summaries is a great achievement.</p><p><a href="https://agstn.github.io/describer/adsl_describer.html">App</a> – <a href="https://github.com/agstn/describer">Repo</a> – <a href="https://community.rstudio.com/t/describer-an-interactive-table-interface-for-data-summaries-table-contest-submission/121483">Community</a></p><hr><p><img src="images/04-openair-gt-gtextras.png" alt="Screenshot of table showing air quality monitoring summary with associated maps"></p><p><strong>Using {gt} and {openair} to Present Air Quality Data</strong> - by Jack Davison - <em>static / HTML</em></p><p><em>Tabulated air quality summaries from the legendary {openair} package.</em></p><p>This tutorial provides an introduction to gt and is aimed toward air quality professionals that already have some foundation in R. It shows how to take outputs from the well-established {openair} package and present them effectively in tables with {gt} and {gtExtras}.</p><p><a href="https://rpubs.com/JackDavison/gt-openair">Article</a> – <a href="https://github.com/jack-davison/rstudio_table-contest_2021">Repo</a> – <a href="https://community.rstudio.com/t/using-gt-and-openair-to-present-air-quality-data-table-contest-submission/119603">Community</a></p><hr><p><img src="images/05-presentation-ready-data-summary.png" alt="Screenshot of table showing data summary statistics"></p><p><strong>Presentation-Ready Data Summary</strong> - by Michael Curry and Daniel Sjoberg - <em>static / tutorial</em></p><p><em>An informative package vignette for creating a compelling data summary.</em></p><p>This is from the authors of the {gtsummary} package. Turns out, you can do a lot after introducing a dataset to the <code>tbl_summary()</code> function. Customizations galore here folks.</p><p><a href="https://www.danieldsjoberg.com/gtsummary/articles/tbl_summary.html">Article</a> – <a href="https://github.com/ddsjoberg/gtsummary/blob/master/vignettes/tbl_summary.Rmd">Source</a> – <a href="https://community.rstudio.com/t/presentation-ready-data-summary-table-contest-submission/120994">Community</a></p><hr><p><img src="images/06-r-clinical-study-reports-submission.png" alt="Screenshot of clinical study reports bookdown"></p><p><strong>R for Clinical Study Reports and Submission</strong> - by Yilong Zhang, Nan Xiao, and Keaven Anderson - <em>static / tutorial</em></p><p><em>If you’re in Pharma and making tables, make sure to read this thoroughly.</em></p><p>It’s rare that we get guidance on anything but if you’re responsible for creating tables for regulatory submissions in the Pharmaceutical Industry, then this online book is for you.</p><p><a href="https://elong0527.github.io/r4csr/index.html">Book</a> – <a href="https://github.com/elong0527/r4csr">Repo</a> – <a href="https://community.rstudio.com/t/r-for-clinical-study-reports-and-submission-table-contest-submission/116788">Community</a></p><hr><p><img src="images/07-fast-big-data-tables.png" alt="Screenshot of Rick and Morty characters in a table"></p><p><strong>Fast Big Data Tables in Shiny</strong> - by Ryszard Szymański - <em>interactive / tutorial</em></p><p><em>Big <em>and</em> fast tables?! Well, that’s a dream come true and we have Shiny and Plumber to thank for it.</em></p><p>If you have some pretty large data you can benefit from pagination from external resources. Thanks to Plumber, this is relatively easy. Thanks to Shiny, we can make it all highly interactive. Thanks to the myriad characters in Rick and Morty, we have a lot of records in the example app.</p><p><a href="https://rszymanski.shinyapps.io/table-contest/">App</a> – <a href="https://github.com/rszymanski/table-contest/blob/main/README.md">Tutorial</a> – <a href="https://community.rstudio.com/t/fast-big-data-tables-in-shiny-table-contest-submission/121358">Community</a></p><hr><p><img src="images/08-sparklines-with-reactablefmtr.png" alt="Screenshot showing tables with sparklines"></p><p><strong>Sparklines with {reactablefmtr}</strong> - by Kyle Cuilla - <em>static / tutorial</em></p><p><em>This was once very hard to do. Not anymore!</em></p><p>There are times when you want little plots to go inside a table. Might as well make it sparky. Thanks to {reactablefmtr}, a package that makes it easy to format reactable tables, you can do this with ease!</p><p><a href="https://kcuilla.github.io/reactablefmtr/articles/sparklines.html">Article</a> – <a href="https://github.com/kcuilla/reactablefmtr/blob/main/vignettes/sparklines.Rmd">Source</a> – <a href="https://community.rstudio.com/t/interactive-sparklines-with-reactablefmtr-table-contest-submission/120665">Community</a></p><hr><p><img src="images/09-riding-tables-gt-gtExtras.png" alt="Screenshot showing most successful riders in the Tour De France with flags, pictograms, and bar charts"></p><p><strong>Riding Tables with {gt} and {gtExtras}</strong> - by Benjamin Nowak - <em>static / tutorial</em></p><p><em>Learn a lot about {gt} and {gtExtras} and create a deluxe table in the end.</em></p><p>This tutorial focuses on making an awesome table with {gt} and {gtExtras}. You’ll learn all the little tricks on how to make the final table look super impressive!</p><p><a href="https://bjnnowak.netlify.app/2021/10/04/r-beautiful-tables-with-gt-and-gtextras">Article</a> – <a href="https://github.com/BjnNowak/BjnNowak/blob/main/content/post/2021-10-04-r-beautiful-tables-with-gt-and-gtextras/index.Rmd">Source</a> – <a href="https://community.rstudio.com/t/riding-tables-with-gt-and-gtextras-table-contest-submission/117184">Community</a></p><hr><p><img src="images/10-imperial-march-redux.png" alt="Screenshot of table with heatmap of seed points of tournaments"></p><p><strong>Imperial March Redux</strong> - by Bill Schmid - <em>interactive / HTML</em></p><p><em>A souped up version of the previous year’s entry, this time using {reactable}!</em></p><p>This submission is a new version of the table submitted for last year’s contest. This time, the {reactable} package was used to add a lot more interactivity to the table (plus, details sections for every row). Check it out!</p><p><a href="https://schmid07.github.io/NBA-R/plots/04/2020_41_bball_react.html">Table</a> – <a href="https://github.com/schmid07/NBA-R">Source</a> – <a href="https://community.rstudio.com/t/imperial-march-ncaa-basketball-reactable-reactablefmtr-table-contest-submission/121602">Community</a></p><hr><p><img src="images/11-from-vines-to-wines.png" alt="Screenshot of table of exceptional wines and associated data"></p><p><strong>From Vines to Wines: the most exceptional wines from all over the world</strong> - by Abdoul ISSA BIDA - <em>static / HTML</em></p><p><em>Looks great and really increases your knowledge of fine wines.</em></p><p>A great-looking informational table about wines. We all really enjoyed the pairing of icons with text throughout (especially in the column header area). Both R and Python were used to scrape the data for this one.</p><p><a href="https://github.com/AbdoulMa/RStudio-Table-Contest-2021">Table and Source</a> – <a href="https://community.rstudio.com/t/from-vines-to-wines-the-most-exceptional-wines-from-all-over-the-world-table-contest-submission/121492">Community</a></p><hr><p><img src="images/12-pokemon-info-table.png" alt="Screenshot of app showing the Pokemon Mew and its information"></p><p><strong>Pokémon Red/Blue/Yellow Information Table</strong> - by Kyle Butts - <em>interactive / Shiny</em></p><p><em>A dashboard/Pokédex/Shiny app that uses tables build from CSS grid + flex.</em></p><p>If you ever wanted to see how the {shiny.tailwind} could be used to generate a really nice Shiny app with no need for custom CSS, have a look at this entry. The result is a highly informational, yet beautiful, app that is a joy to behold.</p><p><a href="https://kyle-butts.shinyapps.io/RStudio_2021_Table_Comp/">App</a> – <a href="https://github.com/kylebutts/RStudio_2021_Table_Comp">Source</a> – <a href="https://community.rstudio.com/t/pokemon-red-blue-yellow-information-table-table-contest-submission/121448">Community</a></p><hr><p><img src="images/13-premier-league-standings-2021.png" alt="Screenshot of premier league standings table"></p><p><strong>Premier League Standings 2021</strong> - by Greta Gasparac - <em>interactive / Shiny</em></p><p><em>A treasure trove of data for the Premier League, all presented quite beautifully.</em></p><p>This is a good example of a minimalistic table with exploration built in. The shiny components are useful here and make cruising around the data fun to do. The design choices were carefully considered and that makes the display of information much easier to parse.</p><p><a href="https://ggapac.shinyapps.io/PL-table-shiny/">App</a> – <a href="https://github.com/ggapac/PL-table-shiny">Source</a> – <a href="https://community.rstudio.com/t/premier-league-standings-2021-table-contest-submission/121418">Community</a></p><hr><p><img src="images/14-histable.png" alt="Screenshot of histogram in a table"></p><p><strong>Histable</strong> - by Milos Vilotic - <em>interactive / Shiny</em></p><p><em>Is it a table? Is it a histogram? It’s kinda both!</em></p><p>The Histable is a Shiny app with an interactive DT table. Clicking on a column does something a little unexpected but a whole lot useful: the table shows a histogram of the selected values! Because of this, users can quickly visualize a distribution across the table.</p><p><a href="https://capandchange.shinyapps.io/histable/">App</a> – <a href="https://github.com/milosvil/histable">Source</a> – <a href="https://community.rstudio.com/t/histable-table-contest-submission/121002">Community</a></p><hr><p><img src="images/15-crosstable.png" alt="Sceenshot of crosstab with information"></p><p><strong>Crosstable, easily describe your dataset</strong> - by Dan Chaltiel - <em>static / tutorial</em></p><p><em>Summarize and tabularize your dataset with just one function.</em></p><p>This is a very nice tutorial that introduces people to the {crosstable} package. The package provides a single function <code>crosstable()</code> that computes descriptive statistics on datasets and interfaces with {officer} to create automated reports.</p><p><a href="https://danchaltiel.github.io/crosstable/">Article</a> – <a href="https://github.com/DanChaltiel/crosstable">Repo</a> – <a href="https://community.rstudio.com/t/crosstable-easily-describe-your-dataset-table-contest-submission/120496">Community</a></p><hr><h2 id="in-closing">In Closing</h2><p>We want to thank you all for making this Table Contest so great. It is incredibly hard to judge submissions with such an overall high level of quality. We fully acknowledge that there are many other really great entries we did not highlight in this article. We encourage you to check out all of the entries at RStudio Community.</p><p>One thing we love about the R community is how open and generous you are in sharing the code and process you use to solve problems. This lets others learn from your experience and invites feedback to improve your work. We hope this contest encourages more sharing and helps to recognize the many outstanding ways people work with and display data with R.</p></description></item><item><title>Sharing Data With the pins Package</title><link>https://www.rstudio.com/blog/sharing-data-with-the-pins-package/</link><pubDate>Wed, 15 Dec 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/sharing-data-with-the-pins-package/</guid><description><caption>Photo by <a href="https://unsplash.com/@universaleye?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Universal Eye</a> on <a href="https://unsplash.com/@ivelasq/likes?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></caption><p>Teams often need access to key data to do their work, but have you ever opened your coworker&rsquo;s script to see:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">dat <span style="color:#666">&lt;-</span><span style="color:#06287e">read_csv</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">C://Users/someone_else/data/dataset.csv&#34;</span>)more_dat <span style="color:#666">&lt;-</span><span style="color:#06287e">read_csv</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">S://Path_to_mapped_drive_that_you_dont_have/dataset.csv&#34;</span>)</code></pre></div><p>Yikes! How will you get these files? Let&rsquo;s hope you can reach your coworker before they’ve logged off for the day.</p><p>How can your code be reproducible if you have to manually change the file paths? <em>Shudder</em>.</p><p>What if you need to make edits to the data, will you have to keep copying CSVs and emailing files forever? <em>Double shudder.</em></p><p>What if your coworker accidentally forwards your email to someone who is not supposed to have access? <em>Oh no.</em></p><p>We can struggle to share data assets easily and safely, relying on emailed files to keep our analyses up to date. This makes it difficult to keep current or know what version of the data we’re using. If you&rsquo;ve ever experienced any of the scenarios above, consider <a href="https://www.rstudio.com/blog/pins-1-0-0/" target = "_blank">pins</a> as a solution that can help you share your data assets.</p><h2 id="what-_is_-a-pin-anyway">What <em>is</em> a pin, anyway?</h2><p>Pins, from the <a href="https://pins.rstudio.com/" target = "_blank">R package of the same name</a>, are a versatile way to publish R objects on a virtual corkboard so you can share them across projects and people.</p><p>Good pins are data or assets that are a few hundred megabytes or smaller. You can pin just about any object: data, models, JSON files, feather files from the Arrow package, and more. One of the most frequent use cases is pinning small data sets — often ephemeral data or reference tables that don&rsquo;t quite merit being in a database, but seemingly don&rsquo;t have a good home elsewhere (until now).</p><p>Pins get published to a board, which can be an <a href="https://www.rstudio.com/products/connect/" target = "_blank">RStudio Connect</a> server, an AWS S3 bucket or Azure Blob Storage, a shared drive like Dropbox or Sharepoint, or a <a href="https://pins.rstudio.com/reference/index.html#section-boards" target = "_blank">variety of other options</a>. Try it out for yourself — read in this data set we’ve pinned for you on RStudio Connect!</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic"># Install the latest pins from CRAN</span><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">pins&#34;</span>)<span style="color:#06287e">library</span>(pins)<span style="color:#60a0b0;font-style:italic"># Identify the board</span>board <span style="color:#666">&lt;-</span><span style="color:#06287e">board_url</span>(<span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">penguins&#34;</span> <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">https://colorado.rstudio.com/rsc/example_pin/&#34;</span>))<span style="color:#60a0b0;font-style:italic"># Read the shared data</span>board <span style="color:#666">%&gt;%</span><span style="color:#06287e">pin_read</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">penguins&#34;</span>)</code></pre></div><p>In short, if you’ve ever wondered where to put an R object that you or your colleague will need to use again, you might just want to pin it.</p><h2 id="pins-for-sharing-across-projects-and-teams">Pins for Sharing Across Projects and Teams</h2><p>One of the greatest strengths of pins is how your pin becomes accessible directly from your R scripts <em>and</em> the R scripts of anyone else to whom you’ve given access. Different projects can include code that reads the same pin without creating more copies of the data:</p><p><img src="images/image1.png" alt="Three projects using the same pin to download data"></p><p>It&rsquo;s easier (and safer) to share a pin across multiple projects or people than to email files around. Pins respect the access controls of the board. Say you’ve pinned to RStudio Connect: you can control who gets to use the pin, just like any other piece of content.</p><h2 id="pins-for-updating-and-versioning">Pins for Updating and Versioning</h2><p>You may be wondering why use pins if you already have a shared drive with your teammates. But what happens if you need to replace the dataset with a new one? Do you email everybody to let them know? Is it dataFINALv2.csv? Or dataFINALfinal.csv?</p><p>The pins package retrieves the newest version of the pin by default. That means pin users never have to worry about getting a stale version of the pin. If you need to update your pin regularly, a scheduled R Markdown on RStudio Connect can handle this task for you, so your pin stays fresh.</p><p>But you’re not locked into losing old versions of a pin. You can version pins so that writing to an existing pin adds a new copy rather than replacing the existing data.</p><p>Here&rsquo;s what versioning looks like using a temporary board:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(pins)board2 <span style="color:#666">&lt;-</span> <span style="color:#06287e">board_temp</span>(versioned <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>)board2 <span style="color:#666">%&gt;%</span> <span style="color:#06287e">pin_write</span>(<span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">5</span>, name <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">x&#34;</span>, type <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">rds&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Creating new version &#39;20210304T050607Z-ab444&#39;</span><span style="color:#60a0b0;font-style:italic">#&gt; Writing to pin &#39;x&#39;</span>board2 <span style="color:#666">%&gt;%</span> <span style="color:#06287e">pin_write</span>(<span style="color:#40a070">2</span><span style="color:#666">:</span><span style="color:#40a070">6</span>, name <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">x&#34;</span>, type <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">rds&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Creating new version &#39;20210304T050607Z-a077a&#39;</span><span style="color:#60a0b0;font-style:italic">#&gt; Writing to pin &#39;x&#39;</span>board2 <span style="color:#666">%&gt;%</span> <span style="color:#06287e">pin_write</span>(<span style="color:#40a070">3</span><span style="color:#666">:</span><span style="color:#40a070">7</span>, name <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">x&#34;</span>, type <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">rds&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Creating new version &#39;20210304T050607Z-0a284&#39;</span><span style="color:#60a0b0;font-style:italic">#&gt; Writing to pin &#39;x&#39;</span><span style="color:#60a0b0;font-style:italic"># see all versions</span>board2 <span style="color:#666">%&gt;%</span> <span style="color:#06287e">pin_versions</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">x&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 3 × 3</span><span style="color:#60a0b0;font-style:italic">#&gt; version created hash </span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;chr&gt; &lt;dttm&gt; &lt;chr&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 20210304T050607Z-0a284 2021-03-04 05:06:00 0a284</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 20210304T050607Z-a077a 2021-03-04 05:06:00 a077a</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 20210304T050607Z-ab444 2021-03-04 05:06:00 ab444</span></code></pre></div><h2 id="learn-more">Learn More</h2><p>With pins, you and your teammates can know where your important data assets are, how to access them, and whether they are the correct version. You can work with confidence knowing you’re using the right asset, your work is reproducible, and you’re following good practices for data management.</p><p>There’s more to explore with pins. We’re excited to share how you can adopt them into your workflow.</p><p>Learn more about how and when to use pins:</p><ul><li><a href="https://pins.rstudio.com/" target = "_blank">The pins package documentation</a></li><li><a href="https://docs.rstudio.com/how-to-guides/users/pro-tips/pins/" target = "_blank">RStudio Pro Tips: Creating Efficient Workflows with <code>pins</code> and RStudio Connect</a></li></ul><p>See pins in action:</p><ul><li>Pins can pull intensive ETL processes out of your apps, improve performance, and save you the hassle of redeploying whenever the underlying data changes.<ul><li>Watch: <a href="https://www.rstudio.com/resources/rstudioconf-2020/deploying-end-to-end-data-science-with-shiny-plumber-and-pins/" target = "_blank">Deploying End-To-End Data Science with Shiny, Plumber, and Pins</a></li></ul></li><li>Pins can play a key role in MLOps, publishing versioned models, and monitoring model metrics.<ul><li>Read: <a href="https://www.rstudio.com/blog/model-monitoring-with-r-markdown/" target = "_blank">Model Monitoring with R Markdown, pins, and RStudio Connect</a></li></ul></li></ul></description></item><item><title>Using Keras for Deep Learning With R</title><link>https://www.rstudio.com/blog/deep-learning-with-r-keras-for-r-updates/</link><pubDate>Wed, 08 Dec 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/deep-learning-with-r-keras-for-r-updates/</guid><description><caption>Photo by <a href="https://unsplash.com/@sallybrad2016?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Preethi Viswanathan</a> on <a href="https://unsplash.com/s/photos/neural-network?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></caption><p>We are excited to announce new developments in <a href="https://blogs.rstudio.com/ai/posts/2021-11-18-keras-updates/" target = "_blank">Keras for R</a>. Together with our current integration with <a href="https://torch.mlverse.org/" target = "_blank">torch</a>, data scientists can use the most popular and powerful deep learning frameworks all within R.</p><h2 id="expand-data-science-capabilities-with-deep-learning">Expand data science capabilities with deep learning</h2><p>Data scientists use machine learning to create models that improve without explicit instructions. Deep learning is a subset of machine learning. It is particularly powerful in applications such as image recognition, natural language processing, and audio processing.</p><p>Deep learning allows data scientists to create more accurate and efficient models, sometimes even outperforming human cognition. With improved machine learning capabilities, data scientists can provide better answers to their data questions.</p><h2 id="add-essential-tools-to-your-toolkit">Add essential tools to your toolkit</h2><p>Data scientists have done significant work to develop the deep learning sector. Among deep learning libraries, <a href="https://keras.io/" target = "_blank">Keras</a> stands out for its productivity, flexibility, and user-friendly API. TensorFlow is a machine learning platform that is both extremely adaptable and well-suited for production. Together, users can use these libraries to train and deploy powerful deep learning models.</p><p>The <a href="https://keras.rstudio.com/" target = "_blank">Keras for R package</a> provides an R interface to Keras. With it, data scientists can leverage the power of Keras and Tensorflow in R.</p><h2 id="train-neural-networks-with-easy-to-write-code">Train neural networks with easy-to-write code</h2><p>Keras for R allows data scientists to run deep learning models in an R interface. They can write in their preferred programming language while taking full advantage of the deep learning methods and architecture.</p><ul><li><strong>The package provides familiar syntax.</strong> Users can write natural-feeling, idiomatic-looking code with Keras for R (including the pipe operator!). Take a look: the <a href="https://tensorflow.rstudio.com/tutorials/beginners/" target = "_blank">image classification example</a> on the Tensorflow for R website uses code very familiar to those who use the tidyverse syntax.</li></ul><p><img src="images/image1.gif" alt="Scrolling through a Tensorflow for R tutorial"></p><ul><li><strong>The package supports users in translating Python code.</strong> Using <code>%py_class%</code>, data scientists can directly subclass Python objects, making it much easier to translate Python code found on the web.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">NonNegative</span>(keras<span style="color:#666">$</span>constraints<span style="color:#666">$</span>Constraint) <span style="color:#666">%py_class%</span> {<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">__call__&#34;</span> <span style="color:#666">&amp;</span>lt;<span style="color:#666">-</span> <span style="color:#06287e">function</span>(x) {w <span style="color:#666">*</span> <span style="color:#06287e">k_cast</span>(w <span style="color:#666">&gt;=</span> <span style="color:#40a070">0</span>, <span style="color:#06287e">k_floatx</span>())}}</code></pre></div><p>No need to switch between environments or languages — users can use Keras functionality all within R.</p><h2 id="whats-next">What’s next?</h2><p>We will continue developing Keras for R to help R users develop sophisticated deep learning models in R. Stay tuned for:</p><ul><li>A new version of <em>Deep Learning for R</em>, with updated functionality and architecture;</li><li>More expansion of Keras for R’s extensive low-level refactoring and enhancements; and</li><li>More detailed introductions to the powerful new features.</li></ul><p>Check out the “<a href="https://blogs.rstudio.com/ai/posts/2021-11-18-keras-updates/" target = "_blank">Keras for R is Back!</a>” post to learn about the state of the ecosystem and the package&rsquo;s new functionalities. Subscribe to the <a href="https://blogs.rstudio.com/ai/" target = "_blank">RStudio AI Blog</a> for the latest news, insights, and examples of using AI-related technologies with R.</p></description></item><item><title>Three Ways to Program in Python With RStudio</title><link>https://www.rstudio.com/blog/three-ways-to-program-in-python-with-rstudio/</link><pubDate>Mon, 06 Dec 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/three-ways-to-program-in-python-with-rstudio/</guid><description><caption>Photo by <a href="https://unsplash.com/@timothycdykes?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Timothy Dykes</a> on <a href="https://unsplash.com/?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></caption><p>RStudio is known for making excellent tools and packages for R programming. But did you know that you can use RStudio for Python programming, as well? Whether you want to use R and Python together or work solely in Python, there are a variety of ways for you to develop your code. You can:</p><ul><li><a href="#1-run-python-scripts-in-the-rstudio-ide">Run Python Scripts in the RStudio IDE</a></li><li><a href="#2-use-r-and-python-in-a-single-project-with-the-reticulate-package">Use R and Python in a single project with the reticulate Package</a></li><li><a href="#3-use-your-python-editor-of-choice-within-rstudio-tools">Use your Python editor of choice within RStudio tools</a></li></ul><p>Let&rsquo;s explore these features using the <a href="https://allisonhorst.github.io/palmerpenguins/" target = "_blank">Palmer Penguins</a> dataset.</p><h2 id="1-run-python-scripts-in-the-rstudio-ide">1. Run Python Scripts in the RStudio IDE</h2><p>The <a href="https://www.rstudio.com/products/rstudio" target = "_blank">RStudio IDE</a> is a free and open-source IDE for Python, as well as R. You can write scripts, import modules, and interactively use Python within the RStudio IDE.</p><p>To get started writing Python in the RStudio IDE, go to File, New File, then Python Script. Code just as you would in an R script.</p><p>The RStudio IDE provides several useful tools for your Python development:</p><ul><li>The RStudio environment pane displays the contents of Python modules.</li><li>Explore Python objects either by calling the <code>View()</code> function or by using the associated right-most buttons in the Environment pane.</li><li>The RStudio IDE presents matplotlib and seaborn plots within the Viewer pane.</li></ul><p><img src="gif/gif2.gif" alt="Running a Python script in the RStudio IDE"></p><p>Need help remembering your Python function? RStudio also provides code completion for Python scripts:</p><p><img src="gif/gif3.gif" alt="Getting Python function help in the RStudio IDE"></p><p><strong>Learn more about <a href="https://rstudio.github.io/reticulate/articles/rstudio_ide.html" target = "_blank">RStudio IDE Tools for reticulate</a>.</strong></p><h2 id="2-use-r-and-python-in-a-single-project-with-the-reticulate-package">2. Use R and Python in a Single Project With the reticulate Package</h2><p>Once installed, you can call Python in R scripts. In this example, we turn the Palmer Penguins dataset into a Pandas data frame. Then, we run the<code>pandas.crosstab</code> function.</p><p><img src="gif/gif6.gif" alt="Turning an R data frame to a Python pandas data frame"></p><p>We can turn the Pandas data frame back into an R object:</p><p><img src="gif/gif7.gif" alt="Turning a Pandas data frame back to an R data frame"></p><p>Interoperability works in R Markdown as well. Create and execute Python chunks in your .Rmd file:</p><pre><code>```{python}from palmerpenguins import load_penguinspenguins = load_penguins()penguins.describe()```</code></pre><p>With reticulate, you can use Python in R packages, Shiny apps, and more.</p><p><strong>Find out more about the <a href="https://rstudio.github.io/reticulate/articles/r_markdown.html" target = "_blank">R Markdown Python engine</a> and how to <a href="https://rstudio.github.io/reticulate/articles/calling_python.html" target = "_blank">call Python from R</a>.</strong></p><h2 id="3-use-your-python-editor-of-choice-within-rstudio-tools">3. Use Your Python Editor of Choice Within RStudio Tools</h2><p>RStudio tools, such as RStudio Workbench and RStudio Cloud, integrate with interfaces beyond the RStudio IDE.</p><h3 id="rstudio-workbench">RStudio Workbench</h3><p>As your data science team grows, your tools need to scale as well. With <a href="https://www.rstudio.com/products/workbench/" target = "_blank">RStudio Workbench</a>, data scientists collaboratively work from a centralized server using their editor of choice: RStudio, JupyterLab, Jupyter Notebook, or VSCode.</p><p><img src="gif/gif5.gif" alt="Running a Python script in the RStudio IDE"></p><p>Within the editor, data scientists can write Python code with:</p><ul><li><strong>Better collaboration:</strong> Data scientists use the same back-end infrastructure, which makes it easier for them to share files, data, libraries, and other resources.</li><li><strong>Concurrent sessions:</strong> RStudio Workbench enables users to have multiple concurrent R or Python sessions on a single server or a load-balanced cluster of servers.</li><li><strong>Security:</strong> Python programmers can work in securely-managed session sandboxes.</li></ul><p>In the RStudio IDE, data scientists can also collaborate in real time. When multiple users are active in the project at once, you can see each others&rsquo; activity and work together.</p><p><strong>Learn more about <a href="https://www.rstudio.com/products/workbench/" target = "_blank">RStudio Workbench</a>.</strong></p><h3 id="rstudio-cloud">RStudio Cloud</h3><p><a href="https://www.rstudio.com/products/cloud/" target = "_blank">RStudio Cloud</a> is a cloud-based solution that allows you to run, share, teach and learn Python. Jupyter Notebook projects are now available to Premium, Instructor, or Organization account holders. Once in RStudio Cloud, it is easy to install modules, share Jupyter notebooks, and run Python code:</p><p><img src="gif/gif4.gif" alt="Running a Python script in the RStudio IDE"></p><p>Want to try out this notebook? Check it out in <a href="https://rstudio.cloud/project/2997990" target = "_blank">your browser</a> (no paid subscription needed but login required).</p><p><strong>Learn more about <a href="https://www.rstudio.com/products/cloud/" target = "_blank">RStudio Cloud</a>.</strong></p><h2 id="conclusion">Conclusion</h2><p>We want you to do your best work in your preferred environment and language. RStudio provides exciting options for your Python projects: Python scripts in the RStudio IDE, mixed language development with reticulate, and editor options in RStudio Cloud and RStudio Workbench.</p><p>Data science goes beyond coding. With RStudio, you can also:</p><ul><li><p><strong>Manage your Python packages:</strong> With <a href="https://www.rstudio.com/products/package-manager/" target = "_blank">RStudio Package Manager</a>, you can mirror the Python Package Index (PyPI) to organize and centralize packages behind your firewall.</p></li><li><p><strong>Share Python content via RStudio Connect:</strong> Need to publish and share your Python content? <a href="https://www.rstudio.com/products/connect/" target = "_blank">RStudio Connect</a> allows data scientists to publish Jupyter Notebooks, Flask applications, Shiny applications that call Python scripts, and much more.</p></li></ul><p><strong>See RStudio Connect and Jupyter notebooks in action during Tom Mock&rsquo;s live webinar on Thursday, December 9th at 11-12 ET - <a href="https://www.youtube.com/watch?v=iJspIB-Wh38" target = "_blank">Cut down on the grunt work and deliver insights more effectively with RStudio Connect, R Markdown, and Jupyter.</a></strong></p><p>Read more about how RStudio can support your Python development:</p><ul><li><a href="https://www.rstudio.com/solutions/r-and-python/" target = "_blank">RStudio: A Single Home for R &amp; Python</a></li><li>Solutions page for <a href="https://solutions.rstudio.com/python/" target = "_blank">Python with RStudio</a></li></ul></description></item><item><title>RStudio Community Monthly Events - December 2021</title><link>https://www.rstudio.com/blog/rstudio-community-monthly-events-december-2021/</link><pubDate>Thu, 02 Dec 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-community-monthly-events-december-2021/</guid><description><sup>Photo by <a href="https://unsplash.com/@nickmorrison?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Nick Morrison</a> on <a href="https://unsplash.com/?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup><p>Welcome to RStudio Community Monthly Events Roundup, where we update you on upcoming events happening at RStudio this month. Missed the great talks and presentations from last month? Find them listed under <a href="#icymi-november-2021-events">ICYMI: November 2021 Events</a>.</p><p>You can <a href="https://www.addevent.com/calendar/wT379734" target = "_blank" rel = "noopener noreferrer">subscribe</a> to the Community Events Calendar so that new events will automatically appear on your calendar. Please note that by subscribing, all of the events in the calendar will appear on your own calendar. If you wish to add individual events instead, please use the links below.</p><p>We can’t wait to see you there!</p><h2 id="save-the-date">Save the Date</h2><ul><li>December 2, 2021 at 12 ET: Data Science Hangout with Jarus Singh, Director of Quantitative Analytics at Pandora <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">(add to calendar)</a></li><li>December 6, 2021 at 12 ET: Using RStudio on Amazon SageMaker | Presented by James Blair <a href="https://www.addevent.com/event/Ch9725290" target = "_blank">(add to calendar)</a></li><li>December 7, 2021 at 12 ET: R en la Administración Pública &amp; informes de técnicas psicométricas con R Markdown (meetup hosted in Spanish by Sergio Garcia Mora) | Presented by Daniela Garcia &amp; Julieta Nieva <a href="https://www.addevent.com/event/HG10141478" target = "_blank">(add to calendar)</a></li><li>December 9, 2021 at 11 ET: Cut down on the grunt work and deliver insights more effectively with RStudio Connect, R Markdown, and Jupyter <a href="https://www.addevent.com/event/SG10488594" target = "_blank">(add to calendar)</a></li><li>December 9, 2021 at 12 ET: Data Science Hangout with Aliyah Wakil, Quantitative and Qualitative Analytics Team Lead at TX Department of State Health Services <a href="https://www.addevent.com/event/Qv9211919" target = "_blank">(add to calendar)</a></li><li>December 9, 2021 at 2 ET: Leveraging the Cloud for Analytics Instruction at Scale: Challenges and Opportunities | Presented by Dr. Brian Anderson <a href="https://www.addevent.com/event/uj10488920" target = "_blank">(add to calendar)</a></li><li>December 14, 2021 at 12 ET: Power Calculations in R: How much data is enough? | Panel meetup with Ethan Brown, Jianmei Wang, and Richard Webster <a href="https://www.addevent.com/event/iq9737382" target = "_blank">(add to calendar)</a></li></ul><h2 id="recurring-events">Recurring Events</h2><h3 id="data-science-hangout">Data Science Hangout</h3><p>Earlier this year, we started an informal &ldquo;data science hangout&rdquo; at RStudio for the data science community to connect and chat about some of the more human-centric questions around data science leadership. These happen every Thursday at 12 ET with a different leader featured each week.</p><p>The conversation is all audience-based and there&rsquo;s no registration needed, so you can jump on whenever it fits your schedule. Add the weekly hangouts to your calendar on <a href="https://www.addevent.com/event/Qv9211919" target = "_blank" rel = "noopener noreferrer">AddEvent</a>.</p><h3 id="rstudio-enterprise-community-meetups">RStudio Enterprise Community Meetups</h3><p>We also host industry meetups for teams to share the work they are doing within their organizations, teach lessons learned, and network with others. Join the group on <a href="https://www.meetup.com/RStudio-Enterprise-Community-Meetup/" target = "_blank" rel = "noopener noreferrer">Meetup</a>.</p><h2 id="icymi-november-2021-events">ICYMI: November 2021 Events</h2><ul><li><a href="https://www.youtube.com/playlist?list=PL9HYL-VRX0oTu3bUoyYknD-vpR7Uq6bsR" target = "_blank">Past Data Science Hangouts</a></li><li><a href="https://www.youtube.com/watch?v=_XNKSEQTo30" target = "_blank">RStudio Hosted Evaluation Walkthrough - Build &amp; Share Data Products Like TheWorld’s Leading Companies</a> | Presented by Tom Mock, RStudio</li><li><a href="https://www.youtube.com/watch?v=e2h-BVgY4VA" target = "_blank">R in Public Sector - The data you were promised&hellip;and the data that you got</a> | Presented by Hlynur Hallgrímsson, City of Reykjavik</li><li><a href="https://www.youtube.com/watch?v=FggD93l7NmA" target = "_blank">R in Sports Analytics - NFL Big Data Bowl &amp; Analyzing Tracking Data</a> | Presented by Thompson Bliss, NFL</li><li><a href="https://www.youtube.com/watch?v=4Gt1VIP07nc" target = "_blank">R in Marketing - Survey Design for Applications of Machine Learning</a> | Presented by Bryan Butler, Eastern Bank</li><li><a href="https://www.youtube.com/watch?v=-zhTXiiCj58" target = "_blank">R in Epidemiology - Connecting Primary Care Providers to their own data</a> | Presented by Andy Choens, Acuitas Health</li><li><a href="https://www.youtube.com/watch?v=PSiAwbRmYaA" target = "_blank">ML Ops - Machine Learning as an Engineering Discipline</a> | Presented by Suteja Kanuri, ShopBack</li></ul><h2 id="call-for-speakers">Call for Speakers</h2><p>If you’re interested in sharing your work at a Meetup (or just starting to consider it for a future date down the road!), <a href="https://forms.gle/EtXMpSoTfhpGopiS8" target = "_blank" rel = "noopener noreferrer">please fill out the speaker submission form</a>. We’re always looking for a diverse set of speakers — if you are a member of a group that is underrepresented in data science, including people of color, those with disabilities, elders/older adults, LGBTQ folks, and women/minority genders, we highly encourage you to submit!</p></description></item><item><title>R Markdown Lesser-Known Tips & Tricks #1: Working in the RStudio IDE</title><link>https://www.rstudio.com/blog/r-markdown-tips-tricks-1-rstudio-ide/</link><pubDate>Mon, 22 Nov 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/r-markdown-tips-tricks-1-rstudio-ide/</guid><description><sup>Photo by <a href="https://unsplash.com/@kumoknits?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Karina L</a> on <a href="https://unsplash.com/?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup><p>The R Markdown file format combines R programming and the markdown language to create dynamic, reproducible documents. R Markdown can be used for reports, slide shows, blogs, books &mdash; even <a href="https://bookdown.org/yihui/rmarkdown/shiny-start.html" target = "_blank">Shiny apps</a>! With so many possibilities, authors learn how to use their tools in effective ways.</p><p>We asked our Twitter friends <a href="https://twitter.com/_bcullen/status/1333878752741191680" target = "_blank">the tips and tricks that they have picked up</a> along their R Markdown journey. There was a flurry of insightful replies, ranging from organizing files to working with YAML. We wanted to highlight some of the responses so that you can also use them when creating R Markdown documents.</p><p>This is the first of a four-part series to help you on your path to R Markdown success, starting with <strong>working with R Markdown documents in the RStudio IDE.</strong></p><p><strong>1. Create new chunks with shortcuts</strong></p><p>We understand the pain of typing out all those backticks to create a new chunk, and <a href="https://yihui.org/en/2021/10/unbalanced-delimiters/" target = "_blank">it is also error-prone</a>. Instead, insert an R code chunk by clicking the Insert button on the document toolbar.</p><center><img src="img/img9.png" alt="Insert button in document toolbar that looks like a green square with a C and a plus sign." width="80%"/></center><br>You can also type the keyboard shortcut <kbd>Ctrl</kbd> + <kbd>Alt</kbd> + <kbd>I</kbd> (<kbd>Cmd</kbd> + <kbd>Option</kbd> + <kbd>I</kbd> on macOS). Use the shortcut inside a chunk to split it into two:<p><img src="img/img1.gif" alt="Breaking up one code chunk into two using Ctrl+Cmd+I Shortcut in RStudio IDE"></p><p><strong>2. Run all (or some) chunks</strong></p><p>Within RStudio, the Run button on the right-hand side of the document toolbar opens a drop-down menu. The menu contains handy shortcuts for running code chunks.</p><center><img src="img/img10.png" alt="Run dropdown menu with various options for running code chunks." width="50%"/></center><br><p>For example, you don&rsquo;t have to run chunks individually. Run all chunks below your cursor by clicking <strong>Run All Chunks Below</strong>.</p><p><img src="img/img2.gif" alt="Selections available from the Run button menu in RStudio"></p><p><strong>3. Show plots in the Viewer pane</strong></p><p>By default, code chunks display R Markdown plots &ldquo;inline&rdquo;, or directly underneath the code chunk. If you would rather see the plot in the Viewer pane, go to <strong>RStudio</strong> &gt; <strong>Preferences</strong> &gt; <strong>R Markdown</strong> and unselect &ldquo;Show output inline for all R Markdown documents&rdquo;.</p><center><img src="img/img3.png" alt="R Markdown checkbox for the output options in Global Options" width="70%"/></center><p>Voilà! Next time you run the document, the plot will show in the Viewer pane as opposed to inline.</p><p><em>Before&hellip;</em></p><p><img src="img/img4.png" alt="Output inline with code"></p><p><em>After&hellip;</em></p><p><img src="img/img5.png" alt="Output in Viewer pane"></p><p><strong>4. Drag and drop formulas from Wikipedia into your R Markdown document</strong></p><p>You can include LaTeX formulas in your R Markdown files. Enclose them between dollar signs (<code>$</code>) to see the rendered formula.</p><p>Since Wikipedia uses LaTeX HTML formatting on its website, this means you can highlight formulas and drag them into your R Markdown document.</p><script src="https://fast.wistia.com/embed/medias/2sqms83hj9.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:33.54% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_2sqms83hj9 videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><p><strong>5. Use the visual markdown editor</strong></p><p><a href="https://www.rstudio.com/products/rstudio/download/" target = "_blank">RStudio v1.4</a> has a visual markdown editing mode. This lets you see what your R Markdown document will look like without knitting. You can edit your document in this mode, as well.</p><p>Click the compass button on the far-right end of the document toolbar to switch into visual markdown editing mode.</p><center><img src="img/img11.png" alt="Visual markdown editor button on the right-hand side of toolbar." width="80%"/></center><br><p>Alternatively, you can use the <kbd>⇧</kbd>+<kbd>⌘</kbd>+<kbd>F4</kbd> keyboard shortcut.</p><p><img src="img/img7.gif" alt="Switching to Visual Markdown Editing Mode on RStudio using shortcuts"></p><p>Typing <kbd>⌘/</kbd> finds and inserts what you need into the document:</p><script src="https://fast.wistia.com/embed/medias/0q6hpkuvhm.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:94.17% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_0q6hpkuvhm videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><p>These are only a few of the many features available in the visual markdown editor. Read more in <a href="https://rstudio.github.io/visual-markdown-editing/" target = "_blank">the RStudio Visual Markdown Editing documentation</a>.</p><h2 id="continue-the-journey">Continue the Journey</h2><p>We hope that these tips &amp; tricks help you when you are working with R Markdown documents in the RStudio IDE. Thank you to everybody who shared advice, workflows, and features!</p><p>Stay tuned for the second post in this four-part series: <strong>Cleaning up your code.</strong></p><h2 id="resources">Resources</h2><ul><li>For more information on R Markdown and the RStudio IDE, see <a href="https://rmarkdown.rstudio.com/articles_integration.html" target = "_blank">R Markdown Integration in the IDE</a>.</li><li>Read more about Visual R Markdown in the <a href="https://rstudio.github.io/visual-markdown-editing/" target = "_blank">documentation</a> and the <a href="https://blog.rstudio.com/2020/09/30/rstudio-v1-4-preview-visual-markdown-editing/" target = "_blank">accompanying blog post</a>.</li><li>To learn about RStudio Connect, a platform on which you can schedule and deploy for R Markdown documents so they are accessible to all the relevant stakeholders in your organization, check out the <a href="https://www.rstudio.com/products/connect/" target = "_blank">RStudio Connect product page</a>.</li></ul></description></item><item><title>Announcing the RStudio Blog’s New Vision and Design</title><link>https://www.rstudio.com/blog/announcing-the-rstudio-blog-s-new-vision-and-design/</link><pubDate>Wed, 17 Nov 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-the-rstudio-blog-s-new-vision-and-design/</guid><description><caption>Photo by <a href="https://unsplash.com/@maksym_tymchyk?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Maksym Tymchyk</a> on <a href="https://unsplash.com/s/photos/hexagon?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></caption><p>Welcome to rstudio.com/blog!</p><p>For over 10 years, the RStudio blog has been your source for updates, products, and perspectives from RStudio. We’ve heard that you love reading about our new tools and upcoming events, and we will continue sharing information on those topics. We’ve also heard you would also like to hear from our community members on the cool stuff they’re doing with their tools and how you could use them, too.</p><p>Given what we’ve learned, we are so excited to announce the new RStudio blog vision and look.</p><p><strong>Our blog vision is to showcase what&rsquo;s possible with great data science tools by highlighting the inspiring work being done across RStudio and in the community.</strong> We will do this through a variety of content, including:</p><ul><li>News and updates</li><li>Walkthroughs</li><li>Stories from the community</li><li>And much more!</li></ul><p>We also want to make sure that it is easy to find what you are looking for. We’ve redesigned our blog so that you can find, read, and discuss the RStudio information that will help you on your data science journey.</p><p><strong>1. Find what you want to know</strong></p><p>The RStudio blog has new ways to help you find what you care about. We have six main categories for our posts. You can <a href="https://www.rstudio.com/about/subscription-management/" target = "_blank">subscribe</a> to as many as you&rsquo;d like:</p><ul><li>Company News and Events</li><li>Data Science Leadership</li><li>Industry</li><li>Open Source</li><li>Products and Technology</li><li>Training and Education</li></ul><p>Want more blog content from RStudio? Be sure to follow these blogs as well:</p><ul><li><a href="http://blogs.rstudio.com/ai" target = "_blank">AI Blog</a>: Learn about AI-related technologies in R</li><li><a href="http://rviews.rstudio.com" target = "_blank">RViews</a>: Read highlights from the R community</li><li><a href="https://www.tidyverse.org/blog/" target = "_blank">Tidyverse Blog</a>: Explore what’s happening in the tidyverse</li></ul><p><strong>2. Read what is possible</strong></p><p>Our blog posts will focus on what’s possible with RStudio. Whether you are a data practitioner, team leader, programmer, instructor, researcher, someone who uses open source software in their spare time, CTO, or any type of data enthusiast that this list may have missed, we will share news, stories, and walkthroughs to help guide you on your data science journey.</p><p>RStudio is an amazing place and there’s so much happening. We are working across the company to:</p><ul><li>Provide updates on all things RStudio,</li><li>Round up any key information from different teams, and</li><li>Point you to where you can learn more.</li></ul><p><strong>3. Discuss what you care about</strong></p><p>The blog is now integrated with the <a href="https://community.rstudio.com/" target = "_blank">RStudio Community Site</a> so that you can share your thoughts and updates with others as well. Discuss, share, and engage with others in the RStudio Community.</p><p><strong>The Future</strong></p><p>We’re not done yet. We want to make sure that we create a better overall experience for everybody and will make purposeful changes to RStudio over time.</p><ul><li>We will be continuously adding design features as we learn more about how people use our blog.</li><li>We are working on a contribution guide for those who want to share what’s possible with RStudio tools on the RStudio blog.</li><li>Interested in continuing to guide the future of our communications formats and channels? Take the <a href="https://blog.rstudio.com/2021/10/27/announcing-the-2021-rstudio-communications-survey/">2021 RStudio Communications Survey</a>.</li></ul><p>Our new blog design has been an incredible team effort and we appreciate the support from across the RStudio team. We’re excited that the new vision and look will help us showcase examples of great data science tools and processes.</p><p>Want to stay connected? You can <a href="https://www.rstudio.com/about/subscription-management/" target = "_blank">subscribe to the RStudio blog</a> and follow us on <a href="https://twitter.com/rstudio" target = "_blank">Twitter</a>, <a href="https://www.linkedin.com/company/rstudio-pbc" target = "_blank">LinkedIn</a>, and <a href="https://www.facebook.com/rstudiopbc/" target = "_blank">Facebook</a>.</p><p>Thank you and we can’t wait to share more!</p></description></item><item><title>How To Augment Tableau With R & Python - A Webinar and Case Study from Sweden</title><link>https://www.rstudio.com/blog/augment-tableau-with-r-python/</link><pubDate>Mon, 15 Nov 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/augment-tableau-with-r-python/</guid><description><div class="lt-gray-box">Filip Wästberg, Vilgot Österlund, and Jesper Ludvigsen are data science consultants at Solita Sweden. <a href="https://www.solita.fi/en/rstudio/" target = "_blank">Solita</a>, an RStudio Full Service Partner, is made up of a community of skilled experts in northern Europe with a goal of delivering impact that lasts. Together with their clients, Solita helps design and build solutions in various fields such as Data Platforms, Cloud Connectivity and Data Science.</div><p>A business intelligence (BI) tool is an excellent gateway into advanced analytics and data science. At Solita, we have helped many organizations scale their visual analytics and business intelligence using Tableau. However, data visualization is (as Hadley Wickham points out in <a href="https://r4ds.had.co.nz/introduction.html" target = "_blank">R for Data Science</a>) a fundamental human activity. At some point, analysts and data scientists start to think about how they can abstract visualizations into something that can be automated and this is typically when we start to think about statistical modelling and machine learning.</p><p>Many of our clients come to us asking how they can use data science and advanced analytics to enrich their BI tools. Of course, there are many different ways to do this and the most suitable solution will depend on all of your other infrastructure. But if you have a data scientist that has used R or Python to build something that cannot (or should not) be done in a BI tool, there are some practices that you can use and that we recommend.</p><p>In our upcoming joint webinar with RStudio, <a href="https://www.rstudio.com/registration/using-r-and-python-to-augment-tableau/" target = "_blank">Using R &amp; Python to Augment Tableau</a>, we will discuss a case from one of our clients, one of Sweden’s largest government agencies, that exemplifies these practices. Since the outbreak of COVID-19, a team of Solita data engineers, analysts and data scientists have helped the agency collect data and visualize it with Tableau.</p><p>Two of our data scientists realized that the visualized data could be used to make forecasts for an important part of their operation. They built a minimal viable product to explore this idea and got buy-in from management. During this webinar, we will walk through this case and how we successfully have combined open-source data science with Tableau. But more importantly, we will have a discussion on where we go from here.</p><p>Realizing one data science project is vastly different from realizing 10 data science projects. Typically, data science projects bring great value in the beginning, but if we don’t think about how to scale data science, there is a risk of getting caught in a maintenance trap. Instead of realizing more use cases, we are stuck maintaining old projects with home-built solutions. The data scientist gets tired of all the maintenance, quits for a new job, and we are back to square one.</p><p>In this webinar, we will talk about how you can combine RStudio’s professional products and Tableau to avoid the maintenance trap and successfully scale data science using two great tools <em>together</em>.</p><p>See you there!</p><p>Filip Wästberg, Jesper Ludvigsen and Vilgot Österlund from Solita</p><p><strong>Watch our webinar with Solita on Using R &amp; Python to Augment Tableau below:</strong></p><script src="https://fast.wistia.com/embed/medias/odsglk2cj2.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_odsglk2cj2 videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/odsglk2cj2/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div></description></item><item><title>Building Code Movies With flipbookr</title><link>https://www.rstudio.com/blog/building-code-movies-with-flipbookr/</link><pubDate>Mon, 08 Nov 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/building-code-movies-with-flipbookr/</guid><description><script src="https://www.rstudio.com/blog/building-code-movies-with-flipbookr/index_files/header-attrs/header-attrs.js"></script><script src="https://www.rstudio.com/blog/building-code-movies-with-flipbookr/index_files/fitvids/fitvids.min.js"></script><sup>Photo by <a href="https://unsplash.com/@alexlitvin?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Alex Litvin</a> on <a href="https://unsplash.com/?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup><div class="lt-gray-box"><p>This is a guest post from <a href="https://twitter.com/evamaerey" target = "_blank" rel = "noopener noreferrer">Gina Reynolds</a> with contributions from Rachel Goodman, Luca Picci, Conner Surrency, and Brit Woodrum, who provided research assistance while completing their Masters’ at the University of Denver’s Josef Korbel School of International Studies. Gina taught research methodology at University of Denver from 2018 to 2020 and currently teaches statistics and probability at West Point. Her research focuses on tools for proximate comparison and translation for data analysis and visualization.</p></div><p>Have you heard of “code movies” or “code flipbooks”? Maybe not? This blog post will tell you what they are, introduce the flipbookr package to help build them in R, and showcase student work as examples.</p><p>I use the terms ‘code movies’ and ‘flipbooks’ interchangeably.</p><div id="what-are-flipbooks" class="level1"><h3> What Are Flipbooks?</h3><p>Flipbooks help you demonstrate how to get from ‘A’ to ‘B’ in data manipulation, analysis, or visualization code pipelines. When using R Markdown or Jupyter notebooks, we usually only see the initial input and final output for a pipeline of steps. Having the inputs and outputs close to one another helps communicate the big picture of what is being accomplished with a chunk of code.</p><p>But you might have trouble figuring out what the <em>individual steps</em> in a pipeline accomplish. This is where flipbooks come in! They seek to illuminate what’s going on in <em>each</em> step of the pipeline or plot. Flipbooks show the <em>within</em>-pipeline output for every line of code.</p><p>Here’s an example where we build a ggplot with the mtcars dataset. We build the plot and then add an annotation to describe the components of a boxplot.</p><p>Click in the frame below and use arrows or swipe to go through the slideshow and see the plot build with each new line of code.</p><div class="shareagain video-responsive" style="min-width:300px;margin:1em auto;"><iframe src="https://colorado.rstudio.com/rsc/content/aa570b5a-2bbd-4c81-8c59-e2bc5ae2c7c3" width="720" height="500" style="border:2px solid currentColor;" loading="lazy" allowfullscreen></iframe><script>fitvids('.shareagain', {players: 'iframe'});</script></div></div><div id="code-movie-examples" class="level1"><h3>Code Movie Examples</h3><p>You may have seen code movies in coding presentations. Presenting pipelines as a movie helps audiences digest workflows, so it’s worth choreographing a set of slides to break down them. If the alignment between slides is good — and if we don’t have disruptive slide transitions (wipes, spins, fades) — we get to enjoy a little movie: the coordinated evolution of code and output! Here are some examples:</p><ul><li><a href="https://youtu.be/DhDOTxojQ3k?t=350" target = "_blank" rel = "noopener noreferrer"><em>Forecasting</em> - Mitchelle O’Hara Wild</a></li><li><a href="https://pkg.garrickadenbuie.com/gentle-ggplot2/#40" target = "_blank" rel = "noopener noreferrer"><em>A Gentle Guide to the Grammar of Graphics with ggplot2</em> - Garrick Aden-Buie</a></li><li><a href="https://youtu.be/sB8CYGlPN0o?t=158" target = "_blank" rel = "noopener noreferrer"><em>3D mapping, plotting, and printing with rayshader</em> - Tyler Morgan-Wall</a></li></ul></div><div id="building-flipbooks-quickly-and-reliably-with-flipbookr" class="section level1"><h3>Building Flipbooks Quickly and Reliably With flipbookr</h3><p>While code movies deliver helpful insight to audiences, it can be time consuming to put together the experience. There’s a lot of copy-and-paste that has to happen to create the right partial code sequence — and you can mix yourself up trying to coordinate it (I’ve been there!).</p><p>The <a href="https://evamaerey.github.io/flipbookr/" target = "_blank" rel = "noopener noreferrer">flipbookr</a> package’s goal is to help create these easy-to-follow, step-by-step experiences — without the copy-and-paste pain! All you need to do is write your pipeline once. Then, you can let flipbookr take over to create a flipbook that shows the code and its corresponding output.</p><p>Together with the R Markdown slideshow package <a href="https://github.com/yihui/xaringan" target = "_blank" rel = "noopener noreferrer">xaringan</a>, flipbookr does four things:</p><ol style="list-style-type: decimal"><li>Parses an .Rmd code pipeline from the chunk you indicate (you name the chunk),</li><li>Identifies good break points in that code chunk pipeline (the default is finding balanced parentheses at the ends of lines),</li><li>Spawns a bunch of code chunks with these <em>partial builds of code</em>, separated by slide breaks, and</li><li>Displays partial code in HTML slides.</li></ol><p>The slides are shown side-by-side and sequentially, giving us a movie-like experience.</p><p>There is so much decision-making packed into our code pipelines. The flipbookr project makes it easy to bring those decisions to light so they can be appreciated, examined, and discussed!</p></div><div id="taking-flipbookr-for-a-spin" class="section level1"><h3>Taking flipbookr for a Spin</h3><p>After installing flipbookr with <code>install.packages("flipbookr")</code>, there are a couple of ways to get started:</p><ol style="list-style-type: decimal"><li>Use the <a href="https://evamaerey.github.io/flipbooks/flipbook_recipes#1" target = "_blank" rel = "noopener noreferrer">Easy Flipbook Recipes</a> guide. You can put together a basic flipbook with step-by-step instructions.</li></ol><!-- there's also a [video companion guide](https://www.youtube.com/watch?v=07xEB4q8bXo&feature=youtu.be) to the recipes book. --><ol start="2" style="list-style-type: decimal"><li>Use the “A Minimal Flipbook” template that comes with the flipbookr package. After installation, you can request the basic template in the RStudio IDE by going to <code>File -&gt; New File -&gt; R Markdown -&gt; From Template -&gt; A Minimal Flipbook</code>.</li></ol><!-- In the future, we hope to deliver a new mode of creating embedded mini flipbooks from *within Rmarkdown files* themselves. You will be able to create instant mini flipbooks with the function `flipbookr::embed_flipbook()`. You (or your students!) will be able to create flipbooks with less fuss than ever before! Stay tuned or, if you are feeling brave and helpful, check out the [functions](https://github.com/EvaMaeRey/flipbookr/blob/master/R/h_write_instant_flipbook.R) or [examples template](https://github.com/EvaMaeRey/flipbookr/blob/master/inst/rmarkdown/templates/flipbook-embed/skeleton/skeleton.Rmd). --></div><div id="what-to-expect-flipbookr-examples" class="level1"><h3>What to Expect: flipbookr Examples</h3><p>Before you start building your own flipbooks, it might also be useful to see some examples from some other folks.</p><p>Four of my graduate research assistants at the <a href="https://korbel.du.edu/" target = "_blank" rel = "noopener noreferrer">University of Denver’s Korbel School of International Studies</a>, with the support of an <a href="https://www.r-consortium.org/" target = "_blank" rel = "noopener noreferrer">R Consortium</a> grant, have built excellent flipbooks that showcase ggplot2 mapping, tmap, magick, and gganimate.</p><!-- So writing code for to create flipbooks is a little different than writing code only to accomplish a task. It is more like choreographing an experience. As much as possible, the creators should try to write the code so that their is feedback as code is revealed. --><p>Rachel demos how to <a href="https://evamaerey.github.io/rstudio_education_blog/Idaho_Mapping/Idaho_mapping.html" target = "_blank" rel = "noopener noreferrer">build maps with ggplot2</a> by looking at the political landscape in Idaho. She uses several thematic elements with the <code>theme()</code> function. Her flipbook displays the incremental effect of each thematic decision.</p><div class="shareagain video-responsive" style="min-width:300px;margin:1em auto;"><iframe src="https://colorado.rstudio.com/rsc/content/db11f0d2-1357-4223-82fb-854bebe5cea0" width="720" height="500" style="border:2px solid currentColor;" loading="lazy" allowfullscreen></iframe><script>fitvids('.shareagain', {players: 'iframe'});</script></div><p>Rachel also contributed this comment about how building plots intended for a flipbook differs from the usual build:</p><blockquote><p>“The process of producing a flipbook pushed me to think differently about both data wrangling and data visualization. It required me to be more deliberate in how I wrote and ordered my code, and it revealed redundancies and other inefficiencies in my script. The process also deepened my understanding of the commands that I employed by allowing me to see the output of each individual line of code.”</p></blockquote><p>Conner explores <a href="https://evamaerey.github.io/rstudio_education_blog/AUS_InteractMap/AUS_tmap.html" target = "_blank" rel = "noopener noreferrer">the tmap package</a> by showing city population sizes in Australia. He also dives into world map projections, cycling through various tmap projection options.</p><!-- ` australia_reveal` --><div class="shareagain video-responsive" style="min-width:300px;margin:1em auto;"><iframe src="https://colorado.rstudio.com/rsc/content/385910e7-ce93-4d86-8d41-b33633d5d8d5"width="720" height="500" style="border:2px solid currentColor;" loading="lazy" allowfullscreen></iframe><script>fitvids('.shareagain', {players: 'iframe'});</script></div><p>Brit demos <a href="https://evamaerey.github.io/rstudio_education_blog/magick/magick.html" target = "_blank" rel = "noopener noreferrer">the magick package</a>, showing how image manipulation pipelines unfold.</p><!-- ` magick_reveal` --><div class="shareagain video-responsive" style="min-width:300px;margin:1em auto;"><iframe src="https://colorado.rstudio.com/rsc/content/358b0b0b-ee86-41fc-811c-5f35a501468a" width="720" height="500" style="border:2px solid currentColor;" loading="lazy" allowfullscreen></iframe><script>fitvids('.shareagain', {players: 'iframe'});</script></div><p>Finally, Luca <a href="https://evamaerey.github.io/rstudio_education_blog/unemployment/unemployment.html" target = "_blank" rel = "noopener noreferrer">visualizes changes in youth unemployment in Europe</a>, first by faceting in ggplot2 by year and then using gganimate.</p><div class="shareagain video-responsive" style="min-width:300px;margin:1em auto;"><iframe src=" https://colorado.rstudio.com/rsc/content/8b1b7f8a-cfae-4340-a3fa-fb0292385466" width="720" height="500" style="border:2px solid currentColor;" loading="lazy" allowfullscreen></iframe><script>fitvids('.shareagain', {players: 'iframe'});</script></div><hr /><div id="acknowledgments" class="level3"><h3>Acknowledgments</h3><p>The flipbookr package builds code movies using the wonderful xaringan, knitr, and R Markdown tools. It’s inspired by data manipulation and visualization tools that let you work incrementally, particularly ggplot2, dplyr, and magrittr.</p><p>Lots of folks have helped build flipbookr, especially Emi Tanaka and Garrick Aden-Buie. Garrick’s code movie in <a href="https://pkg.garrickadenbuie.com/gentle-ggplot2/#40" target = "_blank" rel = "noopener noreferrer">‘A Gentle Guide to the Grammar of Graphics’</a> is the first one I noticed and is not to be missed! Both Garrick and Emi Tanaka were inspired to work on automating the code movie build and helped get the flipbookr project off the ground.</p></div></div></description></item><item><title>RStudio Community Monthly Events - November 2021</title><link>https://www.rstudio.com/blog/rstudio-community-monthly-events-november-2021/</link><pubDate>Wed, 03 Nov 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-community-monthly-events-november-2021/</guid><description><sup>Photo by <a href="https://unsplash.com/@nickmorrison?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Nick Morrison</a> on <a href="https://unsplash.com/?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup><p>Welcome to RStudio Community Monthly Events Roundup, where we update you on upcoming events happening at RStudio this month. Missed the great talks and presentations from last month? Find them listed under <a href="#font-size-4-icymi-october-2021-events-font">ICYMI: October 2021 Events</a>.</p><p>You can <a href="https://www.addevent.com/calendar/wT379734" target = "_blank" rel = "noopener noreferrer">subscribe</a> to the Community Events Calendar so that new events will automatically appear on your calendar. Please note that by subscribing, all of the events in the calendar will appear on your own calendar. If you wish to add individual events instead, please use the links below.</p><p>We can’t wait to see you there!</p><h2 id="save-the-date">Save the Date</h2><ul><li>November 4, 2021: RStudio Hosted Evaluation Walkthrough - Build &amp; Share Data Products Like The World’s Leading Companies <a href="https://www.addevent.com/event/Ls9646911" target = "_blank" rel = "noopener noreferrer">(add to calendar)</a></li><li>November 9, 2021: R in Public SectoR - The data you were promised…and the data that you got <a href="https://www.addevent.com/event/WY9623146" target = "_blank" rel = "noopener noreferrer">(add to calendar)</a></li><li>November 10, 2021: R in Sports Analytics - NFL Big Data Bowl &amp; Analyzing Tracking Data <a href="https://www.addevent.com/event/We9624265" target = "_blank" rel = "noopener noreferrer">(add to calendar)</a></li><li>November 15, 2021: R in Marketing - Survey Design for Applications of Machine Learning <a href="https://www.addevent.com/event/tr9625532" target = "_blank" rel = "noopener noreferrer">(add to calendar)</a></li><li>November 17, 2021: R in Epidemiology - Connecting Primary Care Providers to their own data <a href="https://www.addevent.com/event/oq9656702" target = "_blank" rel = "noopener noreferrer">(add to calendar)</a></li><li>November 30, 2021: R in Retail &amp; E-Commerce - ML Ops for Recommendation Engines <a href="https://www.addevent.com/event/Il9625767" target = "_blank" rel = "noopener noreferrer">(add to calendar)</a></li></ul><h2 id="recurring-events">Recurring Events</h2><h4 id="data-science-hangout">Data Science Hangout</h4><p>We&rsquo;ve started an informal &ldquo;data science hangout&rdquo; at RStudio for the data science community to connect and chat about some of the more human-centric questions around data science leadership. These happen every Thursday at 12 ET with a different leader featured each week. Join us this week to chat with Chase Carpenter, Director of Strategy &amp; Analytics at the Chicago Cubs.</p><p>The conversation is all audience-based and there&rsquo;s no registration needed, so you can jump on whenever it fits your schedule. Add the weekly hangouts to your calendar on <a href="https://www.addevent.com/event/Qv9211919" target = "_blank" rel = "noopener noreferrer">AddEvent</a>.</p><h4 id="rstudio-enterprise-community-meetups">RStudio Enterprise Community Meetups</h4><p>We also host industry meetups for teams to share the work they are doing within their organizations, teach lessons learned, and network with others. Join the group on <a href="https://www.meetup.com/RStudio-Enterprise-Community-Meetup/" target = "_blank" rel = "noopener noreferrer">Meetup</a>.</p><h2 id="icymi-october-2021-events">ICYMI: October 2021 Events</h2><ul><li><a href="https://www.youtube.com/playlist?list=PL9HYL-VRX0oTu3bUoyYknD-vpR7Uq6bsR" target = "_blank" rel = "noopener noreferrer">Past Data Science Hangouts</a></li><li><a href="https://youtu.be/gQ9he9dyfGs" target = "_blank" rel = "noopener noreferrer">Business Reports with R Markdown</a> | Presented by Christophe Dervieux</li><li><a href="https://www.youtube.com/watch?v=VrF9EdgiSy8" target = "_blank" rel = "noopener noreferrer">RStudio Team Demo</a> | Presented by Tom Mock</li><li><a href="https://www.rstudio.com/resources/webinars/hiring-great-data-science-teams/" target = "_blank" rel = "noopener noreferrer">Hiring Great Data Science Teams</a> | Panel Webinar with Rhonda Crate, Jesse Mostipak, Katie Schafer, Jarus Singh, and Iyue Sung</li><li><a href="https://youtu.be/Id2H499q8IU" target = "_blank" rel = "noopener noreferrer">R in Sports Analytics - Introduction to GitHub Actions</a> | Presented by Michelle Brandão</li><li><a href="https://youtu.be/yb_mBJz3iSc" target = "_blank" rel = "noopener noreferrer">Scaling Spreadsheets with R</a> | Presented by Nathan Stephens</li><li><a href="https://youtu.be/t25Lbi5D6kg" target = "_blank" rel = "noopener noreferrer">Leveraging R &amp; Python in Tableau with RStudio Connect</a> | Presented by James Blair / Q&amp;A with Kelly O’Briant</li></ul><h2 id="call-for-speakers">Call for Speakers</h2><p>If you’re interested in sharing your work at a Meetup (or just starting to consider it for a future date down the road!), <a href="https://forms.gle/EtXMpSoTfhpGopiS8" target = "_blank" rel = "noopener noreferrer">please fill out the speaker submission form</a>. We’re always looking for a diverse set of speakers — if you are a member of a group that is underrepresented in data science, including people of color, those with disabilities, elders/older adults, LGBTQ folks, and women/minority genders, we highly encourage you to submit!</p></description></item><item><title>Announcing RStudio on Amazon SageMaker</title><link>https://www.rstudio.com/blog/announcing-rstudio-on-amazon-sagemaker/</link><pubDate>Tue, 02 Nov 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-rstudio-on-amazon-sagemaker/</guid><description><h2 id="data-science-in-the-cloud">Data Science in the Cloud</h2><p>As more organizations migrate their data science work to the cloud, they naturally want to bring along their favorite data science tools, including RStudio, R, and Python. These organizations are embracing the cloud to achieve various goals, including to:</p><ul><li>Simplify and reduce startup costs.</li><li>Promote collaboration between organizations or groups.</li><li>Mitigate the high costs of maintaining their own computing infrastructure.</li><li>Scale to meet variable demand.</li><li>Minimize data movement.</li></ul><p>While RStudio provides <a href="https://blog.rstudio.com/2020/11/12/cloud-strategy/" target = "_blank" rel = "noopener noreferrer">many different ways to support an organization’s cloud strategy</a>, we’ve heard from many customers who also use Amazon SageMaker. They wanted an easier way to combine RStudio’s professional products with SageMaker’s rich machine learning and deep learning capabilities, and to incorporate RStudio into their data science infrastructure on SageMaker.</p><h2 id="rstudio-on-amazon-sagemaker">RStudio on Amazon SageMaker</h2><p>Based on this feedback, we are excited to announce RStudio on Amazon SageMaker, developed in collaboration with the SageMaker team.</p><p><a href="https://aws.amazon.com/pm/sagemaker/" target = "_blank" rel = "noopener noreferrer">Amazon SageMaker</a> helps data scientists and developers to prepare, build, train, and deploy high-quality machine learning models quickly by bringing together a broad set of capabilities purpose-built for machine learning.</p><blockquote><p>RStudio is excited to collaborate with the Amazon SageMaker team on this release as they make it easier for organizations to move their open-source data science workloads to the cloud. We are committed to helping our joint customers use our commercial offerings to bring their production workloads to Amazon’s SageMaker, and to further collaborations with the Amazon SageMaker team.</p><p>— Tareef Kawaf, President, RStudio PBC</p></blockquote><p><img src="image1.png" alt="RStudio IDE showing output using Amazon SageMaker capabilities"></p><h3 id="easy-access-to-sagemaker-for-data-scientists">Easy Access to SageMaker for Data Scientists</h3><p>Data scientists can quickly get to work, spinning up their favorite development environment on SageMaker. They can:</p><ul><li><strong>Launch RStudio Workbench</strong> with a simple click.</li><li><strong>Start a new session</strong> with a fully-configured environment.</li><li><strong>Choose an instance type</strong> with the desired compute and memory for the job at hand, from a wide array of ML instances available.</li></ul><p><img src="image2.png" alt="RStudio Workbench options to choose different instance types"></p><p>Within that environment, they can get access to their organization’s data stored on AWS. They also have access to all of SageMaker’s deep learning capabilities, accessed via Python libraries using the <a href="https://rstudio.github.io/reticulate/" target = "_blank" rel = "noopener noreferrer">reticulate</a> package. This preconfigured environment includes all the necessary SageMaker libraries to get started.</p><p>This offering complements Amazon SageMaker Studio Notebooks, which provide access to Python coding in a Jupyter Notebook environment. This means that data scientists proficient with both R and Python can freely switch between RStudio and SageMaker Studio Notebooks. All of their work, including code, datasets, repositories, and other artifacts are synchronized between the two environments through the default Amazon Elastic File System (Amazon EFS) storage.</p><p>For more information from the data scientist perspective, see <a href="https://aws.amazon.com/blogs/aws/announcing-fully-managed-rstudio-on-amazon-sagemaker-for-data-scientists/" target = "_blank" rel = "noopener noreferrer">Announcing Fully Managed RStudio on Amazon SageMaker for Data Scientists</a>.</p><h3 id="familiar-management-tools-for-devops-teams">Familiar Management Tools for DevOps Teams</h3><p>As a fully managed offering on Amazon SageMaker, this release makes it easy for DevOps teams and IT Admins to administer, secure and scale their organization’s centralized data science infrastructure. They can:</p><ul><li><strong>Quickly create a multi-user RStudio Workbench environment</strong> in AWS SageMaker for their team’s data science work, without the need to install and configure RStudio Workbench.</li><li><strong>Administer this environment using familiar AWS tools and frameworks</strong>, including managing licenses, security, and domains.</li></ul><p><img src="image3.png" alt="Arrow pointing to RStudio option to launch app in SageMaker Domain"></p><p>For more information from the DevOps perspective, see <a href="https://aws.amazon.com/blogs/machine-learning/get-started-with-rstudio-on-amazon-sagemaker/" target = "_blank" rel = "noopener noreferrer">Getting Started with RStudio on Amazon SageMaker</a>.</p><h3 id="data-driven-insights-for-organizations">Data-Driven Insights for Organizations</h3><p>For data-driven organizations already using AWS, this provides a way to migrate their self-managed RStudio environments to AWS SageMaker, using their existing RStudio Workbench licenses without an incremental cost.</p><p>When RStudio for SageMaker is configured for use with RStudio Connect, data scientists using both RStudio for SageMaker and SageMaker Studio can easily share their R and Python insights with their decision-makers.</p><p><a href="https://www.rstudio.com/products/connect/" target="_blank">RStudio Connect</a> makes it easy to deliver key insights to decision-makers, at the right time, in the right format. Connect supports a spectrum of data products, static or dynamic, developed in R and Python: Dashboards, applications, APIs, reports, and more.</p><p>For more information, see <a href="https://aws.amazon.com/blogs/machine-learning/host-rstudio-connect-and-package-manager-for-ml-development-in-rstudio-on-amazon-sagemaker/" target = "_blank" rel = "noopener noreferrer">Host RStudio Connect and Package Manager for ML Development in RStudio on Amazon SageMaker</a>.</p><h2 id="getting-started-with-rstudio-on-amazon-sagemakerfont">Getting Started with RStudio on Amazon SageMaker</font></h2><p>RStudio for Amazon SageMaker enables RStudio Workbench customers to bring their existing licenses to SageMaker. If you are an existing customer, or would like to learn more, please reach out to your customer success manager or <a href="https://rstudio.chilipiper.com/book/schedule-time-with-rstudio" target = "_blank" rel = "noopener noreferrer">schedule a time to talk with us</a>.</p><center><strong><a href="https://www.addevent.com/event/Ch9725290" target = "_blank" rel = "noopener noreferrer">Join us on Dec. 6th for an RStudio on SageMaker Meetup</a><p><a href="https://www.addevent.com/event/Ch9725290"><img src="image4.png" alt="White tile with blue hexes saying RStudio on Amazon SageMaker December 6th at 12ET" title="RStudio on Amazon SageMaker Event on December 6th"></a></p></strong></center></description></item><item><title>How the "Clusterbuster" Shiny App Helps Hundreds of Doctors and Epidemiologists Battle COVID-19 in the Netherlands</title><link>https://www.rstudio.com/blog/how-the-clusterbuster-shiny-app-helps-battle-covid-19-in-the-netherlands/</link><pubDate>Tue, 02 Nov 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/how-the-clusterbuster-shiny-app-helps-battle-covid-19-in-the-netherlands/</guid><description><p>As 2020 closed, Eveline Geubbels, former COVID-19 Surveillance Coordinator at the Dutch National Institute for Public Health and the Environment (RIVM), faced an important question: How can we help the public health doctors and epidemiologists that work within the 25 Dutch regional health services gain insight into clusters of COVID-19 cases?</p><p>Now ten months into 2021, RIVM has an answer. For the past seven months, a Shiny web application called the Clusterbuster has been providing valuable information to hundreds of doctors and epidemiologists on the facts and figures surrounding COVID-19 on a daily basis.</p><h2 id="first-steps-in-creating-a-valuable-and-safe-application-fast">First Steps in Creating a Valuable (and Safe) Application Fast</h2><p>Eveline tasked Sjoerd Wierenga, a recently hired data scientist, to help RIVM answer this question. Sjoerd learned that the organization had experience with R and Shiny and was working on providing secure access to their applications. His advice was to move forward with creating a COVID-19 web app with these tools. Together with his colleague Jossy van den Boogaard, a medical epidemiologist, they quickly set up an advisory board made up of doctors and epidemiologists to outline further development.</p><blockquote><p>&ldquo;At that time, the COVID control teams at the MHSs [Municipal Health Services] were struggling with keeping an eye on newly developing clusters in the midst of the ever-increasing COVID-19 caseload. The members of the advisory board were therefore keen to create a tool that could help them with tracking (and tackling) clusters.&rdquo;</p></blockquote><blockquote><p>&mdash;Jossy van den Boogaard</p></blockquote><p>Team Clusterbuster recruited more data scientists within a few weeks. Senior Data Scientist Job Spijker tackled one of the major prerequisites in getting an application into production safely and efficiently: creating a deployment stack for the application and the data.</p><blockquote><p>&ldquo;Because we were working with sensitive data, we preferred to use on-premise solutions. Thanks to the open-source nature of many RStudio products, it was a breeze to implement our own stack on the institute&rsquo;s infrastructure.&rdquo;</p></blockquote><blockquote><p>&mdash;Job Spijker</p></blockquote><p>At the same time, many others were working behind the scenes doing risk analysis, privacy assessments, infrastructure buildouts, and much more. After a couple of months, everything seemed to fall into place. Thanks to the advisory board, the team was confident that they were sharing valuable insights. With the help of supporting staff from legal, IT, and others, they could proceed knowing that they could share the insights in a safe and responsible way.</p><p><img src="image1.png" alt="COVID-19 dashboard with a map showing hotspots on the left and progress of notifications on the right" title="COVID-19 Shiny App"></p><caption>*The first prototype for a COVID-19 Shiny application by Sjoerd Wierenga, which was hosted on shinyapps.io. This screen shot shows synthetic data. The text in the image was translated from the original Dutch using Google Translate.*</caption><p>With the major prerequisites met and the outcomes of weekly advisory board meetings to guide them, the team created a Shiny web application that quickly generated valuable insights.</p><h2 id="scaling-up-impact">Scaling Up Impact</h2><blockquote><p>&ldquo;We operate with the philosophy that the impact we generate is a function of both quality and volume. So: numbers matter!&rdquo;</p></blockquote><blockquote><p>&mdash; Sjoerd Wierenga</p></blockquote><p>The application was developed for (and partly by) professionals with a background in infectious disease control. In March, the first group of around 30 professionals received access to the application. The group grew to 200 over the following months. At the time of writing, Team Clusterbuster provides access to close to 400 doctors, epidemiologists, and other front-line professionals, and they add new users almost every day.</p><p>Because the team operates with the philosophy that the impact they generate is a function of both quality and volume, these numbers matter! More users continue to request access to the application to:</p><ol><li>Draw insights from new data</li><li>Create custom visualizations</li><li>Incorporate user feedback</li></ol><p>Let&rsquo;s look at how the team achieved each of these goals.</p><p><strong>1. Draw insights from new data</strong></p><p>The Clusterbuster app has evolved with the pandemic. The team started off visualizing only reported cases, but as the pandemic spread, they added new data such as the geographical distribution of clusters. In the last couple of months, the team has also incorporated data from the Dutch National COVID-19 Immunization Program.</p><p><img src="image2.png" alt="Explorer showing two maps with different types of COVID-19 situations, such as linked cases" title="COVID-19 Shiny App"> <caption> <em>A dashboard page displaying the geographical distribution of linked cases for a given situation. This screen shot shows synthetic data. The text in the image was translated from the original Dutch using Google Translate.</em> </caption></p><p><strong>2. Create custom visualizations</strong></p><p>The Clusterbuster combines reliable data sources and visualizes them in new ways. One visualization combined the vaccination rate and the number of reported infections with an interactive bivariate geographical representation. This visualization allowed epidemiologists to gain insight into which neighborhoods lagged behind in vaccinations while potentially being at risk because of an increase in reported cases.</p><blockquote><p>&ldquo;By using R Shiny, we had a lot of flexibility in creating insights. We weren&rsquo;t limited to off-the-shelf functionality &mdash; we could be as creative as our coding skills allowed! Linking visualizations and allowing for a lot of interactivity really helped with making the advisory board&rsquo;s ideas come to life.&rdquo;</p></blockquote><blockquote><p>&mdash; Jolien ten Velden</p></blockquote><p><img src="image3.png" alt="Bivariate map showing both vaccine coverage and incidence" title="COVID-19 Shiny App"></p><caption>*Showing a combination of the vaccination rate and reported cases with a bivariate choropleth map. This screen shot shows synthetic data. The text in the image was translated from the original Dutch using Google Translate.*</caption><p><strong>3. Incorporate user feedback</strong></p><p>Team Clusterbuster put a lot of thought into realizing an application that users actually want to use. In addition, they often received feedback on user friendliness and overall look of the app. This resulted in word of mouth being the biggest driver of more professionals seeking authorization.</p><p>Thanks to the work of Marjanne Plasmans, who created a separate Shiny application to track the usage of the Clusterbuster, the team knows that the application is being accessed hundreds of times each week. One of the users, a Public Health Doctor for the MHS VGGM, describes how the application helps them in monitoring COVID-19:</p><blockquote><p>&ldquo;The Clusterbuster is very useful in maintaining an overview of clusters and outbreaks. We use certain visualizations to track where cases are on the rise, and how this relates to specific concerns. For example: for our daily reports we use the Clusterbuster&rsquo;s insights to monitor outbreaks in areas with a relatively low vaccination coverage, or clusters in nursing homes that require our attention.&rdquo;</p></blockquote><blockquote><p>&mdash;Patrick van Schelven</p></blockquote><h2 id="future-developments">Future Developments</h2><p>Job Spijker takes pride in what he and his colleagues achieved during the COVID-19 crisis: implementing the technology to make Shiny applications available for professionals while complying with strict privacy regulations. Using this experience, Job, Sjoerd, and their colleague William Schuch are actively creating templates and procedures to facilitate efficient creation of future applications.</p><p>The development team plans to update Clusterbuster to adapt to the evolution of the pandemic. More options will be added in the upcoming months to help with the (hopefully) final push in the battle against COVID-19. While the team keeps on improving the application and providing access to more authorized users, they also philosophize about the post COVID-19 potential.</p><blockquote><p>&ldquo;By using open source software and a code first approach we can drastically reduce the &lsquo;production costs&rsquo; of valuable insights. This is, in my view, one of the key aspects in achieving more data driven public policy making.&rdquo;</p></blockquote><blockquote><p>&mdash;Sjoerd Wierenga</p></blockquote><p><img src="image4.png" alt="Template of COVID-19 Clusterbuster dashboard with a map on the left and text on the right" title="COVID-19 Shiny App"></p><caption>*The template is based on the Clusterbuster, making it possible to create new ‘Clusterbusters’ quickly and easily.*</caption><p>Finally, the team is actively having conversations about making the code open source. This could facilitate co-creation of future applications, something the RIVM has <a href="https://www.werkenvoornederland.nl/organisaties/ministerie-van-volksgezondheid-welzijn-en-sport/via-open-source-keken-slimste-developers-mee-bij-het-maken-van-corona-app" target = "_blank" rel = "noopener noreferrer">recently experienced </a>. The team considers it a way of giving back to the open source community, without which the Clusterbuster would not have existed in the first place.</p><p>We at RStudio would like to thank the following professionals for their time in developing this post:</p><ul><li>Sjoerd Wierenga, Data Scientist and Team Lead, Centre for Infectious Disease Control, RIVM</li><li>Jossy van den Boogaard, Medical Epidemiologist, Centre for Infectious Disease Control, RIVM</li><li>Job Spijker, Senior Data Scientist, Environment and Safety, RIVM</li><li>Jolien ten Velden, Data Scientist and Researcher, GGD Hart voor Brabant</li><li>William Schuch, Geo Information Specialist, Mathematics, Data and GIS, RIVM</li><li>Marjanne Plasmans, Researcher, Public Health and Health Services, RIVM</li><li>Patrick van Schelven, Public Health Doctor, VGGM</li></ul></description></item><item><title>Announcing the RStudio 2021 Communications Survey</title><link>https://www.rstudio.com/blog/announcing-the-2021-rstudio-communications-survey/</link><pubDate>Wed, 27 Oct 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-the-2021-rstudio-communications-survey/</guid><description><p>Whether you quickly scan RStudio information or are an active reader of everything we release, we want to hear from you!</p><p>We want to better communicate with you, but first we need to know what RStudio channels you are interacting with, where you are seeing content, and what you would like to see more often. <strong>Please take the <a href="https://rstd.io/2021-com-survey" target = "_blank" rel = "noopener noreferrer">RStudio 2021 Communications Survey </a> so that you can receive RStudio information in your desired formats and channels!</strong></p><p>The survey should take less than 10 minutes. If you’re able, please take the time to fill out the free-form text boxes — we read each and every one! The survey is anonymous, but as a small token of appreciation, you can provide your contact information for a chance to receive RStudio swag.</p><p><a href="https://rstd.io/2021-com-survey"><img src="survey.png" alt="Blue button saying take the survey here and thank you in advance" title="Click this image to take survey"></a></p><p>Once we close the survey, we will review the results (and share our analysis process!). Then, we’ll use the information to tailor our communications to better address your data science needs.</p><p>Thank you for collaborating on this with us and we look forward to seeing your responses!</p></description></item><item><title>RStudio at R/Pharma 2021</title><link>https://www.rstudio.com/blog/rstudio-at-r-pharma-2021/</link><pubDate>Wed, 27 Oct 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-at-r-pharma-2021/</guid><description><sup>Photo by <a href="https://unsplash.com/@_louisreed?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Louis Reed</a> on <a href="https://unsplash.com/?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup><p><a href="https://rinpharma.com/" target = "_blank" rel = "noopener noreferrer">R/Pharma</a> is here! Running until November 4th, this free conference focuses on the use of R and other open-source software in the development of pharmaceuticals. We are excited to announce various workshops and sessions led by our RStudio colleagues. <a href="https://hopin.com/events/r-pharma-2021/registration" target = "_blank" rel = "noopener noreferrer">Register for the event</a> to learn how RStudio is supporting open source in pharma.</p><p><strong>October 28th</strong></p><ul><li><strong>Workshop:</strong> Clinical Tables in gt, hosted by Rich Iannone (registration closed)</li></ul><p><strong>November 1st</strong></p><ul><li><strong>Workshop:</strong> R Admin - RStudio Connect, hosted by Kelly O&rsquo;Briant (<a href="https://www.eventbrite.com/e/r-admin-rstudio-connect-tickets-187197481707" target = "_blank" rel = "noopener noreferrer">registration page</a>)</li></ul><p><strong>November 2nd</strong></p><ul><li><strong>Keynote:</strong> The gt Package: Past, Present, and Future, hosted by Rich Iannone</li></ul><p>Rich introduces gt, a package that creates customizable tables in R, and describes its history and evolution over time.</p><p><strong>November 3rd</strong></p><ul><li><strong>Talk:</strong> Survival analysis with tidymodels: The censored package, hosted by Max Kuhn and Hannah Frick</li></ul><p>Max and Hannah describe tidymodels design goals, show some syntax for modeling, and describe subsequent additions.</p><p><strong>November 4th</strong></p><ul><li><strong>Panel:</strong> Scaling R and Shiny, Winston Chang (RStudio) is joining David Edwards (Amgen), David Granjon (Novartis), Eric Nantz (Eli Lily), and Hanni Willenbrock Thomsen (Novo Nordisk)</li></ul><p>Join Winston and other thought leaders for a panel on shaping the future of reporting in clinical studies.</p><p>Find out more about R/Pharma on <a href="https://rinpharma.com/post/thewhy/" target = "_blank" rel = "noopener noreferrer">their website</a>, where you can see the <a href="https://rinpharma.com/event/rinpharma2021/" target = "_blank" rel = "noopener noreferrer">conference agenda</a> and <a href="https://rinpharma.com/workshop/2021conference/" target = "_blank" rel = "noopener noreferrer">links to all workshops</a>. Interested in seeing what R/Pharma 2020 was like? Session videos are available on <a href="http://youtube.com/c/RinPharma" target = "_blank" rel = "noopener noreferrer">YouTube</a>.</p></description></item><item><title>How Data Scientists and Security Teams Can Effectively Work Together</title><link>https://www.rstudio.com/blog/how-data-scientists-and-security-teams-can-work-together/</link><pubDate>Tue, 26 Oct 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/how-data-scientists-and-security-teams-can-work-together/</guid><description><sup>Photo by <a href="https://unsplash.com/@liliane?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Liliane Limpens</a> on <a href="https://unsplash.com/?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup><p>Data scientists are hungry for every bit of data they can use in their work. Security teams, on the other hand, are primarily concerned with making sure that data stays put and no one ever gets access without authorization. There&rsquo;s a natural tension, which can result in friction and miscommunication.</p><p><a href="https://www.linkedin.com/in/ACoAAAEYB6gB3vIwnR8fiq-cA4aCOrs99Me2jWc" target = "_blank" rel = "noopener noreferrer">Gordon Shotwell</a>, lead data scientist at <a href="https://www.socure.com/" target = "_blank" rel = "noopener noreferrer">Socure</a>, has dealt with this tension firsthand. The team at Socure builds best-in-class fraud models for top banks and credit card companies, so they’re constantly working with sensitive data. During a <a href="https://www.youtube.com/watch?v=UnLpB4IDpZU" target = "_blank" rel = "noopener noreferrer">RStudio Enterprise Meetup</a>, he explained how his quickly-growing team cooperates with Socure’s security team to move fast without harming organizational security.</p><h2 id="become-friends-to-achieve-collective-goals">Become friends to achieve collective goals</h2><h5 id="istep-into-the-mindset-of-securityi"><i>Step into the mindset of security</i></h5><p>Data scientists should show their intention of being allies to the security folks in their organizations, starting by putting themselves in the security team&rsquo;s shoes. Imagine the scenario: you worry all day about events that, though unlikely, could be catastrophic to your organization. You constantly anger your colleagues by saying ‘no’ to cool, new tools because of the risk to security. You are rarely recognized when your work goes well, but everyone will know if something goes wrong.</p><p>By empathizing with security teams, data scientists can better understand where they are coming from, acknowledge the potential risks, and understand why they are important.</p><h5 id="iadvocate-for-security-projectsi"><i>Advocate for security projects</i></h5><p>Even important security projects can get buried because of more urgent tasks. Data scientists should advocate for security improvements to their work. By raising these projects to other teams, data scientists can reinforce that they’re on the same side as their security-focused colleagues.</p><h5 id="iprove-that-you-can-make-security-improvementsi"><i>Prove that you can make security improvements</i></h5><p>Data scientists shouldn&rsquo;t be all talk when it comes to security projects. They should prove that they can actually improve their security practices. This means knowing the context of what the different teams want to do, finding solutions that work for both of them, <em>and following through on what was decided</em>. The relationship between the teams strengthens when security knows that data scientists can fulfill their promises.</p><h2 id="mutually-understand-value-and-threats">Mutually understand value and threats</h2><p>Security professionals usually don’t have an intuitive sense of the value that data science brings to the organization. By articulating the business value to the security organization, data scientists can get security teams on their side.</p><p>Does creating that public-facing app provide a critical new capability for customers? Does accessing that internal database allow for automation that will save staff time and money? Does having write access to the database allow for machine learning models that will impact the company’s bottom line?</p><p>At the same time, security teams should describe the <i>“threat model”</i> — that improbable but devastating event that they are trying to prevent. Are they concerned about data scientists accidentally putting proprietary data in a public app? Or are they worried that outside hackers could find a way in to steal customer payment information? Do they stay up at night worrying that a disgruntled employee could exfiltrate intellectual property? Or are there regulatory regimes in place that specify how they’re allowed to provide access to data that identify customers? Very different prevention and mitigation strategies are warranted depending on the threat.</p><p>Data scientists who understand the threat model can help ensure the gravest threats are less likely, and they can also point out where security choices don’t make sense given the threat model.</p><p>A common example is a database that contains a combination of sensitive and non-sensitive data. A data scientist might want to use sales order data to build a model of the customers that are the highest value for marketing purposes. But if that data is in the same database as customer names, addresses, and credit card information, security is going to be (and should be!) really restrictive on where that data goes.</p><p>Both the data science and security teams can appreciate the potential value of identifying valuable customers and the threat of exposing customer data. It might become obvious that splitting the database so the sales order data isn’t merged with the sensitive customer data would serve everyone.</p><h2 id="make-it-easy-to-follow-the-rules">Make it easy to follow the rules</h2><p>For fast-growing organizations, there’s no way security education can keep up with team growth — the resources and time needed to continuously train every new data scientist will become unsustainable. And once a company is big, security cannot audit everything that everybody is doing.</p><p>Data scientists are hired because they&rsquo;re smart problem-solvers. So if they are locked in a room without something they need, they will waste time trying to get to it — and they&rsquo;re unlikely to discover the safest or best path. A better plan would be to figure out how to create an environment for data scientists that&rsquo;s super secure but doesn&rsquo;t leave them needing more.</p><p>Let’s take the example of a database that requires access authentication. Instead of having everyone develop their own connection to the database, security can write R and Python packages that include wrapper functions for access. Everybody is getting the data in a secure way and when there is a reason to update — say, to a more secure connection method — users don’t have to change their code and can just upgrade to the new version of the package. The system may change over time but the users can continue working with minimal interruption.</p><h5 id="iset-up-child-proof-data-science-environments-to-work-efficiently-and-securelyi"><i>Set up child-proof [data science environments] to work efficiently and securely</i></h5><p><em>&gt; A place that has all of their stuff and is really nice, and they can’t burn your house down.</em></p><p>Organizations can embed security directly into systems by setting up “child-proof rooms”. These closed systems for data ensure that users adhere to the organization’s security boundaries. Less training is needed for new users since the environment has made it impossible to do the wrong thing, allowing them to quickly and safely get started on their work.</p><p>In general, closed systems are more secure than open ones. But if those systems aren’t provisioned with the things data scientists need, they’re unlikely to get used. Instead, it’s important to pair restrictions with power. If it’s necessary to lock data scientists into a specific environment, let it be a playroom where they can do what they need while everyone else rests easy.</p><p>A server that can’t connect to the internet or edit corporate data would be restrictive on its own. Security teams can provide read access, a closed analytics database, and offline access to data science tools (such as R and Python packages through <a href="https://www.rstudio.com/products/package-manager/" target = "_blank" rel = "noopener noreferrer">RStudio Package Manager</a>), empowering data scientists to run their models inside a safe environment.</p><h5 id="ibuy-tools-dont-build-them-for-continuous-high-quality-securityi"><i>Buy tools (don&rsquo;t build them) for continuous, high-quality security</i></h5><p>Organizations sometimes create their own security systems or tools. However, most don’t make money on security, so it can be under-resourced and is often first on the chopping block when times are tough. On the other hand, security is a <em>feature</em> for software vendors. They have a vested interest in creating and maintaining secure features and systems. Vendors also aim to make their tools user-friendly and pretty (making following the rules easy!).</p><p>By buying tools, organizations can turn their own cost center into someone else’s revenue stream. Security gets prioritized appropriately and the company can instead focus on the growth of their people and capabilities.</p><h2 id="arrive-at-the-good-place">Arrive at the good place</h2><p>Tension between data science and security teams is common and even expected, but that doesn’t mean they can’t work together so that data scientists can get their jobs done without opening security vulnerabilities. Through continuous conversation, closed systems for data, and streamlined tools, organizations can set up the relationships and systems needed in order to be successful.</p><p>Watch Gordon’s full talk below. <strong>Interested in working at the good place? <a href="https://www.socure.com/about/careers" target = "_blank" rel = "noopener noreferrer">Socure</a> is hiring!</strong></p><script src="https://fast.wistia.com/embed/medias/kvxdrd04wr.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_kvxdrd04wr videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/kvxdrd04wr/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div></description></item><item><title>RStudio Professional Drivers 2021.10.0</title><link>https://www.rstudio.com/blog/pro-drivers-2021-10-0-release/</link><pubDate>Tue, 26 Oct 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/pro-drivers-2021-10-0-release/</guid><description><h2 id="updates-in-this-release">Updates in this release</h2><p>In this release of the <a href="https://www.rstudio.com/products/drivers/" target="_blank">RStudioPro Drivers</a>, you can now authenticate toHive and Impala using single sign-on (SSO) via SAML 2.0. For moreinformation, see the <a href="https://docs.rstudio.com/pro-drivers/documentation/#obdc-driver-installation-and-configuration-guides" target="_blank">Installation and ConfigurationGuides</a>.This release also contains feature enhancements, security updates, andbug-fixes for these other drivers:</p><ul><li>BigQuery</li><li>MySQL</li><li>Netezza</li><li>Oracle</li><li>PostgreSQL</li><li>Snowflake</li><li>SQL Server</li><li>Teradata.</li></ul><p>For a full list ofchanges in this release, refer to the <a href="https://docs.rstudio.com/drivers/2021.10.0/release-notes/" target="_blank">releasenotes</a>.</p><h2 id="upgrading">Upgrading</h2><p>We strongly encourage all customers to upgrade to the latest releaseof the RStudio Pro Drivers. <a href="https://docs.rstudio.com/pro-drivers/upgrade/" target="_blank">Upgradingdrivers</a> literally takesminutes and can help prevent future security and administrative issues.Please note that this is also the first RStudio Pro Driver release to use our new <a href="https://blog.rstudio.com/2021/08/30/calendar-versioning-for-commercial-rstudio-products/" target="_blank">calendar-based versioningscheme</a>.</p><h2 id="about">About</h2><p>RStudio offers ODBC database drivers to all current customers using ourprofessional products at no additional charge, so that data scientistsand organizations can take full advantage of their data. The RStudioPro Drivers are commerciallylicensed and covered by our <a href="https://www.rstudio.com/about/support-agreement/" target="_blank">supportprogram</a>.</p></description></item><item><title>The INSPIRE U2 Program: Student Reflections on Data Science Training</title><link>https://www.rstudio.com/blog/the-inspire-u2-program-student-reflections/</link><pubDate>Fri, 22 Oct 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/the-inspire-u2-program-student-reflections/</guid><description><blockquote><p>This is a guest post from Kathleen Bostic (Spelman College) and Michel Ruiz-Fuentes (Smith College), participants in the INSPIRE U2 summer site directed by <a href="https://anayenablanksonphd.weebly.com/" target = "_blank" rel = "noopener noreferrer">Dr. A. Nayena Blankson</a>. The INSPIRE U2 Program provides a learning pathway for underrepresented female students to enter advanced degrees and careers in statistical fields. Learn more in our previous <a href="https://blog.rstudio.com/2021/10/20/the-inspire-u2-program/" target = "_blank" rel = "noopener noreferrer">blog post</a> and on the <a href="https://sites.spelman.edu/inspireu2-reu/" target = "_blank" rel = "noopener noreferrer">program website</a>.</p></blockquote><h3 id="navigating-leadership-and-learning-through-rstudio">Navigating Leadership and Learning Through RStudio</h3><h4 id="kathleen-bostick">Kathleen Bostick</h4><img src="kathleen.JPG"><p>I served as the Junior Mentor for the INSPIRE U2 Scholars. As a Junior Mentor, part of my role included participating in various aspects of the summer program, such as attending the RStudio Bootcamp taught by Dr. Blankson and completing an independent project. My experience with RStudio was such an incredible opportunity that allowed me to immerse myself in coding.</p><p>My RStudio mentor was Mine Çetinkaya-Rundel, who helped me through the process of enhancing my coding skills. We met once a week and she really took the time to walk me step by step on how to code. We were able to create something incredible together! I am so grateful for the teachings of Dr. Blankson that gave me the foundation necessary to take my research to the next level with Mine. Jessica Coates and Dr. Mentewab Ayalew were both critical to helping me piece everything together. With the materials I created with Dr. Blankson and Mine, I was able to produce a presentation unique to the research Jessica Coates and Dr. Ayalew guided me through. Thanks to the great teams at INSPIRE U2 and RStudio, my research is in the process of being published!</p><p><img src="Forest_plot_KATHLEEN_BOSTICK.png" alt="Image of dataset summary statistics with a forest plot"></p><p>Learning and using RStudio, and being involved with the INSPIRE U2 Program, have played a critical role in the evolution of my coding skills, leadership skills, research skills, and presentation skills. If anyone has the opportunity to learn how to code, I highly recommend starting with RStudio and utilizing their mentors because it will make all the difference in the world.</p><h3 id="reflecting-on-inspire-u2">Reflecting on INSPIRE U2</h3><h4 id="michel-ruiz-fuentes">Michel Ruiz-Fuentes</h4><img src="Michel-Ruiz-Fuentes_Headshot.jpeg"><p>My name is Michel Ruiz-Fuentes, and I’m a sophomore at Smith College. I am passionate about advocacy and innovation, and INSPIRE U2 was a memorable opportunity that has allowed me to see the intersection of my passions with Statistics and Data Science.</p><p>During my first year in college, I explored a plethora of disciplines, intending to find a field that I enjoyed and allowed me to lead meaningful change. Taking a statistics course in my first semester titled <em>Communicating with Data</em> exposed me to the crucial role that statistics and data science play in our everyday lives. In particular, it showed me how statistics helps us identify, analyze, and articulate information that is crucial for decision-making processes. In class, we conducted projects to combat barriers in statistics like data accessibility and created visualizations that are understandable for any individual regardless of their background in statistics. I was particularly fascinated in learning what statistics was and how to use it ethically to make well-versed decisions. My professor, Sara Stoudt, shared the INSPIRE U2 research opportunity with our class and encouraged historically underrepresented students to apply!</p><p>I am fortunately now able to reflect on being a part of a robust cohort and research experience. I worked with Dr. Tharu, Assistant Professor of Mathematics at Spelman, and Mr. Jeff Allen, RStudio Mentor, to analyze a market research question and precisely measure the strength of the relationship between elements of market structure with the net income of large marketing-intensive firms. I sourced secondary data from the Industrial Book of Economics, performed multiple regression and Box-Cox transformations, and created data visualizations, such as scatterplots, to show my findings.</p><p><img src="Scatterplot_Michel_Ruiz-Fuentes.png" alt="Scatterplot showing relationship between S (average ad-to-sales ratio) and PT (net income + interest expense / total assets) of firm product markets, to showcase an example of student work"></p><p>I enjoyed working with RStudio because it required critical thought, patience, and purpose. Every line of code has a meaning. It is a delicate and rewarding process to write your code and transform your raw data into a story with meaning. I hope to share my research findings — that marketing is a crucial aspect of increasing your company’s net income — with women of color or immigrant entrepreneurs to help their companies enter the market and be strong competitors in their industry.</p><p>I had never used R nor taken an introductory statistics course before INSPIRE U2, so it was both thrilling and overwhelming to read over our syllabus and see what was planned for our summer. However, I had immense support from my mentors in navigating difficult concepts and code. Therefore, I would encourage any student, despite their background in statistics or coding, to apply! Dr. Blankson and Dr. Tharu were excellent teachers and empowered me to search for answers and keep working hard. The Bootcamp led by Dr. Blankson and the homework practice I did with Dr. Tharu prepared me for the research component of our program. In addition to engaging in the Bootcamp and research, I worked on an individual project to create a personal website with Mr. Allen! I shared with him that I wanted to build my proficiency and explore other ways R can be used, so we brainstormed possible projects, and my favorite idea was to create a website and build a portfolio!</p><p>INSPIRE U2 challenged me to dream big, persist through obstacles, and explore uncertainty. I am grateful for this opportunity because through this experience, I pursued my interests and found a new passion!</p><h3 id="learn-more">Learn More</h3><p>For more information about the INSPIRE U2 Program, visit <a href="https://sites.spelman.edu/inspireu2-reu/" target = "_blank" rel = "noopener noreferrer">sites.spelman.edu/inspireu2-reu</a>. Applications for the 2022 program will open on November 15, 2021.</p><p>This post is cross-posted on the <a href="https://sites.spelman.edu/inspireu2-reu/" target = "_blank" rel = "noopener noreferrer">INSPIRE U2 website</a>.</p></description></item><item><title>Embedding Shiny Apps in Tableau Dashboards Using shinytableau</title><link>https://www.rstudio.com/blog/embedding-shiny-apps-in-tableau-dashboards-using-shinytableau/</link><pubDate>Thu, 21 Oct 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/embedding-shiny-apps-in-tableau-dashboards-using-shinytableau/</guid><description><sup>Screenshot of a Tableau dashboard with a shinytableau extension</sup><p>At RStudio, we strive to help you <a href="https://blog.rstudio.com/2021/03/18/bi-and-data-science-the-tradeoffs/" target = "_blank" rel = "noopener noreferrer">combine the power of code-first data science with the other tools in your toolkit</a>. Many organizations rely on <a href="https://www.tableau.com/" target = "_blank" rel = "noopener noreferrer">Tableau</a> for creating data dashboards, but there may be moments where you wish you could take advantage of R’s powerful reporting and visualization capabilities as well.</p><p>With the experimental <a href="https://rstudio.github.io/shinytableau/index.html" target = "_blank" rel = "noopener noreferrer">shinytableau</a> package, you can use the power of R and Shiny to customize objects that you embed in your Tableau dashboards. This package opens up the possibility to include interactive features beyond Tableau’s native capabilities, allowing you to be more flexible in meeting your organization’s data needs.</p><h2 id="embed-custom-r-objects-in-a-tableau-dashboard">Embed Custom R Objects in a Tableau Dashboard</h2><p><img src="images/image2.png" alt="Screenshot of Tableau barchart visualization and ggplot2 violin plot visualization together on a Tableau dashboard"></p><p>Is that a ggplot2 visualization in Tableau? With shinytableau, you can sit Tableau’s visualizations side-by-side with anything you can create in a Shiny application, including custom charts created with ggplot2. Since the Shiny app is embedded in Tableau, users of the dashboard won’t need to learn R or even realize they’re using a different tool.</p><h2 id="configure-shiny-apps-to-talk-to-tableau">Configure Shiny Apps to “Talk” to Tableau</h2><p>Say you want to interactively use your extension with Tableau, such as pulling data from a worksheet to populate your chart. You can use Shiny to create a friendly user interface for specifying settings for your extension. With the right configuration, your Shiny app and Tableau can interact so that you can tell a cohesive story in your Tableau dashboard.</p><script src="https://fast.wistia.com/embed/medias/9zhoe1gnar.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:53.96% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_9zhoe1gnar videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/9zhoe1gnar/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><h2 id="deploy-and-host-shiny-apps-for-production">Deploy and Host Shiny Apps for Production</h2><p>You can deploy shinytableau extensions like any other Shiny app, which allows you to use them in production. RStudio offers RStudio Connect and shinyapps.io as options to publish and host your Shiny apps. Read more on the <a href="https://rstudio.github.io/shinytableau/articles/deployment.html" target = "_blank" rel = "noopener noreferrer">Deployment and Hosting</a> page of the shinytableau documents.</p><h2 id="how-to-use-shinytableau">How to Use shinytableau</h2><p>The authors of the shinytableau package, Joe Cheng, Richard Iannone, and Javier Luraschi, wrote a <a href="https://rstudio.github.io/shinytableau/articles/shinytableau.html" target = "_blank" rel = "noopener noreferrer">great tutorial on getting started with shinytableau</a>. We highly recommend going through the tutorial to get a full understanding of the components of shinytableau, how they work together, and why each step is needed.</p><p>Here, we’ll break down the main steps of creating the authors’ <code>ggviolin</code> extension and how to embed it in Tableau’s <a href="https://www.tableau.com/solutions/gallery/superstore" target = "_blank" rel = "noopener noreferrer">Superstore sample workbook</a>. Note that the violin plot in the videos has an applied palette and theme while the violin plot in the tutorial does not. Like any ggplot2 visualization, you can customize the plot to look how you would like!</p><p>Let&rsquo;s begin in your RStudio console.</p><p><strong>1. Start a <a href="https://support.rstudio.com/hc/en-us/articles/200526207-Using-Projects" target = "_blank" rel = "noopener noreferrer">new RStudio project</a></strong></p><p><strong>2. Install packages</strong></p><p>Install your required packages. In addition to the packages you need for your Shiny app, you will also need remotes and shinytableau.</p><pre><code class="language-{r}" data-lang="{r}">install.packages(&quot;remotes&quot;)remotes::install_github(&quot;rstudio/shinytableau&quot;)</code></pre><p><strong>3. Edit the <code>manifest.yml</code> file</strong></p><p>Once shinytableau is installed, run the code below to open the <code>manifest.yml</code> file. Edit the metadata to fit your extension. See an example in the <a href="https://github.com/rstudio/shinytableau/blob/7790a566dcef9092863ad231fd58ba14596a6300/inst/examples/ggviolin/manifest.yml" target = "_blank" rel = "noopener noreferrer">shinytableau Github repository</a>.</p><pre><code class="language-{r," data-lang="{r,">shinytableau::yaml_skeleton()</code></pre><p><img src="images/image3.png" alt="Example of the manifest.yml file produced in shinytableau which details the metadata that you can provide your shinytableau extension" title="image_tooltip"></p><p><strong>4. Create the shinytableau extension Shiny app</strong></p><p>The next step is to create the extension. In addition to creating a Shiny app, you also need to configure the app so that it can interact with Tableau. This is the most technical part of the whole workflow and the tutorial details how to do this thoughtfully and in detail.</p><p>You can use the <code>ggviolin</code> app code in the <a href="https://github.com/rstudio/shinytableau/blob/7790a566dcef9092863ad231fd58ba14596a6300/inst/examples/ggviolin/app.R" target = "_blank" rel = "noopener noreferrer">shinytableau Github repository</a> as an example. Save this file as <code>app.R</code> in your project.</p><p><strong>5. Run the app and download the <code>.trex</code> file</strong></p><p>Run the <code>app.R</code> file and a dialogue box will appear. Click “Download” to download a <code>.trex</code> file, which is what Tableau will use to create the extension.</p><p><img src="images/image4.png" alt="Dialogue box for the shinytableau extension where you can downlown the .trex file" title="image_tooltip"></p><p>Now, let’s move to Tableau. Be sure to leave your Shiny app running in RStudio! Otherwise, the connection will be lost and Tableau will not be able to open the extension.</p><p><strong>6. Open the Superstore sample workbook in Tableau</strong></p><p>The Superstore workbook should be available under Sample Workbooks:</p><p><img src="images/image5.png" alt="First page of Tableau where you can open the Superstore workbook" title="image_tooltip"></p><p>If you want to create the same violin plots as you see in the images in this post, go to the “Performance&rdquo; workbook. Under &ldquo;Measure Names&rdquo;, you can find &ldquo;Profit Ratio&rdquo;. Drag “Profit Ratio” under “Marks”.</p><p><img src="images/image6.png" alt="Screenshot of Superstore workbook&rsquo;s measures with Profit Ration under Marks" title="image_tooltip"></p><p><strong>7. Create the dashboard and load .trex file</strong></p><p>Go to “Dashboard” then “New Dashboard”. In the new dashboard, drag the “Performance” workbook to where it says “Drop Sheets Here”, and then drag the “Extension” object to the dashboard. The &ldquo;Add an Extension&rdquo; box will automatically open.</p><p>Click “Access Local Extensions”, which will open a window. Navigate to and open your .trex file. Tableau will ask if you agree to open the extension. Click &ldquo;OK&rdquo; and your extension will appear in the dashboard.</p><script src="https://fast.wistia.com/embed/medias/glqjbeolmq.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:60.42% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_glqjbeolmq videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/glqjbeolmq/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><p><strong>8. Configure the extension</strong></p><p>The Shiny app is in Tableau, but it doesn&rsquo;t know what data to use. The next step is configuring the extension so that it uses the &ldquo;Performance&rdquo; worksheet data to render the violin plot.</p><p>Click the triangle button of the extension object, then go to “Configure”. This will open a dialogue box created in the Shiny app.</p><p>Fill out the form. Give it a title, choose the “Performance” worksheet, and select “Category” as the dimension and “AGG(Profit Ratio)” as the measure.</p><script src="https://fast.wistia.com/embed/medias/gwzb58ff56.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:55.83% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_gwzb58ff56 videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/gwzb58ff56/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><p>When you click “OK”, you will see the violin plot populate in the extension object. Congratulations! With the shinytableau package, you created a Tableau dashboard extension using R and Shiny.</p><h3 id="conclusion">Conclusion</h3><p>Using the power of R and Shiny, you can combine your available tools to create insightful dashboards. The shinytableau package is still experimental and <a href="https://community.rstudio.com/?_ga=2.14181236.216058284.1634563605-1803916348.1631026563" target = "_blank" rel = "noopener noreferrer">we would love your feedback</a> on what you hope to see from the package.</p><h3 id="learn-more">Learn More</h3><p>We recommend reviewing the <a href="https://rstudio.github.io/shinytableau/articles/shinytableau.html?_ga=2.118303690.216058284.1634563605-1803916348.1631026563" target = "_blank" rel = "noopener noreferrer">tutorial</a> for shinytableau to learn more about how to customize it to your needs.</p><p>Looking for other ways of leveraging the power of R and Python with Tableau? Last week, we announced <a href="https://blog.rstudio.com/2021/10/12/rstudio-connect-2021-09-0-tableau-analytics-extensions/" target = "_blank" rel = "noopener noreferrer">support for Tableau Analytics Extensions on RStudio Connect</a>, which allows you to create calculated fields in workbooks that can execute R and Python scripts outside of the Tableau environment.</p><p><strong>Learn more about leveraging business intelligence tools like Tableau alongside open source data science at our RStudio Community Enterprise Meetup on <a href="https://www.meetup.com/RStudio-Enterprise-Community-Meetup/events/281209187/" target = "_blank" rel = "noopener noreferrer">Leveraging R &amp; Python in Tableau with RStudio Connect</a> on Friday, October 29, 2021.</strong></p></description></item><item><title>The INSPIRE U2 Program: Training Students in Big Data and Statistics Using RStudio</title><link>https://www.rstudio.com/blog/the-inspire-u2-program/</link><pubDate>Wed, 20 Oct 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/the-inspire-u2-program/</guid><description><sup>Logos of the National Science Foundation, Spelman College, and RStudio</sup><blockquote><p>This is a guest post from <a href="https://anayenablanksonphd.weebly.com/" target = "_blank" rel = "noopener noreferrer">Dr. A. Nayena Blankson</a>, Professor of Psychology at Spelman College. Dr. Blankson is the Director of the <a href="https://sites.spelman.edu/inspireu2-reu/" target = "_blank" rel = "noopener noreferrer">INSPIRE U2</a> summer site (NSF Award #1852056), as well as a researcher, consultant, and award-winning instructor.</p></blockquote><h3 id="about-the-inspire-u2-program">About the INSPIRE U2 Program</h3><p>From television game shows to children&rsquo;s movies, the importance of data is evident. For example, in an episode of the television game show “Wheel of Fortune” that aired on May 28, 2018, the category was Occupation and the puzzle was DATA SCIENTIST. In the children&rsquo;s movie Cars 3, there is a female Data Analyst character who presents the results of her research right at the start of the movie. The need for individuals who are statistically informed and trained in data analysis has become more recognized.</p><p>However, there is a critical need for diverse perspectives in statistical inquiry. Women of color and minorities are less likely to pursue careers in statistical fields because of lack of uniquely stimulating opportunities and appropriate support within the college environment. In recognition of the barriers that underrepresented students may face in pursuit of science degrees, there has been increased development of “pipeline” programs aimed at strengthening opportunities for underrepresented undergraduates.</p><p>Funded in 2018 by the National Science Foundation (NSF Award #1852056), the <u>I</u>ncreasing <u>S</u>tatistical <u>P</u>reparation in <u>R</u>esearch <u>E</u>ducation for <u>U</u>nderrepresented <u>U</u>ndergraduates (INSPIRE U2 Program) is a Research Experiences for Undergraduates site at Spelman College. The interdisciplinary program aims to provide a learning pathway that will set underrepresented female students on a track towards graduate studies and careers in statistical and data fields. It is expected that this initiative will: 1) increase student interest in advanced degree programs; 2) provide support and mentorship to students; and 3) serve as a pipeline for entry into advanced degree programs.</p><p>Each student will have the opportunity to conduct an independent research project, and in doing so, students will develop the skills, confidence, and inspiration to pursue advanced statistics opportunities within the sciences. Key innovations include: 1) the merging of two evidence-based training approaches, specifically the former <a href="https://sites.google.com/site/qtugsmep/home" target = "_blank" rel = "noopener noreferrer">Quantitative Training for Underrepresented Groups program</a> and <a href="https://passiondrivenstatistics.wescreates.wesleyan.edu/" target = "_blank" rel = "noopener noreferrer">the Passion-Driven Statistics curriculum</a>; 2) training in the flexible application of knowledge; 3) analysis of data in real world contexts; and 4) intensive one-on-one mentoring and support.</p><h3 id="2021-inspire-u2-program-session">2021 INSPIRE U2 Program Session</h3><p>The 2021 INSPIRE U2 Program ran from June 7- July 30. Eleven students participated in the program and their majors ranged from biology to journalism. Over the course of the eight-week program, INSPIRE U2 Scholars participated in a series of activities including professional development sessions, weekly mindfulness sessions (led by <a href="https://www.linkedin.com/in/natalie-watson-singleton-phd-4254477/" target = "_blank" rel = "noopener noreferrer">Dr. Natalie Watson-Singleton</a>), and a Statistics Bootcamp using RStudio (taught by myself). INSPIRE U2 Scholars also worked on an independent research project using freely available Big Data sets. They developed their own research questions, conducted data analyses to answer those research questions, and presented their work.</p><p>Per Brown, Davis, and McClendon (1999), there are three essential components of mentoring underrepresented students: role modeling, role molding, and collegial friendships. The INSPIRE U2 program stressed all three components. In particular, knowing that mentorship plays a large role in student outcomes, the program took great care to place students into teams that included student (Peer Mentors and Junior Mentors) and faculty mentors (Senior Mentors). Senior Mentors were <a href="https://www.spelman.edu/academics/majors-and-programs/psychology/faculty/alexandria-hadd" target = "_blank" rel = "noopener noreferrer">Dr. Alexandria Hadd</a> (Spelman College), <a href="https://www.wesleyan.edu/academics/faculty/ldierker/profile.html" target = "_blank" rel = "noopener noreferrer"> Dr. Lisa C. Dierker</a> (Wesleyan University), <a href="https://www.spelman.edu/academics/majors-and-programs/mathematics/faculty/bhikhari-tharu" target = "_blank" rel = "noopener noreferrer">Dr. Bhikhari Tharu</a> (Spelman College), <a href="https://web.uri.edu/psychology/meet/lisa-harlow/" target = "_blank" rel = "noopener noreferrer">Dr. Lisa L. Harlow</a> (University of Rhode Island), <a href="https://www.spelman.edu/academics/faculty/directory/profile/mentewab-ayalew" target = "_blank" rel = "noopener noreferrer">Dr. Mentewab Ayalew</a> (Spelman College), and myself (<a href="https://anayenablanksonphd.weebly.com/" target = "_blank" rel = "noopener noreferrer">Dr. A. Nayena Blankson</a>). Kathleen Bostic, a third-year biology major at Spelman, served as the Junior Mentor.</p><h3 id="partnership-with-rstudio">Partnership With RStudio</h3><p>Additionally, through the establishment of a partnership with RStudio prior to the start of the summer program, RStudio Mentors were added to the planned student teams. The RStudio Mentors were Edgar Ruiz, Jeff Allen, Mara Averick, Mine Çetinkaya-Rundel, Curtis Kephart, and Jesse Mostipak. Students met with their Senior Mentors for at least one hour per week. Meetings with RStudio Mentors at times occurred jointly with Senior Mentors so that all team members were on the same page regarding the research topic and research question. Scholars also met independently with their RStudio Mentors to discuss their data wrangling, visualization, and analysis code, along with other topics depending on the student and mentor. In the Bootcamp sessions, the Scholars were introduced to R and RStudio along with statistical topics such as analysis of variance, analysis of covariance, and multiple regression. By working with their Senior and RStudio mentors, Scholars were able to go beyond the main lessons learned in the Bootcamp.</p><p>All Scholars presented their research projects in a Summer Research Symposium at the end of the summer. Research topics ranged from plant biology to income and race differences in police contact. The presentations were outstanding, and that is an understatement. In particular, at the time of the program, eight of the 11 Scholars had just finished their first year of college. Most had never taken a statistics course before the summer program. Even more pertinent is the fact that none of the Scholars had ever used R or RStudio before the summer program. Students experienced the full breadth of conducting analyses with secondary data, which can be very messy; data are not always what you expect them to be. Sometimes information is missing. Some Scholars started off with one data set, learned the details about how those data were collected, examined the sample and variables, only to realize that the data set might not be the best for answering their proposed research question. They sifted through many research articles, data sets, and variables. They reformatted and restructured their data sets for analyses. They checked their data for missing values, transformed variables, etc. That students were able to accomplish so much in the span of a few short weeks highlights the impact that such programs can make on the educational and career trajectories of students.</p><p>Due to COVID, the inaugural Summer 2020 program was cancelled. To avoid a second year of cancellation, the 2021 program was entirely virtual. Prior to COVID, plans for the program included virtual mentoring. Therefore, shifting entirely to a virtual environment for the 2021 program was not completely off course from the original plans. Moreover, it allowed greater flexibility in the availability of both the Senior Mentors as well as RStudio Mentors for the program. Overall, the program was an incredible success, according to an internal evaluation report. Scholars benefited from having a team of mentors who supported them on their research projects, including their RStudio Mentors.</p><h3 id="student-testimonials">Student Testimonials</h3><p>“I was ecstatic to get the chance to combine my two favorite subjects, mathematics and computer science, and see how I can use both outside the classroom settings!” (Samantha Armijo, Dominican University)</p><p>&ldquo;The Inspire U2 Program was the first time I applied statistics to the real world. My project was on healthcare costs and how gender, age, and type of service might affect it. It was also my first time using RStudio. Having a background in coding, RStudio was easy for me to pick up. It was a lot of fun learning how to beautify my data with all sorts of colors, graphs, and fonts! It was also fun manipulating data for the first time and drawing conclusions based on your data. The Inspire U2 Program and RStudio taught me a lot about statistics, and I look forward to using what I learned in my future studies. Having a head start already puts me ahead of the curve and I feel more confident in conducting research and manipulating data.&rdquo; (Angel Bryant, Howard University)</p><p>&ldquo;INSPIRE U2 challenged me to research with purpose and persist through obstacles. Thank you, Dr. Blankson, Dr. Tharu, Mr. Allen, and all my mentors and peers for this memorable experience. I encourage my peers to think critically about the role data plays into your everyday life and then analyze data through independent projects and research programs like INSPIRE U2!” (Michel Ruiz-Fuentes, Smith College)</p><h3 id="learn-more">Learn More</h3><p>For more information about the INSPIRE U2 Program, visit <a href="https://sites.spelman.edu/inspireu2-reu/" target = "_blank" rel = "noopener noreferrer">sites.spelman.edu/inspireu2-reu</a>. Applications for the 2022 program will open on November 15, 2021.</p><p>This post is cross-posted on <a href="https://anayenablanksonphd.weebly.com/blog" target = "_blank" rel = "noopener noreferrer">Dr. A. Nayena Blankson&rsquo;s blog</a>.</p></description></item><item><title>Why Your Data Science Team Might Need a Shiny Deployment Engineer</title><link>https://www.rstudio.com/blog/why-your-ds-team-might-need-a-shiny-deployment-engineer/</link><pubDate>Thu, 14 Oct 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/why-your-ds-team-might-need-a-shiny-deployment-engineer/</guid><description><caption>Photo by Vineesh Ramayyan at <a href="https://nilgiris.nic.in/tourist-place/rose-garden-ooty/" target = "_blank" rel = "noopener noreferrer">Ooty Rose Garden</a></caption><p>What do data science teams need to ensure their Shiny apps work as intended once they hit “publish”? Software industry best practices — like continuous integration, deployment, and delivery (CI/CD) — can support the creation of production-ready data tools. Teams can test algorithms and models as part of the development cycle, and data scientists can deploy their apps at any moment with confidence.</p><p>This is why <a href="https://guidehouse.com/" target = "_blank" rel = "noopener noreferrer">Guidehouse</a>, a leading global provider of consulting services to the public sector and commercial markets, is currently looking for a <a href="https://careers.guidehouse.com/jobs/14420?lang=en-us" target = "_blank" rel = "noopener noreferrer">Senior Shiny Deployment Engineer</a>. Helping drive adoption of DevOps methodologies for R- and Python-based web applications, the Senior Shiny Deployment Engineer will bridge the worlds between data science and software development.</p><p>In this post, we interview <a href="https://guidehouse.com/professionals/w/weatherford-vergil" target = "_blank" rel = "noopener noreferrer">Vergil Weatherford</a>, Associate Director on the Advanced Solutions team at Guidehouse. A long-time proponent of open-source tools, Vergil oversees the architecture needed to streamline complex data collection and smooth development of data products.</p><p>We were excited to learn more about Vergil’s work planning and implementing data infrastructure, why he drives the adoption of open source at Guidehouse, and his vision for a Senior Shiny Deployment Engineer that will contribute to the data science team’s culture of quality.</p><h3 id="welcome-could-you-tell-us-a-bit-about-yourself-and-your-work">Welcome! Could you tell us a bit about yourself and your work?</h3><p>I am an energy consultant-turned-data tooling enthusiast. Day in and day out, I help my team adopt modern data science tools to boost productivity and solve our clients’ most complex challenges.</p><p>My first day on the job as a consultant was more than 12 years ago. I was given dozens of CSV files and asked to analyze the time-series residential air conditioner runtime data using Excel. I quickly found myself asking, &ldquo;Is this really the best way to do this?&rdquo; The answer led me on a long journey to where I am today, where I work on a specialized team dedicated to supporting code-first data science infrastructure at Guidehouse.</p><h3 id="what-does-a-modern-data-science-infrastructure-allow-your-team-to-do">What does a modern data science infrastructure allow your team to do?</h3><p>I mainly build infrastructure for an analytics team working on our clients’ most complex business challenges related to the clean energy transition. This work requires us to use a myriad of datasets. We work regularly with &ldquo;smart meter&rdquo; data, SCADA data, customer demographics, and building characteristics. We run surveys and deploy data acquisition devices to collect detailed energy usage data. We&rsquo;re given time series data from IoT devices like smart thermostats and data feeds from electric vehicle (EV) charging stations. We then look across the broad spectrum of data and technologies to assess and develop solutions.</p><p>This is one of the things that makes analytics in consulting unique: getting all of these varied datasets from external sources poses challenges not found when data comes from inside an organization. We need tools that help us create meaning and value from data instead of focusing on data wrangling or cleaning.</p><p>We also need approaches that let us standardize analysis methods and develop reusable interfaces regardless of what the data looks like. The consulting industry is seeing big growth in systematizing solutions and building data products that can be quickly redeployed for different clients. “Modularity” has become the operative word. By having a strong infrastructure in place, we can use our previous work to put together solutions more quickly.</p><h3 id="how-did-you-end-up-adopting-open-source-tools">How did you end up adopting open-source tools?</h3><p>As a long-time Linux hobbyist, I am continually looking for ways to leverage open source to solve problems. So when the management team was looking for alternatives to a proprietary — and expensive! — statistical analysis platform seven years ago, I suggested R mainly due to its strong following in academia and its open source nature.</p><p>Fast-forward to the present, and our tooling looks very different than it did when I joined. We&rsquo;re still using some proprietary tools where they are a good fit but our &ldquo;daily driver&rdquo; toolkit is mainly open source: R, Python, and Git, with Linux under the hood. Central to delivering that toolkit, we use the full suite of RStudio Team Enterprise tools to scale our data science work and share results with clients.</p><h3 id="what-made-you-pick-shiny-specifically">What made you pick Shiny specifically?</h3><p>Our data science team has a large and talented R user base. They have experience solving all kinds of unique and challenging problems, like detecting the effects of programs designed to incentivize energy savings, forecasting electric vehicle demand, or optimizing the pathway to achieve decarbonization goals. We have built up a lot of R code over the years as we have solved these problems.</p><p>Early on, we experimented with GUI-based dashboarding tools, but the jump from analysis in R to visualization in other business intelligence tools had too many gaps. With tools like Shiny, we can seamlessly apply the skills our data scientists have built up over time.</p><p>Shiny also allows data scientists to carry the solution development process much further along the deployment lifecycle. The team is able to turn custom analyses into web applications. Then, we can use software like RStudio Connect to rapidly deploy and tweak those web applications. We’re able to iterate and innovate as quickly and often as we would like.</p><h3 id="why-are-you-looking-for-a-senior-shiny-deployment-engineer">Why are you looking for a Senior Shiny Deployment Engineer?</h3><p>We have a lot of depth and expertise in being able to solve our clients’ most complex problems with R and want to improve our client’s end-to-end experience with our data products. To enable this, we recognize the need for formalizing the Software Development Lifecycle (SDLC) early in the process by bringing software industry best practices to bear.</p><p>The Senior Shiny Deployment Engineer will help us do that. However, while our main front-end framework is Shiny, we want to make sure we do not miss out on potential candidates coming from a computer science or formal software development background. The skills required to deploy robust data applications require knowledge of good SDLC practices for testing, CI/CD, and DevOps principles. Those are the sorts of things that data scientists usually have to read up on but are standard skill sets for deployment engineers.</p><p>Writing a good Shiny app and deploying that Shiny app into production are two different things. The purpose of the role is to help push client-facing solutions over the finish line. We need to build applications thoughtfully from the early stages and have strategies in place for testing, quality assurance, versioning, updates, and ongoing maintenance. This is where the Senior Shiny Deployment Engineer can really make a difference.</p><h3 id="what-else-would-you-like-to-add-about-the-role">What else would you like to add about the role?</h3><p>Team members with understanding of both dashboard development and deployment infrastructure are at a premium. When I shopped around the idea of this role with a few solution development leads, they were extremely enthusiastic and said they were very interested to learn from someone with this expertise.</p><p>We are really looking for someone who is ready to take their career to the next level: a teammate who can partner with different teams to help establish a culture of quality of the final deployed product starting in the Shiny app architecture and development phase. We look forward to the Senior Shiny Deployment Engineer working alongside other team members to embed software engineering best practices in a community of more research-focused individuals.</p><h3 id="conclusion">Conclusion</h3><p>Working alongside a Shiny Deployment Engineer, data science teams can apply deployment best practices to their development lifecycle. Data scientists are able to automate and innovate their frameworks, and clients reap the benefits from robust data science products. <strong>Interested in applying your SDLC skills as a Senior Shiny Deployment Engineer? <a href="https://careers.guidehouse.com/jobs/14420?lang=en-us" target = "_blank" rel = "noopener noreferrer">Apply here!</a></strong></p><p>Learn more about <a href="https://guidehouse.com/" target = "_blank" rel = "noopener noreferrer">Guidehouse</a>:</p><ul><li>4 consecutive years as a Forbes top employer</li><li>Rated Top 50 Companies for Diversity by DiversityInc</li><li>Certified as a Great Place to Work</li><li>Committed to science-based targets to reduce our GHG emissions</li></ul></description></item><item><title>RStudio Connect 2021.09.0 Tableau Analytics Extensions</title><link>https://www.rstudio.com/blog/rstudio-connect-2021-09-0-tableau-analytics-extensions/</link><pubDate>Tue, 12 Oct 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-2021-09-0-tableau-analytics-extensions/</guid><description><h2 id="improve-your-reach">Improve your reach</h2><p>This edition of RStudio Connect introduces support for Tableau Analytics Extensions, our first external integration with a BI tool. Tableau Analytics Extensions provide a way to create calculated fields in workbooks that can execute scripts outside of the Tableau environment. This RStudio Connect integration enables you to create R or Python HTTP API extensions for use across all your Tableau workbooks.</p><p>Compared to existing methods for integrating R and/or Python in Tableau, integration via APIs hosted on RStudio Connect provides better security and dependency management:</p><ul><li>Logic is contained within the API, preventing arbitrary code execution on the server.</li><li>RStudio Connect is a commercially-supported platform that provides access management, dials for tuning performance to meet the expected demand, and the ability to manage dependencies for multiple versions of R and Python on a per-project basis.</li><li>RStudio Connect also allows a single Tableau workbook to use R and Python extensions simultaneously.</li></ul><h3 id="why-build-web-apis">Why Build Web APIs?</h3><p>Plumber and FastAPI are popular HTTP API generators for R and Python respectively. They can quickly and easily be leveraged to create powerful web APIs that get used by other developers, applications, and systems. Learning one (or both) of these frameworks allows you, a data scientist, to turn your functions into tools. In fact, by writing a function, you may have already created a powerful tool:</p><blockquote><h5 id="one-of-the-best-ways-to-improve-your-reach-as-a-data-scientist-is-to-write-functions---r-for-data-science-chapter-19">One of the best ways to improve your reach as a data scientist is to write functions. - <em>R for Data Science, Chapter 19</em></h5></blockquote><p>In <a href="https://r4ds.had.co.nz/functions.html">R for Data Science</a>, authors Hadley and Garrett show us that functions are an excellent way to automate common tasks. Functions make code easier to understand and maintain, and reduce the chance of introducing unforced errors. But to truly extend your reach, you need a means for making those functions available to other people and systems. To distribute functions that reach users and consumers outside of your own domain, it&rsquo;s hard to beat the benefits of building and hosting a web API. We wrote about the many advantages of building APIs in <a href="https://blog.rstudio.com/2021/05/04/rstudio-and-apis/">this blog post</a>.</p><h3 align="center"><a href="https://www.rstudio.com/solutions/bi-and-data-science/">Visit our BI and Data Science overview page</a></h3><h3 id="r--python-for-tableau">R &amp; Python for Tableau</h3><p>In principle, extending Tableau should be as simple as directing a workbook to reach out to any existing web API, but <a href="https://tableau.github.io/analytics-extensions-api/">Tableau Analytic Extensions</a> require special handling to make valid requests and receive results. To simplify this process, RStudio has introduced two new open source libraries which add functionality to Plumber and FastAPI:</p><ul><li>For R: <a href="https://rstudio.github.io/plumbertableau/"><code>plumbertableau</code></a></li><li>For Python: <a href="https://github.com/rstudio/fastapitableau"><code>fastapitableau</code></a></li></ul><p>These libraries can be used to create as many extensions as you want to manage. Publishing Tableau extensions to RStudio Connect works just like regular Plumber and FastAPI content, and the new Tableau integration is enabled by default after upgrading Connect to this release. Publishers can learn more in the <a href="https://docs.rstudio.com/connect/user/tableau/">RStudio Connect User Guide</a>. Administrators should review the full <a href="https://docs.rstudio.com/rsc/integration/tableau/">integration and set up instructions</a> upon upgrade.</p><p>Extensions hosted on RStudio Connect allow Tableau users to reference extensions without needing to know how the extension was implemented, or even what language the extension is using. <code>plumbertableau</code> and <code>fastapitableau</code> provide documentation for getting started:</p><p><img src="fastapitableau.gif" alt="" title="Example fastapitableau extension"></p><p>RStudio Connect is currently the only platform that allows a single Tableau workbook to use R and Python extensions simultaneously. This solution is different from Tableau&rsquo;s existing integrations with Rserve and TabPy, both of which require passing R or Python scripts to an external language interpreter. By using <code>plumbertableau</code> and/or <code>fastapitableau</code>, all of the logic for the extension is contained in the API hosted on Connect. As shown in the documented example above, this should help simplify setup and usage in Tableau.</p><h3 align="center"><a href="https://www.linkedin.com/events/6850853311420108800/">October 29th: Join us for a Tableau Integration Meetup</a></h3><h3 id="resources-for-getting-started">Resources for Getting Started</h3><p>In addition to the documentation sites for <a href="https://rstudio.github.io/plumbertableau/"><code>plumbertableau</code></a> and <a href="https://github.com/rstudio/fastapitableau"><code>fastapitableau</code></a>, we have added a new chapter to <a href="https://docs.rstudio.com/connect/user/tableau/">RStudio Connect User Guide</a>, and deployable Jump Start extension examples that build on Tableau&rsquo;s &ldquo;Superstore&rdquo; example dataset.</p><p><img src="jumpstart.png" alt="" title="RStudio Connect Jump Start Examples"></p><p>To learn more about Tableau Analytics Extensions, visit the <a href="https://tableau.github.io/analytics-extensions-api/">documentation site</a> maintained by Tableau.</p><p>To learn more about how RStudio Team can be positioned in relation to traditional BI tooling, take a look at our <a href="https://blog.rstudio.com/2021/03/18/bi-and-data-science-the-tradeoffs/">Data Science Leadership article</a>.</p><h3 id="interested-in-integrating-shiny-with-tableau">Interested in Integrating Shiny with Tableau?</h3><p>RStudio also has a new experimental package called <a href="https://rstudio.github.io/shinytableau/"><code>shinytableau</code></a> and we&rsquo;d love to <a href="https://community.rstudio.com/">get your feedback</a> on it. <code>shinytableau</code> makes use of a new extensibility feature in Tableau called Dashboard Extensions. This feature lets programmers use JavaScript to create custom objects that Tableau users can drop into their dashboard layouts, providing custom visualizations and interactive features beyond Tableau&rsquo;s native capabilities. The <code>shinytableau</code> package is a bridge between the JavaScript-based Tableau Dashboard Extension API, and Shiny code that you as an R practitioner will write.</p><p>To get started:</p><ul><li>Review the <a href="https://rstudio.github.io/shinytableau/">package documentation</a></li><li>Read the Tutorial and Motivating Example: <a href="https://rstudio.github.io/shinytableau/articles/shinytableau.html">Introduction to shinytableau</a></li></ul><h2 id="rstudio-connect-administrator-digest">RStudio Connect Administrator Digest</h2><p><strong>Important:</strong> Support for Ubuntu 16.04 LTS will be removed in the next RStudio Connect release. Please review our <a href="https://www.rstudio.com/about/platform-support/">Platform Support</a> page for information on which vendor operating systems are supported. Operating system server migrations can take time and planning. <a href="https://docs.rstudio.com/connect/admin/directories/#server-migrations">Server migration instructions</a> can be found in the Admin Guide.</p><p><strong>New Features of Note:</strong></p><ul><li>The configuration option <code>Server.ViewerKiosk</code> will now take effect on content permission requests in addition to its existing role request behavior. When enabled, users with a &ldquo;viewer&rdquo; role will not be allowed to request access to individual content items or elevated role privileges.</li><li>An <a href="https://docs.rstudio.com/connect/api/#get-/v1/experimental/groups/%7Bguid%7D/content">API endpoint has been added</a> to return a list of all the content items a given group has access to. This can be useful for auditing access control lists for content on your server, or validating that the various groups you manage have access to all the content they should. This API is being released as <a href="https://docs.rstudio.com/connect/api/#overview--experimentation">experimental</a> and requires administrator privileges to utilize. See the <a href="https://docs.rstudio.com/connect/cookbook/groups/#search-content-group-ownership">cookbook section</a> on this API for more information.</li></ul><p><strong>LDAP Authentication Updates:</strong></p><ul><li><strong>BUGFIX</strong> RStudio Connect will now correctly use locally-defined groups for access control in a double-bind LDAP setup.</li><li><strong>KNOWN ISSUE</strong> <a href="https://docs.rstudio.com/connect/admin/authentication/ldap-based/ldap-double-bind/#user-role-mapping">Automatic user role mapping</a> currently contains a bug where user roles are only set on first login. This issue will be addressed in an upcoming release. If you need to change roles for users after they log in for the first time, the <a href="https://docs.rstudio.com/connect/__unreleased__/admin/appendix/cli/#usermanager">usermanager command line tool</a> can be used as a manual workaround.</li></ul><p>Additional updates are described in the <a href="http://docs.rstudio.com/connect/news">full release notes</a>.</p><h2 id="upgrade-to-rstudio-connect-2021090">Upgrade to RStudio Connect 2021.09.0</h2><blockquote><h4 id="upgrade-planning">Upgrade Planning</h4><p>Upgrading RStudio Connect should require less than five minutes. If you are upgrading from a version earlier than 2021.08.2, be sure to consult the <a href="http://docs.rstudio.com/connect/news">release notes</a> for the intermediate releases, as well. As noted above, support for Ubuntu 16.04 LTS will be removed in an upcoming release. We recommend starting <a href="https://docs.rstudio.com/connect/admin/directories/#server-migrations">migration planning</a> as soon as possible.</p></blockquote><p>To perform an RStudio Connect upgrade, download and run the installation script. The script installs a new version of Connect on top of the earlier one. Existing configuration settings are respected.</p><pre><code># Download the installation scriptcurl -Lo rsc-installer.sh https://cdn.rstudio.com/connect/installer/installer-v1.9.4.sh# Run the installation scriptsudo bash ./rsc-installer.sh 2021.09.0</code></pre><h3 align="center"><a href="https://rstudio.com/about/subscription-management/">Sign up for RStudio Professional Product Updates</a></h3></description></item><item><title>Teaching a Biomedical Data Science Course Using RStudio Cloud</title><link>https://www.rstudio.com/blog/teaching-data-science-with-rstudio-cloud/</link><pubDate>Wed, 06 Oct 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/teaching-data-science-with-rstudio-cloud/</guid><description><sup>Photo by <a href="https://unsplash.com/@f7photo?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText" target="_blank" rel="noopener noreferrer">Michael Longmire</a> on <a href="https://unsplash.com/?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup><p>When teaching a course on biomedical data science, just the initial set up can trip up learning. What versions of R are students using? What if their computer doesn’t let them install a necessary package? How can they access the files they need, when they need them? If these questions aren’t easily answerable, students (and teachers) will need to spend time tinkering with their tools rather than applying them in the lesson.</p><p>Chirag Patel, Associate Professor of Biomedical Informatics at Harvard Medical School, faced this challenge when developing Harvard’s Data Science for Medical Decision Making course. The class focused on applying data science techniques to improve rational medical decision making.</p><p>Students needed access to large datasets, computing clusters, and R and Python environments with the goal of running high-quality biomedical analysis. Students also needed to be able to show and verify work — something critical for replicable medical findings. Normally, the process of figuring out logins, sending documents by email, and configuring workspaces would impose a huge workload on the course instructor at the beginning of the semester, just when they are busiest.</p><p>Professor Patel discovered that <a href="https://www.rstudio.com/products/cloud/" target = "_blank" rel = "noopener noreferrer">RStudio Cloud</a> is a solution that supports his students through efficient onboarding and reproducible analysis.</p><h3 id="efficient-onboarding-for-learning-programming">Efficient onboarding for learning programming</h3><p>Professor Patel realized that a critical first step in helping his class succeed is getting them started quickly. He wanted to find a way that students could focus more on learning to become better consumers of data rather than figuring out how to set up their laptops.</p><p>Professor Patel decided to select RStudio Cloud as a platform for his class. It allowed students to quickly dive into data by starting up a computing environment directly in their browser without installing any new software. Package installation and versioning were already taken care of, and Professor Patel could distribute his curriculum by copying his GitHub repositories directly into the workspace.</p><h3 id="reproducible-analysis-for-better-medical-decision-making">Reproducible analysis for better medical decision making</h3><p>With installation out of the way, Professor Patel was able to focus on another key lesson of this course: code-first, reproducible analysis. Using RStudio Cloud, students shared their work in an environment that others could run. Their thought process was reflected in the code and others could review it step by step. Fellow classmates could run the analysis themselves and replicate the findings.</p><p>More than just being a useful teaching technique, code-first data science reinforced Professor Patel’s belief that analytic choices should be clear to the viewer so that they can make informed decisions based on the results.</p><h3 id="better-experience-better-data-consumers">Better experience, better data consumers</h3><p>RStudio Cloud removed the barrier of setting up a computing environment while allowing for reproducible data analysis. Professor Patel&rsquo;s class could instead focus on the data science approaches, interpretation, and results to become better, more active consumers of data. Read more about Professor Patel’s experience with RStudio Cloud on our <a href="https://www.rstudio.com/about/customer-stories/harvard-medical/" target = "_blank" rel = "noopener noreferrer">Customer Stories</a> page.</p><p>To learn more on teaching and learning with RStudio Cloud, please check out these resources:</p><ul><li><a href="https://www.rstudio.com/products/cloud/" target = "_blank" rel = "noopener noreferrer">RStudio Cloud Product Page</a></li><li><a href="https://blog.rstudio.com/2020/08/05/rstudio-cloud-announcement/" target = "_blank" rel = "noopener noreferrer">Do, Share, Teach, and Learn Data Science with RStudio Cloud</a></li><li><a href="https://www.youtube.com/watch?v=gCZ7oueZw6Q" target = "_blank" rel = "noopener noreferrer">Teaching and learning with RStudio Cloud</a></li><li><a href="https://blog.rstudio.com/2020/09/17/rstudio-cloud-a-student-perspective/" target = "_blank" rel = "noopener noreferrer">Learning Data Science with RStudio Cloud: A Student&rsquo;s Perspective</a></li><li><a href="https://blog.rstudio.com/2021/08/05/rstudio-cloud-an-inclusive-solution-for-learning-r/" target = "_blank" rel = "noopener noreferrer">RStudio Cloud: An inclusive solution for learning R</a></li></ul></description></item><item><title>pins 1.0.0</title><link>https://www.rstudio.com/blog/pins-1-0-0/</link><pubDate>Mon, 04 Oct 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/pins-1-0-0/</guid><description><sup>Photo by <a href="https://unsplash.com/@kelsoknight" target="_blank" rel="noopener noreferrer">Kelsey Knight</a> on <a href="https://unsplash.com/">Unsplash</a></sup><p>I’m delighted to announce that <a href="https://pins.rstudio.com">pins</a> 1.0.0 is now available on CRAN.The pins package publishes data, models, and other R objects, making it easy to share them across projects and with your colleagues.You can pin objects to a variety of pin boards, including folders (to share on a networked drive or with services like DropBox), RStudio Connect, Amazon S3, and Azure blob storage.Pins can be versioned, making it straightforward to track changes, re-run analyses on historical data, and undo mistakes. Our users have found numerous ways to use this ability to fluently share and version data and other objects, such as <a href="https://pins.rstudio.com/dev/articles/rsc.html">automating ETL for a Shiny app</a>.</p><p>You can install pins with:</p><pre class="r"><code>install.packages(&quot;pins&quot;)</code></pre><p>pins 1.0.0 includes a major overhaul of the API.The legacy API (<code>pin()</code>, <code>pin_get()</code>, <code>board_register()</code>, and friends) will continue to work, but new features will only be implemented with the new API, so we encourage you to switch to the modern API as quickly as possible.If you’re an existing pins user, you can learn more about the changes and how to update you code in <a href="https://pins.rstudio.com/articles/pins-update.html"><code>vignette("pins-update")</code></a>.</p><div id="basics" class="level2"><h2>Basics</h2><p>To use the pins package, you must first create a pin board.A good place to start is <code>board_folder()</code>, which stores pins in a directory you specify.Here I’ll use a special version of <code>board_folder()</code> called <code>board_temp()</code> which creates a temporary board that’s automatically deleted when your R session ends.This is great for examples, but obviously you shouldn’t use it for real work!</p><pre class="r"><code>library(pins)board &lt;- board_temp()board#&gt; Pin board &lt;pins_board_folder&gt;#&gt; Path: &#39;/tmp/RtmpLu2Bkx/pins-114af466104ab&#39;#&gt; Cache size: 0</code></pre><p>You can “pin” (save) data to a board with <code>pin_write()</code>.It takes three arguments: the board to pin to, an object, and a name:</p><pre class="r"><code>board %&gt;% pin_write(head(mtcars), &quot;mtcars&quot;)#&gt; Guessing `type = &#39;rds&#39;`#&gt; Creating new version &#39;20211004T155644Z-f8797&#39;#&gt; Writing to pin &#39;mtcars&#39;</code></pre><p>As you can see, the data saved as an <code>.rds</code> by default, but depending on what you’re saving and who else you want to read it, you might use the <code>type</code> argument to instead save it as a <code>csv</code>, <code>json</code>, <code>arrow</code>, or <code>qs</code> file.</p><p>You can later retrieve the pinned data with <code>pin_read()</code>:</p><pre class="r"><code>board %&gt;% pin_read(&quot;mtcars&quot;)#&gt; mpg cyl disp hp drat wt qsec vs am gear carb#&gt; Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4#&gt; Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4#&gt; Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1#&gt; Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1#&gt; Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2#&gt; Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1</code></pre></div><div id="sharing-pins" class="level2"><h2>Sharing pins</h2><p>A board on your computer is good place to start, but the real power of pins comes when you use a board that’s shared with multiple people.To get started, you can use <a href="https://pins.rstudio.com/reference/board_folder.html"><code>board_folder()</code></a> with a directory on a shared drive or using Dropbox, or if you use <a href="https://www.rstudio.com/products/connect/">RStudio Connect</a> you can use <a href="https://pins.rstudio.com/reference/board_rsconnect.html"><code>board_rsconnect()</code></a>:</p><pre class="r"><code>board &lt;- board_rsconnect()#&gt; Connecting to RSC 1.9.0.1 at &lt;https://connect.rstudioservices.com&gt;board %&gt;% pin_write(tidy_sales_data, &quot;sales-summary&quot;, type = &quot;rds&quot;)#&gt; Writing to pin &#39;hadley/sales-summary&#39;</code></pre><p>Then, someone else (or an automated Rmd report) can read and use your pin:</p><pre class="r"><code>board &lt;- board_rsconnect()board %&gt;% pin_read(&quot;hadley/sales-summary&quot;)</code></pre><p>You can easily control who gets to access the data using the RStudio Connection permissions pane.</p></div><div id="other-boards" class="level2"><h2>Other boards</h2><p>As well as <code>board_folder()</code> and <code>board_rsconnect()</code>, pins 1.0.0 provides:</p><ul><li><p><a href="https://pins.rstudio.com/reference/board_azure.html"><code>board_azure()</code></a>, which uses Azure’s blob storage.</p></li><li><p><a href="https://pins.rstudio.com/reference/board_s3.html"><code>board_s3()</code></a>, which uses Amazon’s S3 storage platform.</p></li><li><p><a href="https://pins.rstudio.com/reference/board_ms365.html"><code>board_ms365()</code></a>, which uses Microsoft’s OneDrive or SharePoint.(Thanks to contribution from <a href="https://github.com/hongooi73">Hong Ooi</a>)</p></li></ul><p>Future versions of the pins package are likely to include other backends as we learn from our users what would be most useful.</p></div></description></item><item><title>An RStudio Table Contest for 2021</title><link>https://www.rstudio.com/blog/rstudio-table-contest-2021/</link><pubDate>Thu, 30 Sep 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-table-contest-2021/</guid><description><h2 id="the-rstudio-table-contest-2021-edition">The RStudio Table Contest: 2021 Edition</h2><p>We love tables here at RStudio. They serve as a fantastic means to communicate information both quantitative and qualitative. We really saw that in the <a href="https://blog.rstudio.com/2020/12/23/winners-of-the-2020-rstudio-table-contest/">2020 Table Contest</a>, where the entries were phenomenal. The tables you shared used a variety of different R packages for generating tables, they were imaginative, and they were very interesting to read. We’ve also seen beautiful examples of tables produced in R shared in social media, and we were similarly blown away by them. Because we can’t get enough of your well-put-together work, we’re announcing the RStudio Table Contest of 2021. It will run from <strong>September 30th</strong> to <strong>November 15th, 2021</strong>.</p><p>We’ve seen time and again how the R community is open and generous in sharing the code and process needed to solve problems. We appreciate seeing it because that lets others learn, it gets communication going, and it strengthens the community. It’s hoped that this year’s contest will provide more opportunities for sharing and education. We, in turn, will want to recognize and celebrate all the ways people work with and display tabular data with R.</p><h3 id="entry-requirements-and-evaluation-criteria">Entry Requirements and Evaluation Criteria</h3><p>Every submission needs to include all code and data that was used (this is so that each entry can be reproduced). However, an entry can be submitted in one of several different forms; examples include: an <strong>R Markdown</strong> document with code (which might be published to RPubs or shinyapps.io), a repository, or an RStudio Cloud project.</p><p>A submission can use any table-making package available in R, and there are <em>lots</em> of them (<strong>DT</strong>, <strong>gt</strong>, <strong>flextable</strong>, <strong>kableExtra</strong>, <strong>reactable</strong>, <strong>huxtable</strong>, etc.).</p><h4 id="submission-types">Submission Types</h4><p>We are looking for two main types of table submissions:</p><ol><li><strong>Single Table Example</strong>: This may highlight interesting structuring of content, useful and tricky features – for example enabling interaction – or serve as an example of a common table popular in a specific field. Please document your code for clarity and keep these consise and reproducible.</li><li><strong>Tutorial</strong>: It’s all about teaching us how to craft an excellent table or understand a package’s features. This may include several tables and narrative.</li></ol><h4 id="the-submission-form-and-contest-deadline">The Submission Form and Contest Deadline</h4><p>You can submit your entry for the contest by filling in an <a href="https://rstd.io/table-contest-2021">online form</a>. The form will generate a post on RStudio Community, which you can then edit further if you like. Feel free in making multiple entries if you have lots to share.</p><p>Given that tables have different features and purposes, we’d also like you to categorize the submission table. There are four categories: <code>static-HTML</code>, <code>interactive-HTML</code>, <code>static-print</code>, and <code>interactive-Shiny</code>. Simply choose the one that best fits your table.</p><p>The deadline for submissions is <strong>November 15th, 2021</strong> at midnight Pacific Time.</p><h4 id="evaluation-of-entries">Evaluation of Entries</h4><p>After the submission deadline has passed, we will get to evaluating the tables in a timely manner. Tables will be judged based on technical merit, artistic design, and quality of documentation. We know that certain tables may excel in only some of these criteria and our evaluation process will keep this in mind.</p><h3 id="prizes">Prizes</h3><p>We have some great prizes lined up this year! (We like to reward table work.) We will announce the winners and their submissions on the RStudio Blog, in RStudio Community, and also on Twitter.</p><ul><li><strong>The Grand Prize</strong><ul><li>A randomized combination of RStudio t-shirts, books, and mugs (worth up to $200), plus the prizes below.<br><sup>(*Please note that we may not be able to send t-shirts, books, or other items larger than stickers to non-US addresses for which shipping and customs costs are high.)</sup></li></ul></li><li><strong>Runners Up</strong><ul><li>Face time with people making table-making packages! They are all excellent conversationalists and it&rsquo;ll be fun.</li><li>This prize also includes a one-year subscription to the <a href="https://www.shinyapps.io/">Shinyapps.io</a> Basic plan, plus the prize below.</li></ul></li><li><strong>Honorable Mentions</strong><ul><li>A larger-than-large helping of hexagon-shaped stickers for RStudio packages plus a side of hex for table-making packages (and other goodies).</li><li>Inclusion in the soon to be announced RStudio Table Gallery.</li></ul></li></ul><h3 id="a-note-on-a-future-tables-gallery">A Note on a Future Tables Gallery</h3><p>The <a href="https://shiny.rstudio.com/gallery/">Shiny Gallery</a> is a great resource and it has been driven by entries to several Shiny contests. We are still planning on developing an analogous Tables Gallery; winners and other participants may be invited to feature their work in such a gallery. We&rsquo;ll definitely keep you posted on our progress with this initiative.</p></description></item><item><title>How to Use shinyMatrix and plotly Graphs as Inputs in a Shiny App</title><link>https://www.rstudio.com/blog/how-to-use-shinymatrix-and-plotly-graphs/</link><pubDate>Wed, 29 Sep 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/how-to-use-shinymatrix-and-plotly-graphs/</guid><description><script src="https://www.rstudio.com/blog/how-to-use-shinymatrix-and-plotly-graphs/index_files/header-attrs/header-attrs.js"></script><caption>Photo by <a href="https://unsplash.com/@clayton_cardinalli" target="_blank" rel="noopener noreferrer">Clayton Cardinalli</a> on <a href="https://unsplash.com/?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></caption><br><br><div class="lt-gray-box"><p>This is a guest post from Taylor Rodgers, Senior Data Scientist and project lead at PKGlobal. He’s written extensively on data science topics, including the book <em><a href="https://www.taylorrodgers.com/store/p4/how-to-manage-successful-data-team.html" target = "_blank" rel = "noopener noreferrer">Data Work: A Jargon-Free Guide to Managing Successful Data Teams</a></em>. His next book, <em><a href="https://www.taylorrodgers.com/store/p5/beginner-r-programming-plain-english.html" target = "_blank" rel = "noopener noreferrer">R Programming in Plain English</a></em>, is also available for free in beta.</p></div><div id="introduction" class="section level3"><h3>Introduction</h3><p>At <a href="https://pkglobal.com/" target = "_blank" rel = "noopener noreferrer">PKGlobal</a>, we had a manufacturing client that wanted a Shiny app for their engineers and plant workers. This Shiny app would allow their employees, none of whom are data scientists or machine learning experts, to use a machine learning algorithm with over 60 inputs to make a prediction. The prediction would allow them to use less material and significantly cut down on costs. (We’re talking tens of thousands of dollars a week!)</p><p>This posed a challenge though. How do you democratize an ML model to manufacturing workers with so many parameters?</p><p>The solution was a Shiny app with an intuitive interface. One that presented the parameters in roughly the same way they would see them on the factory floor and allowed them multiple options for how they inputed those values.</p><p>This project wound up using Shiny’s amazing strength and flexible functionality. Packages, such as shinyMatrix and plotly, provided the end-user the ability to determine their manufacturing inputs the way they wanted. Specific Shiny functions, such as <code>observeEvent</code> and <code>reactive</code>, ensured these methods could communicate with one another without the “wonkiness” that happens with less flexible tools, such as PowerBI or Tableau.</p><p><em>Please note that much of this code is based on online resources, such as Carson Sievert’s Stack Overflow <a href="https://stackoverflow.com/questions/47280032/draggable-line-chart-in-r-shiny" target = "_blank" rel = "noopener noreferrer">comment</a> and Hadley Wickham’s <a href="https://mastering-shiny.org/" target = "_blank" rel = "noopener noreferrer">Mastering Shiny</a>, as well as package-specific documentation. This article simply ties these resources together to teach you how to use them and why they work.</em></p></div><div id="so-what-did-this-tool-look-like" class="section level3"><h3>So What Did This Tool Look Like?</h3><p>Our client gave us their blessing to showcase this Shiny functionality, but they did request that we not share the details on what the model predicted or their company name.We can, however, reveal what these inputs looked like. If you look below, you’ll see an animation showcasing this functionality.</p><center><img src="example01.gif" style="width:50.0%" /></center><p>There are a lot of details here, so let me break them out one-by-one.</p><p>Within the application, we showed a time-series plot with seven points. Each point represents an input (speed) at a given time in the manufacturing process. All points were required in the final algorithm.</p><center><img src="image01.png" style="width:50.0%" /></center><p>The user can drag and drop these plots, which then changed the output on the algorithm. (The algorithm isn’t included in this demo.)</p><p>Most users liked this option for inputting parameters. However, some wanted to get more specific. They wanted to <em>hand type</em> their inputs. That’s why we gave them the option to toggle to a <code>matrixInput</code>.</p><center><img src="image02.png" style="width:50.0%" /></center><p>And here’s where things started to get complicated.</p><p>We had to ensure both input methods would sync up with one another. For example, if the end-user altered point 2 on the drag-and-drop plot, then that same value must appear on the matrix input. You can see this on the GIF below. I move point 2 on the graph and we find that same value appears on the matrix. When we change point 2 on the matrix, the graph updates to reflect that change.</p><center><img src="example02.gif" style="width:50.0%" /></center><p>These input methods also had to account for user error. The algorithm had specific rules the user had to follow. The first and last time x-axis value had to be 0 and 70, respectively. We had to find a way to enforce those rules to make sure the algorithm worked correctly.</p><p>So to recap, we needed an app that would accomplish the following:</p><ol style="list-style-type: decimal"><li>Allow the user to drag-and-drop plot points to feed an algorithm</li><li>Allow the user to hand type those same points into a matrix that would feed an algorithm</li><li>Ensure the plot and the matrix would match one another</li><li>Ensure that the user couldn’t break the app by not following “the rules”</li></ol><p>Believe it or not, there was even more functionality included. We actually had <em>six</em> of these plot / matrix combos. We also had to show the nearest neighbor, depending on which tab the user selected. And that wasn’t all either. We also had to give the ability to select a different nearest neighbor without breaking the functionality we built!</p><p>I won’t cover those additional features in this how-to document. The good news is that they are largely an extension of the topics we’ll cover in the article below.</p></div><div id="what-packages-and-functions-did-this-app-use" class="section level3"><h3>What Packages and Functions Did This App Use?</h3><p>Delivering this functionality required the following packages:</p><pre class="r"><code>install.packages(&quot;shiny&quot;)install.packages(&quot;tidyverse&quot;)install.packages(&quot;plotly&quot;)install.packages(&quot;shinyMatrix&quot;)</code></pre><p>We made heavy use of the functions <code>observeEvent</code> and <code>reactive</code>. I suggest reviewing this <a href="https://shiny.rstudio.com/articles/reactivity-overview.html" target = "_blank" rel = "noopener noreferrer">article</a> on the subject to better understand the concept. You don’t need to be an expert on these functions, though. My article below will provide a good demo on how to use them.</p></div><div id="building-the-ui" class="section level3"><h3>Building the UI</h3><p>Down below, you’ll see the “skeleton” of the app we’ll build. This will include the packages, data sets, and UI that you’ll need. I also include three table outputs that will help illustrate how the app handles the values within the <code>matrixInput</code> and <code>reactiveValue</code> functions.</p><p>To get started, go into RStudio and create a new Shiny app. Paste the code below into the app.R file of your Shiny app:</p><details><summary>app.R</summary><pre class="r"><code>### Load packageslibrary(shiny)library(tidyverse)library(plotly)library(shinyMatrix)### Define default matrixrateInputs_m &lt;-matrix(c(0, 10, 15, 26, 29, 39, 70, 0.78, 1.05, 1.21, 0.67, 0.61, 0.67, 0.67),nrow = 7,ncol = 2,dimnames = list(NULL, c(&quot;Time&quot;, &quot;Speed&quot;)))### Define UIui &lt;- fluidPage(titlePanel(&quot;Plotly and Shiny Matrix Input Demonstration&quot;),column(4,radioButtons(&quot;toggleInputSelect&quot;,&quot;Input Method:&quot;,choices = c(&quot;Drag-and-Drop&quot; = &quot;dragDrop&quot;, &quot;Hand Typed&quot; =&quot;handTyped&quot;)),br(),conditionalPanel(condition = &quot;input.toggleInputSelect==&#39;dragDrop&#39;&quot;,plotlyOutput(&quot;speed_p&quot;, height = &quot;250px&quot;)),conditionalPanel(condition = &quot;input.toggleInputSelect==&#39;handTyped&#39;&quot;,matrixInput(&quot;rateInputs_mi&quot;,value = rateInputs_m,class = &quot;numeric&quot;,row = list(names = FALSE)))),column(8,tabsetPanel(id = &quot;tabs&quot;,tabPanel(&quot;Algorithm Tab&quot;,value = &quot;algorithmOutput&quot;,column(3, br(),tags$h4(&quot;Original Values&quot;),tableOutput(&quot;table1&quot;)),column(3, br(),tags$h4(&quot;Matix Inputs&quot;),tableOutput(&quot;table2&quot;)),column(3, br(),tags$h4(&quot;Reactive Values&quot;),tableOutput(&quot;table3&quot;))))))### Define server logicserver &lt;- function(input, output, session) {output$table1 &lt;- renderTable({rateInputs_m})output$table2 &lt;- renderTable({input$rateInputs_mi})output$table3 &lt;- renderTable({req(rv$time)data.frame(rv$time, rv$speed)})# Creating Reactive Valuesrv &lt;- reactiveValues()}### Run the applicationshinyApp(ui = ui, server = server)</code></pre></details><p>If you run the app, you may notice that altering the matrix input on the left-hand side only changes the table labeled “Matrix Inputs.” The “original” table is the one defined at the beginning of the app.R script. shinyMatrix uses those inputs, but starts to function as its own object once the app runs.</p><p>You can see this demonstrated below. Notice that we change the speed in the third row and only the value updates on the “Matrix Input” table on the right.</p><center><img src="example03.gif" style="width:70.0%" /></center><p>This is an important distinction to make. We’ll end up adding a third object that will operate in the background. Like the shinyMatrix, this will also use the original matrix defined at the beginning of our app, but become independent later.</p><p>One other thing I want you to notice are the conditional panels (<code>conditionalPanel</code>) in the UI script.</p><pre class="r"><code>conditionalPanel(condition = &quot;input.toggleInputSelect==&#39;dragDrop&#39;&quot;,plotlyOutput(&quot;speed_p&quot;, height = &quot;250px&quot;)),conditionalPanel(condition = &quot;input.toggleInputSelect==&#39;handTyped&#39;&quot;,matrixInput(&quot;rateInputs_mi&quot;,value = rateInputs_m,class = &quot;numeric&quot;,row = list(names = FALSE)))), </code></pre><p>These conditional panels make it possible to toggle back and forth between the two input methods we’ll use later in the app.</p></div><div id="building-the-plotly-drag-and-drop-inputs" class="section level3"><h3>Building the Plotly Drag-and-Drop Inputs</h3><p>Now that we have our UI, we can add to our server function. Within this server function, we’ll build a set of smaller functions that will coordinate with one another to produce the functionality we want. These functions include <code>reactiveValues</code>, <code>renderPlotly</code>, and <code>observeEvent</code>.</p><ul><li><p><code>reactiveValues</code> stores a list of values that we’ll use as both a source and destination in the other two functions.</p></li><li><p><code>renderPlotly</code> allows us to display a plotly graph within our application and will use the reactive values as its data.</p></li><li><p><code>observeEvent</code> observes the end-user as they interact with the application. We will use this function to update the reactive value list, but only if the user updates the matrix input or moves a point on the plotly graph.</p></li></ul><p>If you look below, you can see how these will interact. We store the original matrix inputs in the reactive value function. The user can update either the plotly graph or the matrix input. We’ll have two event observation functions that will update the reactive values with the new inputs.</p><center><img src="image03.png" style="width:50.0%" /></center><p>This graph is a somewhat simplistic representation of the steps we’ll be programming. But overall, it highlights what we’ll be working towards.</p></div><div id="creating-the-reactive-values" class="section level3"><h3>Creating the Reactive Values</h3><p>The first thing we’ll add is the reactive value function. I already have an empty reactive function in the app.R script shared earlier. Go ahead and adjust the <code>rv</code> object to match what you see below.</p><pre class="r"><code># Creating Reactive Valuesrv &lt;- reactiveValues(time=rateInputs_m[,1],speed=rateInputs_m[,2])</code></pre><p>In this function, we pulled the <code>rateInputs_m</code> matrix defined at the top of our app.R script and assigned each column to its respective name. So column one (<code>rateInputs_m[,1]</code>) is time and column two (<code>rateInputs_m[,2]</code>) is speed. The functions we’ll add later will call upon this reactive value list, rather than the original <code>rateInputs_m</code> matrix.</p><p>If you run the app, you should see values populate in the “Reactive Values” table on the far right.</p><center><img src="image09.png" style="width:70.0%" /></center><p>These values won’t change yet, but it’s important to note we now have three matrices in the background: the original matrix, the shinyMatrix, and now these reactive values.</p><p>There are a few other nuances I want to point out as well.</p><p>First, pay close attention to the parentheses <code>(...)</code> surrounding the <code>reactiveValues</code> function in the code above. If you read the documentation (<code>?reactiveValues</code>), this function operates more like a list.</p><p>That means using the <code>({...})</code> notation will prevent this reactive value list from working. Instead, we’ll need to use <code>(...)</code> notation.</p><p>The second thing is the use of the <code>=</code> sign rather than <code>&lt;-</code> to assign object names. Normally, you could assign object names with <code>&lt;-</code>, but since these are not individual objects <em>per se</em>, but items within a list-like function, we’ll need to use the equal <code>=</code> sign.</p><p>The third thing is the use of <code>,</code> between each list item. Unlike the reactive functions with a Shiny app, this must be comma-separated.</p></div><div id="creating-the-renderplotly-function" class="section level3"><h3>Creating the renderPlotly Function</h3><p>Next we’ll add our reactive function for plotly. We start with the <code>renderPlotly</code> function:</p><pre class="r"><code># Speed 1&#39;s Plot and Table and Feedbackoutput$speed_p &lt;- renderPlotly({})</code></pre><p>We’ll need to alter this later, but to start, let’s just add the full <code>plot_ly</code> function to see what it looks like:</p><pre class="r"><code># Speed 1&#39;s Plot and Table and Feedbackoutput$speed_p &lt;- renderPlotly({plot_ly() %&gt;%add_lines(x = rv$time,y = rv$speed,color = I(&quot;black&quot;)) %&gt;%layout(xaxis = list(title = &quot;Time&quot;),yaxis = list(title = &quot;Speed&quot;),showlegend = FALSE)})</code></pre><p>This will plot the reactive values defined earlier. We can see that we add lines for the x-axis and y-axis, which are <code>rv$time</code> and <code>rv$speed</code> respectively. We also do some minor formatting items with the <code>layout</code> function.</p><p>Now here’s where things start to get tricky. If you remember, we want the end-user to drag-and-drop these plot points to determine how they feed into the algorithm. And we’ll need to ensure the reactive value list updates with the new values after they move one of those points.</p><p>Let’s first add the ability to move the plot points. We can do this with the <code>config</code> function, as seen below:</p><pre class="r"><code># Speed 1&#39;s Plot and Table and Feedbackoutput$speed_p &lt;- renderPlotly({plot_ly() %&gt;%add_lines(x = rv$time,y = rv$speed,color = I(&quot;black&quot;)) %&gt;%layout(xaxis = list(title = &quot;Time&quot;),yaxis = list(title = &quot;Speed&quot;),showlegend = FALSE) %&gt;%config(edits = list(shapePosition = TRUE),displayModeBar = FALSE)})</code></pre><p>If we try to run the app now, we still wouldn’t be able to move the lines on the plot. The reason is that we only allowed for “shapes” to be moved on the plot. We still need to create our shapes.</p><p>If you look at the code below, you can see I added a function called <code>map2</code>. We can include these shapes by adding it to the <code>layout</code> function near the bottom.</p><pre class="r"><code># Speed 1&#39;s Plot and Table and Feedbackoutput$speed_p &lt;- renderPlotly({speed_c &lt;- map2(rv$time,rv$speed,~ list(type = &quot;circle&quot;,xanchor = .x,yanchor = .y,x0 = -4,x1 = 4,y0 = -4,y1 = 4,xsizemode = &quot;pixel&quot;,ysizemode = &quot;pixel&quot;,fillcolor = &quot;grey&quot;,line = list(color = &quot;black&quot;)))plot_ly() %&gt;%add_lines(x = rv$time,y = rv$speed,color = I(&quot;black&quot;)) %&gt;%layout(xaxis = list(title = &quot;Time&quot;),yaxis = list(title = &quot;Speed&quot;),showlegend = FALSE,shapes = speed_c) %&gt;%config(edits = list(shapePosition = TRUE),displayModeBar = FALSE)})</code></pre><p>Within the <code>map2</code> function, I defined a list of arguments that will create the circles we can move. This then maps those arguments to both <code>rv$time</code> and <code>rv$speed</code>, the values from our reactive value list.</p><p>By referencing this object in the <code>layout</code> function near the bottom of this script, we will see those points appear on our Shiny application.</p><p>There’s one last thing to add to the <code>plot_ly</code> function before moving forward. We will want to observe interactions on this plot by the end-user. For that reason, we need to give it a “name” that we can reference later. So we’ll add the argument <code>source="speed_s"</code> and you can see that in the final code for this section below:</p><pre class="r"><code># Speed 1&#39;s Plot and Table and Feedbackoutput$speed_p &lt;- renderPlotly({speed_c &lt;- map2(rv$time,rv$speed,~ list(type = &quot;circle&quot;,xanchor = .x,yanchor = .y,x0 = -4,x1 = 4,y0 = -4,y1 = 4,xsizemode = &quot;pixel&quot;,ysizemode = &quot;pixel&quot;,fillcolor = &quot;grey&quot;,line = list(color = &quot;black&quot;)))plot_ly(source = &quot;speed_s&quot;) %&gt;%add_lines(x = rv$time,y = rv$speed,color = I(&quot;black&quot;)) %&gt;%layout(shapes = speed_c,xaxis = list(title = &quot;Time&quot;),yaxis = list(title = &quot;Speed&quot;),showlegend = FALSE) %&gt;%config(edits = list(shapePosition = TRUE),displayModeBar = FALSE)})</code></pre><p>Now let’s run the app and move the dots around. The lines don’t move with the dots, do they?</p><p>That’s because the app doesn’t know how to respond when we move a plot point. It’s not updating the reactive list we made earlier. That’s where the <code>observeEvent</code> function comes in handy.</p></div><div id="creating-the-observeevent-for-plotly" class="section level3"><h3>Creating the observeEvent for Plotly</h3><p>Within Shiny, we can use either the <code>observeEvent</code> or <code>observe</code> functions to see how the end-user interacts with the app. Those functions can then make adjustments to the app, based on what code we add.</p><p>As you can imagine, they have a wide-range of uses and they are something I suggest you get very good at doing if you want to build more Shiny apps.</p><p>You can use both functions in relatively similar ways, but I like the <code>observeEvent</code> function myself.</p><p>Let’s create our <code>observeEvent</code> function. Take the code below and add it to your <code>server</code> function in the Shiny app.</p><pre class="r"><code>observeEvent(event_data(event = &quot;plotly_relayout&quot;, source = &quot;speed_s&quot;), {})</code></pre><p>Now there’s two things I want you to pay attention to here. The first is the <code>event_data</code> function. This is a plotly function and allows us to specify which plotly event we want to observe.</p><p>In this example, we want to know whether a “plotly_relayout” event occurred and whether it came from the plotly graph called “speed_s”. If you recall, we had named the plotly graph we created earlier as “speed_s”.</p><p>The second thing I want you to notice is where I placed the <code>event_data</code> function. I placed it between the first <code>(</code> and the first <code>{</code>. By placing the <code>event_data</code> function at this location and using “speed_s” as the source, the <code>observeEvent</code> function will proceed if there’s an event associated with “speed_s”. It will then execute the code enclosed within the <code>{...}</code> brackets.</p><p>Now with that out of the way, we can start filling in the interior of this <code>observeEvent</code> function. We’ll add the same <code>event_data()</code> as earlier and we’ll then sub-select specific values from it.</p><pre class="r"><code>observeEvent(event_data(event = &quot;plotly_relayout&quot;, source = &quot;speed_s&quot;), {# Speed 1 Event Dataspeed_ed &lt;-event_data(&quot;plotly_relayout&quot;, source = &quot;speed_s&quot;)speed_sa &lt;-speed_ed[grepl(&quot;^shapes.*anchor$&quot;, names(speed_ed))]})</code></pre><p>There’s two steps here. The first is determining the actual event data. I provide an example of what this looks like below:</p><center><img src="image05.png" style="width:50.0%" /></center><p>The problem with this function is sometimes it will display events unrelated to the ones we want. For example, sometimes it’ll record a user adjusting the range on the plot, like you see below:</p><center><img src="image04.png" style="width:50.0%" /></center><p><code>speed_sa</code> helps us overcome this problem. It will only includes events with the word “shapes” in its column header. When it does include that word, the values will be the same as the <code>speed_ed</code> object above it.</p><center><img src="image06.png" style="width:50.0%" /></center><p>After we pull the new shape points, we need to do some additional data transformations. If you recall, we’re updating the reactive values. However, there’s seven rows we could update. How will Shiny know which row to update?</p><p>The good news is that our event data includes a numeric value associated with the shape the user moved. Unfortunately, that value does not correspond to the row number. If you have seven values, like our app does, it will begin the count at zero. So instead of 1 through 7, we have 0 through 6. We can account for this by merely extracting the number from the column name and adding a 1 to it.</p><pre class="r"><code>observeEvent(event_data(event = &quot;plotly_relayout&quot;, source = &quot;speed_s&quot;), {# Speed 1 Event Dataspeed_ed &lt;-event_data(&quot;plotly_relayout&quot;, source = &quot;speed_s&quot;)speed_sa &lt;-speed_ed[grepl(&quot;^shapes.*anchor$&quot;, names(speed_ed))]speed_ri &lt;-unique(readr::parse_number(names(speed_sa)) + 1)})</code></pre><p>We’ll also want to pull the new values associated with the new location of the plot points. That’s what the <code>speed_pts &lt;- as.numeric(speed_sa)</code> code added below accomplishes.</p><pre class="r"><code>observeEvent(event_data(event = &quot;plotly_relayout&quot;, source = &quot;speed_s&quot;), {# Speed 1 Event Dataspeed_ed &lt;-event_data(&quot;plotly_relayout&quot;, source = &quot;speed_s&quot;)speed_sa &lt;-speed_ed[grepl(&quot;^shapes.*anchor$&quot;, names(speed_ed))]speed_ri &lt;-unique(readr::parse_number(names(speed_sa)) + 1)speed_pts &lt;- as.numeric(speed_sa)})</code></pre><p>Now we get to the fun part! We’ll need to include some extra logic to take into account some of the rules we want in place.</p><p>The first rule is that we want to re-sort the dot plots. The second is to ensure that the x-axis value for the first and last point are always 0 and 70, respectively.</p><center><img src="image08.png" style="width:50.0%" /></center><p>While there are probably several ways to implement these two rules, I found it easier to make a matrix to serve as a temporary home for our values. I could then alter this matrix the way I needed to ensure these rules stayed in place and limit the amount of “wonkiness” in the application.</p><p>To start this process, look at the additional lines of code I added below. You’ll see the new matrix called “temp_matrix.” All this does is record the values from the reactive values list and puts it in the temporary matrix.</p><pre class="r"><code>observeEvent(event_data(event = &quot;plotly_relayout&quot;, source = &quot;speed_s&quot;), {# Speed 1 Event Dataspeed_ed &lt;-event_data(&quot;plotly_relayout&quot;, source = &quot;speed_s&quot;)speed_sa &lt;-speed_ed[grepl(&quot;^shapes.*anchor$&quot;, names(speed_ed))]speed_ri &lt;-unique(readr::parse_number(names(speed_sa)) + 1)speed_pts &lt;- as.numeric(speed_sa)# Speed 1 Point Updatestemp_matrix &lt;-matrix(c(round(rv$time, 2), round(rv$speed, 2)),nrow = 7,ncol = 2,dimnames = list(NULL, c(&quot;Time&quot;, &quot;Speed&quot;)))temp_matrix[speed_ri, 1] &lt;- round(speed_pts[1], 2)temp_matrix[speed_ri, 2] &lt;- round(speed_pts[2], 2)})</code></pre><p>We then update the temporary matrix with the new values. We use “speed_ri”, which is the “row index,” to determine the proper row to update. We then use <code>speed_pts[1]</code> and <code>speed_pts[2]</code> to update the temporary matrix values with their new values for the x and y axis.</p><p>Next, we want to re-sort the values. Remember, these are time values and we want them sorted in order of seconds. If we don’t, the model inputs may be in the incorrect order. And we also want to ensure the first and last value are 0 and 70, respectively.</p><pre class="r"><code>observeEvent(event_data(event = &quot;plotly_relayout&quot;, source = &quot;speed_s&quot;), {# Speed 1 Event Dataspeed_ed &lt;-event_data(&quot;plotly_relayout&quot;, source = &quot;speed_s&quot;)speed_sa &lt;-speed_ed[grepl(&quot;^shapes.*anchor$&quot;, names(speed_ed))]speed_ri &lt;-unique(readr::parse_number(names(speed_sa)) + 1)speed_pts &lt;- as.numeric(speed_sa)# Speed 1 Point Updatestemp_matrix &lt;-matrix(c(round(rv$time, 2), round(rv$speed, 2)),nrow = 7,ncol = 2,dimnames = list(NULL, c(&quot;Time&quot;, &quot;Speed&quot;)))temp_matrix[speed_ri, 1] &lt;- round(speed_pts[1], 2)temp_matrix[speed_ri, 2] &lt;- round(speed_pts[2], 2)temp_matrix &lt;-temp_matrix[order(temp_matrix[, 1], decreasing = FALSE), ]temp_matrix[1, 1] &lt;- 0temp_matrix[7, 1] &lt;- 70})</code></pre><p>After we do that, we want to update the reactive value list with these new values found in the temporary matrix.</p><pre class="r"><code>observeEvent(event_data(event = &quot;plotly_relayout&quot;, source = &quot;speed_s&quot;), {# Speed 1 Event Dataspeed_ed &lt;-event_data(&quot;plotly_relayout&quot;, source = &quot;speed_s&quot;)speed_sa &lt;-speed_ed[grepl(&quot;^shapes.*anchor$&quot;, names(speed_ed))]speed_ri &lt;-unique(readr::parse_number(names(speed_sa)) + 1)speed_pts &lt;- as.numeric(speed_sa)# Speed 1 Point Updatestemp_matrix &lt;-matrix(c(round(rv$time, 2), round(rv$speed, 2)),nrow = 7,ncol = 2,dimnames = list(NULL, c(&quot;Time&quot;, &quot;Speed&quot;)))temp_matrix[speed_ri, 1] &lt;- round(speed_pts[1], 2)temp_matrix[speed_ri, 2] &lt;- round(speed_pts[2], 2)temp_matrix &lt;-temp_matrix[order(temp_matrix[, 1], decreasing = FALSE), ]temp_matrix[1, 1] &lt;- 0temp_matrix[7, 1] &lt;- 70# Update reactive valuesrv$time &lt;- round(temp_matrix[, 1], 2)rv$speed &lt;- round(temp_matrix[, 2], 2)})</code></pre><p>With this code in place, we should now be able to move the dots on the plotly graph. Go ahead and try on your app!</p><p>Pay close attention to which tables on the right are changing. If you’ll notice, only the reactive value list is changing.</p><center><img src="example04.gif" style="width:100.0%" /></center></div><div id="building-the-shinymatrix-inputs" class="section level3"><h3>Building the shinyMatrix Inputs</h3><p>If you recall, we wanted to provide our end-user the ability to toggle back and forth between the drag-and-drop plot and the ability to hand type their inputs.</p><center><img src="example01.gif" style="width:50.0%" /></center><p>So if they want to visualize how speed changes overtime, they can use the plotly graph. If they want to input precise values, they can hand type them.</p><p>Fortunately, there’s a great package out there called shinyMatrix that allows this.</p><p>So how do we add this?</p><p>Well, if you look at the UI script, you’ll notice that I had included an input called <code>matrixInput</code>.</p><pre class="r"><code>matrixInput(&quot;rateInputs_mi&quot;,value = rateInputs_m,class = &quot;numeric&quot;,row = list(names = FALSE))</code></pre><p>This function actually works pretty good on its own, but we need to do more. Currently, the new values you can type into the matrix only apply to itself. If we run the app, we’ll notice that the reactive values (which is what the plotly graph uses) does not update. And rules, such as the first and last value must equal 0 and 70, do not apply either.</p><center><img src="example05.gif" style="width:70.0%" /></center><p>We’ll need to add an additional <code>observeEvent()</code> function to update the reactive value list, similar to how we did the plotly relayout.</p><pre class="r"><code>observeEvent(req(input$rateInputs_mi &amp;input$toggleInputSelect == &quot;handTyped&quot;),{})</code></pre><p>We will add a Boolean statement to this event observation between the first <code>(</code> and the first <code>{</code>. Much like our other event observation function, this will only continue in certain instances. The user will need to select both the “Hand Type” option in the UI and change the matrix input for this function to continue.</p><p>It may not be clear now why this is important, but later on, we’ll be adding to our earlier <code>observeEvent</code> function for the “plotly_relayout” events. We want to prevent these two event observations from competing with one another. Adding these statements between the <code>(</code> and <code>{</code> ensures these functions only proceeds at the right moment.</p><p>Now let’s actually add our code!</p><p>This one is a lot easier to write. The shinyMatrix package makes it a relatively straight forward process to make updates.</p><p>First, we’ll take all the values in the <code>rateInputs_mi</code> matrix (the shinyMatrix input in the UI) and assign it to a “temp matrix.” Please note, it is important to include the full matrix code below for it to register with the plotly graph. This is done for the same reasons as we did with the plotly. We want to ensure the end-user follows the rules required for the algorithm input.</p><pre class="r"><code>observeEvent(req(input$rateInputs_mi &amp;input$toggleInputSelect == &quot;handTyped&quot;),{temp_matrix &lt;-matrix(input$rateInputs_mi,nrow = 7,ncol = 2,dimnames = list(NULL, c(&quot;Time&quot;, &quot;Speed&quot;)))temp_matrix[1, 1] &lt;- 0temp_matrix[7, 1] &lt;- 70temp_matrix &lt;-temp_matrix[order(temp_matrix[, 1], decreasing = FALSE), ]})</code></pre><p>As you can see in the code above, we force the first and last value to be 0 and 70, respectively. We also re-order the speed values from least to highest.</p><p>Next, we’ll update the reactive value list.</p><pre class="r"><code>observeEvent(req(input$rateInputs_mi &amp;input$toggleInputSelect == &quot;handTyped&quot;),{temp_matrix &lt;-matrix(input$rateInputs_mi,nrow = 7,ncol = 2,dimnames = list(NULL, c(&quot;Time&quot;, &quot;Speed&quot;)))temp_matrix[1, 1] &lt;- 0temp_matrix[7, 1] &lt;- 70temp_matrix &lt;-temp_matrix[order(temp_matrix[, 1], decreasing = FALSE), ]rv$time &lt;- temp_matrix[, 1]rv$speed &lt;- temp_matrix[, 2]})</code></pre><p>But there’s one more thing we need to do.</p><p>Normally with shinyMatrix, we don’t need to worry about updating the shinyMatrix with new values. It usually does that itself. But in this case, the end-user may have entered 71 for the last time value. If you look at the script above, we changed that value back to 70. That changes only applies to the reactive value list though! We need to make sure the shinyMatrix updates too.</p><p>As seen in the GIF below, we are able to change the first and last value to something other than 0 and 70. I can also change the order of the values. While the reactive list on the far right changes, the table for the shinyMatrix does not. You can replicate this yourself in the app currently.</p><center><img src="example06.gif" style="width:100.0%" /></center><p>To fix this, we update the matrix input with the reactive values. You can see this with the <code>updateMatrixInput</code> function below:</p><pre class="r"><code>observeEvent(req(input$rateInputs_mi &amp;input$toggleInputSelect == &quot;handTyped&quot;),{temp_matrix &lt;-matrix(input$rateInputs_mi,nrow = 7,ncol = 2,dimnames = list(NULL, c(&quot;Time&quot;, &quot;Speed&quot;)))temp_matrix[1, 1] &lt;- 0temp_matrix[7, 1] &lt;- 70temp_matrix &lt;-temp_matrix[order(temp_matrix[, 1], decreasing = FALSE), ]rv$time &lt;- temp_matrix[, 1]rv$speed &lt;- temp_matrix[, 2]updateMatrixInput(session, &quot;rateInputs_mi&quot;, temp_matrix)})</code></pre><p>Now try the same experiment before! It works right? If you look below, you can see that we can no longer “break the rules”. The matrix corrects itself.</p><center><img src="example07.gif" style="width:70.0%" /></center><p>But we’re still not done! We still need to ensure the plotly graph talks back to the matrixInput and vice versa!</p></div><div id="how-to-make-these-inputs-options-work-together" class="section level3"><h3>How to Make These Inputs Options Work Together</h3><p>When running your earlier experiment, you may have noticed the plotly graph doesn’t seem to change the matrix input.</p><p>We had programmed the second event observation function to update the plotly graph on the first tab. We did not program it to work the other way… yet!</p><p>This is an easy change to make. All you need to do is add the same <code>updateMatrixInput</code> function to your <code>observeEvent()</code> function for the plotly graph. Here’s the full code below:</p><pre class="r"><code>observeEvent(event_data(event = &quot;plotly_relayout&quot;, source = &quot;speed_s&quot;), {# Speed 1 Event Dataspeed_ed &lt;-event_data(&quot;plotly_relayout&quot;, source = &quot;speed_s&quot;)speed_sa &lt;-speed_ed[grepl(&quot;^shapes.*anchor$&quot;, names(speed_ed))]speed_ri &lt;-unique(readr::parse_number(names(speed_sa)) + 1)speed_pts &lt;- as.numeric(speed_sa)# Speed 1 Point Updatestemp_matrix &lt;-matrix(c(round(rv$time, 2), round(rv$speed, 2)),nrow = 7,ncol = 2,dimnames = list(NULL, c(&quot;Time&quot;, &quot;Speed&quot;)))temp_matrix[speed_ri, 1] &lt;- round(speed_pts[1], 2)temp_matrix[speed_ri, 2] &lt;- round(speed_pts[2], 2)temp_matrix &lt;-temp_matrix[order(temp_matrix[, 1], decreasing = FALSE), ]temp_matrix[1, 1] &lt;- 0temp_matrix[7, 1] &lt;- 70# Update reactive valuesrv$time &lt;- round(temp_matrix[, 1], 2)rv$speed &lt;- round(temp_matrix[, 2], 2)updateMatrixInput(session, &quot;rateInputs_mi&quot;, temp_matrix)})</code></pre><p>And with that last addition, these two input methods should work together. Give your app a preview and see if that’s the case!</p><p>The complete app should look something like this:</p><details><summary>Final app.R</summary><pre class="r"><code>### Load packageslibrary(shiny)library(tidyverse)library(plotly)library(shinyMatrix)### Define default matrixrateInputs_m &lt;-matrix(c(0, 10, 15, 26, 29, 39, 70, 0.78, 1.05, 1.21, 0.67, 0.61, 0.67, 0.67),nrow = 7,ncol = 2,dimnames = list(NULL, c(&quot;Time&quot;, &quot;Speed&quot;)))### Define UIui &lt;- fluidPage(titlePanel(&quot;Plotly and Shiny Matrix Input Demonstration&quot;),column(4,radioButtons(&quot;toggleInputSelect&quot;,&quot;Input Method:&quot;,choices = c(&quot;Drag-and-Drop&quot; = &quot;dragDrop&quot;, &quot;Hand Typed&quot; =&quot;handTyped&quot;)),br(),conditionalPanel(condition = &quot;input.toggleInputSelect==&#39;dragDrop&#39;&quot;,plotlyOutput(&quot;speed_p&quot;, height = &quot;250px&quot;)),conditionalPanel(condition = &quot;input.toggleInputSelect==&#39;handTyped&#39;&quot;,matrixInput(&quot;rateInputs_mi&quot;,value = rateInputs_m,class = &quot;numeric&quot;,row = list(names = FALSE)))),column(8,tabsetPanel(id = &quot;tabs&quot;,tabPanel(&quot;Algorithm Tab&quot;,value = &quot;algorithmOutput&quot;,column(3, br(),tags$h4(&quot;Original Values&quot;),tableOutput(&quot;table1&quot;)),column(3, br(),tags$h4(&quot;Matix Inputs&quot;),tableOutput(&quot;table2&quot;)),column(3, br(),tags$h4(&quot;Reactive Values&quot;),tableOutput(&quot;table3&quot;))))))### Define server logicserver &lt;- function(input, output, session) {output$table1 &lt;- renderTable({rateInputs_m})output$table2 &lt;- renderTable({input$rateInputs_mi})output$table3 &lt;- renderTable({req(rv$time)data.frame(rv$time, rv$speed)})# Creating Reactive Valuesrv &lt;- reactiveValues(time = rateInputs_m[, 1],speed = rateInputs_m[, 2])# Speed 1&#39;s Plot and Table and Feedbackoutput$speed_p &lt;- renderPlotly({speed_c &lt;- map2(rv$time,rv$speed,~ list(type = &quot;circle&quot;,xanchor = .x,yanchor = .y,x0 = -4,x1 = 4,y0 = -4,y1 = 4,xsizemode = &quot;pixel&quot;,ysizemode = &quot;pixel&quot;,fillcolor = &quot;grey&quot;,line = list(color = &quot;black&quot;)))plot_ly(source = &quot;speed_s&quot;) %&gt;%add_lines(x = rv$time,y = rv$speed,color = I(&quot;black&quot;)) %&gt;%layout(shapes = speed_c,xaxis = list(title = &quot;Time&quot;),yaxis = list(title = &quot;Speed&quot;),showlegend = FALSE) %&gt;%config(edits = list(shapePosition = TRUE),displayModeBar = FALSE)})observeEvent(event_data(event = &quot;plotly_relayout&quot;, source = &quot;speed_s&quot;), {# Speed 1 Event Dataspeed_ed &lt;- event_data(&quot;plotly_relayout&quot;, source = &quot;speed_s&quot;)speed_sa &lt;-speed_ed[grepl(&quot;^shapes.*anchor$&quot;, names(speed_ed))]speed_ri &lt;- unique(readr::parse_number(names(speed_sa)) + 1)speed_pts &lt;- as.numeric(speed_sa)# Speed 1 Point Updatestemp_matrix &lt;- matrix(c(round(rv$time, 2), round(rv$speed, 2)),nrow = 7,ncol = 2,dimnames = list(NULL, c(&quot;Time&quot;, &quot;Speed&quot;)))temp_matrix[speed_ri, 1] &lt;- round(speed_pts[1], 2)temp_matrix[speed_ri, 2] &lt;- round(speed_pts[2], 2)temp_matrix &lt;-temp_matrix[order(temp_matrix[, 1], decreasing = FALSE), ]temp_matrix[1, 1] &lt;- 0temp_matrix[7, 1] &lt;- 70# Update reactive valuesrv$time &lt;- round(temp_matrix[, 1], 2)rv$speed &lt;- round(temp_matrix[, 2], 2)updateMatrixInput(session, &quot;rateInputs_mi&quot;, temp_matrix)})observeEvent(req(input$rateInputs_mi &amp;input$toggleInputSelect == &quot;handTyped&quot;),{temp_matrix &lt;-matrix(input$rateInputs_mi,nrow = 7,ncol = 2,dimnames = list(NULL, c(&quot;Time&quot;, &quot;Speed&quot;)))temp_matrix[1, 1] &lt;- 0temp_matrix[7, 1] &lt;- 70temp_matrix &lt;-temp_matrix[order(temp_matrix[, 1], decreasing = FALSE), ]rv$time &lt;- temp_matrix[, 1]rv$speed &lt;- temp_matrix[, 2]updateMatrixInput(session, &quot;rateInputs_mi&quot;, temp_matrix)})}### Run the applicationshinyApp(ui = ui, server = server)</code></pre></details><p>See the app on <a href="https://colorado.rstudio.com/rsc/connect/#/apps/14a2ce4f-4248-4c05-ac80-8e7db0d7d6de/access">RStudio Connect</a>.</p></div><div id="things-to-remember" class="section level3"><h3>Things to Remember</h3><p>If there’s one thing I hope you learned from this tutorial, it’s that Shiny apps are powerful tools. Data scientists can build applications that allow engineers, manufacturers, and plant workers to use machine learning to improve results and save money. Shiny provides endless ways to provide this service in a flexible and intuitive way for the end user. The only hard part is the programming. Learn to invest the time in learning observation and reactive functions. Find new and novel ways to use existing packages, such as plotly and shinyMatrix, to build something cool.</p><p><strong>Watch Taylor’s R in Manufacturing talk here:</strong></p><script src="https://fast.wistia.com/embed/medias/ozwqigkag9.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_ozwqigkag9 videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/ozwqigkag9/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div></div></description></item><item><title>RStudio 2021.09.0 Update: What's New</title><link>https://www.rstudio.com/blog/rstudio-2021.09.0-update-whats-new/</link><pubDate>Wed, 29 Sep 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-2021.09.0-update-whats-new/</guid><description><div style="font-size:60%; display: flex; justify-content: center">Photo by Mick Fournier / HBI Producers of Fine Orchids</div></center><p>This post describes some of the improvements contained in the RStudio update 2021.09.0, code-named Ghost Orchid:</p><ul><li><a href="#calendar-versioning">Calendar Versioning</a></li><li><a href="#improved-usability-when-r-is-busy">Improved Usability When R is Busy</a></li><li><a href="#logging-changes">Logging Changes</a></li><li><a href="#high-dpi-retina-plots">High DPI Retina Plots</a></li><li><a href="#replay-local-jobs">Replay Local Jobs</a></li><li><a href="#load-balancing">Load Balancing</a></li><li><a href="#kubernetes---supplemental-groups">Kubernetes Supplemental Groups</a></li></ul><h3 id="calendar-versioning">Calendar Versioning</h3><p>One noticeable change in this update is the shift to a calendar based versioning scheme for all RStudio products. This release is an update to v1.4.1717-3. See <a href="https://blog.rstudio.com/2021/08/30/calendar-versioning-for-commercial-rstudio-products/">this post</a> for details.</p><h3 id="improved-usability-when-r-is-busy">Improved Usability When R is Busy</h3><p>This release contains a change to improve how the IDE responds to the user when R is in the midst of some types of busy operations. Now you are able to save your changes, use the terminal, and open new files even when R is busy.</p><img align="center" style="padding: 35px:" src="r-is-busy.png"><p>To understand this change, it&rsquo;s helpful to understand a bit more about how R works under the hood. Like Python and many other languages, R uses a single thread of execution for access to data so programmers do not have to worry about locking and concurrency. This design however prevents R from supporting preemptive multitasking, a feature that allows the operating system to interrupt a program in the midst of whatever it&rsquo;s doing at the request of the user. Instead, most long-running R routines are written to periodically yield control back to the system. This allows the system to check to see if the user has requested an interrupt, and to do background tasks for user-interactivity. These checks allow the RStudio IDE to support interrupts, and to respond to clicks and commands from the user.</p><p>Some long running R operations unfortunately do not make these yield calls, and when this happens in v1.4, it&rsquo;s not possible to open or save files, use the terminal, or even for the IDE to respond. Further, when using the IDE connecting to a remote server, the browser only allows a small number of outstanding requests. Once those are all used up, it causes a logjam where even the abort operation is not received. In these extreme circumstances, the only option left for the user in v1.4 is to wait for the operation to complete or to refresh the browser.</p><p>Now in 2021.09.0, the RStudio IDE allows you to open and save files even when R is in one of these uncooperative busy states. It also allows you to use the terminal so you can monitor the CPU usage and see the state of files on the disk. Another change is that the memory usage statistic will update even when R is busy.</p><p>And to avoid the browser logjam with RStudio Server, a placeholder response is returned to free up the browser connection. This means that you&rsquo;ll always be able to send an interrupt or abort request to the IDE.</p><h3 id="logging-changes">Logging Changes</h3><p>System log files help administrators of the system diagnose problems. In this release, we have improved both the logging infrastructure and the quality and quantity of logging messages for the open source products and RStudio Workbench. As with the previous release, all logging configuration settings should be placed in <code>/etc/rstudio/logging.conf</code>.</p><h4 id="message-improvements">Message Improvements</h4><p>There have been a number of improvements to the messages logged by the system:</p><ul><li>Info messages logged at startup to detail which configuration files were used</li><li>Additional debug messages for the rserver to trace request handling and load balancing</li><li>Details added to a number of errors to provide more context</li></ul><h4 id="file-logging">File Logging</h4><p>The default setting for logger-type has changed to <code>file</code> from <code>syslog</code>. Even with this default, warnings and errors continue to be logged to syslog by default, but that can be disabled by setting <code>warn-syslog=0</code>. By default, log files are placed in the new directory <code>/var/log/rstudio</code>. There is an <code>rstudio-server</code> subdirectory that contains rserver and rsession logs. Those pertaining to the job launcher are placed in the <code>launcher</code> subdirectory. This is team effort to put all RStudio logs in one place.</p><p>Note that the RStudio monitor log, that combines server and session logs into one stream, is still created in the old location: <code>/var/lib/rstudio-server/monitor/log/rstudio-server.log</code>.</p><h4 id="json-format">JSON Format</h4><p>To support better integration with external log file index and search, log files can be formatted as JSON by setting:</p><p><code>log-message-format=json</code></p><h4 id="session-protocol-debugging">Session Protocol Debugging</h4><p>To diagnose problems with an R session, it can be helpful to see the sequence of requests that lead up to that error. There&rsquo;s a new IDE preference called <code>Session Protocol Debug</code>, settable from the <a href="https://blog.rstudio.com/2020/10/14/rstudio-v1-4-preview-command-palette/">Command Palette</a>. When enabled, it turns on debug logging for the session and enables a special option to show messages before and after each request handled by the IDE. This information is placed by default in a log file in <code>/var/log/rstudio/rstudio-server/rsession-username.log</code>.</p><p>This log complements the session diagnostics option, providing an easier to read log than the existing &lsquo;strace&rsquo; option that logs all system calls at a low level.</p><h4 id="logging-details">Logging Details</h4><p>Read more about RStudio logging in the <a href="https://docs.rstudio.com/ide/server-pro/server_management/logging.html">admin guide</a>.</p><h3 id="high-dpi-retina-plots">High DPI Retina Plots</h3><p>Images generated for displays that have more dots-per-inch (DPI), such as the Mac retina, should be rendered at a higher resolution to match what&rsquo;s possible with those displays. The Mac Desktop version of RStudio has supported higher resolution plots for a long time, but in 2021.09.0, this feature is supported for RStudio Server and the other desktop versions as well.</p><h3 id="replay-local-jobs">Replay Local Jobs</h3><p>The RStudio IDE has a new button that allows you to quickly re-run a local job with the same parameters:</p><img align="center" style="padding: 35px:" src="replay-job.png"><h3 id="load-balancing">Load Balancing</h3><p>RStudio Workbench now supports a more flexible way to configure nodes in the cluster. For details, see <a href="https://blog.rstudio.com/2021/09/21/rstudio-workbench-load-balancing-changes/">this post</a>.</p><h3 id="kubernetes---supplemental-groups">Kubernetes - Supplemental Groups</h3><p>In RStudio Workbench, when using Kubernetes to manage session instances, the <code>launcher-sessions-create-container-user</code> option allows you to create a unix account for the session user on-the-fly when the session starts. This eliminates the need to provision user accounts on the Kubernetes session image. Instead, the user-id and group information from the user&rsquo;s account on the RStudio Workbench system is used to create a matching account on the system image when the session is started. This allows the session to access the network file systems that use those ids for permissions.</p><p>New in 2021.09.0, enabling this option will also create supplemental groups ids.</p><h3 id="more-info">More Info</h3><p>There&rsquo;s lots more in this release, and it&rsquo;s <a href="https://www.rstudio.com/products/rstudio/download/">available for download today</a>. You can read about all the features and bugfixes in the &ldquo;Ghost Orchid&rdquo; update in the <a href="https://www.rstudio.com/products/rstudio/release-notes/">RStudio Release Notes</a>, and we&rsquo;d love to hear your feedback about the new release on our <a href="https://community.rstudio.com/c/rstudio-ide/9">community forum</a>.</p></description></item><item><title>What's New on RStudio Cloud - September 2021</title><link>https://www.rstudio.com/blog/what-s-new-on-rstudio-cloud-september-2021/</link><pubDate>Tue, 28 Sep 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/what-s-new-on-rstudio-cloud-september-2021/</guid><description><p>Whether you want to do, share, teach, or learn data science, <a href="https://www.rstudio.com/products/cloud/" target = "_blank" rel = "noopener noreferrer">RStudio Cloud</a> is a cloud-based solution that allows you to do so online. The RStudio Cloud team has rolled out new features and improvements since our last post in <a href="https://blog.rstudio.com/2021/05/03/rstudio-cloud2/" target = "_blank" rel = "noopener noreferrer">May 2021</a>. So what’s new?</p><ul><li>Create and work with Jupyter Notebook projects as easily as RStudio IDE projects (currently in beta — <a href="https://community.rstudio.com/c/rstudio-cloud/14" target = "_blank" rel = "noopener noreferrer">we&rsquo;d love your feedback!</a>)</li><li>Work with more flexibility on RStudio Cloud with more hours per plan, more disk space per project, and a lower price for additional hours</li><li>Stay up to date with an automatic upgrade to Ubuntu 20.04 for all projects currently running Ubuntu 16.04</li></ul><p>Let’s take a closer look at these updates.</p><p><strong>Expand your data science workbench with Jupyter projects</strong></p><p>Jupyter Notebook projects are now available to Premium, Instructor, or Organization account holders. Once you are in <a href="https://rstudio.cloud/)" target = "_blank" rel = "noopener noreferrer">RStudio Cloud</a>, you can create and work with Jupyter projects as easily as RStudio IDE projects. Click on the New Project button, then select New Jupyter Project from the menu that appears. If you haven&rsquo;t yet joined the beta program, you will be prompted to fill out a brief form — submit that and we&rsquo;ll get you into the program ASAP.</p><p><img src="jupyter-start.png" alt="Screenshot of new Jupyter notebook project option in RStudio Cloud"></p><p>Doing so allows you to work in a Jupyter notebook:</p><p><img src="jupyter-notebook.png" alt="Screenshot of Jupyter notebook in RStudio cloud"></p><p>This functionality is currently in beta — <a href="https://community.rstudio.com/c/rstudio-cloud/14" target = "_blank" rel = "noopener noreferrer">we&rsquo;d love to hear your feedback</a>.</p><p><strong>Pay Less! Get More!</strong></p><p>Need more time for your analysis? Or have other projects that you’d like to run? In RStudio Cloud, we’ve provided you with more cost-effective options for your data science work. We have bumped up the number of projects available on the Free and Plus plans to fifty. We have also provided more flexibility if you need more time by increasing the monthly project hours included in each plan and halving the cost per additional hour.</p><div class="table-responsive pt-4"><table class="table table-striped"><tr><td></td><td><b>Number of projects included</b></td><td><b>Monthly project hours included</b></td><td><b>Cost per additional project hour</b></td></tr><tr><td>Free plan</td><td><strong>50</strong> (was 15)</td><td><strong>25</strong> (was 15)</td><td>-</td></tr><tr><td>Plus plan</td><td><strong>50</strong> (was 15)</td><td><strong>75</strong> (was 50)</td><td><strong>10¢ </strong>(was 20¢)</td></tr><tr><td>Premium plan</td><td>Unlimited</td><td><strong>200</strong> (was 160)</td><td><strong>10¢ </strong>(was 20¢)</td></tr><tr><td>Instructor plan</td><td>Unlimited</td><td><strong>300</strong> (was 160)</td><td><strong>10¢ </strong>(was 20¢)</td></tr></table></div><p>In addition, each project can now store up to 20GB of files, data, and packages on disk, up from the prior 3GB for files and data, and 3GB for packages.</p><p><strong>Work with confidence with upgraded Ubuntu</strong></p><p>In RStudio Cloud, you can use the latest versions of R and Python packages with confidence that the underlying operating system has the features they require. Starting on September 13, 2021, all projects still running Ubuntu 16.04 (Xenial) will be automatically upgraded to Ubuntu 20.04 (Focal) the next time they are opened. For more information, please visit this <a href="https://community.rstudio.com/t/rstudio-cloud-ubuntu-16-04-eol-on-sept-13-2021/113409" target = "_blank" rel = "noopener noreferrer">community article</a>.</p><p><strong>Learn more about RStudio Cloud</strong></p><p>We are excited to provide you with more capabilities so that you can jump right into your data science work. For more information and resources, please visit:</p><ul><li><a href="https://www.rstudio.com/products/cloud/" target = "_blank" rel = "noopener noreferrer">RStudio Cloud Product Page</a></li><li><a href="https://rstudio.cloud/learn/whats-new" target = "_blank" rel = "noopener noreferrer">What&rsquo;s New on RStudio Cloud</a></li><li><a href="https://community.rstudio.com/c/rstudio-cloud/14" target = "_blank" rel = "noopener noreferrer">RStudio Cloud Page on RStudio Community</a></li></ul></description></item><item><title>Curating for @WeAreRLadies on Twitter</title><link>https://www.rstudio.com/blog/curating-for-wearerladies-on-twitter/</link><pubDate>Thu, 23 Sep 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/curating-for-wearerladies-on-twitter/</guid><description><sup>Photo by <a href="https://unsplash.com/@_entreprenerd?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText" target="_blank" rel="noopener noreferrer">Arno Smit</a> on <a href="https://unsplash.com/?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup><div class="lt-gray-box"><p>This is a guest post from Shannon Pileggi, an enthusiastic professional educator and statistical consultant with over ten years of experience collaborating on data analysis with diverse partners in industry, tech, public health, and clinical research. Find this post on Shannon’s website <a href="https://www.pipinghotdata.com/" target = "_blank" rel = "noopener noreferrer">here</a>.</p></div><div id="tl-dr" class="level1"><h2>TL; DR</h2><p>In February 2021 I tweeted to a daunting &gt;20k followers by curating for <a href="https://twitter.com/WeAreRLadies" target = "_blank" rel = "noopener noreferrer"><code>@WeAreRLadies</code></a> on Twitter. This was great opportunity to share knowledge, interact with others, and learn something in return, ultimately cultivating new connections and collaborations. From preparation to fruition, I hope this post helps you confidently enroll as a curator!</p></div><div id="about-wearerladies" class="level1"><h2>About <code>@WeAreRLadies</code></h2><p>The <code>@WeAreRLadies</code> rotating Twitter curator exists to “encourage and maintain Twitter engagement within the R-Ladies community”, and to “spotlight female and minority genders” working with R. R-Ladies has a <a href="https://guide.rladies.org/rocur/about/" target = "_blank" rel = "noopener noreferrer">comprehensive guide</a> describing the program, procedures and protocols for the week, and tips for successful curation.</p></div><div id="overcoming-imposter-syndrome" class="level1"><h2>Overcoming imposter syndrome</h2><p>You may be hesitant to sign up as a curator due to imposter syndrome - I certainly was. I was on Twitter for three years before I gathered the courage. However, you do not need to know everything about R nor Twitter in order to be a successful <code>@WeAreRLadies</code> curator - that is impossible! In fact,</p><center><img src="rcomm.png" width="450"></center><p>Every R user, new or experienced, has a valuable perspective to share. I was particularly impressed when <a href="https://twitter.com/daniebrant" target = "_blank" rel = "noopener noreferrer">Danielle Brantley</a> excellently curated in September 2020, after being an R user for one year! To help alleviategeneral R imposter syndrome check out Catlin Hudon’s blog post on <a href="https://caitlinhudon.com/2018/01/19/imposter-syndrome-in-data-science/" target = "_blank" rel = "noopener noreferrer">imposter syndrome in data science</a>; to increase comfort with Twitter, try <a href="https://www.t4rstats.com/" target = "_blank" rel = "noopener noreferrer">Twitter for R Programmers</a> by Oscar Baruffa and Veerle van Son.</p><p>My personal strategy for combating imposter syndrome is to prepare. For my curating week, my preparation involved reflecting on past curators and creating some content in advance. I hope this post helps you to prepare and motivates you to sign up. 😉</p></div><div id="timeline" class="level1"><h2>Timeline</h2><p>Here is my personal timeline to leading up to curation.</p><div class="table-responsive"><table class="table-striped-odd-bg"><thead><tr class="header"><th align="left">Time before curation</th><th align="left">Action taken</th></tr></thead><tbody><tr class="odd"><td align="left">3 years</td><td align="left">Became active on twitter</td></tr><tr class="even"><td align="left">3 months</td><td align="left">Signed up to curate</td></tr><tr class="odd"><td align="left">3 weeks</td><td align="left">Notified manager; discussed work-related content</td></tr><tr class="even"><td align="left">2 weeks</td><td align="left">Researched R-Ladies fonts and colors</td></tr><tr class="odd"><td align="left">1 week</td><td align="left">Started drafting tweets</td></tr><tr class="even"><td align="left">1 day</td><td align="left">Fiddled with formats for code gifs</td></tr></tbody></table></div></div><div id="selecting-a-date" class="level1"><h2>Selecting a date</h2><p>You can view the <a href="https://docs.google.com/spreadsheets/d/13NwIphQ6o-3YJUbHtbDRf4texfMOCvhIDNZgDZhHv7U/edit#gid=1322160368" target = "_blank" rel = "noopener noreferrer">schedule</a> of upcoming curators to identify available dates; records of previous curators are also maintained here.</p><p>Being a curator will be time intensive, so be kind to yourself. Choose dates when you will have time to invest and a flexible work schedule. I chose Feb 15-20because I hoped by then I would be recovered from an intense Q4 work cycle;additionally, Feb 15 (President’s Day) was a company holiday. You maywant to select a date far enough in the future that allows you time to createcontent.</p><p>Another consideration is to schedule your curation to coincide with dates that align with your interests. For example, are you passionate about Black History Month in February, LGBT Pride Month in June, or Universal Human Rights Month in December?If so, take advantage of the <code>@WeAreRLadies</code> large platform as an opportunity to inform and educate others on issues that are important to you as they relate to theR community. <a href="https://www.diversitybestpractices.com/2021-diversity-holidays" target = "_blank" rel = "noopener noreferrer">Diversity best practices</a> has a comprehensive list of calendar holidays and observances.</p></div><div id="signing-up" class="level1"><h2>Signing up</h2><p>You sign up by <a href="https://docs.google.com/forms/d/e/1FAIpQLSepXaNf2z_hvekoNKIbS1hA1B8Z_3B9p5WK0Kk6wlHIDIw2Lg/viewform" target = "_blank" rel = "noopener noreferrer">submitting a form</a> - give yourself at least 30 minutes to sign up as part of theform includes filling out details that complete your <a href="https://twitter.com/WeAreRLadies/status/1361139211819180032" target = "_blank" rel = "noopener noreferrer">curating profile</a>.</p><p>There was a gap between when I filled out the form and when I was confirmed as a curator, which I suspect was due to timings and holidays. Be kind, be patient - all organizing R-Ladies are volunteers.</p></div><div id="notifying-my-manager" class="level1"><h2>Notifying my manager</h2><p>About three weeks before my curation, I started planning my curating efforts a bit more seriously. I notified my manager that I was curating, and I discussed potential work-related content with her. One idea was approved and another was reasonably denied. This honest conversation facilitated new awareness about my passions - my manager was not aware of R-Ladies, and she was enthusiastic and supportive.</p></div><div id="styling-content" class="level1"><h2>Styling content</h2><p>Additionally, I considered how to visually style content beyond text in a tweet. I asked on <a href="https://rladies-community-slack.herokuapp.com/" target = "_blank" rel = "noopener noreferrer">R-Ladies slack</a> about R-Ladies styles, and I was directed to the xaringan R-Ladies <a href="https://github.com/yihui/xaringan/tree/master/inst/rmarkdown/templates/xaringan/resources" target = "_blank" rel = "noopener noreferrer">css</a>and the R-Ladies <a href="https://guide.rladies.org/organization/tech/brand/" target = "_blank" rel = "noopener noreferrer">branding</a> guides. You are not required to use R-Ladies style and branding, but it was convenient for me.</p><p>I developed two visual layouts using the Google slide <a href="https://docs.google.com/presentation/d/1sriC2biLPYza_TtGiZkrNDsv3AMg6dvhqw2yit4wNnA/edit" target = "_blank" rel = "noopener noreferrer">template</a> from R-Ladies branding (see tweets for <a href="https://twitter.com/WeAreRLadies/status/1363144545677017089" target = "_blank" rel = "noopener noreferrer">blogdown vs distill</a>) and <a href="https://twitter.com/WeAreRLadies/status/1362370580708790274" target = "_blank" rel = "noopener noreferrer">asking for help online</a>.</p><center><div class="py-4"><img src="distill.png" width="600"/></div><caption>Comparison of blogdown vs distill styled using R-Ladies Google slide template.</caption></center><p>I also created five R-Ladies styled code gifs with xaringan and flipbookr - methods and code are in this <a href="https://www.pipinghotdata.com/posts/2021-03-08-r-ladies-styled-code-gifs-with-xaringan-and-flipbookr/" target = "_blank" rel = "noopener noreferrer">blog post</a>. Here is an example code gif:</p><center><div class="py-4"><img src="walrus.gif"/></div><caption>Example R-Ladies styled code gif.</caption></center></div><div id="drafting-content" class="level1"><h2>Drafting content</h2><p>Leading up to my curation week, I regularly jotted down brief notes of content ideas. The week before curation, I started fleshing out those ideas into actual tweets and wrote them down in a document. Not all of my ideas ended up in a draft,and rarely did the draft get tweeted out exactly as I had written.</p><p>One challenge with drafting tweets in a document was being mindful of character limits and anticipating where the breaks would be for threads. I started copying content into a send tweet window to preview and then pasting it back into my draft document. There is software that facilitates drafting tweets - for example, <a href="https://twitter.com/DaphnaHarel" target = "_blank" rel = "noopener noreferrer">Daphna Harel</a> recommended <a href="https://getchirrapp.com/" target = "_blank" rel = "noopener noreferrer">getchirrapp.com</a> to me the week of my curation. I also kept <a href="https://emojipedia.org/" target = "_blank" rel = "noopener noreferrer">emojipedia</a> open all week to easily copy and paste emojis into drafts.</p><p>Not all content was premeditated - I also tweeted in the moment. For example, the <a href="https://twitter.com/WeAreRLadies/status/1362016896116219904?s=20" target = "_blank" rel = "noopener noreferrer">W.E.B. Du Bois’</a> <code>#TidyTuesday</code> visualizations were incredible that week, or when I <a href="https://twitter.com/WeAreRLadies/status/1362509983368249346?s=20" target = "_blank" rel = "noopener noreferrer">realized</a> a new colleague wasn’t yet taking advantage of RStudio projects.</p></div><div id="content-inspiration" class="level1"><h2>Content inspiration</h2><p>As I approached my curating week, I recalled previous <code>@WeAreRLadies</code> that werememorable for me, my previous experience as an educator, and some reflectionquestions to inspire content.</p><div class="table-responsive"><table class="table-striped-odd-bg"><colgroup><col width="47%" /><col width="52%" /></colgroup><thead><tr class=""><th>Inspiration source</th><th>Example realization</th></tr></thead><tbody><tr class="odd"><td>1. Rotating curator Mine Çetinkaya-Rundel <a href="https://twitter.com/minebocek" target = "_blank" rel = "noopener noreferrer"><code>@minebocek</code></a> tweets awesome <a href="https://twitter.com/WeAreRLadies/status/1064704918102163457" target = "_blank" rel = "noopener noreferrer">gifs</a></td><td><a href="https://twitter.com/WeAreRLadies/status/1361802517735178243" target = "_blank" rel = "noopener noreferrer">Gifs</a> for R code demos</td></tr><tr class="even"><td>2. Rotating curator Megan Stodel <a href="https://twitter.com/MeganStodel" target = "_blank" rel = "noopener noreferrer"><code>@MeganStodel</code></a> tweets a <a href="https://twitter.com/WeAreRLadies/status/1313177623128944645" target = "_blank" rel = "noopener noreferrer">project</a> inspired by curating</td><td>An R project to <a href="https://twitter.com/WeAreRLadies/status/1361286341317779456" target = "_blank" rel = "noopener noreferrer">introduce myself</a> as a curator</td></tr><tr class="odd"><td>3. Rotating curator Julia Piaskowski <a href="https://twitter.com/SeedsAndBreeds" target = "_blank" rel = "noopener noreferrer"><code>@SeedsAndBreeds</code></a> tweets a great technical thread on <a href="https://twitter.com/WeAreRLadies/status/1223790298024726528" target = "_blank" rel = "noopener noreferrer">ANOVA</a></td><td>A thread on <a href="https://twitter.com/WeAreRLadies/status/1363144545677017089" target = "_blank" rel = "noopener noreferrer">blogging resources</a></td></tr><tr class="even"><td>4. Prior experience as an educator</td><td>Starting discussion with a <a href="https://twitter.com/WeAreRLadies/status/1361332603274612739" target = "_blank" rel = "noopener noreferrer">question</a></td></tr><tr class="odd"><td>5. What am I passionate about lately?</td><td><a href="https://twitter.com/WeAreRLadies/status/1363144545677017089" target = "_blank" rel = "noopener noreferrer">Blogging</a></td></tr><tr class="even"><td>6. What did I have to overcome to be where I am today?</td><td>Learning how to <a href="https://twitter.com/WeAreRLadies/status/1362370580708790274" target = "_blank" rel = "noopener noreferrer">ask for help online</a></td></tr><tr class="odd"><td>7. What have colleagues or students asked me about?</td><td>What needs <a href="https://twitter.com/WeAreRLadies/status/1362114431090573315" target = "_blank" rel = "noopener noreferrer">updating</a> and when</td></tr><tr class="even"><td>8. What are some R functions or packages that have helped me recently?</td><td><a href="https://twitter.com/WeAreRLadies/status/1361418527870132226" target = "_blank" rel = "noopener noreferrer">sortable</a> package</td></tr><tr class="odd"><td>9. What are R-Ladies voices I can amplify while I have this large platform?</td><td><a href="https://twitter.com/WeAreRLadies/status/1362866281918128132" target = "_blank" rel = "noopener noreferrer">Quote tweeting questions</a></td></tr></tbody></table></div></div><div id="polls" class="level1"><h2>Polls</h2><p>Reminiscing about my days teaching in large lectures halls with students actively participating in polling questions through clickers, I planned three polls for the week. Polls on twitter are open for 24 hours and allow up to four response options. The approach was to launch the poll, collect responses, and then discuss. Here are the three polls that I launched during my curation, with follow up discussion:</p><div><img src="poll.png" /></div></div><div id="first-and-last-tweets" class="level1"><h2>First and last tweets</h2><p>The introduction and farewell tweets as a curator are important as this is when you actually tell people your name or personal twitter handle. To generate engagement, I aimed to create content-rich <a href="https://twitter.com/WeAreRLadies/status/1361286341317779456" target = "_blank" rel = "noopener noreferrer">first</a> and <a href="https://twitter.com/WeAreRLadies/status/1363286037846511616" target = "_blank" rel = "noopener noreferrer">last</a> tweets to give users more motivation to like or re-tweet, and I also connected the content with links to my blog so that users could easily learn more about me.</p></div><div id="tweetdeck" class="level1"><h2>TweetDeck</h2><p>When you serve as a curator, you will be tweeting from <a href="https://tweetdeck.twitter.com/" target = "_blank" rel = "noopener noreferrer">TweetDeck</a>, and it is hard to separate curator experience from the technology. Tweeting from TweetDeck can be overwhelming compared to the standard Twitter interface.</p><p>Moreover, there were limitations to the platform that added challenges to curating, which included:</p><ol style="list-style-type: decimal"><li>There was no <code>+</code> enabled to easily create threads (I had to send a tweet and then comment on the tweet, and it was initially hard to ensure the thread appeared in correct order). Yes, I deleted many out of order tweets.</li><li>Consequently, I could not draft and save threads in TweetDeck to send later.</li><li>I could not send polls from the curator account on TweetDeck; polls were sent from my personal account and then re-tweeted from the curator account. Adding this context can help better frame polls.</li><li>Depending on the content already in the send tweet interface, sometimes other options in TweetDeck would disappear, like the emoji, gifs, and upload image buttons. I kept <a href="https://emojipedia.org/" target = "_blank" rel = "noopener noreferrer">emojipedia</a> open to easily copy and paste emojis into my tweets, and it took trial and error to get everything I wanted in a single tweet.</li><li>When uploading local content, you can add <a href="https://help.twitter.com/en/using-twitter/picture-descriptions" target = "_blank" rel = "noopener noreferrer">descriptions</a> to both gifs and images in the regular Twitter interface to create inclusive content for community members that use assistive reading technology; however, in TweetDeck, descriptions were enabled for images but not gifs.</li><li>With TweetDeck, you can tweet from both your personal account and your curator account. You can set the options to default to the curator account, <em>but</em> there were still some instances where I still managed to inadvertently tweet from my personal account when I meant to tweet from the curator account. (I did delete the tweet and re-tweet from the correct account.)</li></ol><p>I spent a lot of time my first couple of days as a curator getting used to TweetDeck, reaching out to other curators for tips, and researching alternative solutions and plug-ins that ultimately did not help. Twitter is targeting TweetDeck <a href="https://www.theverge.com/2021/3/9/22321991/twitter-tweetdeck-overhaul-redesign-product-changes" target = "_blank" rel = "noopener noreferrer">enhancements</a> later in 2021, so I don’t think it is worth documenting all of my methods and work-arounds. However, if you are serving as a curator and struggling with TweetDeck, please reach out - I am happy to share what ended up working for me. You can also prepare yourself by practicing tweeting from TweetDeck with your personal account prior to curating.</p></div><div id="what-i-would-have-done-differently" class="level1"><h2>What I would have done differently</h2><p>It was a whirlwind week! Here a few things I would have done differently.</p><ol style="list-style-type: decimal"><li>Practice with TweetDeck in advance. Literally, force yourself to tweet from Tweetdeck at least a week before your curation. TweetDeck is very different than the standard Twitter interface, and it took me a few days to get used to it.</li><li>Figure out how I wanted to style shared code in advance - my first couple of days would have gone smoother with this.</li><li>Preface polls tweeted from my personal account with the context that they are for curator rotation.</li><li>Prepare a tweet in honor of any holidays or significant events coinciding with your curation week. One regret that I do have from my curating week is failing to explicitly acknowledge Black History Month as I was tweeting in February. I wish had prepared at least one tweet or better amplified the voices of black members of the R community while I had the large platform.</li></ol></div><div id="fleeting-fame" class="level1"><h2>Fleeting fame</h2><p>When curating, your tweets in the moment are highly visible. But what persists afterward is fairly anonymous as your tweets are not linked to your personal profile unless you tag yourself. In a weird way, it actually becomes a safe place to put yourself out there withquestions you might not have been comfortable asking from your own personal account. Take advantage of this fleeting fame not just to share your knowledge but also to ask your questions.</p></div><div id="supporting-your-curators" class="level1"><h2>Supporting your curators</h2><p>Just because a Twitter account has &gt;20K followers, the likes, re-tweets, and comments don’t come automatically. You still have to earn engagement with your content. Many of the tweets I sent had little engagement, and that is okay. Supporting your curators by engaging with their tweets or sending notes of encouragement is <em>much</em> appreciated. I thank <strong>everyone</strong> who engaged with me during my curation, with a special shout out to Alison Hill who re-energized me mid-week with comments on the R-Ladies bloggers <a href="https://twitter.com/WeAreRLadies/status/1362021673239785473" target = "_blank" rel = "noopener noreferrer">thread</a>. I cannot emphasize this enough: every like, re-tweet, comment, and direct message helps!</p><p>In addition, if you have curated in the past, consider sending new curators a personal welcome message and an invitation to ask you any questions. Following my curation week, I offered camaraderie and tips to <a href="https://twitter.com/alehsegura13" target = "_blank" rel = "noopener noreferrer">Ale Segura</a>, and in return, she did the same for <a href="https://twitter.com/ShreyaLouis" target = "_blank" rel = "noopener noreferrer">Shreya Louis</a> following her.</p></div><div id="reflection" class="level1"><h2>Reflection</h2><p>Between prepared and ad-hoc content and discussions with followers, I tweeted a lot! (At least for me.) Here is a <a href="https://twitter.com/spcanelon/status/1363518469782843396" target = "_blank" rel = "noopener noreferrer">summary thread</a> of my tweets for the week. My tweets were not perfect, and that is okay. I messed up threads, had typos, and shared deprecated code, among other things. Check out my <a href="https://twitter.com/PipingHotData/status/1364183660744896513" target = "_blank" rel = "noopener noreferrer">blooper reel</a><a href="#fn1" class="footnote-ref" id="fnref1"><sup>1</sup></a> for tweets that I bungled.</p><p>Serving as a curator was intimidating and time consuming, but I am very glad I did it. Many good things have happened as direct result of that week, including:</p><ul><li><a href="https://twitter.com/WeAreRLadies/status/1363144545677017089" target = "_blank" rel = "noopener noreferrer">discussing</a> comparisons between <code>{blogdown}</code> and <code>{distill}</code> with Alison Hill.</li><li><a href="https://www.pipinghotdata.com/posts/2021-03-08-r-ladies-styled-code-gifs-with-xaringan-and-flipbookr/" target = "_blank" rel = "noopener noreferrer">collaborating</a> with Silvia Canelón to style code gifs.</li><li>engaging with new people on Twitter that I want to continue to engage with.</li><li>learning about valuable new-to-me packages, functions, and work flows.</li><li>being invited to <a href="https://www.pipinghotdata.com/talks/2021-04-22-growing-into-the-r-community/" target = "_blank" rel = "noopener noreferrer">speak for R-Ladies Miami</a>.</li><li>seeing my “Asking for help online” content re-used in Sharla Gelfand’s <a href="https://twitter.com/sharlagelfand/status/1365665149063987201" target = "_blank" rel = "noopener noreferrer">make a reprex… please</a> presentation.</li><li>co-developing a unit testing workshop with Gordon Shotwell for R-Ladies Philly.</li></ul><p>During my curating week I tried to embody the tweets that I value: honest questions, thoughtful discussion, generous sharing, supportive community, and humorous exchanges. To borrow from Vicki Boykis in the rstudio::global(2021) <a href="https://rstudio.com/resources/rstudioglobal-2021/your-public-garden/" target = "_blank" rel = "noopener noreferrer">keynote</a>, I created my own public garden that cultivated new connections and collaborations. And now, I am more confident in continuing these practices from my personal Twitter account.</p></div><div id="acknowledgements" class="level1"><h2>Acknowledgements</h2><p>Thank you to <a href="https://twitter.com/ma_salmon" target = "_blank" rel = "noopener noreferrer">Maëlle Salmon</a> and <a href="https://twitter.com/apreshill" target = "_blank" rel = "noopener noreferrer">Alison Hill</a> for encouraging me to write this - it might not have happened without you! Thank you also to Maëlle Salmon, <a href="https://twitter.com/alehsegura13" target = "_blank" rel = "noopener noreferrer">Ale Segura</a>, and <a href="https://twitter.com/ivelasq3" target = "_blank" rel = "noopener noreferrer">Isabella Velásquez</a> for your suggestions; I truly appreciate your sharp eyes and thoughtful feedback. 💜</p></div><div class="footnotes"><hr /><ol><li id="fn1"><p>A blooper is an embarrassing mistake, often sports-related, and humorous in retrospect; a blooper reel is a compilation of multiple bloopers.<a href="#fnref1" class="footnote-back">↩︎</a></p></li></ol></div></description></item><item><title>RStudio Package Manager 2021.09.0 - Capturing and Maintaining Working Repositories</title><link>https://www.rstudio.com/blog/rstudio-package-manager-2021-09-0-capturing-and-maintaining-working-repositories/</link><pubDate>Wed, 22 Sep 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-package-manager-2021-09-0-capturing-and-maintaining-working-repositories/</guid><description><caption>Photo by <a href="https://unsplash.com/@timmossholder?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText" target="_blank" rel="noopener noreferrer">Tim Mossholder</a> on <a href="https://unsplash.com/?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText" target="_blank" rel="noopener noreferrer">Unsplash</a></caption><p>This release adds new management, service, and configuration options to RStudio Package Manager. Highlights include a more versatile repository calendar, more flexibility in serving multiple binary package versions, and more options for configuring git sources.</p><p>Packages are critical to data science, but keeping them in sync and working together at scale is often a challenging, frustrating task. <a href="https://www.rstudio.com/products/package-manager" target = "_blank" rel = "noopener noreferrer">RStudio Package Manager</a> is our pro offering that simplifies package management across your team, department, or entire organization for reproducible, maintainable, and secure code repositories. This release includes improvements on our management, service, and configuration options, such as:</p><ul><li>A new, more flexible repository calendar. Users can now freeze to any date in the repository&rsquo;s history, and frozen repository URLs now include the snapshot date in YYYY-MM-DD format.</li><li>More flexibility in serving multiple binary package versions. RStudio Package Manager can now serve binary packages for new R versions and operating systems without upgrading to a new version.</li><li>More configuration options when using git sources. You can now edit the SSH key, git URL, branch, and subdirectories of existing sources. You can also now use a file to watch for changes.</li><li>Improvements in logging standardization and usability. Logs now work more like other RStudio team products, including being available through journalclt.</li><li>Support for the Bioconductor books repository.</li><li>Many important bug fixes.</li></ul><p>Check out more in our <a href="https://docs.rstudio.com/rspm/news/#rstudio-package-manager-2021090" target = "_blank" rel = "noopener noreferrer">release notes</a> to learn more details and click <a href="https://www.rstudio.com/products/package-manager/download-commercial/" target = "_blank" rel = "noopener noreferrer">here</a> to upgrade today.</p><blockquote><p>NOTE on versioning: As part of this release, we’ve moved to calendar-based versioning. <a href="https://blog.rstudio.com/2021/08/30/calendar-versioning-for-commercial-rstudio-products/" target = "_blank" rel = "noopener noreferrer">See this blog post</a> for details.</p></blockquote><p><strong>For more information</strong></p><ul><li>For an overview of best practices for open source package management, check out this free webinar, <a href="https://www.rstudio.com/resources/webinars/managing-packages-for-open-source-data-science/" target = "_blank" rel = "noopener noreferrer">Managing Packages for Open Source Data Science</a>, and <a href="https://blog.rstudio.com/2021/05/06/pkg-mgmt-admins/" target = "_blank" rel = "noopener noreferrer">this series of blog posts</a>.</li><li>For more information on RStudio Package Manager, see the <a href="https://www.rstudio.com/products/package-manager/" target = "_blank" rel = "noopener noreferrer">RStudio Package Manager product page</a>.</li></ul></description></item><item><title>RStudio Workbench Load Balancing Changes</title><link>https://www.rstudio.com/blog/rstudio-workbench-load-balancing-changes/</link><pubDate>Tue, 21 Sep 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-workbench-load-balancing-changes/</guid><description><sup>Photo by <a href="https://unsplash.com/@davidclode?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText" target="_blank" rel="noopener noreferrer">David Clode</a> on <a href="https://unsplash.com/?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup><p>As we&rsquo;re putting the finishing touches on the RStudio Workbench 2021.09.0 &ldquo;Ghost Orchid&rdquo; release, we&rsquo;d like to share one of the new sets of features we&rsquo;re most excited about. We&rsquo;ve revisited and revamped the administration experience for load balancing clusters.</p><p>Specifically, we&rsquo;ve worked to improve the cluster management and troubleshooting. To make this possible, cluster data is now stored within the internal database. The load balancing configuration file no longer requires a list of each node in the cluster. In fact, the file can be completely empty - though its presence is required. This means nodes can join and leave the cluster without bringing down and re-configuring every node - scaling your cluster has never been easier!</p><p>When provided an empty configuration file, RStudio Workbench predicts the address that other nodes can reach each node at. For more complicated configurations, we&rsquo;ve included an escape hatch through the new <code>www-host-name</code> option which be can included in the file to instruct RStudio Workbench to use a specified hostname. A detailed explanation of the approach taken to determine each node&rsquo;s address and the new option can be found in the <a href="https://docs.rstudio.com/ide/server-pro/latest/load_balancing/configuration.html" target = "_blank" rel = "noopener noreferrer">Admin Guide</a>.</p><p>Furthermore, we&rsquo;ve added several new commands to the <code>rstudio-server</code> admin tool to improve load balancing cluster management.</p><p>The first command, <code>rstudio-server list-nodes</code> displays each node and information about its current status. It is intended to be use in conjunction with the existing status endpoint (accessed through <code>curl http://localhost:8787/load-balancer/status</code>) to monitor the status of your nodes and aid in identifying and addressing issues.</p><p>The following is an example of this output:</p><pre><code>$ sudo rstudio-server list-nodesCluster-------ProtocolHttpNodes-----ID Host IPv4 Port Status Last Seen1 rsw-primary 172.98.8.241 80 Online 2021-Sep-20 17:08:532 rsw-secondary 172.98.14.255 80 Invalid secure cookie key 2021-Sep-20 17:10:253 rsw-tertiary 172.98.6.205 80 Offline 2021-Sep-20 17:10:34</code></pre><p>Because load balancing now makes use of the internal database, each node validates its secure cookie key and configured protocol against the database before coming online. The first node online sets the values used for validation. The results of that validation are stored in the database and easily retrievable through the <code>rstudio-server list-nodes</code> command, allowing for easy troubleshooting when encountering unexpected issues with your cluster.</p><p>We&rsquo;ve added the command <code>rstudio-server reset-cluster</code> to reset the cluster&rsquo;s state used for validation. This should be run after replacing the secure cookie key on each node or after updating the protocol the cluster is using (<code>http</code>, <code>https</code>, or <code>https-no-verify</code>). Again, the first node brought online or restarted after this reset will determine the configuration used for validation.</p><p>Finally, the command <code>rstudio-server delete-node &lt;node-id&gt;</code> allows you to easily remove nodes from the cluster. The required <code>node-id</code> parameter can be retrieved from the output of the <code>rstudio-server list-nodes</code> command. When a node is deleted, the other nodes in the cluster will no longer try to contact that node; there is no need to restart the active nodes after running this. This command should only be used for nodes that are offline and will not be coming back online.</p><p>There are many more features coming with this release. If you&rsquo;re interested in giving them a try, check out the <a href="https://www.rstudio.com/products/rstudio/download/preview/" target = "_blank" rel = "noopener noreferrer">RStudio 2021.09.0 Preview</a> for the latest installers and release notes.</p></description></item><item><title>The Advantages of Code-First Data Science</title><link>https://www.rstudio.com/blog/the-advantages-of-code-first-data-science/</link><pubDate>Thu, 16 Sep 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/the-advantages-of-code-first-data-science/</guid><description><sup>Photo by <a href="https://unsplash.com/@cgower?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText" target="_blank" rel="noopener noreferrer">Christopher Gower</a> on <a href="https://unsplash.com/">Unsplash</a></sup><p>RStudio has worked with hundreds of different data science teams, and we&rsquo;ve seen three key strategies that help maximize their productivity and impact:</p><ul><li>Adopting open source as the core of their work</li><li>Leading with a Code-First approach</li><li>Implementing a centralized data science infrastructure</li></ul><p>Collectively, we call this approach <a href="https://www.rstudio.com/solutions/serious-data-science/" target="_blank" rel="noopener noreferrer">Serious Data Science</a>. In this post, we focus on the benefits of a Code-First approach.</p><p>A no-code approach to data science has some serious drawbacks, as described in this video:</p><script src="https://fast.wistia.com/embed/medias/32cf7r0kh4.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_32cf7r0kh4 videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/32cf7r0kh4/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><p>As we discussed in depth <a href="https://www.rstudio.com/resources/why-your-enterprise-needs-code-first-data-science/" target="_blank" rel="noopener noreferrer">in a recent webinar</a>, a Code-First approach is important because:</p><ul><li>Code provides the flexibility to build and share the most valuable insights, tailored to the analytic problems and needs of your stakeholders</li><li>Code enables fast iteration and updates</li><li>Code by its nature is reusable, extensible, and inspectable</li></ul><script src="https://fast.wistia.com/embed/medias/3mptb802yl.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_3mptb802yl videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/3mptb802yl/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><p>Code-First helps overcome the pitfalls of no-code approaches, as shown in the table below:</p><table class="table-striped-odd-bg"><thead><tr><th class="problem"> No-Code Problem </th><th class="solution"> Code-First Solution </th></tr></thead><tr><td><p>Difficulty in tracking changes and auditing work</p></td><td><p>Code, coupled with version control systems like git, can track what changed, when, by whom, and why.</p><p>Code can be logged when run for auditing and monitoring.</p></td></tr><tr><td><p>No single source of truth</p></td><td><p>Centralized tools can create a single source of truth for data, dashboards, and models.</p><p>Version control can track multiple versions of code separately without creating conflicts.</p></td></tr><tr><td><p>Difficulty in reproducing and extending work</p></td><td><p>Code can enable reproducibility by explicitly recording every step taken.</p><p>Open-source code can be deployed on many platforms and is not dependent on proprietary tools.</p><p>Code can be copied, pasted, and modified to address emergent problems as circumstances change.</p></td></tr><tr><td><p>Limitations on analysis techniques and presentation formats</p></td><td><p>Code can allow you to analyze and present all your data as you need to in the form of custom dashboards and reports.</p><p>Code can pull in new methods and open-source work without waiting for vendors to add proprietary features.</p></td></tr></table><h2 id="to-learn-more">To learn more</h2><p>If you’d like to learn more about a code-first approach to data science, you can <a href="https://www.rstudio.com/resources/why-your-enterprise-needs-code-first-data-science/" target="_blank" rel="noopener noreferrer">watch our recent webinar here</a> or read an overview of the webinar in <a href="https://blog.rstudio.com/2021/05/12/code-first-data-science-for-the-enterprise2/" target="_blank" rel="noopener noreferrer">this blog post</a>. For a broader view of Serious Data Science and links to more resources, <a href="https://www.rstudio.com/solutions/serious-data-science/" target="_blank" rel="noopener noreferrer">see this page</a>.</p></description></item><item><title>How do you use Shiny to communicate to 8 million people?</title><link>https://www.rstudio.com/blog/how-do-you-use-shiny-to-communicate-to-8-million-people/</link><pubDate>Tue, 14 Sep 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/how-do-you-use-shiny-to-communicate-to-8-million-people/</guid><description><p><a class="no-icon" href="https://www.youtube.com/watch?v=BmpnfLLrr4w" target="_blank"><img src="atlanta.gif" class="mb-0" alt="6 second video snippet of Atlanta, GA view of buildings from sky from GA Tech COVID app spotlight video" style="height:auto; width:100%;"/></a></p><div align="right" class="mb-4"><a href="https://www.youtube.com/watch?v=BmpnfLLrr4w." target="_blank">Full GA Tech Spotlight Video on YouTube</a></div><p>Data visualization is fundamentally an act of communication. While many discussions focus on the technical aspects of creating visualizations, communicating your insights in a clear, relevant and accessible way is essential.</p><p>The Georgia Institute of Technology team shared some key lessons, based on their experience building the <a href = "https://covid19risk.biosci.gatech.edu/" target="_blank">COVID-19 Event Risk Assessment Planning Tool.</a> These lessons apply to visualizations across many different industries and use cases, whether you are communicating to a handful of executives at your company or out to the world.</p><ul><li>Make sure that you <strong>have a specific question in mind.</strong> What is the question that your app will answer? Think about who your audience is going to be and what they would use this for. For the COVID-19 Event Risk Assessment Planning Tool this question was “what is the risk level of attending an event, given the event size and location?”</li><li><strong>View your audience through a lens of empathy.</strong> Think about metrics that people can really get a grip on and visualize. For example, the risk of attending a local event with 100 people in your own town vs. communicating this as cases per 100,000 people. If you want to communicate something that’s critical to the public, put it in the right terms.</li><li><strong>Balance the straightforwardness</strong> of your visualization. You don’t have to anticipate every single question. With every feature or piece of information included, ask yourself if this supports your overall point?</li><li><strong>Keep the lines of communication open</strong> with your users. If you share a visualization, make sure that people have a clear way to contact you (email, Twitter, LinkedIn) with questions or feedback. Their team made an intentional effort to be available for local news particularly. They were responsive to the kinds of decisions people were making and adjusted the app to match their needs with event sizes for example.</li></ul><p>In July 2020, Georgia Institute of Technology faculty, scientists, GIS specialists, and graduate students launched a tool that provided real-time, localized information on the estimated risk of COVID-19 exposure by attending an event.</p><blockquote><p>“Over a year ago we had been concerned as early as March 2020 that there were generally underappreciated risks associated with attending even medium to small events. Given that cases were spreading, it was hard to figure out how many cases there really were. All these questions of whether or not the cases were being documented, (and we were fairly certain they were under ascertained, underdocumented) <strong>we wanted to translate that in some way to communicate that out to the world</strong> - individuals as well as decision makers”</p></blockquote><blockquote><p>-Joshua Weitz PhD, Professor, Biological Sciences &amp; Physics, Georgia Institute of Technology</p></blockquote><p>Their team presented this risk out to the world through an interactive Shiny application, which allowed users to determine their own risk of encountering someone with COVID at an event in their given location.</p><p><a style="display: block; text-align:center;" href="https://www.youtube.com/watch?v=BmpnfLLrr4w" target="_blank"><img src="https://videoapi-muybridge.vimeocdn.com/animated-thumbnails/image/c15a7557-1c62-4a25-a722-e9db5da399b3.gif?ClientID=vimeo-core-prod&Date=1631289749&Signature=e36ca981c7b4431445d8b10e71bf45322382df69" alt="gif of interacting with COVID-19 Event Risk Assessment Planning Tool map of US with risk level percentage shown in range of yellow to red" style=" max-height:100%; max-width:100%;"/></a></p><div align="right"><a href="https://www.youtube.com/watch?v=BmpnfLLrr4w." target="_blank">Full spotlight video</a> and <a href="https://covid19risk.biosci.gatech.edu/", target="_blank">COVID-19 Event Risk Assessment Planning Shiny App</a></div><p>What if you were planning on having dinner at a restaurant with 20 people in Ontario, NY? A small wedding with 50 people in Teton, WY? Deciding to go back to your office?</p><p>The narrative became personal to the individual user by answering their specific question in a direct metric that they could not only understand but share with others.</p><p>In <a href="https://www.youtube.com/watch?v=BmpnfLLrr4w." target ="_blank">talking with the GA Tech team</a>, it was clear that their empathetic perspective of the audience and communication-focus was crucial to successfully sharing their insights with event planners, policy makers, various news outlets and individuals - adding up to ultimately over <em>8 million</em> unique users around the world.</p><p>Thank you so much to the team at GA Tech for sharing their story with me:</p><ul><li><strong>Joshua Weitz PhD</strong>, Professor, Biological Sciences &amp; Physics, Georgia Institute of Technology</li><li><strong>Aroon Chande PhD</strong>, Scientific Advisor at the Applied Bioinformatics Laboratory</li><li><strong>Clio Andris PhD</strong>, Assistant Professor, City and Regional Planning &amp; Interactive Computing, Georgia Institute of Technology</li><li><strong>Stephen Beckett PhD</strong>, Research Scientist, Biological Sciences, Georgia Institute of Technology</li><li><strong>Seolha Lee</strong>, Graduate Research Assistant, City and Regional Planning, Georgia Institute of Technology</li><li><strong>Quan Nguyen</strong>, Undergraduate Research Assistant, Biological Sciences, Georgia Institute of Technology</li></ul><div align="center"><img src="GAtech.jpg" alt = "snapshot of COVID-19 Event Risk Assessment Planning Tool shiny application, map of US with risk level percentage per county shown in range of yellow to red"><font size="2"skip=0pt></div><div align="right">Source: GA Tech,<a href="https://covid19risk.biosci.gatech.edu/", target="_blank"> COVID-19 Event Risk Assessment Planning Tool</a></div></font><p>I’ve included a few other helpful communication resources below:</p><ul><li><strong>John Burn-Murdoch</strong> | RStudio Conference Keynote | <a href="https://www.youtube.com/watch?v=L5_4kuoiiKU" target = "_blank">Reporting on and visualising the pandemic</a></li><li><strong>Sophie Beiers</strong> | RStudio Conference Talk | <a href="https://www.rstudio.com/resources/rstudioglobal-2021/trial-and-error-in-data-viz-at-the-aclu/" target ="_blank">Trial and error in data vis at the ACLU<a></li><li><strong>Charlotta Früchtenicht, Diego Saldana, Mark Baille, Marc Vandemeulebroecke</strong> | Webinar | <a href="https://www.rstudio.com/resources/webinars/effective-visualizations-for-data-driven-decisions/?_ga=2.39720929.395829809.1631126074-1690468391.1610381620" target="_blank">Effective Visualizations for Credible, Data-Driven Decision Making Presented by Novartis and Roche</a></li><li><strong>Jason Milnes</strong> | Blog Post | <a href="https://blog.rstudio.com/2020/04/16/effective-visualizations-for-credible-data-driven-decision-making/" target="_blank">Effective Visualizations for Credible, Data-Driven Decision Making</a></li></ul></description></item><item><title>My Excel and R Journey in Financial Services</title><link>https://www.rstudio.com/blog/my-excel-and-r-journey-in-financial-services/</link><pubDate>Tue, 07 Sep 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/my-excel-and-r-journey-in-financial-services/</guid><description><p><sup>Photo by<a href="https://unsplash.com/@claybanks?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText"> Clay Banks</a> on<a href="https://unsplash.com/s/photos/journey-map?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText"> Unsplash</a></sup></p><div class="lt-gray-box"><p><em>RStudio is dedicated to our mission to support <a href="https://www.rstudio.com/about/what-makes-rstudio-different/">open source data science</a>, and we believe that a <a href="https://www.rstudio.com/solutions/serious-data-science/">code-first approach</a> is uniquely powerful, because code provides the flexibility to build and share insights, tailored to the analytic problem and the needs of your stakeholders. However, different audiences often need different tools for different applications, and so we asked Pritam Dalal from our Customer Success team to share his perspective on the pros and cons of Excel vs. code-first data science.</em></p></div><h2 id="sometimes-excel-is-the-right-tool">Sometimes Excel is the right tool</h2><p>Prior to joining RStudio, I had a 15 year career in financial services. In particular, I held trading and research roles in areas ranging from exotic derivatives, to mortgage backed securities, to option market making. And throughout all of these experiences, there was one data analysis application that was more ubiquitous than all the rest combined: Excel.</p><p>Excel is a critical tool in the financial services industry. It is not an exaggeration that each day hundreds of billions (perhaps trillions) of dollars get transacted on the basis of spreadsheet workflows. In contrast, there is a fair amount of antipathy towards Excel in the data science community. And while many of the criticisms are valid - a lack of reproducibility, severe limitations with data size, clunky visualizations - the negativity overlooks much of the exceptional utility of spreadsheets.</p><p>At a personal level, I have a great deal of affection for Excel. Spreadsheets are how I got my start in data analysis. They offered a visual and tactile approach to data centric computation. Excel has a simple built-in programming language that served as my first foray into coding. I even backtested and implemented a profitable trading strategy with spreadsheets. However, when I increased the complexity and scope of the strategy, my Excel analysis tools were not able to scale accordingly.</p><p>Data analysis programming languages such as Python and R are far more flexible and powerful tools that amply address many of Excel’s shortcomings. But much is lost as well. With code, you lose the visceral experience of traversing spreadsheet cells. And for tiny data problems (say a few hundred rows of data), the overhead of a programming language may not be worth it.</p><p>And what’s more, here is a dirty data science secret: programming is not for everyone. Many don’t have the interest, temperament, or time to learn. While there are no-code alternatives such as Tableau or PowerBI, for many of these non-coders, spreadsheets may in fact be the best tool.</p><p>If you work in financial services and are a champion of programming centric data analysis tools, then your evangelism will be better received if it is not accompanied by rancor for Excel. Many citizen data analysts have come to rely heavily on spreadsheets, and change can be scary and painful. Showing disrespect to such a pivotal tool is an easy way to erode goodwill, and to keep your message from being heard.</p><script src="https://fast.wistia.com/embed/medias/nm8wlv48dz.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:100.0% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_nm8wlv48dz seo=false videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/nm8wlv48dz/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><div align="center" style="font-size:12px;color:grey;">As another viewpoint, Frank Corrigan, Director of Decision Intelligence at Target recently shared his views on the value of Excel for business analysts in a recent Data Science hangout. “If you can think through the idea in Excel, you are going to be able to easily explain it to people.</div><h2 id="for-more-information">For more information</h2><p>While Excel can sometimes be the right tool for the job, as mentioned above, Python and R provide more flexibility and power than can address many of Excel’s shortcomings, including a lack of reproducibility and clunky visualizations.</p><ul><li>To learn more about how an open source, code first approach can provide unique value, check out this overview of <a href="https://www.rstudio.com/solutions/serious-data-science/" target="_blank">Serious Data Science</a>.</li><li>One of the key attributes of open source tools is Interoperability: the ability to connect to a wide variety of analytic tools, helping an organization leverage all of its analytic investments. Learn more about <a href="https://blog.rstudio.com/2020/07/15/interoperability-maximize-analytic-investments/" target="_blank">Interoperability</a>.</li></ul></description></item><item><title>Announcing Calendar Based Versioning for All Commercial RStudio Products</title><link>https://www.rstudio.com/blog/calendar-versioning-for-commercial-rstudio-products/</link><pubDate>Mon, 30 Aug 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/calendar-versioning-for-commercial-rstudio-products/</guid><description><p><sup>Photo by<a href="https://unsplash.com/@erothermel?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText"> Eric Rothermel</a> on<a href="https://unsplash.com/s/photos/calendar?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText"> Unsplash</a></sup></p><p>RStudio is shifting to a calendar-based versioning scheme for future releases of all our commercial products.</p><p>We are making this transition to deliver a more transparent experience for our customers:</p><ul><li><strong>The age of a given release will be self-evident</strong> from its version label.</li><li><strong>Customers can rely on support for a consistent and predictable time period</strong>. (Previously the support window was heavily influenced by how rapidly new releases superseded prior ones.)</li><li><strong>Customers will be able to easily determine which releases contain new features</strong> (as opposed to bug fixes), based on consistent standards for releasing new editions across all our products.</li></ul><p>In this new scheme version labels are derived from the date of release using the YYYY.MM.patch format where</p><ul><li>YYYY is the four digit year</li><li>MM is the two digit month</li><li>patch is an integer that starts at zero and gets incremented each time non-functional improvements (e.g.: bug fixes, performance improvements, etc…) are made to the product.</li></ul><p>We call the YYYY.MM portion of the version label the <strong>edition</strong> of the product. We will release a new edition of the product with a .0 patch number when new features are added and/or breaking changes have been made. Each <strong>edition</strong> will be supported for 18 months and the most recent edition will be supported regardless of its age.</p><p>Our <a href="https://www.rstudio.com/about/support-agreement/">support agreement</a> has been revised to align with the new calendar versioning scheme. The changes will generally result in a more generous support window for both past and future versions. Please see our <a href="https://www.rstudio.com/support/">Support page</a> for more details on these support windows.</p><p>To receive email notifications for RStudio professional product releases, patches, security information, and general product support updates, subscribe to the <strong>Product Information</strong> list by visiting the RStudio <a href="https://rstudio.com/about/subscription-management/">subscription management portal</a>.</p></description></item><item><title>RStudio Connect 2021.08.0 Custom Branding</title><link>https://www.rstudio.com/blog/rstudio-connect-2021-08-custom-branding/</link><pubDate>Mon, 30 Aug 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-2021-08-custom-branding/</guid><description><h2 id="custom-branding">Custom Branding</h2><p>Many organizations want to align RStudio Connect with their branding strategy. Whether you use RStudio Connect to deliver R and Python content to external clients, or internal stakeholders, branding and clear, consistent presentation is important. This release introduces greater control over aspects of RStudio Connect&rsquo;s look and feel so that your team&rsquo;s work will be front-and-center.</p><p>Using the new features, you can now do things like:</p><ul><li>Replace the RStudio logo and favicon with your own.</li><li>Organize groups of content in customized landing pages using <a href="https://blog.rstudio.com/2021/07/29/rstudio-connect-1-9-0/#introducing-connectwidgets"><code>connectwidgets</code></a>.</li><li>Customize what anonymous and logged-out users see when they visit your server.</li><li>Control how RStudio Connect&rsquo;s automated emails appear to recipients.</li></ul><p><img src="logo-customization.png" alt="Dashboard logo customization in RStudio Connect."></p><blockquote><p>NOTE on versioning: As part of this release, we&rsquo;ve moved to calendar-based versioning. <a href="https://blog.rstudio.com/2021/08/30/calendar-versioning-for-commercial-rstudio-products/">See this blog post</a> for details.</p></blockquote><h3 id="custom-branding-101">Custom Branding 101</h3><p>The RStudio Connect custom branding features must be set up by a server administrator with access to the Connect configuration file. This release introduces support for adding a new section to the configuration file called <code>[Branding]</code>. These settings allow you to remove elements of RStudio&rsquo;s brand from Connect and replace them with your own.</p><p>An example <code>Branding</code> configuration might look like this:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-ini" data-lang="ini"><span style="color:#60a0b0;font-style:italic">; /etc/rstudio-connect/rstudio-connect.gcfg</span><span style="color:#007020;font-weight:bold">[Branding]</span><span style="color:#4070a0">Enabled</span> <span style="color:#666">=</span> <span style="color:#4070a0">true</span><span style="color:#4070a0">Logo</span> <span style="color:#666">=</span> <span style="color:#4070a0">/path/to/logo.png</span><span style="color:#4070a0">Favicon</span> <span style="color:#666">=</span> <span style="color:#4070a0">/path/to/favicon.ico</span><span style="color:#4070a0">DisplayName</span> <span style="color:#666">=</span> <span style="color:#4070a0">&#34;SuperPowers Inc.&#34;</span></code></pre></div><p>Custom branding settings include the ability to change the logo, favicon, and display name used throughout the platform.</p><p><img src="branding-dashboard.png" alt="Custom branding configuration settings and where they apply to the RStudio Connect dashboard."></p><p>The <code>DisplayName</code> and <code>Logo</code> customizations are also used in product dialog messages such as on log in, user role upgrade requests, content permission requests, jump start publishing instructions, content configuration settings labels, system emails, and more.</p><p>To learn more about <code>Branding</code> configuration in RStudio Connect, visit the <a href="https://docs.rstudio.com/connect/admin/appendix/branding/">Admin Guide</a>.</p><div align="center"><a class="btn btn-primary btn-lg" href="https://rstudio.chilipiper.com/book/schedule-time-with-rstudio">See RStudio Connect in Action</a></div><h3 id="additional-considerations">Additional Considerations</h3><p>In addition to the basic branding configuration options listed above, there are several settings (new and old) that we think could be useful for customizing the RStudio Connect user experience.</p><h4 id="customize-the-front-door">Customize the Front Door</h4><p>The page that anonymous and logged-out users see when they visit RStudio Connect can be customized with <code>Server.LandingDir</code>. This setting takes a path to a directory containing the <code>index.html</code> page you&rsquo;d like to serve in place of the &ldquo;Welcome to RStudio Connect&rdquo; page you see below:</p><p><img src="front-door.png" alt="Server.LandingDir customization example."></p><p>Make sure you combine this feature with <code>Branding.Logo</code>, otherwise the header will still display the default RStudio Connect logo.</p><ul><li><p>To learn more about serving a custom landing page, visit the <a href="https://docs.rstudio.com/connect/admin/appendix/custom-landing/">RStudio Connect Admin Guide</a>.</p></li><li><p>To see an example of a custom &ldquo;Landing Site&rdquo; demo we use internally at RStudio, visit our GitHub: <a href="https://github.com/sol-eng/demo-landing-site">https://github.com/sol-eng/demo-landing-site</a>.</p></li></ul><h4 id="customize-the-front-entrance">Customize the Front Entrance</h4><p>What would you like users to see upon logging in to RStudio Connect?</p><p>Should they arrive at the default &ldquo;Connect Content Dashboard&rdquo; and be given free-range to search across all the content they have access to view, or should they be routed to a custom content showcase?</p><p><img src="front-entrance-options.png" alt="Server.RootRedirect options in RStudio Connect."></p><p><code>Server.RootRedirect</code> can be used to divert users to a URL other than the standard RStudio Connect dashboard. To create a landing page like the one above, we recommend working with a Publisher to make a content showcase with the <code>connectwidgets</code> R package. <code>connectwidgets</code> can be used to query an RStudio Connect server for your existing content items, then organize, subset, and style them with <code>htmlwidgets</code> components in an R Markdown document or Shiny application. This document or application can itself be hosted on RStudio Connect, and the URL of that content can be what you use for <code>RootRedirect</code>. To learn more about <code>connectwidgets</code>, visit the <a href="https://docs.rstudio.com/connect/user/curating-content/">RStudio Connect User Guide</a>.</p><p>If you choose to customize the <code>RootRedirect</code> URL, it will be important to notify publishers and other administrators about where they can access the content dashboard view of RStudio Connect. This URL can also be customized with the <code>Server.DashboardPath</code> setting. By default, the content dashboard is available at <code>/connect</code>.</p><p>To learn more about serving a custom content showcase, visit the <a href="https://docs.rstudio.com/connect/admin/appendix/branding/#logged-in-users">RStudio Connect Admin Guide</a>.</p><h4 id="customize-the-viewer-experience">Customize the Viewer Experience</h4><p>If you serve content on RStudio Connect to external users or clients, you might also want to put some restrictions on what viewers can see and do within the platform.</p><ul><li><p><strong>Documentation Visibility</strong> Use the new <a href="https://docs.rstudio.com/connect/admin/appendix/configuration/#Server.HideViewerDocumentation"><code>Server.HideViewerDocumentation</code></a> setting to Hide the documentation tab from Viewers in the RStudio Connect dashboard.</p></li><li><p><strong>Viewer Isolation</strong> Viewers on RStudio Connect can only see content to which they have explicitly been added. Many organizations additionally use the <a href="https://docs.rstudio.com/connect/admin/appendix/configuration/#Authorization.ViewersCanOnlySeeThemselves"><code>Authorization.ViewersCanOnlySeeThemselves</code></a> setting to ensure clients cannot discover other users on RStudio Connect.</p></li></ul><h4 id="customize-emails">Customize Emails</h4><p>Two new configuration settings have been added to increase the customization options for emails sent by RStudio Connect:</p><ul><li><p><strong>Sender Name Customization</strong> The <code>Server.SenderEmailDisplayName</code> setting has been added to allow customization of the server display name (alias) that is used when sending administrative emails.</p></li><li><p><strong>From and Sender Address Headers</strong> The <code>Server.EmailFromUserAddresses</code> setting indicates that outbound email messages sent on behalf of your users should specify both the Sender and From addresses. When enabled, the From field of an email message uses the name and email address associated with the sending user. The Sender field will be populated with the value from the <code>Server.SenderEmail</code> configuration setting. This setting is disabled by default. Not all email servers support this feature.</p></li><li><p><strong>Subject Prefix Customization</strong> Emails sent from RStudio Connect will be prefixed with &ldquo;[RStudio Connect].&rdquo; If you wish to change this prefix use the setting <code>Server.EmailSubjectPrefix</code>.</p></li></ul><p>Emails sent from RStudio Connect are now highly customizable. The example below is a &ldquo;Request for viewing access&rdquo; notification email sent to a content owner. Each of the areas highlighted below are values that can be controlled through configuration settings:</p><p><img src="branding-email.png" alt="Areas where RStudio Connect emails can be custom branded."></p><p>Note: When <code>Branding.Enabled = true</code>, the highlighted footer text is changed to &ldquo;Powered by RStudio Connect&rdquo;. No other footer text customization options are available at this time.</p><p>Learn more about email customization options in the <a href="https://docs.rstudio.com/connect/1.9.0/admin/email/#configuring-other-email-settings">Admin Guide</a>.</p><h3 id="upgrade-to-rstudio-connect-2021080">Upgrade to RStudio Connect 2021.08.0</h3><p>Before upgrading, please review the <a href="http://docs.rstudio.com/connect/news">full release notes</a>. This release contains additional features described in the Python Updates <a href="https://blog.rstudio.com/2021/08/30/rstudio-connect-2021-08-python-updates/">blog announcement</a>.</p><blockquote><h5 id="upgrade-planning">Upgrade Planning</h5><p>Upgrading RStudio Connect should require less than five minutes. If you are upgrading from a version earlier than 1.9.0.1, be sure to consult the <a href="http://docs.rstudio.com/connect/news">release notes</a> for the intermediate releases, as well.</p></blockquote><p>To perform an RStudio Connect upgrade, download and run the installation script. The script installs a new version of Connect on top of the earlier one. Existing configuration settings are respected.</p><pre><code># Download the installation scriptcurl -Lo rsc-installer.sh https://cdn.rstudio.com/connect/installer/installer-v1.9.3.sh# Run the installation scriptsudo bash ./rsc-installer.sh 2021.08.0</code></pre><div align="center"><a class="btn btn-primary btn-lg mt-5" href="https://rstudio.com/about/subscription-management/">Sign up for RStudio Professional Product Updates</a></div></description></item><item><title>RStudio Connect 2021.08.0 Python Updates</title><link>https://www.rstudio.com/blog/rstudio-connect-2021-08-python-updates/</link><pubDate>Mon, 30 Aug 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-2021-08-python-updates/</guid><description><h1 id="python-updates">Python Updates</h1><p>At RStudio we know that many data science teams leverage both R and Python in their work, so it&rsquo;s important that we build products to support the best tools available in both languages. For an overview of all the ways our pro products support data science teams using R and Python, check out our <a href="https://www.rstudio.com/solutions/r-and-python/">Single Home for R and Python</a> page.</p><p>APIs are a key to integrating your data science results into other applications and processes (<a href="https://blog.rstudio.com/2021/05/04/rstudio-and-apis/">see this blog post</a> for more on APIs). With this RStudio Connect release, you can deploy, manage, and scale Python APIs built with FastAPI and several other ASGI-compliant frameworks. The list of supported Python content types has grown steadily over the last two years. This feature edition extends the capabilities even further with several options that enable asynchronous API development.</p><h3 id="rstudio-connect-supported-python-content-types-in-2021">RStudio Connect Supported Python Content Types in 2021</h3><table><thead><tr><th>Content Type</th><th>Framework</th></tr></thead><tbody><tr><td>Documents &amp; Notebooks</td><td>Jupyter Notebooks</td></tr><tr><td>Interactive Applications</td><td>Dash, Streamlit, Bokeh</td></tr><tr><td>WSGI Frameworks</td><td>Flask</td></tr><tr><td>ASGI Frameworks</td><td>FastAPI, Quart, Falcon, Sanic</td></tr></tbody></table><h3 id="additional-python-updates">Additional Python Updates:</h3><ul><li><p><strong>New Feature</strong> Support for hiding input code cells in Jupyter Notebooks.</p></li><li><p><strong>Announcement</strong> Support for the Python 2 in RStudio Connect will end in January 2022. Planning and migration recommendations are described in the post below.</p></li></ul><blockquote><p>NOTE on versioning: As part of this release, we&rsquo;ve moved to calendar-based versioning. <a href="https://blog.rstudio.com/2021/08/30/calendar-versioning-for-commercial-rstudio-products/">See this blog post</a> for details.</p></blockquote><div align="center"><a class="btn btn-primary btn-lg mt-5" href="https://www.rstudio.com/products/connect/">Click through to learn more about RStudio Connect</a></div><h2 id="new-content-type-fastapi">New Content Type: FastAPI</h2><p><em>Support for Python ASGI frameworks in RStudio Connect.</em></p><p><a href="https://fastapi.tiangolo.com/"><img src="fastapi.png" alt="FastAPI Logo"></a></p><p><a href="https://fastapi.tiangolo.com/">FastAPI</a> is a Python <a href="https://asgi.readthedocs.io/en/latest/">ASGI</a> web API framework. Endpoints in FastAPI are Python <code>async</code> functions, which means that multiple requests can be processed concurrently. This is useful when the response of a request depends on the results of other <code>async</code> functions.</p><p>Example: If you use an asynchronous database client to access a remote database, your FastAPI endpoint function can <code>await</code> the results of the database query. This means that rather than getting blocked waiting for a response, new requests can begin to be processed while earlier requests are awaiting their results.</p><p>Learn more about FastAPI in the <a href="https://docs.rstudio.com/connect/user/fastapi/">RStudio Connect User Guide</a>.</p><h3 id="additional-asgi-frameworks">Additional ASGI Frameworks</h3><p>Although ASGI is a standard, frameworks differ in the configuration settings required to support being deployed behind a proxy server (as is the case for APIs deployed within RStudio Connect). These frameworks have been validated for deployment in RStudio Connect:</p><ul><li><a href="https://gitlab.com/pgjones/quart">Quart</a></li><li><a href="https://falconframework.org/">Falcon</a></li><li><a href="https://sanicframework.org/en/">Sanic</a></li></ul><h3 id="new-jump-start-example">New Jump Start Example</h3><p>The new FastAPI jump start example should look familiar to those who experimented with Flask. We&rsquo;ve implemented the Stock Pricing Service example again in FastAPI so you can quickly see the differences between the two frameworks:</p><p><img src="fastapi-jumpstart.png" alt="RStudio Connect FastAPI Jump Start Example"></p><h3 id="get-started">Get Started</h3><p>FastAPI and other ASGI-compatible APIs can be deployed to RStudio Connect with the <code>rsconnect-python</code> package. Follow the same <a href="https://docs.rstudio.com/connect/user/publishing/#publishing-python-apis">basic deployment steps</a> required from our other python content types:</p><p>Pre-flight checks:</p><ul><li>Install (or upgrade) the <code>rsconnect-python</code> command line interface using <code>pip</code>.</li><li>Add an RStudio Connect server for deployment by specifying the server URL and your API key.</li><li>Verify that you have activated the <code>virtualenv</code> environment that you want to reproduce on the server.</li><li>Ensure that you specify the <a href="https://docs.rstudio.com/connect/user/publishing/#publishing-rsconnect-python-entrypoint">correct app entrypoint</a>.</li></ul><p>Deploy your ASGI-compliant API with:</p><pre><code>rsconnect deploy fastapi -n myServer MyApiPath/</code></pre><div align="center"><a class="btn btn-primary btn-lg mt-5" href="https://docs.rstudio.com/rsconnect-python/#installation">Upgrade your rsconnect-python CLI</a></div><h2 id="jupyter-notebook-feature-hiding-code-cells">Jupyter Notebook Feature: Hiding Code Cells</h2><p><em>Introduced in RStudio Connect 1.9.0.</em></p><p>Hiding input code cells can be useful when preparing notebooks for audiences where a cleaner or less code-heavy presentation would be more appreciated.</p><p>There are two options for hiding input code cells in Jupyter Notebooks published to RStudio Connect:</p><ul><li>Hide all input code cells</li><li>Hide only selected input code cells</li></ul><p><a href="https://docs.rstudio.com/connect/user/jupyter-notebook/#hide-input"><img src="input-shownvhidden.png" alt="Hide input code cells"></a></p><p>If you&rsquo;ve already set up the push-button publishing plugin for Jupyter Notebooks, make sure to upgrade <a href="https://docs.rstudio.com/rsconnect-jupyter/upgrading/">rsconnect-jupyter</a>, and <a href="https://docs.rstudio.com/rsconnect-python/#installation">rsconnect-python</a> so you can get access to the new publishing features.</p><p>Learn more in the <a href="https://docs.rstudio.com/connect/user/jupyter-notebook/#hide-input">RStudio Connect User Guide</a>.</p><h2 id="ending-support-for-python-2">Ending Support for Python 2</h2><p><em>Starting January 2022, RStudio Connect will no longer support Python 2.</em></p><p>Python 2.7 has reached end of life maintenance status. Support from the Python language governing body ended on January 1, 2020 and it is no longer receiving security patches.</p><p>RStudio Connect has continued to support Python 2.7 beyond its EOL status, but we will join the community in ending support as of January 2022.</p><p>Factors that have gone into our decision include the following:</p><ul><li>Python 3 is now widely adopted and is the actively-developed version of the Python language.</li><li>In January 2021, the <code>pip</code> 21.0 release officially dropped support for Python 2.</li><li>A large number of projects pledged to drop support for Python 2 in 2020 including TensorFlow, scikit-learn, Apache Spark, pandas, XGBoost, NumPy, Bokeh, Matplotlib, IPython, and Jupyter notebook.</li></ul><h3 id="steps-for-planning-your-migration">Steps for planning your migration:</h3><p><strong>Administrators</strong> should determine whether Python 2 content exists on your RStudio Connect server today.</p><ul><li>Customers who have RStudio Connect 1.8.6 or higher can audit the complete list of content items on their server and which versions of R/Python they use by deploying <a href="https://github.com/sol-eng/rsc-audit-reports/blob/main/environment-audit/environment-audit-report.Rmd">this report</a>.</li></ul><p><strong>Publishers</strong> should review the official porting guide and redeploy any mission critical content that currently relies on Python 2.</p><ul><li>Read the official <a href="https://docs.python.org/3/howto/pyporting.html">&ldquo;Porting Python 2 Code to Python 3&rdquo; guide</a> and the <a href="https://python3statement.org/practicalities/">Python 3 Statement Practicalities</a> for advice on how to sunset your Python 2 code.</li></ul><h2 id="upgrade-to-rstudio-connect-2021080">Upgrade to RStudio Connect 2021.08.0</h2><p>Before upgrading, please review the <a href="http://docs.rstudio.com/connect/news">full release notes</a>. This release contains additional features described in the Custom Branding <a href="https://blog.rstudio.com/2021/08/30/rstudio-connect-2021-08-custom-branding/">blog announcement</a>.</p><blockquote><h4 id="upgrade-planning">Upgrade Planning</h4><p>Upgrading RStudio Connect should require less than five minutes. If you are upgrading from a version earlier than 1.9.0.1, be sure to consult the release notes for the intermediate releases, as well. As noted above, this release has features that require updates to <a href="https://docs.rstudio.com/rsconnect-python/#installation"><code>rsconnect-python</code></a> and, if applicable, <a href="https://docs.rstudio.com/rsconnect-jupyter/upgrading/"><code>rsconnect-jupyter</code></a>.</p></blockquote><p>To perform an RStudio Connect upgrade, download and run the installation script. The script installs a new version of Connect on top of the earlier one. Existing configuration settings are respected.</p><pre><code># Download the installation scriptcurl -Lo rsc-installer.sh https://cdn.rstudio.com/connect/installer/installer-v1.9.3.sh# Run the installation scriptsudo bash ./rsc-installer.sh 2021.08.0</code></pre><div align="center"><a class="btn btn-primary btn-lg mt-5" href="https://rstudio.com/about/subscription-management/">Sign up for RStudio Professional Product Updates</a></div></description></item><item><title>Practical Advice for R in Production - Answering Your Questions</title><link>https://www.rstudio.com/blog/practical-advice-for-r-in-production-answering-your-questions/</link><pubDate>Fri, 27 Aug 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/practical-advice-for-r-in-production-answering-your-questions/</guid><description><p><em>This is a guest post by Colin Gillespie from <a href="https://www.jumpingrivers.com/">Jumping Rivers</a>, a Full Service RStudio Partner.</em></p><p>Earlier this month, Jack Walton and I delivered a webinar with RStudio on the benefits of putting R into production environments, and how to do it successfully. We received tons of questions from participants, ranging from package management, to team organization, and container best practices. Below is a summary of our answers to your questions.</p><h2 id="watch-both-webinars-here"><em>Watch both webinars here:</em></h2><h2 id="practical-advice-for-putting-r-in-production-part-1-whyhttpswwwrstudiocomresourceswebinarspractical-advice-for-r-in-production-1-why-and-part-2-howhttpswwwrstudiocomresourceswebinarspractical-advice-for-r-in-production-2-how"><em>Practical Advice for Putting R in Production, <a href="https://www.rstudio.com/resources/webinars/practical-advice-for-r-in-production-1-why/">Part 1: Why</a> and <a href="https://www.rstudio.com/resources/webinars/practical-advice-for-r-in-production-2-how/">Part 2: How</a></em></h2><ul><li><a href="#q1">Do you have a preferred tool or package for package version management or CRAN snapshots?</a></li><li><a href="#q2">It seems someone needs to take charge of the data engineering pipeline and process. Who would you put in charge of it? IT or DS?</a></li><li><a href="#q3">Are there any conversation-starters IT leaders cannot ignore?</a></li><li><a href="#q4">Do you think R in production is mature?</a></li><li><a href="#q5">How would you handle different R versions, packages etc., because a pipeline from 5 years ago still has to be reproduced? Docker?</a></li><li><a href="#q6">Which infrastructure do you usually use to put R into production in an organisation? I saw RStudio Connect, but how about Azure ML Studio? Experience with that tool?</a></li><li><a href="#q7">Thoughts on containers / Kubernetes instead of RStudio Connect?</a></li></ul><h3 id="a-nameq1colin-many-thanks-for-your-presentation-do-you-have-a-preferred-tool-or-package-for-package-version-management-or-cran-snapshotsa"><a name="q1">Colin, many thanks for your presentation! Do you have a preferred tool or package for package version management or CRAN snapshots?</a></h3><p>At <a href="https://www.jumpingrivers.com">Jumping Rivers</a> we use a combination of tools. For distributing and installing R packages, we use <a href="https://www.rstudio.com/products/package-manager/">RStudio Package Manager</a> (RSPM), in addition to the R package <code>drat</code>.</p><ul><li>RSPM is excellent for accessing particular CRAN snapshots and binary R packages. You simply <a href="https://packagemanager.rstudio.com/client/#/repos/1/overview">select the date</a> - and you’ve pinned it to CRAN. For our day to day work, these features are essential.</li><li>The <a href="https://cran.r-project.org/web/packages/drat/index.html"><code>drat</code></a> R package is a handy little package that makes creating R repositories easy. Since <code>drat</code> is an R package, we have complete flexibility with customization. For example, we have an internal workflow that dynamically creates repositories based on a Git branch name. Dynamically creating repos allows us to work on a separate development stream efficiently. Our primary use case for dynamic repos is when a Shiny app depends on several internal packages.</li></ul><p>In terms of package versioning, we tackle this in multiple ways.</p><ul><li>For internal packages, when a package changes, the version number and NEWS file must be updated. The rule is enforced via continuous integration.</li><li>Where appropriate, we use <code>renv</code>. While this solves (some) reproducibility problems, it can cause other issues. We’ve recently taken on maintaining a few Shiny applications for clients. The Shiny application had been pinned to R v3.5 and the associated packages for that R release. This pinning causes upgrade issues and potential security issues (Javascript!).</li></ul><p>Finally, for our training material, the notes are always built with a current version of R and the current version of CRAN. When we run a course, participants are likely to use the latest versions of these packages. This can cause issues when the notes fail to build. But it’s better that the CI pipeline complains than course participants!</p><h3 id="a-nameq2it-seems-someone-needs-to-take-charge-of-the-data-engineering-pipeline-and-process-who-would-you-put-in-charge-of-it-it-or-dsa"><a name="q2">It seems someone needs to take charge of the data engineering pipeline and process. Who would you put in charge of it? IT or DS?</a></h3><p>Pragmatically, the person in charge is the person paying the bill! While R is free, nothing is “free” for large organisations. Everything takes time and resources.</p><p>Typically, IT does the bulk of the work. That is, installing, upgrading, and maintaining R/RStudio. But Data Scientists, as the end-users, should have input into what they want. Communication is the key. When we work with organisations, we often provide that translation layer. We convert DS requirements into IT deliverables.</p><p>I’m making a hard distinction between IT and DS, but I acknowledge that this isn’t clear-cut in many organisations. But my overall feeling is that many DS teams don’t (typically) do well maintaining, patching and upgrading systems. They are too busy building models, reports and dashboards!</p><img src="mechanic.png" alt="Jumping Rivers cloud mechanic to the rescue!" class="center"><h3 id="a-nameq3are-there-any-conversation-starters-it-leaders-cannot-ignorea"><a name="q3">Are there any conversation-starters IT leaders cannot ignore?</a></h3><p>That’s an excellent question, and I suspect I would be a rich man if I knew the answer! There isn’t any evidence to suggest R is less secure than other standard environments. When we work with an organisation, we always start with a scoping project. This exercise assesses the organisations’ needs and, more importantly, provides different options with associated costs.</p><p>For example, take the question: how much does it cost to deploy an RStudio IDE across an organisation?</p><ul><li><a href="https://www.rstudio.com/products/rstudio/">RStudio Open Source IDE</a>: Free, but it would take IT X hours (assuming experience) to deploy and maintain. Furthermore, scaling and security are much more complicated.</li><li><a href="https://www.rstudio.com/products/workbench/">RStudio Workbench</a> (Pro): £Y, but reduces the cost of implementing scaling and security.</li><li><a href="https://www.jumpingrivers.com/consultancy/managed-rstudio-rsconnect-cloud-production/">Maintained RStudio Workbench by Jumping Rivers</a>: £Y + £Z, but the cost to IT is now tiny.</li></ul><p>Each point has different implications and different costs. But the organisation needs to be in the position to make a choice.</p><h3 id="a-nameq4do-you-think-r-in-production-is-maturea"><a name="q4">Do you think R in production is mature?</a></h3><p>Yes! See <a href="https://github.com/ThinkR-open/companies-using-r">this list of companies</a>, as well as <a href="https://www.rstudio.com/about/customer-stories/">this list of RStudio customers</a>.</p><img src="quote.png" alt="What is production anyway? Mark Sellors at rstudio::conf(2019)" class="center"><div style="text-align:right;"><sup>Screenshot from inspirational-r-quotes.com</sup></div><h3 id="a-nameq5how-would-you-handle-different-r-versions-packages-etc-because-a-pipeline-from-5-years-ago-still-has-to-be-reproduced-dockera"><a name="q5">How would you handle different R versions, packages etc., because a pipeline from 5 years ago still has to be reproduced? Docker?</a></h3><p>I feel your pain! One of our regular roles for clients is to take over and maintain workflows. Typically, this means using Docker to ensure that an existing pipeline doesn’t break.</p><p>However, we also have our eye on the future. Five years is starting to get painful in terms of R maintenance. From the beginning, we’ll actively plan an upgrade strategy. This plan is always centred around continuous integration and unit testing. Once this is in place, we have the nascent framework of an upgrade pathway.</p><h3 id="a-nameq6which-infrastructure-do-you-usually-use-to-put-r-into-production-in-an-organisation-i-saw-rstudio-connect-but-how-about-azure-ml-studio-experience-with-that-toola"><a name="q6">Which infrastructure do you usually use to put R into production in an organisation? I saw RStudio Connect, but how about Azure ML Studio? Experience with that tool?</a></h3><p>The infrastructure we provide for an organisation is always carefully chosen to suit an organisation’s particular needs and use-cases. As such, we have experience deploying R solutions to several different production environments, including Azure-based environments.</p><p>Azure Machine Learning is, as the name suggests, first-and-foremost a platform for building machine learning pipelines, rather than a more general content hosting platform (as RStudio Connect is). Azure Machine learning supports several “drag-and-drop” no-code workflows (in addition to code-first workflows), making it an inclusive development platform for team members with low-code backgrounds.</p><p>We have also helped organisations migrate R pipelines onto the Databricks platform. Databricks makes it easy to scale R jobs across spark clusters which are created and scaled on demand. Both Azure and AWS support Databricks deployments, making it simpler to assimilate this tool into existing cloud-based environments.</p><blockquote><p>While R is free, nothing is “free” for large organisations. Everything takes time and resources.</p><p>- Colin Gillespie, Jumping Rivers</p></blockquote><h3 id="a-nameq7thoughts-on-containers--kubernetes-instead-of-rstudio-connect-a"><a name="q7">Thoughts on containers / Kubernetes instead of RStudio Connect? </a></h3><p>The two technologies are a bit tricky to compare. RStudio Connect takes the pain out of application deployment. With a single click (or CI process), applications magically appear on a server. The user doesn’t need to worry about servers, containers or deployment.</p><p>Containers/Kubernetes are something that the average user doesn’t need to know about. They’re lurking in the background, ready to provide aid deployment or scale-up resources as needed. You can use containers to deploy Shiny applications, but that has to be combined with other technologies.</p><p><a class="btn btn-primary btn-lg mt-4" href="https://www.jumpingrivers.com/">Learn more about Jumping Rivers</a></p></description></item><item><title>Cheatsheet Updates</title><link>https://www.rstudio.com/blog/cheat-sheet-updates/</link><pubDate>Mon, 23 Aug 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/cheat-sheet-updates/</guid><description><sup>Photo by <a href="https://unsplash.com/@patrickperkins?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Patrick Perkins</a> on <a href="https://unsplash.com/s/photos/study-sheet?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup><p>Hello! This summer I worked as one of the education interns, working with Mine Çetinkaya-Rundel and Garrett Grolemund on the <a href="https://www.rstudio.com/resources/cheatsheets/">RStudio Cheatsheets</a>. I&rsquo;m excited to share my work from this summer. Many RStudio cheatsheets have been updated or reworked based on recent package updates, and we&rsquo;ve updated the cheatsheet contribution process as well. You&rsquo;ll also see some small changes to the cheatsheet website reflecting these changes.</p><h2 id="rstudio-cheatsheet-updates">RStudio Cheatsheet Updates</h2><p>Cheatsheets for <a href="https://github.com/rstudio/cheatsheets/blob/master/data-transformation.pdf">dplyr</a>, <a href="https://github.com/rstudio/cheatsheets/blob/master/data-visualization.pdf">ggplot2</a>, <a href="https://github.com/rstudio/cheatsheets/blob/master/lubridate.pdf">lubridate</a>, <a href="https://github.com/rstudio/cheatsheets/blob/master/factors.pdf">forcats</a>, <a href="https://github.com/rstudio/cheatsheets/blob/master/reticulate.pdf">reticulate</a>, <a href="https://github.com/rstudio/cheatsheets/blob/master/rstudio-ide.pdf">the RStudio IDE</a>, <a href="https://github.com/rstudio/cheatsheets/blob/master/shiny.pdf">Shiny</a>, and <a href="https://github.com/rstudio/cheatsheets/blob/master/strings.pdf">stringr</a> have been updated to reflect the most recent package updates. This includes dplyr&rsquo;s row-wise grouping, the RStudio Visual Editor, and more.</p><table><tr><td style="background-color:#FFFFFF"><a href="https://github.com/rstudio/cheatsheets/blob/master/data-transformation.pdf"><img src="dplyr.png" alt="Data transformation with dplyr cheatsheet"></a></td><td style="background-color:#FFFFFF"><a href="https://github.com/rstudio/cheatsheets/blob/master/data-visualization.pdf"><img src="ggplot2.png" alt="Data visualization with ggplot2 cheatsheet"></a></td><td style="background-color:#FFFFFF"><a href="https://github.com/rstudio/cheatsheets/blob/master/lubridate.pdf"><img src="lubridate.png" alt="Dates and times with lubridate cheatsheet"></a></td><td style="background-color:#FFFFFF"><a href="https://github.com/rstudio/cheatsheets/blob/master/factors.pdf"><img src="forcats.png" alt="Factors with forcats cheatsheet"></a></td></tr><tr><td style="background-color:#FFFFFF"><a href="https://github.com/rstudio/cheatsheets/blob/master/reticulate.pdf"><img src="reticulate.png" alt="Python with R and reticulate cheatsheet"></a></td><td style="background-color:#FFFFFF"><a href="https://github.com/rstudio/cheatsheets/blob/master/rstudio-ide.pdf"><img src="ide.png" alt="RStudio IDE cheatsheet"></a></td><td style="background-color:#FFFFFF"><a href="https://github.com/rstudio/cheatsheets/blob/master/shiny.pdf"><img src="shiny.png" alt="Shiny cheatsheet"></a></td><td style="background-color:#FFFFFF"><a href="https://github.com/rstudio/cheatsheets/blob/master/strings.pdf"><img src="strings.png" alt="String manipulation with stringr cheatsheet"></a></td></tr></table><p><a href="https://github.com/rstudio/cheatsheets/blob/master/rmarkdown.pdf">R Markdown</a> and <a href="https://github.com/rstudio/cheatsheets/blob/master/purrr.pdf">Apply functions with purrr</a> received more substantial redesigns. R Markdown was updated to match the new hex sticker colors and include new features related to the RStudio IDE Visual Editor. With the addition of row-wise grouping to dplyr, the list-column workflow on the previous purrr cheatsheet also needed to be updated and was moved to a new cheatsheet featuring tidyr and nested data. The new purrr cheatsheet focuses more on the many different <code>map()</code> functions available in the package on the first page, and all of the more general list functions on the second page.</p><p><a href="https://github.com/rstudio/cheatsheets/blob/master/rmarkdown.pdf"><img src="rmarkdown.png" alt="rmarkdown cheatsheet"></a></p><p><a href="https://github.com/rstudio/cheatsheets/blob/master/purrr.pdf"><img src="purrr.png" alt="Apply functions with purrr cheatsheet"></a></p><p>Speaking of new cheatsheets, tidyr now has it’s own cheatsheet! <a href="https://github.com/rstudio/cheatsheets/blob/master/tidyr.pdf">Data tidying with tidyr</a> features an overview of tibbles and how to reshape and work with tidy data on the first page, and a redesign of the nested data and list-column workflow from the previous purrr cheatsheet on the second page. The new page provides an overview of creating, reshaping, and transforming nested data and list-columns with tidyr, tibble, and dplyr. Previously tidyr was featured on the second page of the Data import with tidyr cheatsheet, and with the space provided by this change <a href="https://github.com/rstudio/cheatsheets/blob/master/data-import.pdf">Data import with readr, readxl, and googlesheets4</a> now includes a second page covering spreadsheets, with readxl and googlesheets4.</p><p><a href="https://github.com/rstudio/cheatsheets/blob/master/tidyr.pdf"><img src="tidyr.png" alt="Data tidying with tidyr cheatsheet"></a></p><p><a href="https://github.com/rstudio/cheatsheets/blob/master/data-import.pdf"><img src="import.png" alt="Data import with readr, readxl, and googlesheets4 cheatsheet"></a></p><p>See all of the current RStudio cheatsheets, as well as user contributed cheatsheets and translations on the <a href="https://www.rstudio.com/resources/cheatsheets/">RStudio website</a>.</p><h2 id="new-contribution-process">New Contribution Process</h2><p>Another big project completed during my internship was reworking the process for handling user contributed cheatsheets. The <a href="https://github.com/rstudio/cheatsheets">Cheatsheet GitHub Repository</a> now includes a <a href="https://github.com/rstudio/cheatsheets/blob/master/.github/CONTRIBUTING.md">Contributing Guidelines page</a> outlining how to submit a new cheatsheet, or a new or updated translation. Both can now be submitted directly to GitHub via pull request, and you&rsquo;ll see a template outlining everything to include.</p><p>Questions on the cheatsheets can now be submitted as issues on the Cheatsheet GitHub Repository. We have included issue templates to help guide this process, which can be particularly helpful if you&rsquo;re new to GitHub. Just go to the <a href="https://github.com/rstudio/cheatsheets/issues">Issues tab</a> and choose the option that’s most relevant to your question!</p><h2 id="call-for-translations">Call for Translations</h2><p>If you’re interested in translating a cheatsheet, please feel free to submit any updates using the new process! With the changes to so many cheatsheets, many translations would benefit from updates as well.</p><p>We really appreciate the work and care that goes into these translations. The first eight cheatsheets mentioned could be great starting points if you’re new to the process, since the changes were much smaller, and for many languages will require updating existing translations, instead of starting from scratch. If you’re interested in translating a cheatsheet, but have limited time or aren’t sure where to start, we’ve listed cheatsheets in each language that we think would be good first contributions as <a href="https://github.com/rstudio/cheatsheets/issues">issues in the GitHub repo</a>.</p></description></item><item><title>pins 0.4.0: Versioning</title><link>https://www.rstudio.com/blog/pins-0-4-0-versioning/</link><pubDate>Mon, 23 Aug 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/pins-0-4-0-versioning/</guid><description><p>A new version of <code>pins</code> is available on CRAN today, which adds support for <a href="http://pins.rstudio.com/articles/advanced-versions.html">versioning</a> your datasets and <a href="http://pins.rstudio.com/articles/boards-dospace.html">DigitalOcean Spaces</a> boards!</p><p>As a quick recap, the pins package allows you to cache, discover and share resources. You can use <code>pins</code> in a wide range of situations, from downloading a dataset from a URL to creating complex automation workflows (learn more at <a href="https://pins.rstudio.com">pins.rstudio.com</a>). You can also use <code>pins</code> in combination with TensorFlow and Keras; for instance, use <a href="https://tensorflow.rstudio.com/tools/cloudml">cloudml</a> to train models in cloud GPUs, but rather than manually copying files into the GPU instance, you can store them as pins directly from R.</p><p>To install this new version of <code>pins</code> from CRAN, simply run:</p><pre><code>install.packages(&quot;pins&quot;)</code></pre><p>You can find a detailed list of improvements in the pins <a href="https://github.com/rstudio/pins/blob/master/NEWS.md">NEWS</a> file.</p><h1 id="versioning">Versioning</h1><p>To illustrate the new versioning functionality, let&rsquo;s start by downloading and caching a remote dataset with pins. For this example, we will download the weather in London, this happens to be in JSON format and requires <code>jsonlite</code> to be parsed:</p><pre><code>library(pins)weather_url &lt;- &quot;https://samples.openweathermap.org/data/2.5/weather?q=London,uk&amp;appid=b6907d289e10d714a6e88b30761fae22&quot;pin(weather_url, &quot;weather&quot;) %&gt;%jsonlite::read_json() %&gt;%as.data.frame()</code></pre><pre><code> coord.lon coord.lat weather.id weather.main weather.description weather.icon1 -0.13 51.51 300 Drizzle light intensity drizzle 09d</code></pre><p>One advantage of using <code>pins</code> is that, even if the URL or your internet connection becomes unavailable, the above code will still work.</p><p>But back to <code>pins 0.4</code>! The new <code>signature</code> parameter in <code>pin_info()</code> allows you to retrieve the &ldquo;version&rdquo; of this dataset:</p><pre><code>pin_info(&quot;weather&quot;, signature = TRUE)</code></pre><pre><code># Source: local&lt;weather&gt; [files]# Signature: 624cca260666c6f090b93c37fd76878e3a12a79b# Properties:# - path: weather</code></pre><p>You can then validate the remote dataset has not changed by specifying its signature:</p><pre><code>pin(weather_url, &quot;weather&quot;, signature = &quot;624cca260666c6f090b93c37fd76878e3a12a79b&quot;) %&gt;%jsonlite::read_json()</code></pre><p>If the remote dataset changes, <code>pin()</code> will fail and you can take the appropriate steps to accept the changes by updating the signature or properly updating your code. The previous example is useful as a way of detecting version changes, but we might also want to retrieve specific versions even when the dataset changes.</p><p><code>pins 0.4</code> allows you to display and retrieve versions from services like GitHub, Kaggle and RStudio Connect. Even in boards that don&rsquo;t support versioning natively, you can opt-in by registering a board with <code>versions = TRUE</code>.</p><p>To keep this simple, let&rsquo;s focus on GitHub first. We will register a GitHub board and pin a dataset to it. Notice that you can also specify the <code>commit</code> parameter in GitHub boards as the commit message for this change.</p><pre><code>board_register_github(repo = &quot;javierluraschi/datasets&quot;, branch = &quot;datasets&quot;)pin(iris, name = &quot;versioned&quot;, board = &quot;github&quot;, commit = &quot;use iris as the main dataset&quot;)</code></pre><p>Now suppose that a colleague comes along and updates this dataset as well:</p><pre><code>pin(mtcars, name = &quot;versioned&quot;, board = &quot;github&quot;, commit = &quot;slight preference to mtcars&quot;)</code></pre><p>From now on, your code could be broken or, even worse, produce incorrect results!</p><p>However, since GitHub was designed as a version control system and <code>pins 0.4</code> adds support for <code>pin_versions()</code>, we can now explore particular versions of this dataset:</p><pre><code>pin_versions(&quot;versioned&quot;, board = &quot;github&quot;)</code></pre><pre><code># A tibble: 2 x 4version created author message&lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;1 6e6c320 2020-04-02T21:28:07Z javierluraschi slight preference to mtcars2 01f8ddf 2020-04-02T21:27:59Z javierluraschi use iris as the main dataset</code></pre><p>You can then retrieve the version you are interested in as follows:</p><pre><code>pin_get(&quot;versioned&quot;, version = &quot;01f8ddf&quot;, board = &quot;github&quot;)</code></pre><pre><code># A tibble: 150 x 5Sepal.Length Sepal.Width Petal.Length Petal.Width Species&lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;fct&gt;1 5.1 3.5 1.4 0.2 setosa2 4.9 3 1.4 0.2 setosa3 4.7 3.2 1.3 0.2 setosa4 4.6 3.1 1.5 0.2 setosa5 5 3.6 1.4 0.2 setosa6 5.4 3.9 1.7 0.4 setosa7 4.6 3.4 1.4 0.3 setosa8 5 3.4 1.5 0.2 setosa9 4.4 2.9 1.4 0.2 setosa10 4.9 3.1 1.5 0.1 setosa# … with 140 more rows</code></pre><p>You can follow similar steps for <a href="http://pins.rstudio.com/articles/boards-rsconnect.html">RStudio Connect</a> and <a href="http://pins.rstudio.com/articles/boards-kaggle.html">Kaggle</a> boards, even for existing pins! Other boards like <a href="http://pins.rstudio.com/articles/boards-s3.html">Amazon S3</a>, <a href="http://pins.rstudio.com/articles/boards-gcloud.html">Google Cloud</a>, <a href="http://pins.rstudio.com/articles/boards-dospace.html">Digital Ocean</a> and <a href="http://pins.rstudio.com/articles/boards-azure.html">Microsoft Azure</a> require you explicitly enable versioning when registering your boards.</p><h1 id="digitalocean">DigitalOcean</h1><p>To try out the new <a href="http://pins.rstudio.com/articles/boards-dospace.html">DigitalOcean Spaces board</a>, first you will have to register this board and enable versioning by setting <code>versions</code> to <code>TRUE</code>:</p><pre><code>library(pins)board_register_dospace(space = &quot;pinstest&quot;,key = &quot;AAAAAAAAAAAAAAAAAAAA&quot;,secret = &quot;ABCABCABCABCABCABCABCABCABCABCABCABCABCA==&quot;,datacenter = &quot;sfo2&quot;,versions = TRUE)</code></pre><p>You can then use all the functionality pins provides, including versioning:</p><pre><code># create pin and replace content in digitaloceanpin(iris, name = &quot;versioned&quot;, board = &quot;pinstest&quot;)pin(mtcars, name = &quot;versioned&quot;, board = &quot;pinstest&quot;)# retrieve versions from digitaloceanpin_versions(name = &quot;versioned&quot;, board = &quot;pinstest&quot;)</code></pre><pre><code># A tibble: 2 x 1version&lt;chr&gt;1 c35da042 d9034cd</code></pre><p>Notice that enabling versions in cloud services requires additional storage space for each version of the dataset being stored:</p><p><img src="images/digitalocean-spaces-pins-versioned.png" alt="">{width=100%}</p><p>To learn more visit the <a href="http://pins.rstudio.com/articles/advanced-versions.html">Versioning</a> and <a href="http://pins.rstudio.com/articles/boards-dospace.html">DigitalOcean</a> articles. To catch up with previous releases:</p><ul><li><a href="https://blog.rstudio.com/2019/11/28/pins-0-3-0-azure-gcloud-and-s3/">pins 0.3</a>: Azure, GCloud and S3</li><li><a href="https://blog.rstudio.com/2019/09/09/pin-discover-and-share-resources/">pins 0.2</a>: Pin, Discover and Share Resources</li></ul><p>Thanks for reading along!</p></description></item><item><title>Announcing bookdown v0.23</title><link>https://www.rstudio.com/blog/2021-08-18-bookdown-release/</link><pubDate>Wed, 18 Aug 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/2021-08-18-bookdown-release/</guid><description><!--https://unsplash.com/photos/7Bgm-_Sn09c--><p>Happy summer from the R Markdown family! We are proud to share that <strong>bookdown</strong> (<a href="https://pkgs.rstudio.com/bookdown/" class="uri">https://pkgs.rstudio.com/bookdown/</a>) version 0.23 is on CRAN. <strong>bookdown</strong> is a package that helps you write books and long-form articles/reports, knitting together content from single or multiple R Markdown files as input.</p><table><thead><tr class="header"><th align="center">Latest release</th></tr></thead><tbody><tr class="odd"><td align="center"><img src="https://img.shields.io/badge/CRAN-0.23-brightgreen" alt="Last bookdown release 0.23 cran badge" /></td></tr></tbody></table><p>You can install <strong>bookdown</strong> from CRAN with:</p><pre class="r"><code>install.packages(&quot;bookdown&quot;)# or if the v0.23 binary package for your platform is not ready yet, try# install.packages(&quot;bookdown&quot;, type = &quot;source&quot;)</code></pre><p>In this post, we’ll share some highlights from the latest release, but you might want to look at the <a href="https://pkgs.rstudio.com/bookdown/news/index.html#changes-in-bookdown-version-0-23-2021-08-13">release notes</a> for the full details.</p><div id="new-reference-site" class="section level2"><h2>New reference site</h2><p>Joining its R Markdown siblings like <a href="https://pkgs.rstudio.com/blogdown/">blogdown</a>, <a href="https://pkgs.rstudio.com/distill/">distill</a>, and <a href="https://pkgs.rstudio.com/rmarkdown/">rmarkdown</a>, <strong>bookdown</strong> has also gained a <a href="https://pkgs.rstudio.com/bookdown/">reference site</a>, built with <strong>pkgdown</strong>. There, you’ll find:</p><ol style="list-style-type: decimal"><li><p>📖 A <a href="https://pkgs.rstudio.com/bookdown/reference/index.html">reference section</a>,</p></li><li><p>🖼️ An <a href="https://pkgs.rstudio.com/bookdown/articles/articles/examples.html">example gallery</a>, plus</p></li><li><p>📣 The <a href="https://pkgs.rstudio.com/bookdown/news/index.html">latest news</a>.</p></li></ol></div><div id="new-html-book-format-based-on-bootstrap-4" class="section level2"><h2>New HTML book format based on Bootstrap 4</h2><p>This release includes a new HTML book output format called <code>bs4_book()</code>, contributed by <a href="https://github.com/hadley">Hadley Wickham</a> and <a href="https://github.com/maelle">Maëlle Salmon</a>. Based on Bootstrap 4, <code>bs4_book()</code> includes carefully crafted features to provide a clean reading experience whether you are on a phone, tablet, or desktop. On a full-size screen, the layout includes three columns of content so readers can quickly see all chapters on the left, the current chapter in the middle, and sections within the the current chapter on the right. As an example, you can read a book using this format here: <a href="https://mastering-shiny.org" class="uri">https://mastering-shiny.org</a></p><div class="figure" style="text-align: center"><span style="display:block;" id="fig:unnamed-chunk-1"></span><a href="https://mastering-shiny.org/" target="_blank"><img src="https://bookdown.org/yihui/bookdown/images/bs4-book.png" alt="Home page for a bs4_book() showing the layout with a table of contents on the left, main chapter content in the center, and an 'on this page' sidebar on the right." /></a><p class="caption">Figure 1: Screenshot of a bs4_book home page.</p></div><p>Learn more about the unique features of this output format in the book <em>“bookdown: Authoring Books and Technical Documents with R Markdown”</em>: <a href="https://bookdown.org/yihui/bookdown/html.html#bs4-book" class="uri">https://bookdown.org/yihui/bookdown/html.html#bs4-book</a></p><p>Our package reference site also has a documentation page for <code>bs4_book()</code>: <a href="https://pkgs.rstudio.com/bookdown/reference/bs4_book.html" class="uri">https://pkgs.rstudio.com/bookdown/reference/bs4_book.html</a></p></div><div id="new-project-template" class="section level2"><h2>New project template</h2><p>To make it easier for users to start new <strong>bookdown</strong> book projects, we added two functions to <a href="https://pkgs.rstudio.com/bookdown/reference/create_book.html">create new bookdown projects</a>:</p><ul><li><code>create_gitbook()</code>, and</li><li><code>create_bs4_book()</code>.</li></ul><p>If you use RStudio, you can also access these two templates interactively from the <strong>New Project Wizard</strong> using <em>File &gt; New Project &gt; New Directory</em>.</p><div class="figure" style="text-align: center"><span style="display:block;" id="fig:new-bs4-book"></span><img src="https://bookdown.org/yihui/bookdown/images/new-bs4-book.png" alt="Screenshot showing the fields and dropdown selection menu in the RStudio New Project Wizard." /><p class="caption">Figure 2: Screenshot of the RStudio Project Wizard for creating a new bookdown project.</p></div><p>To help you build a new <strong>bookdown</strong> project faster, we also added some helpful pointers inside the template book itself to get you writing your book more quickly. You can think of the boilerplate content as a cheat sheet for the most useful features of <strong>bookdown</strong> so that you can easily access them if you are offline, or if you simply don’t have the docs right in front of you as you work. For example, you’ll find:</p><ol style="list-style-type: decimal"><li>How to use parts, chapters, sections, and subsections to organize your content.</li><li>How to use cross-references, including to captioned figures and tables.</li><li>How to add footnotes and citations.</li><li>How to use custom blocks for equations, theorems and proofs, and callouts.</li><li>How to prepare your book to be shared.</li></ol><p>We also included a <code>_common.R</code> script in the template project. By using <code>before_chapter_script</code> in your <code>bookdown.yml</code> file, this script is run at the beginning of each chapter:</p><pre class=".yaml"><code>before_chapter_script: _common.R</code></pre><p>Importantly, this works with <code>new_session: true</code> since <strong>bookdown</strong> v0.18 (see <a href="https://pkgs.rstudio.com/bookdown/news/index.html#bug-fixes-5">news</a>).</p><p>We hope these templates make it easier to start a book with bookdown. As always, with any template, you can also just cut out the template contents and start customizing and writing straight away too - the overall file structure and YAML configurations will still provide a useful skeleton for your next book.</p></div><div id="create-and-customize-404-pages" class="section level2"><h2>Create and customize 404 pages</h2><p>For all HTML book formats, bookdown now creates a default <code>404.html</code> page in your output directory using simple content (a header, and a body of 2 paragraphs). Learn more about 404 pages and how to create a custom page in our online docs: <a href="https://bookdown.org/yihui/bookdown/features-for-html-publishing.html#html-404" class="uri">https://bookdown.org/yihui/bookdown/features-for-html-publishing.html#html-404</a></p></div><div id="improved-search" class="section level2"><h2>Improved search</h2><p>For all HTML books, we now support an alternative search engine called <code>fuse.js</code>, which provides a better user experience and more nuanced search capabilities than <code>lunr.js</code>. To enable <code>fuse.js</code> for gitbook, set the search engine to be <code>fuse</code> in <code>_output.yml</code>:</p><pre class="yaml"><code>output:bookdown::gitbook:config:search:engine: fuse # lunr is the defaultoptions: null # can override, see: https://fusejs.io/api/options.html</code></pre><p>This is the only search engine supported by <code>bs4_book()</code> and, depending on user feedback, we may set <code>fuse</code> to be the default search engine in <code>gitbook()</code> as well. We will appreciate your testing and feedback!</p></div><div id="in-other-news" class="section level2"><h2>In other news</h2><ul><li><p>The <code>render_book()</code> function has a new default behavior, and will now search for an <code>index.Rmd</code> file in the current working directory. Previously, this function required users to specify the name of this file. Now, <code>render_book()</code> is equivalent to <code>render_book("index.Rmd")</code>.</p></li><li><p>The <code>render_book()</code> function can also now be used to render your book in a subdirectory of your project:</p><pre class="r"><code>render_book(&quot;book_in_a_folder&quot;)</code></pre></li><li><p>We updated the jQuery library to v3.x, which is now imported from the R package <strong>jquerylib</strong>.</p></li><li><p>Last but not least, we are continually working to update our documentation. For example, we have new instructions to help you deploy a bookdown book using Netlify Drop: <a href="https://bookdown.org/yihui/bookdown/netlify-drop.html" class="uri">https://bookdown.org/yihui/bookdown/netlify-drop.html</a></p></li></ul></div><div id="acknowledgements" class="section level2"><h2>Acknowledgements</h2><p>A big thanks to the 32 contributors who helped with this release by discussing problems, proposing features, and contributing code:</p><p><a href="https://github.com/aimundo">@aimundo</a>, <a href="https://github.com/apreshill">@apreshill</a>, <a href="https://github.com/AstrickHarren">@AstrickHarren</a>, <a href="https://github.com/avraam-1997">@avraam-1997</a>, <a href="https://github.com/briandk">@briandk</a>, <a href="https://github.com/cderv">@cderv</a>, <a href="https://github.com/CrumpLab">@CrumpLab</a>, <a href="https://github.com/danawanzer">@danawanzer</a>, <a href="https://github.com/DavidLukeThiessen">@DavidLukeThiessen</a>, <a href="https://github.com/dchiu911">@dchiu911</a>, <a href="https://github.com/debruine">@debruine</a>, <a href="https://github.com/edzer">@edzer</a>, <a href="https://github.com/GuillaumeBiessy">@GuillaumeBiessy</a>, <a href="https://github.com/hhmacedo">@hhmacedo</a>, <a href="https://github.com/hnguyen19">@hnguyen19</a>, <a href="https://github.com/johnbaums">@johnbaums</a>, <a href="https://github.com/jtbayly">@jtbayly</a>, <a href="https://github.com/judgelord">@judgelord</a>, <a href="https://github.com/LDSamson">@LDSamson</a>, <a href="https://github.com/maelle">@maelle</a>, <a href="https://github.com/malcolmbarrett">@malcolmbarrett</a>, <a href="https://github.com/N0rbert">@N0rbert</a>, <a href="https://github.com/pschloss">@pschloss</a>, <a href="https://github.com/rgaiacs">@rgaiacs</a>, <a href="https://github.com/robjhyndman">@robjhyndman</a>, <a href="https://github.com/salim-b">@salim-b</a>, <a href="https://github.com/shirdekel">@shirdekel</a>, <a href="https://github.com/ShixiangWang">@ShixiangWang</a>, <a href="https://github.com/Shuliyey">@Shuliyey</a>, <a href="https://github.com/strimmerlab">@strimmerlab</a>, <a href="https://github.com/thisisnic">@thisisnic</a>, and <a href="https://github.com/thosgood">@thosgood</a>.</p></div></description></item><item><title>Using Shiny in Healthcare: Examples from the 2021 Shiny Contest</title><link>https://www.rstudio.com/blog/using-shiny-in-healthcare/</link><pubDate>Tue, 17 Aug 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/using-shiny-in-healthcare/</guid><description><p>Did you know that Shiny is used every day to help healthcare organizations provide better care and health outcomes to people across the world? As a member of RStudio&rsquo;s Life Sciences &amp; Healthcare team, it&rsquo;s my privilege to learn about many of the ways that data science teams in healthcare innovate with Shiny. This year&rsquo;s Shiny Contest had a number of excellent submissions in healthcare and provides the perfect opportunity to share some of the amazing work being done with Shiny in the industry! We&rsquo;ll highlight three of these applications here. The apps are:</p><ul><li><p><a href="https://community.rstudio.com/t/vaccine-queue-simulator-shiny-contest-submission/103543">Vaccine Queue Simulator</a> by Mark Hanly, Oisín Fitzgerald, and Tim Churches</p></li><li><p><a href="https://community.rstudio.com/t/healthdown-shiny-contest-submission/104784">Healthdown</a> by Peter Gandenberger and Andreas Hofheinz</p></li><li><p><a href="https://community.rstudio.com/t/reviewr-shiny-contest-submission/104037">ReviewR</a> by Laura Wiley, Luke Rasmussen, and David Mayer</p></li></ul><p>We&rsquo;ll use these apps to demonstrate some of the use cases for Shiny in healthcare including:</p><ul><li><p><a href="#planning-for-healthcare-capacity">Planning for healthcare capacity</a></p></li><li><p><a href="#comparing-health-metrics-geographically">Comparing health metrics geographically</a></p></li><li><p><a href="#connecting-to-electronic-health-records-data">Connecting to electronic health records data</a></p></li></ul><h2 id="planning-for-healthcare-capacity">Planning for healthcare capacity</h2><p>Ensuring optimal supply, space, and staffing resources are available to meet any demands is critical for healthcare systems. Data Science teams can use R to facilitate this kind of planning, combining multiple data sources and building predictive models for expected resource needs.</p><p>The Vaccine Queue Simulator app simulates a vaccination clinic process to estimate the number of vaccinations given, wait times, and staff needed based on the assumptions of the model. The model takes into account many factors that can affect vaccination queues including how staff are distributed across different roles, how long it takes to prepare the vaccine, and whether people arrive on time for appointments. Putting this in a Shiny app makes it easy for anyone to quickly iterate and test out a range of different scenarios for their clinic or mass vaccination center. There&rsquo;s also a great guided install that can help users quickly feel confident running the simulations in the app!</p><p><img src="images/VaccineQueueSimulator.png" alt="Screenshot of the Vaccine Queue Simulator Shiny app by Mark Hanly, Oisín Fitzgerald and Tim Churches App"></p><p>By Mark Hanly, Oisín Fitzgerald and Tim Churches</p><p><a href="https://cbdrh.shinyapps.io/queueSim/">App</a> - <a href="https://github.com/CBDRH/vaccineQueueNetworks">Code</a> - <a href="https://community.rstudio.com/t/vaccine-queue-simulator-shiny-contest-submission/103543">Community Post</a></p><h2 id="comparing-health-metrics-geographically">Comparing health metrics geographically</h2><p>Many of us have seen visualizations recently that show differences in health metrics like infection and vaccination rates across a map. This kind of visualization can be especially helpful for understanding public health trends and for healthcare systems looking to better understand the communities they serve.</p><p>The healthdown application won an honorable mention in the 2021 Shiny Contest and shows a great new way to interactively explore health metrics spatially. The app allows users to explore the University of Wisconsin Population Health rankings. The team uses the leafdown package to give users a lot of flexibility to drill down and explore direct comparisons across different states and counties.</p><p><img src="images/Healthdown.png" alt="Screenshot of the Healthdown Shiny app by Peter Gandenberger and Andreas Hofheinz"></p><p>By Peter Gandenberger and Andreas Hofheinz</p><p><a href="https://hoga.shinyapps.io/healthdown/">App</a> - <a href="https://github.com/hoga-it/healthdown">Code</a> - <a href="https://community.rstudio.com/t/healthdown-shiny-contest-submission/104784">Community Post</a></p><h2 id="connecting-to-electronic-health-records-data">Connecting to electronic health records data</h2><p>Getting connected to data in electronic health records (EHR) is the first step for many data science projects in healthcare. Then teams can build reports or visualizations that combine the data in new ways or make it easier for healthcare providers to quickly get insights from the data.</p><p>The ReviewR app provides a framework for connecting to EHR data and reviewing patient records in the application. While the app supports the data formats OMOP and MIMIC-III, the authors have also created a vignette to help others extend the use to other formats (<a href="https://reviewr.thewileylab.org/articles/customize_support_new_datamodel.html">see here</a>). Similarly, they also have a vignette to extend the database connection beyond the Google BigQuery and Postgres connections already built into the app (<a href="https://reviewr.thewileylab.org/articles/customize_support_new_rdbms.html">here</a>). This app can be a great resource for teams looking to start using or extend how they access EHR data with R, though you&rsquo;ll also want to talk with your IT / database teams to be sure all security standards are met!</p><p><img src="images/ReviewR.png" alt="Screenshot of ReviewR Shiny app by Laura Wiley, Luke Rasmussen, and David Mayer"></p><p>By Laura Wiley, Luke Rasmussen, and David Mayer<br><a href="https://thewileylab.shinyapps.io/ReviewR/">App</a> - <a href="https://github.com/thewileylab/ReviewR">Code</a> - <a href="https://community.rstudio.com/t/reviewr-shiny-contest-submission/104037">Community Post</a></p><h2 id="interested-in-learning-more-about-r-and-shiny-in-healthcare">Interested in learning more about R and Shiny in Healthcare?</h2><p>The 2021 Shiny contest gave us a chance to highlight three of the ways we see Shiny used in healthcare, but every day we get to see Shiny applied to solve new problems. If this sample has sparked your interest and you want to learn more about how Shiny and R are used across healthcare, check out some of these resources below:</p><ul><li><p><a href="https://r-medicine.org/">R in Medicine</a>- this virtual conference is happening August 24-27, 2021. The talks from the 2020 virtual conference are also available <a href="https://www.youtube.com/playlist?list=PL4IzsxWztPdljYo7uE5G_R2PtYw3fUReo">here</a>.</p></li><li><p>At a recent RStudio Enterprise Meetup for R in Healthcare, Chris Bumgardner, the data science program manager at Children&rsquo;s Wisconsin shared how his team uses R and Shiny. You can view the recording <a href="https://www.youtube.com/watch?v=pHZ8dsc0PhY">here</a> or <a href="https://blog.rstudio.com/2021/08/03/r-in-healthcare-meetup-q-a/">check out the Q&amp;A</a>.</p></li><li><p>Shiny was used to help with West Virginia&rsquo;s COVID vaccine distribution, <a href="https://www.youtube.com/watch?v=CYilc-rEgjg">learn about the role Shiny played</a> and <a href="https://www.youtube.com/watch?v=T2DzDs0ksZY">tour the app</a>.</p></li></ul><p>And if you&rsquo;re at a healthcare organization and currently wondering how to put Shiny in production and/or use R in a way that aligns with your team&rsquo;s security needs, feel free to send an email to the RStudio Life Sciences &amp; Healthcare team at <a href="mailto:life-sciences-healthcare@rstudio.com">life-sciences-healthcare@rstudio.com</a>. We love to meet new teams and support the work happening in healthcare!</p></description></item><item><title>RStudio Voices - Julia Silge</title><link>https://www.rstudio.com/blog/rstudio-voices-julia-silge/</link><pubDate>Thu, 12 Aug 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-voices-julia-silge/</guid><description><p>For the first piece in our new RStudio Voices series, we decided to interview one of our open source package developers as their work defines our organization’s focus on making data science tools available to everyone. We spoke with Julia Silge, who is a maintainer of the <a href="https://github.com/juliasilge/tidytext">tidytext</a> package, which uses tidy data principles to make text mining tasks easier and more effective for R developers.</p><p>Beyond simply a legal entity, any corporation is a collective. It is the sum of its individual employees’ work and principles weighted by their roles in the company. My name is Michael Demsko Jr. For the past three years, I have worked at RStudio, and my goal is to show you, the reader, developer, contributor, or customer what RStudio is by showing you who RStudio is– to allow each agent of our mission to introduce themselves to you, one at a time, in their own voices.</p><p>I plan to ask three questions of everyone I interview for this series:</p><ul><li><strong>Who is your inspiration or role model</strong></li><li><strong>What brings you the most joy at work?</strong></li><li><strong>Why do you work at RStudio?</strong></li></ul><p>Julia lives in Salt Lake City, UT with her husband, three kids, and two cats. She was kind enough to lend some time to talk about her journey from academia to data science, her thoughts on machine learning’s place in the data science life cycle, her experience with the open source data science community, and the ethical questions open source developers face in making their work available to the world. As of the writing of this article, tidytext has been downloaded via CRAN 1,538,977 times– a number I found using <a href="https://github.com/GuangchuangYu/dlstats">Guangchuang Yu’s dlstats package</a>.</p><p><b>Michael: Time and time again, I’ve seen this migration from working in academia to working in data science, and if I’m not mistaken, this was the case for you as well. What did that transition look like for you?</b></b></p><p>Julia: My academic background is in physics and astronomy. That’s what I did for my PhD, that’s what I did for my post-doc research, and that’s what I taught when I worked in academia. I didn’t go straight from my academic life into data science.</p><p>I worked at an ed tech startup doing content development for physics and astronomy courses, and then I actually spent a few years out of the workforce altogether doing the stay-at-home mom thing. Then, I worked as a contractor doing various writing and coding work. It was during that time that I made the transition into data science.</p><p>I think there’s a lot of people now who are going straight from a post-doc or grad school to data science. My own path was a little more circuitous, I think, because of the particular age I was when data science roles were becoming more plentiful. When I was in grad school, people who were leaving for industry were going straight into finance, straight into quant roles or software engineering roles.</p><p>During my research years I was an observational astronomer, so I dealt with real world, messy data. I wrote code to analyze the data. I wrote code to make plots. I made presentations about this real-world data that was generated by “some process” and then had to try to communicate to people: “what is this data telling us? What can we learn from it?” When I started to see more data science roles becoming available, I thought, “wait, what? Is that a job? What is it that they’re doing in that job? Hey, that’s exactly what I do!” Those were my favorite parts of the research process. That’s what I found interesting and fun.</p><p><b>Michael: Did you begin working with R as a means to perform that physics and astronomy research?</b></p><p>Julia: No, when I was in physics and astronomy there were people who wrote their own code, and there were people who used what you might consider “closed source,” expensively licensed tools.</p><p>In the stats world, it would be similar to open source Python and R versus something like SPSS. There were analogs in the physics and astronomy world. Back when I was in it, people who wrote their own code wrote C and Fortran. That’s actually my computational background, just vanilla C. Not even C ++. In that circuitous path that I talked about, I learned a lot of random things like content development and various front end approaches for things like interactive homework or college courses and whatnot.</p><p>When I started to see these data science roles, I was certain that it was a good fit, but I didn’t know Python, I didn’t know R. I actually didn’t even know R existed. As a C programmer, I had never heard of R. I knew about front-end technologies, I knew about Python, and I started out learning Python and I thought, “okay, this is fine,” but I think because of the programming background I have, which is based on numerical recipes– how a physicist writes C and all that entails– I began thinking “I don’t know about this Python thing…”</p><p>Then there was basically a six month period where I took every data science MOOC [massive open online course] that exists, and that was how I got exposed to R. Actually, it was through the <a href="https://www.coursera.org/specializations/jhu-data-science?utm_source=gg&amp;utm_medium=sem&amp;utm_campaign=03-DataScience-JHU-US&amp;utm_content=03-DataScience-JHU-US&amp;campaignid=313639147&amp;adgroupid=121203872804&amp;device=c&amp;keyword=&amp;matchtype=b&amp;network=g&amp;devicemodel=&amp;adpostion=&amp;creativeid=507187136066&amp;hide_mobile_promo&amp;gclid=Cj0KCQjw6s2IBhCnARIsAP8RfAi73-YuP1myujGsUESycPud953mS3Y1r-KeiEb_ZmffghZofvaQYbgaAha6EALw_wcB">Johns Hopkins/Coursera MOOCs</a>.</p><p>At that point, <a href="https://rviews.rstudio.com/2017/06/08/what-is-the-tidyverse/">the Tidyverse</a> was pretty mature. <a href="https://purrr.tidyverse.org/">Purrr</a> existed, and so I was presented with these, very functional programming approaches to data analysis. I thought, “I love this. This is way better than the way you get introduced to doing data analysis in Python.”</p><p><b>Michael: You mentioned the Tidyverse. Was it the way that activities were segmented, the way that the development life cycle works out in that paradigm that clicked with you?</b></p><p>Julia: This is a little bit overly simplified, but being presented with a functional programming approach to data analysis, as opposed to an object-oriented programming approach, really clicked with me. I found it a good fit for how I think about what data is, how it works, and how I want to move from, “okay, I need to do something simple” to, “ah, now I need to do something really involved.”</p><p><b>Michael: In my head I’m compiling a list of attributes. So far we’ve got experience in a low-level programming language like C and an inclination toward a specific approach to data analysis. When you were performing observational astronomy work, it sounds like a lot of that data was unstructured or simply messy. Do you think that it was that confluence of ideas that led you to machine learning and specifically text analysis?</b></p><p>Julia: When I came up through physics and astronomy, I would say it was very rare for people to use modern machine learning methods on astronomical data. People were using traditional statistical methods. I was very familiar with some of these ideas as they related to observational data. You can’t do a randomized controlled trial with observational data. You have to think about random, natural experiments. You have to think about how we use observational data. You have to think about the biases inherent in the data we’re observing because that comes with the territory when you’re talking about astronomy. It’s core to what you’re doing.</p><p>I think that was very formative in how I think about methods for analyzing data. I didn’t have a lot of exposure to modern machine learning methods until I I thought, “okay, I want to think about data science as a career.” I am interested– very interested– in different quantitative approaches or machine learning methods, but I’m interested in them as a means to an end. I’m interested in how people use them in the real world. I mean, I love math, but what I’m most interested in is people’s real world problems and how they use these tools to solve them.</p><p><b>Michael: I’ve read your blog, and that’s apparent. From what I can tell, it seems like there’s no desire to do math for the sake of doing math. It’s not just for sport– it’s for a purpose. Do you think that you were attracted to machine learning specifically because that area gives you the most opportunity to solve the most problems?</b></p><p>Julia: I don’t actually know that it does offer the most answers to the most people’s questions. Monica Rogati, an important data science leader and writer, has what she calls <a href="https://hackernoon.com/the-ai-hierarchy-of-needs-18f111fcc007">“The AI Hierarchy of Needs.”</a> At the bottom, you have logging and collecting data and then you move up to things like analytics in the middle. Topics like machine learning are way at the top of this pyramid. You get a lot of value from analytics, from simply counting things. You get a lot of value from the first models you train– simple models, linear models. You have to be pretty mature, pretty high up on your hierarchy of needs to get value from a fancy machine learning model that’s learning non-linear relationships or taking advantage of complicated math. So I wouldn’t say I’m motivated to work on machine learning because it’s the thing that offers the most value, or because it’s the thing that solves the most problems. I think what solves the most problems is making data accessible to people in their workplaces, and that they have the tools to be able to do basic analytics or train that first model. You get the biggest wins at those lower levels.</p><p><img src="images/Data%20Science%20Hierarchy%20of%20Needs.png" alt="Monica Rogati’s “AI Hierarchy of Needs”" /><em>Monica Rogati’s “AI Hierarchy of Needs” via Hackernoon</em></p><p>I think machine learning tooling is interesting because it’s complex and people enter into it with different levels of experience. Being able to build tools to make that process more streamlined is an interesting challenge.</p><h2><em>“What solves the most problems is making data accessible to people in their workplaces, and that they have the tools to be able to do basic analytics or train that first model. You get the biggest wins at those lower levels.”</em></h2><p>— Julia Silge</p><p><b>Michael: As far as personal challenges and real-world applications go, I want to briefly talk about the analysis you did on the collected Jane Austen works. In my own experience, and in speaking with other professionals that have made a similar career change, data science is like Florida. Nobody’s from Florida. Everybody moves there from somewhere else, and there’s usually an interesting story for how they got there. Was that project a way of motivating your learning for making that move?</b></p><p>Julia: As I started making that transition [from academia to data science], I was thinking, “I need to learn the right skills to be marketable and I need to demonstrate that I have these skills.” I had a lot of confidence that I was going to be able to do the job, but I needed to show people that I was going to be able to do the job. So part of my plan was to publish projects that people could interact with.</p><p>I kind of had a weird resume at that point. I had been out of school awhile, I had done other random things, I had a gap in my working life, and I had all these ideas for projects. My vision for those projects was that they would be hard for people to forget after a conversation or an interview. If you look back, a lot of those early posts on my blog are working with data specific to where I live– Utah– because I pictured myself interviewing for jobs in my city.</p><p>I was just looking around for data to analyze. I knew I didn’t want to use any super clean data sets that are available– that you see a million times in people’s posts, projects, or demos– because it gets pretty boring, you know? I knew I didn’t want to do that. I saw that you can get the full text of books that are in the public domain from <a href="https://www.gutenberg.org/">Project Gutenberg</a> if you follow their rules. I’ve always been a big reader.</p><p>“Let’s see what they have here. Do they have my favorite book?” They did– Pride and Prejudice is in the public domain.“Oh look, all of Jane Austen’s works are in the public domain.” It really started out motivated by the desire to demonstrate my abilities to people who may want to hire me. Those first couple analyses were written with existing tools for text analysis at the time.</p><p>Some of it was kind of annoying. I wished that some of it were a little easier and that I could use some other approaches that I really liked. I was really fortunate to go to the <a href="https://ropensci.org/">ROpenSci</a> Unconference in 2016 and there I met <a href="https://www.rstudio.com/authors/david-robinson/">Dave Robinson</a> for the first time. Dave asked, “would you want to write a package to do text analysis, but with tidyverse principles?”</p><p>“I would love to do that. I would love to do that.”</p><p>During the Unconference hackathon we got the skeleton of it done, and tidytext was on CRAN within a couple months after that. Honestly, it was life-changing. I ended up getting jobs, I ended up publishing a book– it hugely changed the course of my career and my life.</p><p><b>Michael: That’s really special.</b></p><p>Julia: It is really special. The Tidyverse itself was at a point in its maturity where it could be built on. People were feeling the frustration of dealing with text data and it was easier to turn to other programming languages or other environments to deal with text data at that time. No one had yet come in to say, “why don’t we just make it better?”</p><p>All the code I had ever written to that point was for myself, and I think, without knowing it, I had used open source software back in my physics and astronomy days, but I didn’t realize what the open source community was like, and I certainly didn’t see myself as someone who could contribute. What happened with tidytext was a real combination of circumstances. Right place, right time.</p><p>Those circumstances included the support of a community. Specific people and the community overall. People tell you, “you should start here, you should do this, and this,” and, eventually, you get to a point where you can say, “oh, I understand. I understand what maintaining an open source package is now,” which is quite the path.</p><p><b>Michael: Can you describe what it’s like to be a part of that community for some time now – specifically as a maintainer of a widely-adopted package?</b></p><p>Julia: Thinking back to that transitional time again, I did think “alright, I’m going back in there, time to grow my thick skin again. Time to get tough again. Time to get ready for all the crap.” Yet my experience was so the opposite. Entering into data science, entering into the open source community, of course there are problems, but it was so different from my experience in the academic world. People are happy to have you contribute, people are excited to hear your ideas. I was floored. I was shocked. I had no idea it could be like this. I was entirely unprepared– in a good way.</p><p><b>Michael: So many folks– yourself included– are beneficiaries of what the open source community provides. In a unique way, you were able to provide back almost from the get-go. I’m curious, because I’ve never had the opportunity to ask one of our package developers before, do you keep any running list of projects that have used your packages in mind? Positively or negatively speaking.</b></p><p>Julia: I don’t think I keep a running list, but as you ask the question, one really positive project comes to mind. There’s a data journalist at Buzzfeed named <a href="https://www.buzzfeednews.com/author/peteraldhous">Peter Aldhous</a>, who has used tidytext in a couple of different analyses. One was about <a href="https://www.buzzfeednews.com/article/peteraldhous/trump-state-of-the-union-words">State of the Union addresses</a>. He showed that the reading [comprehension] level for the address has been dropping over time, and that the sentiment has remained fairly stable.</p><p>On the negative side, I think what has bothered me the most have been the cases where people used data that raise ethical issues. Things can become fraught when it comes to text data. Especially if you are a technically savvy person, you can decide to go out and get text data that it feels yucky for one to have access to. A lot of these cases are not illegal, but you have to think, “is it a good choice for you to be dealing with that information?”</p><p><b>Michael: The open source world is, in many ways, a double-edged sword and I think machine learning libraries are at the tip of that sword right now. That removal of any barrier to entry, it allows for opportunity for folks like you and I, for the entire community, but that’s just it– it grants use for whatever purposes. Do you feel as though there should be some sense of accountability for how those libraries are used?</b></p><p>Julia: This is something that we, as a community, need to think through. Where are we going to land on this? FOSS, free and open-source software, arose in the eighties– or sometime around then– with the idea that you should be able to open it up and do what you want with it. The licensing that this was built on, has it ever been challenged? Not really. At least it has never been tested so well that we know what the rules really are. As people generally understand it, someone can do anything with a package, and I– as the developer– cannot tell them otherwise if I am morally opposed to what they do. They are free to use it. In fact, most of the licenses that we put on our software make it so people can use it to make money. People can use it to do good and bad things, which I think is very challenging.</p><p>The framework that we have right now doesn’t give developers any say. With professional software, you have a little more say, because you can decide to whom you will or won’t sell your software. That was the point of it originally– you can do what you want with it. That was the goal, but you start to think, “do I then have to accept all of that?”</p><p>More recently, there’s been discussion around that. People have found out that their software was being used by particularly morally reprehensible groups, and they ask “do I have any options?” Do I have any freedom in licensing?” For example, there’s a new-ish license called the <a href="https://firstdonoharm.dev/">Hippocratic license</a>. It basically lays out the accepted uses– different purposes, including commercial purposes– for which you can use a package in your business. It also has a list of things that the package cannot be used for such as human rights violations. This is a new license, and the old licenses still haven’t really been legally tested, so this one, of course, hasn’t been legally tested at all. Regardless, some people have started to use it, and I think it’s a statement of value.</p><p><b>Michael: I’d like to ask three questions that we’ll ask everyone that’s a part of this series. The first of those is who would you consider to be your inspiration or role model?</b></p><p>Julia: I have a hard time with questions about inspiration and role models. Partly because I have a core belief that like we’re all a little amazing and a little messed up, but, if I were to list some people, I would list the writer Madeleine L’Engle, the writer and activist Dorothy Day, and one of my kids who has special needs. It’s incredible how hard she works in so many areas of her life.</p><p><b>Michael: The next question: what brings you the most joy at work?</b></p><p>Julia: When I think about joy at work, I think about two different modes of joy. One is a little bit of a high– when something that I’ve built or released gets a response. It helps someone and they tell me, “I used this to do ‘x’… I couldn’t do ‘x’ before and now I can!”</p><p>The other mode of joy that I often get at work is a little bit more meditative, a little bit more “in the zone.” When something’s not working well in code that I’m writing and I’m able to get it working better, more organized, or aligned with what we’re trying to accomplish.</p><p><b>Michael: Lastly, why do you work at RStudio?</b></p><p>Julia: I work at RStudio because it gives me this incredible opportunity to have a real impact on the lives of a lot of people who work with data, and to work on interesting problems with amazing coworkers.</p><p>If you would like to see more of Julia’s work:</p><ul><li><a href="https://juliasilge.com/">Julia’s Blog</a></li><li><a href="https://twitter.com/juliasilge">Julia’s Twitter</a></li><li><a href="https://www.youtube.com/juliasilge">Julia’s YouTube Channel</a></li></ul></description></item><item><title>Democratizing Data with R, Python, and Slack</title><link>https://www.rstudio.com/blog/r-in-marketing-meetup/</link><pubDate>Tue, 10 Aug 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/r-in-marketing-meetup/</guid><description><div align="right"><font size="2">View the R in Marketing meetup recording <a href="https://youtu.be/Y2zoRCXgPwk" target="_blank">here.</a></font></div><div class="lt-gray-box mt-4">*This is a guest post by Matthias Mueller, Director of Marketing Analytics at CM Group. Matthias oversees a team of data scientists, engineers, and analysts tasked with optimizing the marketing mix, building customer lifetime value models, and enhancing the prospect’s buying journey.*</div><p>As a global Marketing Technology company, CM Group and its family of brands generate countless data points that from an analytics perspective are a true treasure trove. Functioning as a de facto analytics agency to each of the brands under the CM Group umbrella, my team is tasked with analyzing, reporting on, and modeling these data, so that ultimately stakeholders across the organization can make data-driven decisions.</p><p>While having a plethora of data is a great position to be in, democratizing information across the business isn’t always easy. In fact, it is precisely this overabundance of data, and the sharing thereof, that can create challenges hampering growth in many organizations. Particularly,</p><ul><li><strong>Cognitive overload</strong> in the end-user of data products due to the overwhelming amount of dashboards for individuals to track on a daily, weekly, monthly, or quarterly basis. Dashboard fatigue is a real thing and overloading your stakeholders with data can lead to feeling overwhelmed, severely stifling the efficacy of even providing the data in the first place.</li><li>Lack of annotations and explanations for non-technical marketers can make information <strong>hard to digest and understand</strong> (after all, isn&rsquo;t it our primary responsibility as analytical folks to make complex data sets digestible to people that might not have a background in stats or mathematics? I’d like to believe so.)</li><li>While sophisticated BI tools are available, they <strong>require stakeholders to be self-driven</strong> and knowledgeable in using such tools.</li></ul><p><strong>So what can we do?</strong></p><p>By using communication channels that are already used and available (Slack) in connection with R, Python and RStudio Connect, we built a system that serves custom, individualized insights directly to stakeholders, at the right time, where work already happens.</p><blockquote><p>If we want to democratize data across the business, we need to find a way to serve insights to the stakeholders directly, in an easily digestible way.</p></blockquote><p>Having bespoke analytics surfaced to you in an environment where you already talk about business performance allows my team to send insights without forcing you to leave the platform. It also democratizes the information with the stakeholders it is shared with: for example, I know that all marketing stakeholders will be in the #marketing channel, so I can push marketing analytics directly into that channel and spur immediate discussion. Naturally, I could curate different slack notifications for different channels; perhaps #marketing is interested in lead volume, while our internal team channel #marketing-analytics would be a good place to surface code breakages, etc.</p><img src="predicto.jpg" alt = "CM Group Slack"><p><strong>Data Process</strong></p><p>The process is simple: our data is centralized inside of an AWS Redshift instance, from which we pull whatever data set we are interested in using either an R or Python script and SQL. Then, after whatever analysis has been performed, we are pushing that insight into Slack using <a href="https://api.slack.com/" target="_blank">Slack’s API</a> and the <code>httr</code> package. Utilizing RStudio Connect, we can then schedule that process at whatever cadence we need it to run to programmatically serve insights to stakeholders. Once the bot is established, the possibilities of use are virtually limitless. At CM Group, we have successfully implemented this process for many different purposes, including, but not limited to,</p><ul><li>Building a notification system that pushes insights to the team internal #marketing-analytics channel highlighting code break, anomalies in data, automated, reporting, etc.</li><li>Paid Search Forecasting/Anomaly detection: daily tracking of clicks and impressions against forecasted values to notify the Performance Marketing Team of any aberrations as soon as they happen</li><li>Organization-wide notifications to announce data and/or report availability. For example, notifying the Marketing channel when the Monthly Marketing Council Report has posted</li></ul><p>In the example of anomaly detection for lead volume of specific lead sources, the use of the Slack bot allows our marketing team to quickly gauge performance and determine if any action needs to be taken. In this case, it took just 7 minutes for people to start talking about what was going on as compared to sending out a dashboard or report.</p><img src="slack.jpg" alt = "CM Group Slack"><p>In summary, this is a win-win situation for the organization and the analytics team:</p><ul><li><p><strong>For the organization:</strong> Serving stakeholders data in the moment they need it in a fully automated manner has transformed our organization into a more data-driven one. Utilizing this process, we are able to reach both technical and non-technical marketers, and make sure they are armed with all the data they need to make savvy decisions.</p></li><li><p><strong>For the analytics team:</strong> As a lean analytics team, bandwidth is one of our biggest challenges. Building this process has helped us avoid/deprecate many redundant reporting tasks, while democratizing data across the organization. With the time saved on reporting, we were able to re-allocate this to do even more impactful analyses.</p></li></ul><p>I’ve included the full Q&amp;A from the meetup below. We have paraphrased and distilled portions of the responses for brevity and narrative quality.</p><h2 id="meetup-qa">Meetup Q&amp;A:</h2><p><strong>Does the Slack Bot need to be hosted somewhere?</strong></p><p>No, essentially slack bots are hosted inside of Slack themselves, so you don’t need to find a hosting platform or server to make this work. It’s as easy as going to api.slack.com and setting up your own bot there which is already inside the realm of Slack. Think of the Slack bot itself as a mail man that you can ask to deliver a message for you inside of Slack. You’ll need to find a way to send the message, however the mail man is part of the Slack ecosystem already.</p><p><strong>Are you running R in the cloud today? What does your workflow look like under the hood?</strong></p><p>We do use RStudio Connect in the cloud today. What I love about it is that it has this thing called git-backed deployment. Basically, if some of the folks on my team write code in R that is productionalized, we are already on the version control system. We use GitHub for that and our RStudio Connect installation is hooking directly into that Git instance. What happens is that as soon as you deploy your code once in Rstudio Connect and that original link is active, anytime someone actually commits new code into Git, that is automatically recognized by RStudio Connect and it updates your production code. Ultimately, the data scientists can focus on writing code and pushing that to Git. The entire process of having to moonlight as an IT person to put something into production is taken away because that’s all handled by RStudio Connect.</p><p><strong>Outside the use of Slack, how else do you use R on your team?</strong></p><p>R in itself is inherent in most of the analytics that we do. We are writing a lot of our work in RMarkdown and leveraging parameterized RMarkdown to give people the reporting that they need. Also, when working on more advanced analytics problems, you really cannot get around using code-based solutions, so we use R (and Python) for things like forecasting, customer LTV models, multi-touch attribution and so on. Ultimately, it’s the primary analytics language that we use here aside from SQL and Python.</p><p><strong>Do you now have to manage Slack Bot fatigue?</strong></p><p>Ha, that is an interesting question. The quick answer is no. The longer answer is that I think there have been some things that we have set up and found that we probably had too often of a cadence for pushing those insights. If you bombard people with these notifications then they can also become something that you see in Slack everyday so you stop reading it. They ways that we have countered this is:</p><ol><li><p>If there’s something with a high recurring cadence, someone really needs to convince me that there is a definite need for this script to execute this often. So we’re very careful with that.</p></li><li><p>Instead of posting generalized analytics to large groups, we’ve actually gone into more specific insights to teams that are interested in exactly that piece of information. For example, instead of posting something about paid search that is relevant to 3 people to a channel that has 600 people in it, we would post it to the smaller group that actually utilizes it and needs that data.</p></li></ol><p><strong>I’m interested in how many global emails your company might have and how “fresh” those emails are? Also, how does your team measure the value of an email address?</strong></p><p>I know this is a bigger discussion, but the summary is two-fold. We actually sit on the acquisition marketing end, so when I reference the 180 billion emails - that is the emails that our customers are sending with our platforms. It’s an interesting problem from a data perspective because you don’t have that one-to-one relationship, it’s a many-to-one relationship where a prospect either converts into a lead or not. With an email, you know you can convert after getting one email or five emails or never. It’s an interesting data science problem, but I would love to chat about this more.</p><p><strong>I think your Slack bot is brilliant but also seems like a &ldquo;with great power comes great responsibility&rdquo; situation. Has it lightened or increased your workload?</strong></p><p>When I went through the win-win section in the end, it has definitely lightened the things that we used to have to do like the recurring reporting tasks that most analysts don’t love doing. It has increased our work in some ways, where other teams have realized how neat this is. We’ve gotten recognition internally and have been asked by teams outside of marketing to build them similar processes, which in my book is a good thing. I love collaborating with other teams.</p><p><strong>I have an outdated Marketing Analytics book that touches on martech studies. Do you recommend any books or resources to learn applied marketing analytics in R?</strong></p><p>I think some of the reasons why these Enterprise meetups and other talks are so important is because you can learn so much from the community and people that are actually doing it today. Full disclosure, I don’t think I’ve read a marketing analytics book in a long time. I think a lot of the underlying concepts that you deal with in marketing aren’t necessarily only used in marketing. I definitely subscribe to the thought that data science is a bit of a tool kit that you can apply to different disciplines. For example, my lead data scientist that works on lifetime value models and multi-touch attribution has a PHD in physics and came to us from doing neuroscience. I think it’s more so the concepts that you already know and work with on the data science basis that are translatable to the marketing discipline.</p><p>(Bryan Butler also shared that he found this book helpful: Discrete Data Analysis with R: Visualization and Modeling Techniques for Categorical and Count Data)</p><p><strong>You made a great point about taking your organization in the direction of code-based solutions versus GUI-based solutions. This is something I’ve encountered a lot and am wondering if you have seen any good posts to help crystalize that?</strong></p><p>I think it comes back to some of those issues we’ve discussed. I think there are use cases for both. It’s always this trade-off of where we feel like we need to be. If you just want to have a simple bar chart with revenue, maybe cut by a certain region, why not build that in a BI tool. There can be a benefit of having something that’s simple that more people can use.</p><p>With R, the power really comes with customizability. I enjoy having full control over what we build internally from the ground up. I like that with R you can amend it however you want to and I just don’t have the capability with a lot of the BI tools that are out there. The biggest benefit is really being able to reproduce work and change things on the fly. Although the initial set-up is why tools like Excel are so inherent, because there isn’t as great a learning curve in the beginning. I’ve found that if you go that route though, there comes a point where there’s a ceiling and you can’t go any further. That ceiling doesn’t exist in code-based solutions. I think about my earlier career and if I had used code-based solutions rather than a spreadsheet, I would have been able to have a lot less headache to say, update certain things. For me, the savings are not necessarily in the beginning. I think this is a greater discussion though that could warrant a whole talk about what the use cases are for using either.</p><p><strong>Let’s keep the conversation going:</strong></p><p>If you have follow-up questions, are interested in speaking at a future meetup, and/or would like to be a part of this R in Marketing Community, join us in the #chat-marketing channel of the <a href="r4ds.io/join" target = "_blank">R for Data Science Online Learning Community Slack.</a></p></description></item><item><title>RStudio Cloud: An inclusive solution for learning R</title><link>https://www.rstudio.com/blog/rstudio-cloud-an-inclusive-solution-for-learning-r/</link><pubDate>Thu, 05 Aug 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-cloud-an-inclusive-solution-for-learning-r/</guid><description><p><sup>Patricia’s own cloud photo from Antarctica</sup></p><div class="lt-gray-box">*This is a guest post from Dr. Patricia Menéndez, Department of Econometrics and Business Statistics at Monash University, Melbourne, Australia*</div><p>In 2019 I was assigned the task to teach two code-focused units at Monash University:</p><ul><li><strong>Introduction to Data Analysis:</strong> delivered to an audience of both undergraduate and graduate students, with a class size of approximately 300 students.</li><li><strong>Collaborative and Reproducible Practices:</strong> offered to students enrolled in the Master of Business Analytics at Monash University with an average of 65 students.</li></ul><p>When I started to develop the materials for the two units, I realised the large number of students and different operating systems they use in class would be a challenge to manage. I was familiar with these challenges from my experience working outside of academia with people from various backgrounds and different proficiency levels in R. With that in mind, I decided to trial RStudio Cloud in my classrooms and have been using it ever since.</p><p>The two units began just before the COVID-19 pandemic hit Australia. After the first two weeks of the semester, we went into a hard lock-down. With the prospect of teaching on campus eliminated, I could not have been happier with my decision to use RStudio Cloud.</p><h4 id="getting-started-with-rstudio-cloud-at-monash">Getting Started with RStudio Cloud at Monash</h4><p>RStudio Cloud is a platform that allows you to use R within RStudio by just logging in to the system using your preferred browser. This was a benefit for our team at Monash because students did not need to install any software in their own machines. Administrators can simply create a space for the classroom and invite participants to join. It could not be easier.</p><p>RStudio Cloud Roles at Monash:</p><ul><li><em>Administrator:</em> In our case, the head teaching assistant and myself were nominated as administrators of the space.</li><li><em>Moderator:</em> Other teaching assistants were delegated the role of Moderator so they could view and access any students’ projects.</li><li><em>Contributor:</em> Students and tutors were contributors in the space, which also ensured that they could not change the original exercise.</li><li><em>Viewer:</em> I have not explored this option but this could be useful if you want to ensure that some users can only see the materials.</li></ul><p>As an administrator of the RStudio Cloud spaces for my units, I had the capacity to:</p><ul><li>Create new spaces for the classrooms where the weekly RStudio projects and corresponding solutions were made available to students</li><li>Decide which version of R to use</li><li>Determine the visibility of each project</li><li>Customise the specific resources required for each project (RAM and CPU memory allocations)</li><li>Manage users</li></ul><p>With RStudio Cloud, projects can be uploaded directly from GitHub or a zip file containing an R project. At the beginning of the semester, all the R projects for the 12-weeks were uploaded onto the classroom space and made available in the cloud. The corresponding projects were made visible to the students only at the beginning of each week, while the solutions were made available at the end of the week.</p><p>With the centralised capacity, I could keep myself informed of how the students interacted with R and RStudio in the cloud. This allowed the lecture and tutorials to run smoothly through the semester without hiccups.</p><p>In addition to creating an RStudio Cloud space for both of the units, I created another RStudio Cloud space for my teaching teams. The ability to make the projects visible to other members within the space proved to be extremely useful as it provided administrators the capacity to trial and test each project before making it visible to the users.</p><p>During the implementation process, we worked closely to easily troubleshoot any issues before students encountered them. In this test space, the unit materials were uploaded so my teaching assistants could try the code and go over the solutions. This has worked perfectly by affording them the opportunity to test all the projects before conducting the tutorials. To further facilitate the interactions among members of the teaching teams, we also set up a Slack channel through which we were able to communicate at any time.</p><p>Administrators and teaching assistants could also log in to students’ projects so that we could promptly help students with any issues. To support any further technical issues, we set up a few Zoom meetings for students to join. Technical issues were resolved by us logging into their project as the administrator and rectifying the issues.</p><blockquote><p>The versatility of RStudio Cloud is simply amazing. I used RStudio Cloud for weekly tutorials and assignments during the semester and have even run a timed test with 300 students logging in simultaneously.</p></blockquote><p>For the test, students were given access to an RStudio Cloud project which contained data and an RMarkdown file with questions for which they needed to create code to answer. The exercise was made visible to the students in the course space where they could be timed while working on it. Students were required to download their RStudio Cloud project, which could be done easily in RStudio Cloud and upload it to the learning management systems as an assignment.</p><p>For the Introduction to Data Analysis, RStudio Cloud eliminated any potential challenges with software installation. Students with varying levels of coding knowledge can work under the same setup in the online environment, ensuring a smooth semester.</p><p>The Collaborative &amp; Reproducible Practices unit focused primarily on reproducible reporting using RMarkdown, the use of Git via the command line terminal and GitHub as a remote repository. In this unit, we used RStudio Cloud for the first four weeks of the semester and then moved on to using R and RStudio local installations. RStudio Cloud allowed me to quickly give students instructions on installing in their local machines and was instrumental for getting students off the ground running with RMarkdown, git and GitHub.</p><p><img style="float:left;margin:0 10px 10px 0;" src="patricia.jpg" alt = "Dr. Patricia Menéndez" width = 25%> The experience using RStudio Cloud in my units over the last two years has been truly wonderful. I would like to express appreciation to RStudio for their support and gratitude to my teaching team for enthusiastically jumping in the boat with me and providing input throughout the semesters. Last but not least, thank you to all the students whose feedback and input help make these units better each year! If you have any questions, feel free to reach out on Twitter! @PM_maths</p><p> </p><blockquote><h4 id="helpful-resources">Helpful resources:</h4></blockquote><blockquote><ul><li><a href="https://www.rstudio.com/resources/webinars/teaching-r-online-with-rstudio-cloud/?_ga=2.169953248.1183963645.1628082102-748609636.1627930340" target = "_blank">Teaching R Online with RStudio Cloud Webinar</a></li><li><a href="https://rstudio.cloud/learn/guide" target="_blank">RStudio Cloud Guide - Getting Started</a></li><li>Webinar for Instructors Getting Started with RStudio Cloud, <a href="https://www.youtube.com/channel/UC3xfbCMLCw1Hh4dWop3XtHg/videos" target="_blank">YouTube Premiere on August 18th at 1 pm ET</a></li></ul></blockquote></description></item><item><title>R in Healthcare Meetup Q&A</title><link>https://www.rstudio.com/blog/r-in-healthcare-meetup-q-a/</link><pubDate>Tue, 03 Aug 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/r-in-healthcare-meetup-q-a/</guid><description><div class="lt-gray-box"><i>This is a guest post from Chris Bumgardner, Data Science Program Manager at Children’s Wisconsin. The RStudio Enterprise Community group recently hosted an <a href="https://www.youtube.com/watch?v=pHZ8dsc0PhY" target="_blank">R in Healthcare meetup</a> highlighting the powerful work that Chris and his team are doing to help ensure Wisconsin’s kids are healthy, happy, and safe.</i></div><p>An active academic healthcare organization requires tools and practices that enhance the application of statistical and algorithmic approaches.</p><p>To positively impact care, system operations, or even well-being at the community-level, these tools need to support solutions which can be rapidly deployed and communicated as well as reproduced when studying longitudinal trends.</p><p>At Children’s Wisconsin, we are using R and RStudio’s suite of tools to enable forecasting, modeling, and data mining among other data science activities. We communicate the results of our efforts using interactive applications built with Shiny as well as reports and push analytics created using RMarkdown.</p><p>During the <a href="https://www.youtube.com/watch?v=pHZ8dsc0PhY" target="_blank">R in Healthcare meetup</a> on June 30th, I shared how we have developed this capability and provided a few examples of the applications that have been created to support our vision that the kids of Wisconsin will be the healthiest in the nation.</p><p>I’ve included the full Q&amp;A below, which also includes the response to any questions that went unanswered.</p><h2 id="meetup-qa">Meetup Q&amp;A:</h2><p><strong>How is data collected, cleaned, stored, retrieved that fuels this great work?</strong></p><p>The data science team is a small team and we sit within the analytics group. We have data engineers in the larger analytics team that are shared. The data engineers help bring in a lot of this data and we store it in our data warehouse. For this application, we pull the processed data into an R data file for Shiny to use. This results in much better performance for our users. The data engineers perform a lot of the data cleaning and scrubbing, but we collaborate with them. The data scientists will usually take the first pass with a rough cut of the data, think about what we need, look for quality issues as well as other statistical concerns, and then the data engineers will automate our process to productionalize that.</p><p><strong>What are you using to link igraph to the selected youth info panel? Is it crosstalk or something else?</strong></p><p>To update the child details table when clicking on a node in the network it is using the <strong>selected</strong> event firing from the outputed visNetwork object.</p><p>More details can be found in the <a href = "https://cran.r-project.org/web/packages/visNetwork/vignettes/Introduction-to-visNetwork.html" target = "_blank">Introduction to visNetwork</a> vignette in the <strong>Use with Shiny</strong> section.</p><p>In the example code below, I watch for the selection to occur (the output is named “full_network”, so visNetwork creates an input with the suffix of “_selected” when used in a Shiny application), filter to the correct child, and then pass the information to a helper function to create the actual informational panel.</p><pre><code>#Update the node_info table with child details on the full networkoutput$node_info &lt;- renderTable({req(input$full_network_selected)dat &lt;- placement_df %&gt;%filter(ChildID == input$full_network_selected)return(createChildInformation(dat))</code></pre><p><strong>What is the organizational structure of your data science program within the organization and how have you gotten buy-in to build R into the analytics infrastructure?</strong></p><script src="https://fast.wistia.com/embed/medias/w03jvpjn3n.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_w03jvpjn3n videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/w03jvpjn3n/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><p><font size="2" skip=0pt><div align="right">Full meetup recording <a href="https://youtu.be/pHZ8dsc0PhY" target="_blank">here</a></div></font></p><p>Back in 2012-2013, we brought the majority of the analysts together into an enterprise analytics group. Instead of having them dispersed among the departments within Children’s Wisconsin, we brought them together to try and build a shared understanding of our data, tools, and common data definitions and rules. At the same time we were rolling out a new electronic health record (EHR). It was a perfect time to bring everyone together and level-set on data and data sources. Then at some point, we would possibly spin individual analysts back out into the business.</p><p>For analysts not officially included on the common team we created an Analytics Center of Excellence to reach out to power users to help disperse this information, but the majority of the analysts still sit within the enterprise team and that’s the team that data science is a part of. We have plenty of analysts and 2-3 data scientists who work on the more strategic, longer-term projects.</p><p>To get buy-in, I think it’s those small wins where you can actually get in, deep-dive, learn all you can and produce an insight that will “wow” the team. I know it’s hard to predict when that’s possible, but I think we’ve been lucky in that we’ve had very curious counterparts on the business side and at the hospital that work with us to help achieve that initial success. Once we garner the first win, they will want to share it and so we initially used the open-source Shiny Server. Once we got to a point where we needed to have our solutions tied into Active Directory and share outputs securely, that’s where we really started to gain the necessary toehold to integrate R more fully.</p><p><strong>How open is data access for your data science team within your organization?</strong></p><p>Our data science team and analysts have access to just about everything we need. In our data warehouse we have the ability to apply user-level security to the included data sets and if we require additional access for a certain project we can usually get the permissions necessary. Overall, it hasn’t been an issue within our organization.</p><p><strong>How long does it take to go from concept to production for a project like this? Does the engineering work occur before or alongside your build?</strong></p><p>I would say it’s very iterative. With the COVID app, we brought it up in a week or two because it could build on the other dashboards we have created. Once we have something scripted, we then have these template or skeletal apps where we can pull something together quickly. We are definitely leveraging our great analysts and data engineers to pull a lot of the data together for us before we even have to work on a problem. The Inpatient Modeling app I shared started for me at the end of April, so that app, including the simulations and scenarios came together in two months. I think it can be very rapid and that’s really the power of R. We can be so iterative and responsive in a short amount of time - once you’re fluent in it of course.</p><p><strong>In your organization, how do you determine whether ethics approval is required for exploratory data analysis or modeling?</strong></p><p>While we do not have a dedicated ethics review panel, we do have an Institutional Review Board (IRB) for research requests involving human subjects. With approved requests there are well-defined pathways for what is to be included and how to handle modifications to the request.</p><p>For operational or performance improvement requests we have a multi-disciplinary team that triages data-related project requests and considers ethical factors in addition to time and resources when deciding when and if a project will be initiated.</p><p>Most importantly, as an organization we understand the importance of fairness and limiting bias in our models. This awareness has led us to not use outputs from models lacking in transparency or that give us hesitation when considering equitable care. We still have much to do in this area!</p><p><strong>In your application, are “confirmed” children those have been trafficked, or are participating as actors in the trafficking of others?</strong></p><p>Those children could be either a victim or a participant. The state of Wisconsin has an indicator guide for child trafficking and we follow those definitions. There are three tiers included in the state’s guide that we also use in the application.</p><p>It is important to note that the factors associated with each tier are used to guide the classification, and the tier status is not a quantitative score that is simply totalled to arrive at a category. Human intervention is still required before any of the tier levels are assigned.</p><p><strong>‘Anon’ v 0.7.0 - is that package available externally?</strong></p><p>This package is something that I created for use within our Shiny applications. It’s something that we could definitely consider making publicly available on GitHub. It’s enabled via the switch on the introductory page of the Shiny app that then runs the code to anonymize all the names shown in the application. We use baby names from the Social Security Administration and surnames from the U.S. Census Bureau to create reasonable names that are similar by year of birth and sex.</p><p><strong>How do you decide when to use Shiny vs. flexdashboard? What about R vs. BI tools?</strong></p><p>I have considered this question myself when I thought about how much we want to take on as a data science team whenI think about the analytics program as a whole. The larger team has BI developers who currently specialize in QlikView. I think the choice usually comes down to what our analysis will be used for, how long the organization will need those results, and how they will be shared. At first, we will do mostly ad-hoc analysis with static outputs and when we start to see that there is a longer term need to have some interactivity, we’ll bring in Shiny. Usually it starts with R Markdown or a flexdashboard that is a little lighter weight. If the request is something that’s truly servicing our service line needs, such as counts, volumes, or something we have an existing QlikView app for, we’ll push it to the appropriate resources within the analytics team. The data scientists are a scarce resource and we want to ensure they are having the most impact they can within the list of prioritized work.</p><p><strong>Does your team validate R Packages or is it a requirement at all?</strong></p><p>Yes, we do make efforts to validate the R packages we use regularly and maintain a standard list of “trusted” packages that we recommend to other R users in the organization. While we do not technically limit what packages can be used, we encourage best practices via lunch and learn sessions, code reviews, and an internal Slack channel. We are also piloting the use of the RStudio Package Manager, which we are using to help make package management easier and more consistent across the organization and within the Data Science environment.</p><p><strong>What type and/or amount of clinician involvement do you have on your team?</strong></p><p>We are definitely partnering every step of the way. We have a CMIO (Chief Medical Information Officer) who works very closely with us to understand and support our analytic efforts. Our Enterprise Analytics team is not within what would be considered the traditional IS (Information Systems) organizational structure. Instead, we’re located within our Health Management group which is similar to a population health team. We actually have nurses/RNs on that team as well which enables us to be more engaged with the various entities across the health system. For any kind of request that comes in, we’re always working with providers side by side which is very enjoyable. This collaboration is a great thing about Children’s. We get a lot of involvement, which helps make us successful. This is something that we look for as we take on new projects too - how invested is the business side before we go too far with it.</p><h2 id="keeping-the-conversation-going">Keeping the conversation going:</h2><p>If you have follow-up questions, are interested in speaking at a future meetup, and/or would like to be a part of this R in Healthcare Community, join the <a href="https://join.slack.com/t/rinhealthcare/shared_invite/zt-sc7lc4k6-K9zb~kX826dOXMcaj~Wt~w" target="_blank">Slack group here</a>.</p></description></item><item><title>RStudio Connect 1.9.0 - Content Curation Tools</title><link>https://www.rstudio.com/blog/rstudio-connect-1-9-0/</link><pubDate>Thu, 29 Jul 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-1-9-0/</guid><description><p>As publishers add more content to RStudio Connect, content organization, distribution, and discovery can become a challenge. Distributing individual links to all your most important content is tiresome, and the default Connect dashboard contains more information than end users often want or need.</p><p>This release of RStudio Connect introduces tools for addressing these common content curation concerns:</p><blockquote><p>How do you make sure your audience finds what they need on RStudio Connect without paging through the dashboard, remembering the right search terms, or bookmarking every content item you share?</p></blockquote><blockquote><p>After deploying many pieces of related content, how do you share them as a cohesive project?</p></blockquote><p>You might be interested in these content curation tools if you&rsquo;ve ever wanted to create:</p><ul><li>A summary/reference page for a complex project.</li><li>A content hub or knowledge repository for work belonging to a team or objective.</li><li>A customized entry point into RStudio Connect for stakeholders.</li><li>A presentation layer for any curated list of notable content items.</li></ul><p>We&rsquo;ve seen some impressive solutions to these problems from our advanced user community, but our ultimate goal has been to make content curation and distribution easy for all RStudio Connect publishers. With this in mind, content curation for RStudio Connect is structured around the following design principles:</p><ul><li>Code-based and reproducible</li><li>Built for existing, familiar RStudio Connect content types</li><li>Polished, presentable defaults</li><li>Customizable, easy to brand</li><li>No RStudio Connect Server API experience required</li></ul><h2 id="introducing-connectwidgets">Introducing <code>connectwidgets</code></h2><p><a href="https://github.com/rstudio/connectwidgets/"><code>connectwidgets</code></a> is an RStudio-maintained R package that can be used to query a Connect server for a subset of your existing content items, then organize them within <code>htmlwidget</code> components in an R Markdown document or Shiny application.</p><div align="center"><img src="connectwidgets-explained.png" alt = "Moving from Connect dashboard to connectwidgets"><font size="2",skip=0pt></div><div align="right">This example makes use of a free banner image from <a href="https://www.canva.com/" target="_blank">Canva</a></div></font><p>The package provides organization components for card, grid, and table views:</p><ul><li>Card and grid components display metadata about each piece of content. The title, description, and preview image can be set from the RStudio Connect dashboard.</li><li>Table components display a fixed set of content metadata: Name, Owner, Type, and Updated.</li><li>Each card, grid, or table row item links to the &ldquo;open solo&rdquo; version of the associated content item on RStudio Connect.</li><li>Search and Filter components can be applied to the table view (as shown below) or the grid view.</li></ul><p>Visit the package <a href="https://rstudio.github.io/connectwidgets/">documentation site</a> for a full set of code examples.</p><h3 id="theming">Theming</h3><p><code>connectwidgets</code> components support styling in <code>rmarkdown::html_document</code> via the <code>bslib</code> package. You can supply a Bootswatch theme in the yaml header, or pass a custom theme consistent with your organization&rsquo;s style.</p><p>Bootswatch theme example:</p><pre><code>---output:html_document:theme:bootswatch: minty---</code></pre><p><img src="connectwidgets-themes.png" alt="Use a bootswatch or custom theme with connectwidgets"></p><h3 id="get-started">Get Started</h3><p>To start using <code>connectwidgets</code> with your own RStudio Connect server content, you must first upgrade to 1.9.0.</p><p>Install <code>connectwidgets</code> from CRAN and load the library:</p><pre><code>install.packages(‘connectwidgets’)library(connectwidgets)</code></pre><p>Use the package template to learn about each of the components:</p><pre><code>rmarkdown::draft(&quot;example-page.Rmd&quot;, template = &quot;connectwidgets&quot;, package = &quot;connectwidgets&quot;)</code></pre><p>Alternatively, follow the RStudio Connect Jump Start example directions, or code examples available on the package <a href="https://rstudio.github.io/connectwidgets/">documentation site</a>.</p><h3 id="contribute">Contribute</h3><p><code>connectwidgets</code> is an open source R package. We would love to hear your thoughts and feedback. Make a feature request by opening an issue on the <a href="https://github.com/rstudio/connectwidgets">package repository</a>. Contribute code by submitting a pull request.</p><h3 align="center"><a href="https://docs.rstudio.com/rsc/upgrade/">Upgrade to Start Curating</a></h3><h2 id="additional-relevant-features">Additional Relevant Features</h2><h3 id="streamlined-publishing">Streamlined Publishing</h3><p><em>Introduced in RStudio Connect 1.8.8</em></p><p>By default, RStudio Connect will now automatically provision Server Address (<code>CONNECT_SERVER</code>) and an API Key (<code>CONNECT_API_KEY</code>), scoped to the publisher so that items are not published in a broken state.</p><p>If you&rsquo;ve ever published content to RStudio Connect that relies on the presence of environment variables, you will know to expect errors on the initial deployment. Content items that make use of the new <code>connectwidgets</code> package will need two environment variables to run: API key, and server address. This feature isn&rsquo;t only relevant for <code>connectwidgets</code>; any workflow that expects a publisher&rsquo;s API key to be passed to the content runtime (e.g. updating a Pin, or pulling audit information from the RStudio Connect Server API), will benefit from this change.</p><ul><li><code>CONNECT_SERVER</code> and <code>CONNECT_API_KEY</code> are available across all content runtimes except TensorFlow.</li><li>These variables can be overwritten if necessary by explicitly setting them in the <a href="https://docs.rstudio.com/connect/user/content-settings/#content-vars">Vars settings pane</a>.</li><li>This feature is enabled by default, but can be <a href="https://docs.rstudio.com/connect/admin/appendix/configuration/#Applications.DefaultServerEnv">globally disabled</a>.</li></ul><h3 id="content-access-requests">Content Access Requests</h3><p><em>Introduced in RStudio Connect 1.8.8</em></p><p>Since <code>connectwidgets</code> components are rendered with the same permissions you have on the RStudio Connect server, viewers of your pages may discover content they don&rsquo;t otherwise have access to. If a viewer follows a link to a content item they don&rsquo;t have permission to view, they will be directed to request access.</p><p>The example below shows what a <code>connectwidgets</code> grid view content item looks like to someone who doesn&rsquo;t have access permissions to view it. Note that the &ldquo;preview image&rdquo; for that content item has been replaced with the generic placeholder and a &ldquo;Request Access&rdquo; overlay. The access permissions dialog prompts the requesting user to select the level of access desired. In this example, the requesting user is a publisher, so they can choose either Collaborator or Viewer permissions. This triggers an email to be sent to the content owner and collaborators who can confirm or deny the request.</p><p><img src="access-request.png" alt="New permissions access request workflow in RStudio Connect."></p><h3 align="center"><a href="https://rstudio.chilipiper.com/book/schedule-time-with-rstudio">See RStudio Connect in Action</a></h3><h2 id="rstudio-connect-administrator-digest">RStudio Connect Administrator Digest</h2><h3 id="new-email-configuration-settings">New Email Configuration Settings</h3><p>Two new configuration settings have been added to increase the customization options for emails sent by RStudio Connect through your email server:</p><ul><li><strong>Sender Name Customization</strong> The <code>Server.SenderEmailDisplayName</code> setting has been added to allow customization of the server display name (alias) that is used when sending administrative emails.</li><li><strong>From and Sender Address Headers</strong> The <code>Server.EmailFromUserAddresses</code> setting indicates that outbound email messages sent on behalf of your users should specify both the Sender and From addresses. When enabled, the From field of an email message uses the name and email address associated with the sending user. The Sender field will be populated with the value from the <code>Server.SenderEmail</code> configuration setting. This setting is disabled by default. Not all email servers support this feature.</li></ul><p>Learn more about these new configuration settings in the <a href="https://docs.rstudio.com/connect/1.9.0/admin/email/#configuring-other-email-settings">Admin Guide</a>.</p><h3 id="configurable-unix-group-for-runas-users">Configurable Unix Group for <code>RunAs</code> Users</h3><p>RStudio Connect now allows server administrators to configure a shared Unix group via the <code>Applications.SharedRunAsUnixGroup</code> setting. If unset, the default is the primary Unix group of the <code>Applications.RunAs</code> user. Previously, the shared Unix group was not configurable and the primary group of the <code>Applications.RunAs</code> user was always used. All <code>RunAs</code> users must be members of this shared Unix group. Learn more about <code>RunAs</code> user process management in the <a href="https://docs.rstudio.com/connect/1.9.0/admin/process-management/#runas-current">Admin Guide</a>.</p><h3 id="metrics-listing-improvements">Metrics Listing Improvements</h3><p>The process listing on the Admin Metrics page of the RStudio Connect dashboard has been updated:</p><ul><li>Enumerates jobs across all hosts in a cluster, not only those on the responding host.</li><li>Includes process age and owning hostname.</li><li>Entries link to the logs for that job.</li><li>Removes PID, which is available on the linked-to logs page.</li><li>Columns can be sorted by clicking on the headers.</li></ul><h3 align="center"><a href="https://www.rstudio.com/products/connect/">Click through to learn more about RStudio Connect</a></h3><h2 id="deprecations--breaking-changes">Deprecations &amp; Breaking Changes</h2><p>In order to increase the supportability of RStudio Connect installations, the following breaking changes have been introduced in this release:</p><ul><li><p><strong>Breaking Change</strong> RStudio Connect will not launch if the following configurable directories are located inside its installation directory (<code>/opt/rstudio-connect</code> by default): <code>Server.DataDir</code>, <code>SQLite.Dir</code>, <code>Server.TempDir</code>, <code>Server.LandingDir</code>, <code>Database.Dir</code>, <code>Application.Pandoc1Dir</code>, <code>Application.Pandoc2Dir</code>, <code>Application.Pandoc211Dir</code>. Connect will produce an error message that identifies any directories in violation of this condition. You will be directed to relocate the directory and update your configuration file. Review the <a href="https://docs.rstudio.com/connect/1.9.0/admin/directories/#relocating-variable-data">Admin Guide</a> for additional information.</p></li><li><p><strong>Breaking Change</strong> RStudio Connect confirms at startup that a configured <code>Applications.Supervisor</code> script does not reside under certain protected directories, including <code>Server.DataDir</code>, <code>Server.TempDir</code>, <code>SQLite.Dir</code>, and <code>/etc/rstudio-connect/</code>. Connect will produce an error message if the configured supervisor script is detected in a protected directory. You will be directed to relocate the script and update your configuration file. Review the <a href="https://docs.rstudio.com/connect/1.9.0/admin/process-management/#program-supervisors">Admin Guide</a> for additional information.</p></li></ul><p>Please review the <a href="http://docs.rstudio.com/connect/news">full release notes</a>.</p><blockquote><h4 id="upgrade-planning">Upgrade Planning</h4><p>Aside from the breaking changes listed above, there are no other special considerations, and upgrading should require less than five minutes. If you are upgrading from a version earlier than 1.8.8.2, be sure to consult the release notes for the intermediate releases, as well.</p></blockquote><p>To perform an upgrade, download and run the installation script. The script installs a new version of RStudio Connect on top of the earlier one. Existing configuration settings are respected.</p><pre><code># Download the installation scriptcurl -Lo rsc-installer.sh https://cdn.rstudio.com/connect/installer/installer-v1.9.2.sh# Run the installation scriptsudo bash ./rsc-installer.sh 1.9.0</code></pre><h3 align="center"><a href="https://rstudio.com/about/subscription-management/">Sign up for RStudio Professional Product Updates</a></h3></description></item><item><title>Shiny Apps from Concept to Production - An RStudio Community X-Session with Appsilon</title><link>https://www.rstudio.com/blog/shiny-apps-from-concept-to-production-rsutdio-community-x-session/</link><pubDate>Tue, 27 Jul 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-apps-from-concept-to-production-rsutdio-community-x-session/</guid><description><p>We’re excited to announce the upcoming RStudio Community X-Session with Appsilon is open for registration. <a href="https://www.rstudio.com/registration/shiny-from-concept-to-production/" target="_blank"><strong>Sign up now here for the Tuesday, August 10th event</strong></a>.</p><div id="are-you-a-shiny-developer-looking-to-level-up" class="level1"><h2>Are you a Shiny developer looking to level-up?</h2><p>Perhaps you’ve built and shared a few Shiny applications, but the <a href="https://github.com/rstudio/shiny-gallery">Shiny Gallery</a> and <a href="https://blog.rstudio.com/2021/06/24/winners-of-the-3rd-annual-shiny-contest/">Shiny Contests</a> have shown you the enormity of what is possible with Shiny.</p><p>How do you write apps that require complex functionality while keeping your code tidy and easy to follow and maintain? How do experienced Shiny developers debug, speed up, and scale their deployed apps? How do you deliver beautifully styled Shiny apps all written in your favorite statistical programming language?</p><p><strong>Shiny Apps from Concept to Production</strong> brings the experts from Appsilon to help you take your skills to the next level.</p><p>Speakers include two Shiny Contest winners; Marcin Dubel, <a href="https://blog.rstudio.com/2021/06/24/winners-of-the-3rd-annual-shiny-contest/">grand prize winner of the 2021 Shiny Contest</a>, and Pedro Silva, <a href="https://blog.rstudio.com/2020/07/13/winners-of-the-2nd-shiny-contest/">whose Shiny Decisions app won the grand prize in 2020</a>. <a href="https://appsilon.com" target="_blank">Applison</a> has developed some of the world’s most advanced Shiny dashboards. That’s why Fortune 500 companies routinely approach them to create enterprise Shiny apps.</p></div><div id="the-format" class="level1"><h2>The Format</h2><ul><li><strong>Two hours of talks from four speakers</strong> with years of experience building enterprise Shiny applications.</li><li><strong>Panel Q&amp;A</strong> to close the event.</li><li><strong>Shiny Dev Community Hangout</strong> - For the hour before and after scheduled talks, join us for a freeform community hangout. Think of this as the virtual analog to the hallways between and after talks during a live conference. We encourage you to join early and stay late to chat with other Shiny developers like yourself.</li></ul></div><div id="the-agenda" class="level1"><h2>The Agenda</h2><p>Tuesday, August 10th, with talks running from 12pm to 2pm ET. That’s 6-8pm CEST.</p><style type="text/css">.tg {border-collapse:collapse;border-spacing:0;}.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;overflow:hidden;padding:10px 5px;word-break:normal;min-width:110px;}.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}.tg .tg-nzxa{background-color:#f6f6ff;border-color:#ffffff;color:#222;font-weight:bold;text-align:left;vertical-align:top}.tg .tg-zv4m{border-color:#ffffff;text-align:left;vertical-align:top}.tg .tg-vk3u{background-color:#f6f6ff;border-color:#ffffff;color:#000;text-align:left;vertical-align:top}.tg .tg-33nf{background-color:#f6f6ff;border-color:#ffffff;color:#000;font-weight:bold;text-align:left;vertical-align:top}.tg .tg-bc9u{background-color:#4d8dc9;border-color:#ffffff;color:#ffffff;font-weight:bold;text-align:center;vertical-align:top}.tg .tg-y4iu{background-color:#f6f6ff;border-color:#ffffff;font-weight:bold;text-align:left;vertical-align:top}</style><table class="tg"><thead><tr><th class="tg-bc9u">Schedule</th><th class="tg-bc9u">Item</th></tr></thead><tbody><tr><td class="tg-vk3u"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">11 AM - Noon ET (Optional)</span></td><td class="tg-33nf"><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">Shiny Developer Community Hangout</span><br><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Join early and connect with fellow Shiny Developers. </span></td></tr><tr><td class="tg-vk3u"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Noon ET</span></td><td class="tg-nzxa"><span style="font-weight:700;font-style:normal;text-decoration:none;color:#222">Introductions</span><span style="font-weight:400;font-style:normal;text-decoration:none;color:#222"> from Jesse Mostipak</span></td></tr><tr><td class="tg-vk3u"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Talk 1</span></td><td class="tg-y4iu"><div id="shepherding-your-shiny-app-from-proof-of-concept-to-production" class="level2"><h2><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">Shepherding your Shiny App from Proof of Concept to Production</span></h2><span style="font-weight:400;font-style:italic;text-decoration:none;color:#000">Marcin Dubel will present Appsilon’s preferred workflow for creating production quality Shiny apps. He’ll touch on key topics in the development process that lead to apps which users enjoy and developers find easy to maintain and extend. <br> <br><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">A great advantage of Shiny applications is that a proof of concept can be created quickly and easily. It is a great way for subject matter experts to present their ideas to stakeholders before moving on to production. However, to take the next step to a production application requires help from experienced software developers. The actions should be focused on two areas: to make the application a great experience for users and to make it maintainable for future work. Focusing on these will assure that the app will be scalable, performant, bug-free, extendable, and enjoyable. Close collaboration between engineers and experts paves a wave to many successful projects in data science and is Appsilon’s confirmed path to production-ready solutions.</span><br><br><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">The very first step should always be to build a comfortable and (importantly) reproducible workflow, thus </span><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">setting up the development environment</span><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000"> and </span><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">organizing the folder structure </span><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">[renv + docker]. Once this is done engineers should proceed to limiting the codebase by </span><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">cleaning the code – </span><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">i.e., removing redundant comments, extracting the constants and inline styles [ymls + styler]. Now the fun begins: </span><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">extract the business logic</span><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000"> into separate functions, modules and classes [packages/R6 + plumber]. Restrict reactivity to minimum. </span><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">Check the logic </span><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">[data.validator + drake]. </span><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">Add tests </span><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">[testthat + cypress/shinytest]. Organize your /www and </span><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">move actions to the browser </span><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">[shiny + css/js]. Finally, </span><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">style the app </span><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">[sass/bslib + shiny.fluent]. And, voila! A world-class Shiny app.</span><br> <br><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">Marcin Dubel / Software Engineer at Appsilon</span><br><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Marcin is a software engineer with a strong R and data analysis background. Over 4 years of experience in developing applications in such areas as finance, insurance or genomic studies. </span><br></td></tr><tr><td class="tg-vk3u"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Talk 2</span></td><td class="tg-y4iu"></div><div id="improve-your-code---best-practices-for-durable-code" class="level2"><h2><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">Improve Your Code - Best Practices for Durable Code</span></h2><p><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000"><em>Anna Skrzydło discusses code maintenance, and the strategies you can employ to create enduring code. Avoid future delays and growing pains for your Shiny app by using better coding practices.</em></p><p><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Shiny applications often start as small projects and grow as they get noticed. This is when the problems usually begin, as it quickly turns out that implementing one small change takes two days, and adding a new requirement leads to rebuilding the whole application. Assuming, of course, that further development is done by the same team, any new developer joining the team first needs a one-month onboarding…</p><p>If that sounds familiar, join the session, to learn more about:</span></p><ol><li><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">modules</span><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000"> (Shiny modules, wahani/modules, R6 classes)</span></li><li><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">tests</span><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000"> (unit tests, but also other types of tests)</span></li><li><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">strategies</span><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000"> (code structure, automation)</span><br><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">to make your code easier to develop and maintain. </span></li></ol><p><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Each of the three topics will be divided into two parts so that everyone can find something useful:</span><br></p><ul><li><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Quick start - dedicated for those for whom the topic is completely new. It aims at giving you the basic concept so that you can begin using it immediately.</span><br></li><li><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Further steps - dedicated for those who are already familiar with the concept and use it frequently. It aims at giving you advanced steps to get even more out of your coding practices.</span></li></ul><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">Anna Skrzydło / Project Leader at Appsilon</span><br><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Anna has passion for coding and Data Science. Over 5 years of experience in managing various IT and analytical projects.</span><br></td></tr><tr><td class="tg-vk3u"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">12:50 - 1:00 PM ET</span></td><td class="tg-vk3u"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Break</span></td></tr><tr><td class="tg-vk3u"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Talk 3</span></td><td class="tg-33nf"></div><div id="scaling-infrastructure---why-is-my-shiny-app-slow" class="level2"><h2><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">Scaling &amp; Infrastructure - Why is My Shiny App Slow?</span></h2><p><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000"><em>Pedro Silva will explore scaling and infrastructure issues common when deploying large scale Shiny apps, and how to overcome them.</em></p><p><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">For a Data Scientist, Shiny can be an amazing tool when it comes to creating fast and powerful prototypes and dashboards. But what to do when your application becomes TOO popular and more and more people want to use it?</p><p><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">As the number of users grows, keeping up with the demand of a Shiny application can be tricky, and there is only so much you can do to improve performance at the code level.</p><p><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">In this presentation I will be giving an overview of our approach to improving Shiny dashboards performance on an infrastructure level, as well as tips to scaling Shiny dashboards to hundreds of concurrent users while keeping your budget under control.</span></p><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">Pedro Silva / R/Shiny Developer at Appsilon</span><br><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Pedro has worked in many technologies over the years and has an extensive background in web, developing both websites and applications. In his free time he is both an Open Source enthusiast and a practitioner of the JavaScript dark arts. </span><br></td></tr><tr><td class="tg-vk3u"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Talk 4</span></td><td class="tg-33nf"></div><div id="uiux-in-shiny-apps-and-live-coding-session-shiny.fluent-and-shiny.react" class="level2"><h2><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">UI/UX in Shiny Apps and Live Coding Session (<code>shiny.fluent</code> and <code>shiny.react</code>)</span></h2><p><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000"><em>Kamil Żyła will demonstrate how to use shiny.fluent to build beautiful Shiny apps with Microsoft’s Fluent UI, and explain how to integrate React and Shiny.</em></p><p><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">In this talk we will present the functionality and ideas behind a new open source package we have developed called <code>shiny.fluent</code>.</p><p><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">UI plays a huge role in the success of Shiny projects. <code>shiny.fluent</code> enables you to build Shiny apps in a novel way using Microsoft’s Fluent UI as the UI foundation. It gives your app a beautiful, professional look and a rich set of components while retaining the speed of development that Shiny is famous for.</p><p><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Fluent UI is based on the Javascript library React, so it’s a challenging task to make it work with Shiny. We have put the parts responsible for making this possible in a separate package called <code>shiny.react</code>, which enables you to port other React-based components and UI libraries so that they work in Shiny.</p><p><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">During the talk, we will demonstrate how to use <code>shiny.fluent</code> to build your own Shiny apps, and explain how we solved the main challenges in integrating React and Shiny.</p><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">Kamil Żyła / Full Stack Engineer at Appsilon</span><br><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Kamil is passionate about working across various technologies, leading projects and developing internal tooling.</td></tr><tr><td class="tg-vk3u"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Up to 2 PM ET</span></td><td class="tg-vk3u"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Panel Q&amp;A</span></td></tr><tr><td class="tg-vk3u"><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">2 - 3PM ET (Optional)</span></td><td class="tg-33nf"><span style="font-weight:700;font-style:normal;text-decoration:none;color:#000">Shiny Developer Community Hangout</span><br><span style="font-weight:400;font-style:normal;text-decoration:none;color:#000">Join the hangout after the session and connect with fellow Shiny Developers. </span></td></tr></tbody></table><p><a href="https://www.rstudio.com/registration/shiny-from-concept-to-production/" target="_blank">Register now for <strong>Shiny Apps from Concept to Production - An RStudio Community X-Session with Appsilon</strong> on Tuesday, August 10th at 12pm-2pm ET / 6-8pm CEST</a></p></div></div></description></item><item><title>Top 3 Coding Best Practices from the Shiny Contest</title><link>https://www.rstudio.com/blog/three-shiny-best-practices-seen-in-the-shiny-contest/</link><pubDate>Thu, 22 Jul 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/three-shiny-best-practices-seen-in-the-shiny-contest/</guid><description><p>Recently we <a href="https://www.rstudio.com/blog/winners-of-the-3rd-annual-shiny-contest/">wrapped up another round of the Shiny Contest</a>, and, as always, the entries were terrific. A previous post announced and discussed the winners, but we wanted to take a moment to highlight some of the examples of fantastic code we saw in the entries. In this post, we have selected three apps that demonstrate Shiny best practices. The apps are <a href="https://community.rstudio.com/t/restor-shiny-contest-submission/104903">RestoR</a> by Luka Negoita and Anna Calle, the <a href="https://community.rstudio.com/t/commute-explorer-shiny-contest-submission/104651">Commute Explorer</a> by Stefan Schliebs, and <a href="https://community.rstudio.com/t/wedding-a-shiny-app-to-help-future-grooms-shiny-contest-submission/104657">{wedding}: a Shiny app to help future grooms</a> by Margot Brard.</p><p>While many of the apps submitted implement every one of the best practices we touch on (along with many other best practices), the three we&rsquo;re highlighting illustrate their respective best practices excellently.</p><p>The three points we will discuss are</p><ul><li><a href="#modules">Modules</a></li><li><a href="#custom-styles">Using custom styles</a></li><li><a href="#reactive-variables">Smart organization of reactive variables</a></li></ul><h2 id="modules">Best practice #1: Modules</h2><h3 id="commute-explorer">Commute Explorer</h3><p><img src="https://community.rstudio.com/uploads/default/original/3X/6/c/6c271876439a87ec22caa52774262ade91dec369.jpeg" alt="Screenshot of Commute Explorer app"><em>Links:</em> <a href="https://nz-stefan.shinyapps.io/commute-explorer-2/">App</a> - <a href="https://github.com/nz-stefan/commute-explorer-2/">Code</a> - <a href="https://community.rstudio.com/t/commute-explorer-shiny-contest-submission/104651">Community Post</a></p><p><a href="https://shiny.rstudio.com/articles/modules.html">Modules</a> are used to abstract complicated sections of an app into self-contained and reusable UI and server functions. As your app grows larger and and more complicated, well utilized modules result in cleaner code that is easier to read and understand, both for the original author and any collaborators interacting with your code.</p><p>Commute Explorer does a great job not only of using modules but of simplifying the interfaces to those modules by passing reactive variables into the module&rsquo;s server function. A common mistake in using modules is to call them inside of some other reactive statement like <code>observe()</code>&hellip;</p><pre><code># Bad module calling exampleobserve({# Process data before sending it into the moduleif (input$filterTo != &quot;special&quot;) {myModuleServer(data %&gt;%filter(val == input$filterTo))} else {# Handle special casemyModuleServer(data %&gt;%...)}})</code></pre><p>Compare this to the (edited-for-brevity) <a href="https://github.com/nz-stefan/commute-explorer-2/blob/master/app/server.R">server function from the Commute Explorer app</a>. (The modules used are prefixed with <code>mod_</code> .)</p><pre><code># Server function from Commute Explorer appserver &lt;- function(input, output, session) {# initialise the app state...app_state &lt;- reactiveValues(...)...# add server logic for the commute explorermod_commute_mode(&quot;mode&quot;, app_state)mod_commute_map(&quot;map&quot;, app_state)mod_commute_table(&quot;table&quot;, app_state)mod_commute_filter(&quot;filter&quot;, app_state)}</code></pre><p>By using reactives as inputs to its modules, the app&rsquo;s code makes it clear what the module depends on, leaving the logic for how it depends on that thing abstracted away from your main app script. See the <a href="https://shiny.rstudio.com/articles/communicate-bet-modules.html">article on communication between modules</a> for a more in-depth exploration into this concept.</p><h2 id="custom-styles">Best practice #2: Using custom styles</h2><h3 id="wedding-a-shiny-app-to-help-future-grooms">{wedding}: a Shiny app to help future grooms</h3><p><em>Links:</em> <a href="https://connect.thinkr.fr/wedding/">App</a> - <a href="https://github.com/ThinkR-open/wedding">Code</a> - <a href="https://community.rstudio.com/t/wedding-a-shiny-app-to-help-future-grooms-shiny-contest-submission/104657">Community Post</a></p><p><img src="thumbnail.jpeg" alt="Screenshot of the landing page for the {wedding} app. Custom fonts and a background image make for a striking first impression."> <em>Screenshot of the landing page for the {wedding} app. Custom fonts and a background image make for a striking first impression.</em></p><p>The {wedding} app is a fantastic example of an app that you would never know was written primarily in a statistical programming language. The app has an entirely bespoke look thanks to the use of custom style sheets.</p><p>A great way of getting the use of Shiny accepted, either at your company (or by your spouse), is to keep it &ldquo;on brand.&rdquo; The {wedding} app does this by leveraging custom CSS to beautifully match the style of a wedding invitation, creating a seamless experience for the guests/users.</p><p>The app uses the Shiny framework <a href="https://thinkr-open.github.io/golem/">Golem</a>&lsquo;s helpers to add the CSS to the page, but there are plenty of other ways to do it. To learn more about customizing the look and feel of your Shiny apps, see the <a href="https://rstudio.github.io/bslib/">bslib package</a> and the article <a href="https://shiny.rstudio.com/articles/css.html">Using custom CSS in your app</a>.</p><h2 id="reactive-variables">Best practice #3: Smart organization of reactive variables</h2><h3 id="restor">RestoR</h3><p><img src="https://community.rstudio.com/uploads/default/original/3X/e/d/ed7e4ca1d7d37a641c775bedc3ffe89b5a7ca53d.jpeg" alt="Screenshot of RestorR app"></p><p><em>Links:</em> <a href="https://gv2050.shinyapps.io/gv2050-platform-submission/">App</a> - <a href="https://github.com/LukaNeg/gv2050-platform-submission">Code</a> - <a href="https://community.rstudio.com/t/restor-shiny-contest-submission/104903">Community Post</a></p><p>As R is a scripting language, one of the most challenging parts about learning Shiny is that the code doesn&rsquo;t just run down the script but as a series of &ldquo;reactive&rdquo; calls of code chunks that listen to each other. A common source of poor Shiny performance and maintainability is placing a large amount of logic within a single reactive statement such as an <code>observe()</code>. Burdening your observe statements with lots of <code>renderPlot()</code> calls is tempting because it is how an R programmer&rsquo;s script-oriented mindset works. However, once you learn to trust Shiny&rsquo;s reactive system and liberally use reactive variables, your code becomes much cleaner, faster, and more maintainable.</p><p>The RestorR app does a great job at this. One way to tell is that you see very few nested brackets when looking through the <code>server.r</code> file. Instead, you see nice compact reactive variables like&hellip;</p><pre><code>datasheet_df &lt;- reactive({sample_data %&gt;%filter(site %in% input$selectSiteDatasheets) %&gt;%...})## Download buttonoutput$download_datasheet &lt;- downloadHandler(filename = function() {paste(&quot;spreadsheet_&quot;, input$selectSiteDatasheets, &quot;.csv&quot;, sep = &quot;&quot;)},content = function(file) {write.csv(datasheet_df(), file, row.names = FALSE)})</code></pre><p>Here <code>datasheet_df</code> is a reactive variable that Shiny will always keep up to date. Therefore the download button only needs to describe that it uses whatever the current value of that reactive is. This separation keeps the code easy to reason about and allows easy use of <code>datasheet_df</code> in other contexts than just the download button.</p><p>Contrast this with a naive implementation such as:</p><pre><code>observe({# Build datasheet_df based on current selectSiteDatasheets inputdatasheet_df &lt;- sample_data %&gt;%filter(site %in% input$selectSiteDatasheets) %&gt;%....# Setup download button so it downloads filtered dataoutput$download_datasheet &lt;- downloadHandler(filename = function() {paste(&quot;spreadsheet_&quot;, input$selectSiteDatasheets, &quot;.csv&quot;, sep = &quot;&quot;)},content = function(file) {write.csv(datasheet_df, file, row.names = FALSE)})})</code></pre><p>Here changes in <code>input$selectedSiteDatasheets</code> trigger an <code>observe()</code> statement that filters some data and then sets up a download button. This code is not ideal because it re-initializes the download button every time the input changes. In addition, this style makes it much harder to reuse the filtered <code>datasheet_df</code> anywhere else in your app, hindering future enhancements like showing <code>datasheet_df</code> in a table.</p><p>The compact and simple reactives result in a clean and easy-to-understand reactive dependency graph. Neat dependency graphs make your app&rsquo;s logic easier to parse both for humans reading the code and Shiny when executing that code. Want to see what the dependency graph looks like for your app? Try out the <a href="https://rstudio.github.io/reactlog/">React Log</a>.</p><h2 id="learning-more">Learning more</h2><p>The Shiny contest, both this year and in the past, was an awe-inspiring demonstration of Shiny excellence from all participants with something to learn from every app submitted. If you&rsquo;re eager to learn more about writing great Shiny apps by looking at outstanding examples, check out the winners from the <a href="https://www.rstudio.com/blog/winners-of-the-3rd-annual-shiny-contest/">announcement post</a>. Another resource for learning more is the <a href="https://shiny.rstudio.com/">Shiny website</a>, where you can look at <a href="https://shiny.rstudio.com/articles/">articles</a> covering topics from <a href="https://shiny.rstudio.com/articles/basics.html">beginner</a> to <a href="https://shiny.rstudio.com/articles/building-inputs.html">advanced</a>, an <a href="https://shiny.rstudio.com/gallery/">app gallery</a> for more inspiration, and <a href="https://shiny.rstudio.com/app-stories/">&ldquo;app stories&rdquo;</a> that look at the decision processes behind building apps.</p></description></item><item><title>Advice to Aspiring Sports Analytics Professionals</title><link>https://www.rstudio.com/blog/advice-to-aspiring-sports-analytics-professionals/</link><pubDate>Tue, 20 Jul 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/advice-to-aspiring-sports-analytics-professionals/</guid><description><div style="text-align:right;"><sup>Photo by <a href="https://unsplash.com/photos/Yzef5dRpwWg" target="_blank">Ameer Basheer</a></sup></div><p>“How do I get started?” While not asked explicitly, this question summarizes the highest ranked inquiries we received from attendees of the <a href="https://community.rstudio.com/t/recording-of-r-in-sports-analytics-rstudio-enterprise-community-meetup/107551" target="_blank">“R in Sport Analytics”</a> discussion we hosted in mid-June. My contribution to RStudio’s ongoing <a href="https://www.meetup.com/RStudio-Enterprise-Community-Meetup/" target="_blank">Enterprise Community Meetups</a> highlighted the importance of data-driven frameworks and how they should influence decisions on the field, court, ice or pitch in the same way they impact decisions in the boardroom. Like many of the attendees, I was also unsure of the path to an NFL team as an analytics professional. This post aims to answer questions from the meetup while also providing recommendations for aspiring sports analytics professionals.</p><p>As a brief introduction, the following bullet points summarize roughly the past 20 years of my academic and professional experiences.</p><ul><li>Completed undergraduate degrees in Mathematics and Spanish from Monmouth College and an MBA from the University of Iowa</li><li>Played and coached football collegiately, chased a professional football playing career (think minor-league baseball equivalent but in football)</li><li>Held analyst, product management and leadership positions for multiple sports analytics organizations</li><li>Founded and led analytics departments for the Chicago Bears and Denver Broncos (member of Super Bowl 50 World Championship team)</li><li>Recently started in Customer Success at RStudio and am extremely fortunate to collaborate with data science teams across the globe</li></ul><p>That is more than enough about me. Now, about those questions…</p><ul><li><p><a href="#q1">Is Teamwork Online really the primary avenue into professional sports jobs, particularly the NFL? It seems like a black hole. Better suggestions?</a></p></li><li><p><a href="#q2">What are the best opportunities to widen options (job prospects, career) and network if based in a foreign country with few opportunities in sports analytics?</a></p></li><li><p><a href="#q3">I am a huge fan of sports, but I am new to data science. Which should I learn first, R or Python?</a></p></li><li><p><a href="#q4">I am thinking about data science in sports as a career. What classes would you recommend?</a></p></li></ul><div id="is-teamwork-online-really-the-primary-avenue-into-professional-sports-jobs-particularly-the-nfl-it-seems-like-a-black-hole.-better-suggestions" class="level2"><h2><a name="q1">Is Teamwork Online really the primary avenue into professional sports jobs, particularly the NFL? It seems like a black hole. Better suggestions?</a></h2><p>First, for those that are not familiar, <a href="https://www.teamworkonline.com/" target="_blank">TeamWork Online</a> is a recruiting platform/job board for sports teams, entities and leagues. I addressed this briefly during the meetup, but the best advice I can provide is the following: <a href="https://mitchtanney.github.io/R_in_spoRts_analytics/#13" target="_blank">“Do something.”</a> Outside of playing and coaching for several years after completing my undergraduate degree, my foray into sports analytics started as an independent study in graduate school. I had no real research experience, but my professors did. Thanks to <a href="https://tippie.uiowa.edu/people/jeffrey-w-ohlmann" target="_blank">Jeff Ohlmann</a> at the University of Iowa, I had a forum that allowed me to combine years of experience in sports with the problem-solving and technical skills I had developed in the classroom. There is simply no substitute for identifying a problem that warrants further investigation, framing questions, developing hypotheses, collecting/wrangling/analyzing/modeling/visualizing data and then presenting your work. As someone who has hired technical roles and assembled a staff, one of the first questions I would almost always ask was “What have you done?” Do something.</p></div><div id="what-are-the-best-opportunities-to-widen-options-job-prospects-career-and-network-if-based-in-a-foreign-country-with-few-opportunities-in-sports-analytics" class="level2"><h2><a name="q2">What are the best opportunities to widen options (job prospects, career) and network if based in a foreign country with few opportunities in sports analytics?</a></h2><p>Time zone differences create scheduling challenges, but there are essentially no boundaries thanks to modern technology. Look no further than the <a href="https://operations.nfl.com/updates/football-ops/second-annual-big-data-bowl-competition-participants/" target="_blank">2020 NFL Big Data Bowl</a> as Philipp Singer and Dmitry Gordeev, two data scientists based in Austria, joined forces to win the Open Kaggle Competition. As a former small college football player who competed for roster spots and playing time with players from Power 5 schools, I always used to tell myself “If you can play, then you can play. Where you come from does not and should not matter.” The same concept applies here. If you are willing and talented as a data scientist to compete and perform well in open competitions, where you come from does not and should not matter.</p><p>Additionally, as I mentioned previously, I am extremely fortunate to work with data science teams across the globe. I am approaching the end of month 4 at RStudio, but I have already connected with data science teams on 4 continents. One of my early takeaways from my time at RStudio is that good data science work is good data science work regardless of the industry. Therefore, do not be discouraged if your day job finds you analyzing data that is completely unrelated to sports. Leverage the data skills you develop outside of sports to solve meaningful problems in sports.</p></div><div id="i-am-a-huge-fan-of-sports-but-i-am-new-to-data-science.-which-should-i-learn-first-r-or-python" class="level2"><h2><a name="q3">I am a huge fan of sports, but I am new to data science. Which should I learn first, R or Python?</a></h2><p>The priority at the start should not be memorizing the exact keystrokes of a particular language to simply pass a final exam or advance to the next class in the sequence of a program. Rather, the focus at the start should be learning fundamental programming concepts and data structures. The scripting language provides a syntax for applying higher-order thought. Conditional statements, loops and an appreciation for how rows and columns fit together are core tenets that apply to data analysis in either R or Python. I understood columns and lookups well before I understood vectors and joins, respectively. I also learned R first, and if someone asked me to complete a data analysis project in an hour, hello <code>library(tidyverse)</code>. However, when I have needed to work in Python, I adapted quickly because I understand the fundamentals outlined above.</p><p>Furthermore, my experiences learning a second language have heavily influenced my thoughts toward learning additional programming languages. My knowledge of Portuguese or Italian is virtually non-existent, but I have a feeling my experiences studying, reading and writing Spanish would greatly benefit my ability to learn Portuguese or Italian because of their similarities as Romance languages. The same concept applies to learning R or Python first.</p><p>In summary, emphasize fundamentals and structure over memorizing code at the start; a solid foundation in either R or Python will improve future learning.</p></div><div id="i-am-thinking-about-data-science-in-sports-as-a-career.-what-classes-would-you-recommend" class="level2"><h2><a name="q4">I am thinking about data science in sports as a career. What classes would you recommend?</a></h2><p>In alphabetical order so as to avoid any hint of prioritization…</p><ul><li>Communication/Public-Speaking</li><li>Computer Science</li><li>Marketing</li><li>Science</li><li>Statistics</li></ul><p>To be very clear, proficiency in every subject is not a prerequisite for early-career employment opportunities. Continued growth in these areas summarizes my own professional development. I highlighted them here because all disciplines from the list above significantly contribute to the success or failure of data science teams within organizations. Technical expertise with poor communication creates confusion and potentially even doubt in the minds of business leaders who are tasked with organizational decision-making. Conversely, persuasive campaigns that lack technical substance create high expectations that fall flat. Overpromising and underdelivering is a recipe for disaster, but it occurs when data science teams fail to clearly articulate the performance of a model or the limitations of a research study.</p><p>“Why science?” Two reasons come to mind. First, I am heavily influenced by the <a href="https://davidepstein.com/the-range/" target="_blank">work of David Epstein</a>, but individuals who think critically and understand complex problems such as colliding particles bring a unique perspective to the investigation of player movements and interactions in sports. Secondly, I also strongly agree with Adam Grant in the value of <a href="https://twitter.com/AdamMGrant/status/1371564806352822274?s=20" target="_blank">thinking like a scientist</a>.</p><hr /><p>Thanks for your time and thoughtful review if you have reached this of the post. This is admittedly my first in a public forum after years of working for teams, but I don’t think this will be my last. Please share your comments and feedback, and here are a few final items that also warrant your attention.</p><ul><li>You can find the full recording of the meetup <a href="https://community.rstudio.com/t/recording-of-r-in-sports-analytics-rstudio-enterprise-community-meetup/107551" target="_blank">here</a>.</li><li>To connect with other attendees or ask follow-up questions, please join the #chat-sports_analytics Slack channel on the <a href="http://r4ds.io/join" target="_blank">R4DS Online Learning Community Slack</a></li><li>Follow me <a href="https://twitter.com/MitchTanney" target="_blank">on Twitter</a> where I post thoughts on sports, data science, business and the interof those topics. If you have any questions, feel free to mention @<a href="https://twitter.com/MitchTanney" target="_blank">MitchTanney</a>. Increasing my activity on Twitter is a goal that I expressed during the meetup, and I look forward to hearing from you.</li></ul></div></description></item><item><title>A CEO’s View of Open Source Data Science in the Enterprise</title><link>https://www.rstudio.com/blog/a-ceo-s-view-of-open-source-data-science-in-the-enterprise/</link><pubDate>Thu, 15 Jul 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/a-ceo-s-view-of-open-source-data-science-in-the-enterprise/</guid><description><div id="sharing-customer-stories" class="level1"><h2>Sharing customer stories</h2><p>While the use and scope of open source data science continues to grow, we still sometimes hear from RStudio users and customers that they face some opposition, or at least questions, from IT or other stakeholders when championing a code-first, open source approach.</p><p>Of course, thousands of organizations have adopted open source data science as part of their analytics platform, and often the best way to reassure skeptics is to share their stories first hand. To help get these stories out, we feature RStudio customers in our <a href="https://www.rstudio.com/about/customer-stories/" target="_blank">Customer Stories</a>, encourage our users to share their <a href="https://www.trustradius.com/products/rstudio/reviews" target="_blank">RStudio reviews on TrustRadius</a>, organize <a href="https://www.meetup.com/RStudio-Enterprise-Community-Meetup/" target="_blank">RStudio Enterprise Community Meetups</a> on various industries, and share blog posts (like <a href="https://blog.rstudio.com/2021/05/18/managing-covid-vaccine-distribution-with-a-little-help-from-shiny/" target="_blank">this one</a> on using Shiny applications to optimize COVID vaccine distribution in West Virginia, and <a href="https://blog.rstudio.com/2021/06/24/strategic-analytics-at-monash-university-how-rstudio-accelerated-the-transformation/" target="_blank">this one</a> on Strategic Analytics at Monash University).</p><p>Recently, I had the pleasure of sitting down with Art Steinmetz, the former Chairman, CEO and President of OppenheimerFunds. In this interview, Art gave his unique perspective on the value and suitability of open source, code-first data science for the enterprise.</p></div><div id="highlights-from-my-interview-with-art-steinmetz" class="level1"><h2>Highlights from my interview with Art Steinmetz</h2><p>Art earlier shared an in-depth perspective on <a href="https://blog.rstudio.com/2020/10/13/open-source-data-science-in-investment-management/" target="_blank">Open Source Data Science in Investment Management</a> as a guest contributor to this blog, so I was curious to learn more about his experience, both as an CEO encouraging his teams to use open source data science, and as an R user himself.</p><div id="how-and-why-did-you-get-started-with-open-source-data-science" class="level2"><h3>How and why did you get started with open source data science?</h3><p>Art shared that he started using R, a major language for open source data science, when he became frustrated with the limitations of Excel. As he describes it,</p><blockquote><p>“One of the things that really bugged me was my current self had no idea what my past self did, when I opened a spreadsheet from a year or two prior,”</p></blockquote><p>and he was forced to puzzle through the obscure formulas and the critical dependencies between spreadsheets. He started using R more and more, because he found he was “getting answers faster, and with reusable code.”</p><script src="https://fast.wistia.com/embed/medias/k7piur5xjq.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_k7piur5xjq videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/k7piur5xjq/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><div align="right">Video: How did you get started with open source data science, and why?</div><p></font></p></div><div id="is-open-source-software-appropriate-for-enterprise-level-data-science" class="level2"><h3>Is open source software appropriate for enterprise-level data science?</h3><p>From Art’s perspective, it is absolutely appropriate, because it is “a great way to boost productivity, by empowering all the interested parties in the organization”. Art related that because of the reach and availability of open source, there were many different people at his organization working on analytic problems. Open source “lets a thousand flowers bloom”, but critically this can be done in a managed, curated way that addresses IT’s concerns, using platforms like <a href="https://www.rstudio.com/products/team/" target="_blank">RStudio Team</a> to support the full data science production life cycle.</p><script src="https://fast.wistia.com/embed/medias/9t4lgifb2o.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_9t4lgifb2o videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/9t4lgifb2o/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><div align="right">Video: Is open source software appropriate for enterprise-level data science?</div><p></font></p></div><div id="how-do-you-build-support-for-open-source-software-within-an-organization" class="level2"><h3>How do you build support for open source software within an organization?</h3><p>Finally, I asked Art for his advice on how to build support for open source with an organization. While he says this is much easier than it used to be, as open source software has become more accepted, his primary advice was:</p><ul><li>Start small, with quick projects to demonstrate value.</li><li>Inspire others, who will want the same power and flexibility.</li><li>Don’t “go rogue” and appear to be rejecting IT standards. Instead, work with IT as much as possible.</li></ul><script src="https://fast.wistia.com/embed/medias/8yvpa2d6i7.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_8yvpa2d6i7 videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/8yvpa2d6i7/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><div align="right">Video: How do you build support for open source software within an organization?</div><p></font></p></div></div><div id="to-learn-more" class="level1"><h2>To Learn More</h2><ul><li>Watch the full interview with Art <a href="https://www.youtube.com/watch?v=yf_bu56DGYE" target="_blank">on YouTube here</a>.</li><li>Read Art’s previous blog post on <a href="https://blog.rstudio.com/2020/10/13/open-source-data-science-in-investment-management/" target="_blank">Open Source Data Science in Investment Management</a>, where Art relates how OppenheimerFunds struggled to get full value from their data until they adopted an open source data science approach.</li><li>To hear more stories of how organizations are driving change and impact with their open source data science, read some of our <a href="https://www.rstudio.com/about/customer-stories/" target="_blank">customer stories</a>, or join one of our <a href="https://www.meetup.com/RStudio-Enterprise-Community-Meetup/" target="_blank">RStudio Enterprise Meetups</a>.</li></ul></div></description></item><item><title>Shiny, Tableau, and PowerBI: Better Business Intelligence</title><link>https://www.rstudio.com/blog/shiny-tableau-and-powerbi-better-business-intelligence/</link><pubDate>Mon, 12 Jul 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-tableau-and-powerbi-better-business-intelligence/</guid><description><p><i>This is a guest post from Marcin Dubel, a 2021 Shiny Contest Grand Prize winner and Software Engineer at <a href="https://appsilon.com/" target="_blank" rel="noopener noreferrer">Appsilon</a>, a Full Service RStudio Partner.</i></p><h2 id="finding-the-right-tool-for-the-job">Finding The Right Tool For The Job</h2><p>With strong competition in the Business Intelligence market, choosing the best option for your project can be challenging. There are numerous options, and even though some tools outperform in key areas, there is no clear winner-take-all. To find the best fit, you must clarify your needs and identify project objectives. Some questions you might ask yourself:</p><ul><li>Will you integrate within other web applications?</li><li>What are your connectivity needs?</li><li>What’s the level of user input?</li><li>What level of data science or machine learning might be useful?</li></ul><p>We compared the features and capabilities of Shiny, Tableau, and PowerBI for delivering insights in enterprise organizations. This is our opinion on how they stack up.</p><blockquote><p>For in-depth 1 vs 1 analyses see <a href="https://appsilon.com/tableau-vs-r-shiny/" target="_blank" rel="noopener noreferrer">Tableau vs R Shiny</a> or <a href="https://appsilon.com/powerbi-vs-r-shiny/" target="_blank" rel="noopener noreferrer">PowerBI vs R Shiny</a>.</p></blockquote><p>We encourage you to read through Lou Bajuk’s series on <a href="https://blog.rstudio.com/tags/bi-tools/" target="_blank" rel="noopener noreferrer">Data Science and Business Intelligence</a>. <a href="https://blog.rstudio.com/2021/03/04/bi-and-ds-part1/" target="_blank" rel="noopener noreferrer">Part 1</a> provides valuable insights into the <i>why</i> and <i>how</i> BI and Data Science tools can augment each other. Other posts in the series discuss important topics and <a href="https://blog.rstudio.com/2021/03/11/bi-and-ds2-strengths-challenges/" target="_blank" rel="noopener noreferrer">different approaches to BI tools</a>.</p><h2 id="quick-overview">Quick Overview</h2><h3 id="tableau">Tableau</h3><p>Since its inception in 2003, Tableau has amassed a large community of users and can provide end-to-end services from data prep to deployment. With Tableau you can connect to almost any data source and handle massive datasets - Tableau has no row limits and is designed to scale. Its point-and-click functionality makes use a breeze, however it also means there is no source code to really dig into and replicate results.</p><h3 id="powerbi">PowerBI</h3><p>PowerBI is a collection of cloud-based apps and software services, rather than a single app or software package. As a Microsoft product, Power BI excels with ease-of-use for beginners/non-technical users and has full integration with the Microsoft ecosystem. However, mastery of PowerBI means learning the entire suite of Microsoft tools, and just as with Tableau, it is read-only, so there is no access to source code, and good luck trying to maintain proper version control.</p><img src="paul-quote.png" alt="Paul Ditterline discusses how RStudio helped Brown-Forman achieve their BI goals" class="center"><p>Learn more about Paul&rsquo;s experience in the <a href="https://www.rstudio.com/about/customer-stories/brown-forman/" target="_blank" rel="noopener noreferrer">Brown-Forman customer story</a>.</p><h3 id="shiny">Shiny</h3><p>Shiny is a full web framework that allows R users to create interactive web applications from their preferred language. Shiny is the obvious choice for those looking for complete control over UI/UX and a single integrated web solution. For wider and faster adoption of an app, user experience (UX) is vital, and for a better experience, you need better-performing tools and complete control over customization. This is where Shiny shines. The drawback here is that a code-based solution like Shiny can be more challenging to create. But fear not - the RStudio <a href="https://shiny.rstudio.com/" target="_blank" rel="noopener noreferrer">Shiny developer center</a> and <a href="https://community.rstudio.com/c/shiny/8" target="_blank" rel="noopener noreferrer">Community</a> have plenty of free learning tools and resources. <a href="https://www.rstudio.com/products/connect/" target="_blank" rel="noopener noreferrer">RStudio Connect</a> provides a secure, scalable way to deploy Shiny applications in your organization and is our preferred delivery tool. If time is of the essence, consider reaching out to Appsilon. Our engineers are leading experts in Shiny and can help you quickly implement a Proof-of-Concept application.</p><blockquote><p>To see why Shiny is the preferred option for enterprise applications: <a href="https://appsilon.com/why-you-should-use-r-shiny-for-enterprise-application-development/" target="_blank" rel="noopener noreferrer">Why You Should Use Shiny for Enterprise Application Development</a></p></blockquote><h2 id="categories-and-scoring">Categories and Scoring</h2><img src="bi_chart.png" alt="Scoring criteria and where each tool ranks" class="center"><h2 id="making-the-right-choice---defining-your-needs">Making the Right Choice - Defining Your Needs</h2><h3 id="shiny-1">Shiny</h3><p>At Appsilon we prefer total control over UI customization and R’s data handling capabilities, and our engineers have the skillset to handle the complexities of a code-friendly approach. We have developed some of the most advanced R Shiny dashboards and produced several open-source packages to help users create their apps. So to say we are biased towards R Shiny is fair. But the reason why we do what we do is that at the enterprise level, self-service options simply don’t match the level of creative freedom afforded by open-source data science tools.</p><h3 id="powerbi-1">PowerBI</h3><p>Not everyone requires a higher level of customization and control. For those in need of a quick start, PowerBI is an excellent choice. With the best connectivity options available, great looks right out of the box, and full integration with the Microsoft ecosystem, PowerBI is your best budget-friendly option.</p><h3 id="tableau-1">Tableau</h3><p>If you can foot the bill, Tableau is a great option for those looking to combine big data (similar to the power of Shiny) and PowerBI’s ready, drag-and-drop designs for the inexperienced user. Tableau beats out PowerBI for ease of use, but you’ll have to decide if the benefits justify the cost.</p><h2 id="conclusion">Conclusion</h2><p>The clear winner here is you. With so many options, it’s a buyer’s market. Be wary of the one-stop-shop sellers because even the industry leaders don’t cover all the bases. Take the time to understand your goals and project requirements. There are plenty of strong, easy-to-use self-service options for producing simple dashboards and simple charts. But as project complexity increases, there is no substitute for code-friendly enterprise options like Shiny.</p><h2 id="shiny-from-concept-to-production-an-rstudio-community-x-session">Shiny from Concept to Production: An RStudio Community X-Session</h2><p>Level up your Shiny developer skills on August 10th with the leading Shiny experts at Appsilon. Discover how you can improve your Shiny app&rsquo;s performance, scale for widespread adoption, and deployment through RStudio Connect. This collaborative webinar by RStudio and Appsilon was designed for all levels of Shiny developers - whether a beginner moving beyond simple dashboards or a senior developer needing to improve infrastructure. Register <a href="https://www.rstudio.com/registration/shiny-from-concept-to-production/" target="_blank" rel="noopener noreferrer">here</a> and be sure to check out presentations by two RStudio Shiny Contest Grand Prize winners.</p></description></item><item><title>RStudio Professional Drivers 1.8.0</title><link>https://www.rstudio.com/blog/pro-drivers-1-8-0-release/</link><pubDate>Mon, 28 Jun 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/pro-drivers-1-8-0-release/</guid><description><h2 id="announcing-full-support-for-the-snowflake-driver">Announcing full support for the Snowflake driver</h2><p>In the <a href="https://blog.rstudio.com/2021/03/10/pro-drivers-1-7-0-release/">previous release</a> we included a preview of the Snowflake driver. This new release provides full ODBC, <code>dbplyr</code>, and Python support for Snowflake, as well as corrects some issues that occurred on new installations.</p><p>This release makes working with the Snowflake data cloud as easy as working with other <a href="https://docs.rstudio.com/pro-drivers/">supported data sources</a>. For example, the same <a href="https://dplyr.tidyverse.org/"><code>dplyr</code></a> syntax used with data in R will also work with data in Snowflake. In the code below, data from the <a href="https://cran.r-project.org/web/packages/nycflights13/index.html"><code>nycflights13</code></a> package were pre-loaded into a Snowflake data warehouse and then queried from R using the Snowflake ODBC driver. Notice that the same <code>dplyr</code> syntax used here will also work on an R data frame. The results from the query were then collected into R and visualized with <a href="https://ggplot2.tidyverse.org/"><code>ggplot2</code></a>. For more information on using databases with R, see <a href="https://db.rstudio.com/">db.rstudio.com</a>.</p><pre><code>library(DBI)library(dplyr)library(ggplot2)con &lt;- dbConnect(odbc::odbc(), &quot;Snowflake&quot;)tbl(con, &quot;FLIGHTS&quot;) %&gt;%filter(distance &gt; 75) %&gt;%group_by(origin, hour) %&gt;%summarise(delay = mean(dep_delay, na.rm = TRUE)) %&gt;%collect() %&gt;%ggplot(aes(hour, delay, color = origin)) + geom_line()</code></pre><img align="center" style="padding: 35px:" src="viz-snowflake-flights.png"><h2 id="enhancements-for-microsoft-sql-server-and-ntlm-authentication">Enhancements for Microsoft SQL Server and NTLM authentication</h2><p>This release of the drivers includes an updated version of the SQL Server driver that supports environments using exclusively NTLM v2 authentication. Please refer to the <a href="https://docs.rstudio.com/pro-drivers/documentation/#version-180">documentation</a> for the SQL Server driver in this release for additional details.</p><h2 id="fixed-issues-with-mysql-commands">Fixed issues with MySQL commands</h2><p>The previous release of the drivers introduced a regression for certain MySQL commands such as <code>USE &lt;database&gt;</code> and <code>LOAD DATA INFILE</code>. This new release restores that functionality.</p><h2 id="updating-the-rstudio-pro-drivers">Updating the RStudio Pro Drivers</h2><p><em>We strongly encourage all customers to upgrade to the 1.8.0 release of the RStudio Professional Drivers</em>. This release contains important updates that will help keep your data connections secure and easy to manage. <a href="https://docs.rstudio.com/pro-drivers/upgrade/">Upgrading drivers</a> literally takes minutes and can help prevent future security and administrative issues. For a full list of changes in this release refer to the <a href="https://docs.rstudio.com/drivers/1.8.0/release-notes/">release notes</a>.</p><h2 id="about-the-rstudio-pro-drivers">About the RStudio Pro Drivers</h2><p>RStudio offers ODBC database drivers to all current customers using our professional products at no additional charge, so that data scientists and organizations can take full advantage of their data. The drivers are an important part of our effort to promote <a href="https://blog.rstudio.com/2020/07/15/interoperability-maximize-analytic-investments/">interoperability</a> between systems and data science languages like R and Python. The <a href="https://rstudio.com/products/drivers/">RStudio Pro Drivers</a> are commercially licensed and covered by our <a href="https://www.rstudio.com/about/support-agreement/">support program</a>.</p></description></item><item><title>Strategic Analytics at Monash University: How RStudio Accelerated the Transformation</title><link>https://www.rstudio.com/blog/strategic-analytics-at-monash-university-how-rstudio-accelerated-the-transformation/</link><pubDate>Thu, 24 Jun 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/strategic-analytics-at-monash-university-how-rstudio-accelerated-the-transformation/</guid><description><p><sup>Photo credit: <a href="https://www.monash.edu/" target="_blank">Monash University</a></sup></p><div class="lt-gray-box">*This is a guest post from Dr. Behrooz Hassani-Mahmooei, Director of Strategic Intelligence and Insights Unit, and his team at <a href="https://www.monash.edu/" target="_blank">Monash University</a>, Australia*</div><p>Similar to any other large and complex organisation, at Monash University we employ a wide range of data to inform decision making and monitor different aspects of our operations; such as student outcomes, academic performance, research outputs, human resources and financial records.</p><p>Following significant focus on people and system resources, as well as progress in data management and reporting, the university shifted its focus to build capacity to use these rich data sources to inform and support strategic decision making.</p><p>Specifically, we aimed to integrate the collection, routine reporting and analysis of the existing large internal datasets to the development of data-driven insights which inform strategic thinking and decision making at the senior executive level.</p><p><strong>Monash University&rsquo;s Approach:</strong></p><ul><li><a href="#Journey"><strong>The Journey to Strategic Analytics</strong></a><ul><li><a href="#Link">Linking internal and external data</a></li><li><a href="#Flexible">Creating flexible analytics relevant to strategic decisions</a></li><li><a href="#Accessible">Making the results accessible and understandable</a></li><li><a href="#Relevant">Making our analysis findings relevant to the world around us</a></li></ul></li><li><a href="#Workstreams"><strong>Three Key Workstreams</strong></a><ul><li><a href="#Establish">Establish a Platform for Data Linkage</a></li><li><a href="#Deliver">Deliver Analytics that are Flexible, Proactive and Integrated</a><ul><li><a href="#handling">Flexible data handling</a></li><li><a href="#proactive">Making analytics proactive, not reactive</a></li><li><a href="#collaborative">Fully integrated and collaborative analytic teams</a></li></ul></li><li><a href="#Translate">Translate Analytic Research through Visualisations</a></li></ul></li><li><a href="#Strategic"><strong>Integrating analytics into the broader strategic conversation</strong></a></li></ul><h2 id="a-namejourneyathe-journey-to-strategic-analytics"><a name="Journey"></a>The Journey to Strategic Analytics</h2><p>In 2017, Monash University established the Strategic Intelligence and Insights Unit (InSight), composed of members with multidisciplinary and diverse backgrounds and a broad set of research expertise, analytic skills and solid experience in higher education.InSight was tasked to employ advanced and multidisciplinary approaches to deliver innovative data solutions, intelligence, and insights to support the University’s strategy and help the University secure its positioning globally.</p><p>We identified four main opportunities in our path to transition from day to day operational reporting toward strategic analytics:</p><ul><li><a name="Link"></a><strong>Linking internal and external data:</strong> We needed to bring together sources of internal data and establish reproducible processes to link internal and external data in a consistent, timely and reliable manner. We recognised early that there was no perfect single source of truth that combined all of the data sources across the university. Instead, we needed to create a flexible platform to run major data linkages with low cost and high accuracy.</li><li><a name="Flexible"></a><strong>Creating flexible analytics relevant to strategic decisions:</strong> It was key to use advanced techniques and innovative solutions to establish the most relevant measures and indicators that could be linked to the high-level strategic questions and decisions across the institution. Ideally, this approach should minimise any manual interventions and instead deliver reproducible and reliable analyses.</li><li><a name="Accessible"></a><strong>Making the results accessible and understandable:</strong> Another opportunity was presenting the data and insights in a concise, informative and accessible way that could be easily adopted and utilised to inform high-level strategic conversations. We needed to establish our role as analytics translators to link data to domain knowledge without delivering unnecessary business context or complicated analysis.</li><li><a name="Relevant"></a><strong>Making our analysis findings relevant to the world around us:</strong> Our internal data is limited to what we collect and the scope of our functions and services. To deliver outputs that can inform the senior management to make decisions, we need to reach beyond our own organisation and interrogate our local data within the broader evidence.</li></ul><h2 id="a-nameworkstreamsathree-key-workstreams"><a name="Workstreams"></a>Three Key Workstreams</h2><ol><li><strong>Establish a Platform for Data Linkage</strong></li><li><strong>Deliver Analytics that are Flexible, Proactive and Integrated</strong></li><li><strong>Translate Analytic Research through Visualisations</strong></li></ol><h3 id="a-nameestablishaestablish-a-platform-for-data-linkage"><a name="Establish"></a>Establish a Platform for Data Linkage</h3><p>One of the key steps in delivering strategic analytics is harnessing the richness of multiple complementary datasets through data linkage and making it accessible across the university. We established two programs of work in this area.</p><p>Firstly, we designed and delivered a framework for linking our internal data scattered across different environments and organisational systems using the best practice methods using R.</p><p>For example, in 2019, we linked a significant number of our internal data sources focused on student outcomes and experience. This involved linking separate datasets from the time a student commences at Monash University (admissions), all their activities during their studies (grades, extracurricular activities, student experience surveys, unit evaluations) through to life after graduation (graduate outcomes).</p><p>This platform enabled us to run a wide range of analysis to understand how different programs and interventions impact the academic and non-academic outcomes of students as well as their career post-graduation. Secondly, it enabled us to link some of our internal data to available datasets outside our environment.</p><p>For example, one of our projects involved cleaning, structuring and recalculating a large number of performance indicators (learning and teaching, research, finance, human resources) from the top 200 universities. We used this data to analyse the key drivers of research expenditure as well as emerging patterns in research outputs and partnerships across the world.</p><p>Different key features of R, such as rich capability in text analysis and reshaping data helped us link these sources of information, despite the lack of proper linkage keys and inconsistency across datasets.</p><h3 id="deliver-analytics-that-are-flexible-proactive-and-integrated">Deliver Analytics that are Flexible, Proactive and Integrated</h3><p>One of the key challenges that many organisations face today is how to link the outcomes of their data science and analytics teams to decisions and interventions that lead to tangible results for their business.</p><p>In the past few years, many Australian public and private organisations have established and then disbanded data science teams due to a range of barriers and failures such as lack of rich structured data to inform advanced analysis, unclear connection between the analysis and the business targets and outcomes, and lack of role clarity between business intelligence (BI), data analytics and data science teams. Another challenge that many public organisations such as universities face is the transition from data reporting frameworks that are mostly driven by compliance (such as reporting student numbers to the government) toward designing data-driven strategies.</p><p>Strategic analytics, in our experience, is driven by three key changes to how things are done traditionally:</p><ol><li><p><a name="handling"></a><strong>Flexible data handling:</strong> Administrative data frequently arrives in different formats and structures, disrupting the data preparation process. The traditional response requires manual interventions to address these issues.</p><p>Our approach is designed to deal with the uncertainty in the data automatically. If there are changes in the data format, the scripts can fix them with minimum input from the analysts, minimizing the time needed for manual preparation and analysis of the data.</p><p>For example, if we are analysing a time series dataset that has a structural break that can occur unexpectedly and needs to be identified and corrected for, we embed processes in the code that automatically identify where the break happens and correct for it. This means that in most cases we only need to deal with a data wrangling, transformation or quality issue once.</p></li><li><p><a name="proactive"></a><strong>Making analytics proactive, not reactive:</strong> Historically, data analysis has been a responsive activity. For example, a strategic conversation at an executive meeting or a brainstorming session concludes with a data analysis request based on the direction of the meeting. The results from the analysis are then used to inform future conversations to validate or clarify an issue.</p><p>However, using the platforms that we established, we successfully prepared the data to be part of that first conversation. This allowed the analysis to contribute to the direction of these conversations as early as possible rather than being done in response to them. Data analytics experts were made available, or even invited to participate, in the strategic discussions, highlighting the reliance on data-informed strategic thinking. The data could be interrogated throughout the strategic discussion to inform strategic thinking and progress the conversation beyond the initial context queries to explore the deeper underlying strategic issues.</p></li><li><p><a name="collaborative"></a><strong>Fully integrated and collaborative analytic teams:</strong> Traditionally, different stages of analytics (for example descriptive, diagnostic, predictive and prescriptive) are done by different people using different tools and as separate steps.</p><p>We built capacity within the university by upskilling people across the organisation in using core methodologies and systems to consider analytics from an integrated perspective whereby each step informed the next. Critically, the prescriptive step which aims to inform an intervention was not conducted by a separate data science team in isolation, but instead by analysts with business knowledge who had conducted the descriptive, diagnostic and predictive phases. This added an important contextual element to the analysis which enhanced the strategic relevance of the analysis.</p></li></ol><h3 id="a-nametranslateatranslate-analytic-research-through-visualisations"><a name="Translate"></a>Translate Analytic Research through Visualisations</h3><p>Members of the InSight Unit have a strong background in research and about half of the team have completed a PhD in areas such as econometrics, applied economics, business information systems, and computational modelling. As a result of this expertise, the team was able to apply many core concepts of designing, planning, and undertaking a research project into the process of delivering analytics services. This helped the team deliver uniquely valuable insights and outputs.</p><p>This is most evident in the way that the team utilises data visualisation to communicate the findings of its analysis. Using the rich features of R and relying on the Connect Server, we were able to share a wide range of outputs from the team. These ranged from a simple descriptive statistic on student outcomes to the results of a complex regression analysis on co-authorship patterns. These results were visualised, communicated and translated so that they could be used by the senior management team to design intervention, policies and strategies across a wide range of areas.</p><p>As part of this communication, we considered using existing reporting tools. We found that BI tools are very useful when you want to start from data and generate information. However, when you have a specific decision that you are expected to inform on, especially a strategic decision, you need tools that enable you to start from that decision and reverse engineer back to the data. That is where R helped give us a competitive advantage, providing maximum flexibility and reproducibility as well as clarity for communication and translation.</p><h2 id="a-namestrategicaintegrating-analytics-into-the-broader-strategic-conversation"><a name="Strategic"></a>Integrating Analytics into the Broader Strategic Conversation</h2><p>The InSight team set a goal to deliver informed recommendations of effective business models to improve the University’s overall positioning and competitiveness. To achieve this, the unit was equipped with strong capabilities in areas such as evidence reviews, environmental and horizon scanning, and state analysis. This helped to deliver outputs that extended beyond the scope of internal collected data to incorporate current best practice approaches and build upon the evidence from other institutions and sectors. In our experience, the addition of contextual intelligence informed and extended our analytics beyond the operational reporting, facilitated the delivery of actionable recommendations and supported rapid implementation and translation by the senior management.Moreover, in our experience, the process of strategic analytics is not a linear process where a question is asked and an answer is provided. Rather, it starts from a high-level conversation between executives and data scientists about a strategy and a set of potential scenarios.</p><p>The first step is primarily focused on analysing data through an iterative engagement process which allows the senior executive team to clarify their thinking and the strategic issue they are addressing, refine those scenarios and arrive at some early structured questions.</p><p>Therefore, it is extremely important that the analytical techniques and team are agile, efficient and adaptable to the constantly evolving questions. A code-first approach delivered on RStudio’s professional products is a critical part of this sort of adaptable and agile approach.</p><p>Strategic analytics teams need to be collaborative with decision-makers to arrive at an evidence-based strategic question that can be translated into a targeted in-depth data analysis project. Throughout this process, flexibility and reproducibility are critical components in the delivering of a methodologically robust, transparent and ultimately trusted strategic function. The technically advanced yet intuitive platform supported widespread uptake and adoption across the University was instrumental to achieving our transition from operational to strategic analytics.</p><hr><h2 id="about-monash-university">About Monash University</h2><p>From a single campus at Clayton with fewer than 400 students, Monash has grown into a network of campuses, education centres and partnerships spanning the globe. With approximately 60,000 students (and 350,000 alumni) from over 170 countries, we are today Australia&rsquo;s largest university.</p><p>The University now offers a broad selection of courses within 10 faculties: Art, Design and Architecture; Arts; Business and Economic; Education; Engineering; Information Technology; Law; Medicine, Nursing and Health Sciences; Pharmacy and Pharmaceutical Sciences; and Science.</p></description></item><item><title>Winners of the 3rd annual Shiny Contest</title><link>https://www.rstudio.com/blog/winners-of-the-3rd-annual-shiny-contest/</link><pubDate>Thu, 24 Jun 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/winners-of-the-3rd-annual-shiny-contest/</guid><description><p>Once again the Shiny community has wowed us with their contributions to the 3rd annual Shiny Contest that we <a href = "https://blog.rstudio.com/2021/03/11/time-to-shiny/">announced back in March 2021</a>.</p><p>We had 179 submissions from 164 unique Shiny developers to the contest this year over the two-month period submission period for the contest, with over 40% of participants indicating that they have less than one year experience with Shiny!</p><div id="evaluation-and-judging" class="level1"><h2>Evaluation and judging</h2><p>Apps were evaluated based on technical merit and artistic achievement.Some apps excelled in one of these categories and some in the other, and some in both.Evaluation also took into account the narrative on the contest submission post on RStudio Community.</p><p>Before we go on to announcing the winners, we would like to thank our esteemed judges who helped us evaluate the submissions.A huge thanks to <a href='https://community.rstudio.com/u/csoneson' target = '_blank'>Charlotte Soneson</a>, <a href='https://community.rstudio.com/u/colin' target = '_blank'>Colin Fay</a>, <a href='https://community.rstudio.com/u/EconomiCurtis' target = '_blank'>Curtis Kephart</a>, <a href='https://community.rstudio.com/u/dgranjon' target = '_blank'>David Granjon</a>, <a href='https://community.rstudio.com/u/committedtotape' target = '_blank'>David Smale</a>, <a href='https://community.rstudio.com/u/rpodcast' target = '_blank'>Eric Nantz</a>, <a href='https://community.rstudio.com/u/federicomarini' target = '_blank'>Federico Marini</a>, <a href='https://community.rstudio.com/u/grrrck' target = '_blank'>Garrick Aden-Buie</a>, <a href='https://community.rstudio.com/u/rstudiojoe' target = '_blank'>Joe Rickert</a>, <a href='https://community.rstudio.com/u/johncoene' target = '_blank'>John Coene</a>, <a href='https://community.rstudio.com/u/kevinrue' target = '_blank'>Kevin Rue</a>, <a href='https://community.rstudio.com/u/kneijenhuijs' target = '_blank'>Koen Neijenhuijs</a>, <a href='https://community.rstudio.com/u/mayagans' target = '_blank'>Maya Gans</a>, <a href='https://community.rstudio.com/u/nsgrantham' target = '_blank'>Neal Grantham</a>, <a href='https://community.rstudio.com/u/nicohahn' target = '_blank'>Nico Hahn</a>, <a href='https://community.rstudio.com/u/pedrocoutinhosilva' target = '_blank'>Pedro Silva</a>, <a href='https://community.rstudio.com/u/rajk' target = '_blank'>Raj Kumar</a>, <a href='https://community.rstudio.com/u/parmsam' target = '_blank'>Sam Parmar</a>, <a href='https://community.rstudio.com/u/samanthatoet' target = '_blank'>Sam Toet</a>, and <a href='https://community.rstudio.com/u/winston' target = '_blank'>Winston Chang</a> for their help in evaluating the submissions and thoughtful comments.</p><p>A few of these judges have submitted apps to the contest as well, which have been omitted from the evaluation but we’d love to take a moment to highlight them here!</p><ul><li><a href = "https://nicohahn.shinyapps.io/spotify_habits/" target = "_blank">Spotify Habits</a> by Nico Hahn: A small app showing what music Nico listens to on Spotify in general and what they have been listening to more recently. The all also presents analysis on the artists and songs. Read more about the app <a href = 'https://community.rstudio.com/t/spotify-habits-shiny-contest-submission/102934' target = '_blank'>here</a>.</li><li><a href = "https://parmsam.shinyapps.io/MixThingsUp/" target = "_blank">Mix Things Up</a> by Sam Parmar: An app to quickly generate random workout plans from a list of exercises. Features the use of the <a href = "https://gt.rstudio.com/" target = "_blank">gt</a> and <a href = "https://rstudio.github.io/bslib/" target = "_blank">bslib</a> packages as well as publishing Google Sheets for import into Shiny. You can use the app or source code to generate your random workout plan (based on a preset list of exercises). Get after it! Read more about the app <a href = 'https://community.rstudio.com/t/mix-things-up-a-random-workout-plan-generator-shiny-contest-submission/103530' target = '_blank'>here</a>.</li><li><a href = "https://rpodcast.shinyapps.io/hotshot_dashboard" target = "_blank">The Hotshots Racing Dashboard!</a> by Eric Nantz: In Eric’s words “a completely over-the-top” dashboard summarizing the Wimpy’s World Hotshot Racing 2021 spring virtual racing league. The dashboard contains interactive tables powered by <a href = "https://glin.github.io/reactable/" target = "_blank">reactable</a> and interactive visualizations from <a href = "https://echarts4r.john-coene.com/" target = "_blank">echarts4r</a> and <a href = "https://plotly.com/r/getting-started/" target = "_blank">plotly</a>!" Read more about the app <a href = 'https://community.rstudio.com/t/the-hotshots-racing-dashboard-shiny-contest-submission/104925' target = '_blank'>here</a>.</li><li><a href = "https://rpodcast.shinyapps.io/hotshot_random" target = "_blank">Hotshots Racing Random Driver &amp; Car App!</a> also by Eric Nantz: A random car and driver selector used in the Wimpy’s World Hotshot Racing 2021 spring virtual racing league. Simply click the Launch button to kick off a Vegas-style carousel! Read more about the app <a href = 'https://community.rstudio.com/t/hotshots-racing-random-driver-car-app-shiny-contest-submission/104927' target = '_blank'>here</a>.</li></ul><p>All winners of the Shiny Contest 2021 will get one year of shinyapps.io Basic Plan, a bunch of hex stickers of RStudio packages, and a spot on the Shiny User Showcase.Runners up will additionally get any number of RStudio t-shirts, books, and mugs (worth up to $200) where mailing is possible.And, finally, grand prize winners will additionally receive special and persistent recognition by RStudio in the form of a winners page and a badge that will be publicly visible on their RStudio Community profile, as well as half-an-hour one-on-one with a representative from the RStudio Shiny team for Q&amp;A and feedback!</p><p>Without further ado, here are the winners!We have categorized runners up and grand prize winners based on whether they have less or more than one year experience with Shiny.Note that winners are listed in no specific order within each category.</p></div><div id="grand-prizes" class="level1"><h2>Grand prizes</h2><div id="less-than-one-year-experience" class="level2"><h3>Less than one year experience</h3><div id="wedding-a-shiny-app-to-help-future-grooms" class="level3"><h4>🏆 <a href = "https://connect.thinkr.fr/wedding" target = "_blank">{wedding}: a Shiny app to help future grooms</a></h4><p>by <a href = 'https://community.rstudio.com/u/MargotBrd' target = '_blank'>Margot Brard</a></p><p><img src="images/wedding.gif" align="center" width="80%" alt="GIF of the wedding app that browses the wedding app and shows various screens of the app and the functionality like adding guests."></p><p>Guests can get logistical information about the wedding, confirm their attendance, and indicate their menu choice.The future bride and groom have access the wedding dashboard (visualization of expenses, number of confirmations, seating charts, …).Read more about the app <a href = 'https://community.rstudio.com/t/wedding-a-shiny-app-to-help-future-grooms-shiny-contest-submission/104657' target = '_blank'>here</a>.</p><p>The judges loved this novel approach to a very real problem (terrible wedding websites!) and especially enjoyed the table setting plans in the bride and groom only area.</p></div><div id="math-eagle-game" class="level3"><h4>🏆 <a href = "https://sharukkhanstat777.shinyapps.io/MathEagleGame/" target = "_blank">Math Eagle Game</a></h4><p>by <a href = 'https://community.rstudio.com/u/Sharukkhan' target = '_blank'>Sharukkhan</a></p><p><img src="images/math-eagle.png" align="center" width="80%" alt="Screenshot of the Math Eagle Game. Sidebar shows the answer status and the main panel shows the game where the eagle flies across the screen and the user needs to guess the right answer to the math question for the eagle to keep flying."></p><p>When the user clicks on START, the page directs to a mathematical game.The objective of the game is to make the math eagle reach outer space.For each correct or incorrect answer, the user is notified and the speed changes accordingly.After completing 2 minutes, the page directs to Scoreboard, where the player can view the scores as well as highlights.Read more about the app <a href = 'https://community.rstudio.com/t/math-eagle-game-shiny-contest-submission/104836' target = '_blank'>here</a>.</p><p>The judges thought this was a very fun take on a math game for kids.They loved the technical implementation of this application with proper use of functions to reduce the codebase, and very proper use of reactivity and remarked that this was particularly impressive for someone relatively new to Shiny!</p></div><div id="shiny-app-for-climate-change-informed-tree-species-selection" class="level3"><h4>🏆 <a href = "https://neoxone.shinyapps.io/BCGOV_CCISS" target = "_blank">Shiny app for climate change informed tree species selection</a></h4><p>by <a href = 'https://community.rstudio.com/u/meztez' target = '_blank'>Bruno Tremblay</a></p><p><img src="images/climate-change.png" align="center" width="80%" alt="Screenshot of the Shiny app for climate change informed tree species selection. Sidebar shows the data and main panel shows a map with information on a point on the map."></p><p>Use map or table to enter points of interest.Information for each point is retrieved from a Postgis backend.Click Generate to produce a series of information detailing each site tree species feasibility, including modelled future predictions.Read more about the app <a href = 'https://community.rstudio.com/t/shiny-app-for-climate-change-informed-tree-species-selection-shiny-contest-submission/99916' target = '_blank'>here</a>.</p><p>The judges thought the app showed high technical rigor, and especially highlighted that it’s structured as a package.Not only is the reactive functionality in the app is smooth, it’s also robust to errors with gentle fail mechanisms vs. crashing the app!</p></div></div><div id="more-than-one-year-experience" class="level2"><h3>More than one year experience</h3><div id="racetrack-2-electric-boogaloo" class="level3"><h4>🏆 <a href = "https://jpd527.shinyapps.io/racetrack/" target = "_blank">Racetrack 2: Electric Boogaloo</a></h4><p>by <a href = 'https://community.rstudio.com/u/jdeweese' target = '_blank'>Jackson DeWeese</a> and <a href = 'https://community.rstudio.com/u/dfrye' target = '_blank'>Darren Frye</a></p><p><img src="images/racetrack2.png" align="center" width="80%" alt="Screenshot of the Racetrack 2 app, showing the game controls on the sidebar and the race track in the main panel."></p><p>A digital version of the paper &amp; pencil game Racetrack taken from its start on graph paper in the ‘60s to an online click-to-drive game, Racetrack 2: Electric Boogaloo uses Shiny’s reactivity with global variables to create a multiplayer experience. Read more about the app <a href = 'https://community.rstudio.com/t/racetrack-2-electric-boogaloo-shiny-contest-submission/104522' target = '_blank'>here</a>.</p><p>The judges loved the fun factor of this app, the multi-player logic, and the slick authentication!</p></div><div id="commute-explorer" class="level3"><h4>🏆 <a href = "https://nz-stefan.shinyapps.io/commute-explorer-2/" target = "_blank">Commute Explorer</a></h4><p>by <a href = 'https://community.rstudio.com/u/nz-stefan' target = '_blank'>Stefan Schliebs</a></p><p><img src="images/commute-explorer.gif" align="center" width="80%" alt="GIF showing the functionality of the Commute Explorer app."></p><p>This project explores the commuting behavior of New Zealanders based on the Stats NZ Census 2018 data set. The app uses a custom HTML template to present commuting figures on a map, mode of travel visualizations, work and education related commuting and various filtering options. Read more about the app <a href = 'https://community.rstudio.com/t/commute-explorer-shiny-contest-submission/104651' target = '_blank'>here</a>.</p><p>The judges remaked that “this app is GORGEOUS”! Via the use of an HTML template, the app looks nothing like a standard Shiny app. Plug and play: it’s very simple to understand what it does!</p></div><div id="shark-attack" class="level3"><h4>🏆 <a href = "https://mdubel.shinyapps.io/shark-attack/" target = "_blank">Shark Attack</a></h4><p>by <a href = 'https://community.rstudio.com/u/mdubel' target = '_blank'>Marcin Dubel</a></p><p><img src="images/shark-attack.gif" align="center" width="80%" alt="GIF of the Shark Attack app showing the diver swimming through the screen to collect trash."></p><p>The ocean is the origin and the engine of all life on this planet — and it is under threat. The game goal is to spread the awareness of the environmental issues via enjoyable way. The user will try to collect as much trash as possible while avoiding sharks. Read more about the app <a href = 'https://community.rstudio.com/t/shark-attack-shiny-contest-submission/104695' target = '_blank'>here</a>.</p><p>The judges thought this was a really neat implementation of a motivating and fun teaching tool. (Side note: My four year old LOVES this app! Unfortunately, now he thinks all I do on my computer when I’m working is playing the shark game…)</p></div><div id="dinnr" class="level3"><h4>🏆 <a href = "https://jpd527.shinyapps.io/racetrack/" target = "_blank">dinnR</a></h4><p>by <a href = 'https://community.rstudio.com/u/koderkow' target = '_blank'>Kyle Harris</a> and <a href = 'https://community.rstudio.com/u/actualtoilet' target = '_blank'>Alexis Meskowski</a></p><p><img src="images/dinnr.png" align="center" width="80%" alt="Screenshot of the dinnR app showing meal and grocery store shopping list planning for a week of dinners."></p><p>The dinnR app is a weekly meal planning app for dinner. Simply pick your meals from a community-driven database of recipes, and dinnR will generate a list of items needed for the week. Remove ingredients you have at home and the remaining ingredients can be used as a shopping list. Links are provided to each recipe along with credit to the user that submitted them. The options tab allows you to change the planning dates, set a measurement system, and filter by dietary restrictions. The “Submit a Recipe” tab lets the user submit recipes they enjoy to the app where we credit them. Read more about the app <a href = 'https://community.rstudio.com/t/dinnr-shiny-contest-submission/104773' target = '_blank'>here</a>.</p><p>The judges thought the idea of the app is really fun and were impressed by the participation from the twitch audience filling in the data – the developers stream on <a href = "https://www.twitch.tv/theeatgamelove" target = "_blanl">twitch.tv</a>, and among other things, they stream coding sessions on Saturday mornings! The judges also commented that the code is super clean and very impressive (with the use of Shiny modules and organization as an R package).</p></div><div id="a-minimalist-markdown-organiser" class="level3"><h4>🏆 <a href = "https://jksserver.shinyapps.io/shiny_markdown_organiser/" target = "_blank">A minimalist Markdown organiser</a></h4><p>by <a href = 'https://community.rstudio.com/u/jacksonkwok' target = '_blank'>Chun Fung Kwok</a></p><p><img src="images/markdown-organiser.png" align="center" width="80%" alt="Screenshot of the Markdown organiser app, showing the markdown code on the sidebar and the rendered output in the main panel."></p><p>Type Markdown in the textbox on the left, see rendered organiser on the right. “Click” an item/card on the board to edit the text, and “Drag” to move the items/cards around to desired location. Updates are bidirectional.Read more about the app <a href = 'https://community.rstudio.com/t/a-minimalist-markdown-organiser-shiny-contest-submission/104025' target = '_blank'>here</a>.</p><p>The judges love the minimalistic design of this app and how the developers integrated lots of cool JS!</p></div></div></div><div id="runners-up" class="level1"><h2>Runners up</h2><ul><li><p><a href = "https://jiddualexander.shinyapps.io/svg_input/" target = "_blank">SVG Input</a> by <a href = 'https://community.rstudio.com/u/JidduAlexander' target = '_blank'>Jiddu Alexander</a>.The judges really loved the functionality and the layout of this app as well as the well-organized code.A very neat tool for all Shiny developers!</p></li><li><p><a href = "https://rosemarysu.shinyapps.io/bolivia_unpaid_labor/" target = "_blank">Living a life of labor in Bolivia</a> by <a href = 'https://community.rstudio.com/u/rosemarysu'>Rui Su</a> and Carla Cristina Solis Uehara.The judges highlighted the ease of use of this app and the inspiring topic.</p></li><li><p><a href = "https://tgirke.shinyapps.io/systemPipeShiny/" target = "_blank">systemPipeShiny</a> by <a href = 'https://community.rstudio.com/u/lz100' target = '_blank'>Le Zhang</a>.The judges thought this app is brilliantly put together and commented “I can’t wrap my head around how much functionality this app has implemented!”. Read more about the app <a href = 'https://community.rstudio.com/t/systempipeshiny-shiny-contest-submission/103392' target = '_blank'>here</a>.</p></li><li><p><a href = "https://niels-van-der-velden.shinyapps.io/shinyNGLVieweR/" target = "_blank">Three dimensional (3D) interactive visualization of protein structures</a> by <a href = 'https://community.rstudio.com/u/noveld' target = '_blank'>Niels van der Velden</a>.The judges thought the app is very well explained in contest submission and really polished.They also commented on the neat organization of the code and the use of Shiny modules.</p></li><li><p><a href = "https://2exp3.shinyapps.io/mapa-ciclista/" target = "_blank">Bikemapp</a> by <a href = 'https://community.rstudio.com/u/agus' target = '_blank'>Agustin Perez Santangelo</a>.The were impressed by the snazzy UI of this app and how the author collated data from different sources to provide cyclists a map that provides all essential information they will need while riding.They commented that the concept behind the app is unique and UI is pretty pleasant and enhances overall experience and the code follows best practices.</p></li></ul></div><div id="honorable-mentions" class="level1"><h2>Honorable mentions</h2><ul><li><a href = 'https://community.rstudio.com/t/greent-shiny-contest-submission/104204' target = '_blank'>greenT</a> by Kaija Gahm.</li><li><a href = 'https://community.rstudio.com/t/wildlift-an-open-source-tool-to-guide-decisions-for-wildlife-conservation-shiny-contest-submission/102498' target = '_blank'>WildLift: An open-source tool to guide decisions for wildlife conservation</a> by Sólymos, P., Nagy-Reis, M., Dickie, M., Gilbert, S. and Serrouya, R..</li><li><a href = 'https://community.rstudio.com/t/chess-com-dashboard-shiny-contest-submission/98735' target = '_blank'>Chess(.com) Dashboard</a> by Claudio Paladini.</li><li><a href = 'https://community.rstudio.com/t/fairsplit-shiny-contest-submission/104752' target = '_blank'>FairSplit</a> by Douglas R. Mesquita Azevedo and Luís Gustavo Silva e Silva.</li><li><a href = 'https://community.rstudio.com/t/geomapx-shiny-contest-submission/104901' target = '_blank'>geoMapX</a> by Camill.</li><li><a href = 'https://community.rstudio.com/t/healthdown-shiny-contest-submission/104784' target = '_blank'>healthdown</a> by Peter Gandenberger and Andreas Hofheinz.</li><li><a href = 'https://community.rstudio.com/t/hong-kong-district-councillors-shiny-contest-submission/104810' target = '_blank'>Hong Kong District Councillors</a> by Avision Ho, Martin Chan, and Gabriel Tam.</li><li><a href = 'https://community.rstudio.com/t/science-pulse-shiny-contest-submission/104880' target = '_blank'>Science Pulse</a> by Sérgio Spagnuolo, Lucas Gelape, Rodolfo Almeida, Renata Hirota, Jade Drummond, and Felippe Mercurio.</li><li><a href = 'https://community.rstudio.com/t/cbm-at-home-shiny-contest-submission/102596' target = '_blank'>CBM at Home</a> by Lillian Durán, Norma Medina, Yaacov Petscher, Marissa Suhr, and A.J. Torgesen.</li><li><a href = 'https://community.rstudio.com/t/steam-explorer-shiny-contest-submission/104606' target = '_blank'>Steam Explorer</a> by Raphael Guyot.</li><li><a href = 'https://community.rstudio.com/t/piano-journal-past-present-and-future-3-5-years-shiny-contest-submission/104834' target = '_blank'>Piano Journal: Past, Present and Future - 3.5 years</a> by Peter Hontaru.</li></ul></div><div id="all-submissions-to-shiny-contest-2021" class="level1"><h2>All submissions to Shiny Contest 2021</h2><p>Feel free to peruse the full list of all submissions to the contest on <a href = "https://community.rstudio.com/tag/shiny-contest-2021" target = "_blank">RStudio Community</a>.Note that data and code used in the apps are all publicly available and/or openly licensed.We hope that they will serve as inspiration for your next Shiny app!</p></div></description></item><item><title>R in Supply Chain Management: Meetup Q&A</title><link>https://www.rstudio.com/blog/r-in-supply-chain-management-meetup-q-a/</link><pubDate>Thu, 17 Jun 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/r-in-supply-chain-management-meetup-q-a/</guid><description><p>Supply chain management presents several unique challenges, with supply chain resilience particularly relevant due to disruptions from the pandemic. With thousands of SKUs and multi-tiered distribution networks, teams are using R and Python to improve forecasts, speed up operational planning, simulate variability, and design efficient supply chains.</p><p>The RStudio Enterprise Community group recently hosted an R in Supply Chain Management <a href="https://community.rstudio.com/t/recording-of-r-in-supply-chain-management-rstudio-enterprise-virtual-meetup/104459" target="_blank">meetup</a> featuring Nicolas Nguyen, Digital Supply Chain and Global S&amp;OP Leader at Carl Zeiss Meditec. Nicolas shared the work that he is doing with Shiny to balance demand and supply to support sales. Using Shiny, he has designed powerful, scalable, and reproducible applications for the business.</p><p><a style="display: block; text-align:center;" href="https://www.youtube.com/watch?v=rzs6aSr4XoU" target="_blank"><img src="https://videoapi-muybridge.vimeocdn.com/animated-thumbnails/image/a3948c82-214b-4b6a-b348-fcdad00cf415.gif?ClientID=vimeo-core-prod&Date=1623431854&Signature=b83235c85c3666fcc6ff134adb0f8df8c6e9db2b" alt="R in Supply Chain Management with Nicolas Nguyen | Carl Zeiss Meditec" style=" max-height:100%; max-width:100%;"/></a></p><div align="right">Full meetup recording <a href="https://www.youtube.com/watch?v=rzs6aSr4XoU" target="_blank">here</a></div></font><p>It was awesome to see the beautiful applications that Nicolas built, as well as the reactions and support among the supply chain community that came together for the event. His Shiny applications served as inspiration for many of us that attended.</p><p>While we had time for a handful of questions during the session, I’ve included the full Q&amp;A below, which also includes Nicolas’ response to any questions that went unanswered.</p><p>We have paraphrased and distilled portions of the responses for brevity and narrative quality.</p><h2 id="meetup-qa">Meetup Q&amp;A:</h2><p><strong>How long did this take to create and what are the libraries used?</strong></p><p><strong>Nicolas:</strong> I created the R script to calculate the projected inventories and replenishment plan in 2017 over a few days (with trial &amp; error). Since then, I have been reusing the same code with a bit of improvements. The libraries used are dplyr, DT, tidyverse, and lubridate.</p><p>Depending on the complexity, the Shiny applications took anywhere from 4 hours to a couple of days.</p><p><strong>Can you provide additional background about the algorithms for the prescriptive analytics, how you build supply chain scenarios, and how you find recommendations?</strong></p><p><strong>Nicolas:</strong> Yes, I will also create some examples of prescriptive analytics to share with the group within a few months. The idea is to create a “virtual supply planning assistant” that is going to analyze a situation (based on our code &amp; parameters we defined) to propose potential solutions.</p><p>A few examples of supply chain scenarios are:</p><ul><li>If sales grow 20% above the current level in the next 6-12 months, will we have enough production capacities to answer this demand?</li><li>Should we anticipate production and build stock to be prepared?</li><li>Could we play on some safety stock levels in some markets, or shorter delivery lead time (shipping by air instead of sea), to reduce the production demand or accelerate our supplies?</li><li>Could we play on the product mix and use the machines on a certain type of “similar” product to increase the overall output?</li><li>In case of shortages, can we supply earlier to the market by doing a partial air shipment? Can we ship from another market?</li></ul><p><strong>Do you have any suggestions to start developing similar tools? Any material or examples related to Supply &amp; Demand?</strong></p><p><strong>Nicolas:</strong> Perhaps the easiest way to get started is to look at the existing tools (Excel, APS, SAP, etc.) used by the supply chain team in your company.</p><p>Looking at the way they are used for data transformation, analysis, and charts - what are the problems faced (if any)? What is the degree of automation? Then, do the same thing using R &amp; Shiny, and see if it answers some challenges.</p><p>I will create a github in July/August to share a few examples:</p><ul><li>Basic Projected Inventories &amp; Coverages</li><li>Single level DRP @ Monthly and Weekly Buckets</li><li>Multi-Level DRP</li><li>BOM (Bill Of Material) calculation</li><li>Calculation of Dependent Demand (ex.: Kits / Promo Packs)</li><li>Discontinuation of Products and NPL (New Product Launch) to replace them</li><li>FeFo (First Expired First Out) calculation</li><li>S&amp;OP (Sales &amp; Operations Planning) process app: illustrating Production Capacities</li></ul><p><strong>Do you have an example of using R for the right inventory in distribution channels and warehouses across geography depending on demand sensing?</strong></p><p><strong>Nicolas:</strong> I currently do not have this, but this is to be created! :)</p><p><strong>For products that have 0 demand/supply for a time-point, would the forecasting algorithm impute or take actual 0?</strong></p><p><strong>Nicolas:</strong> The current algorithm works on projected coverages. If there is 0 demand next month but some demand the following months, it will maintain a coverage to be aligned with the demand of the following months. For any specific need, it’s always possible to modify the code to get the expected behaviour.</p><p><strong>What did you use for the plots, graphs, and tables you are showing?</strong></p><p><strong>Nicolas:</strong> The tables are <a href="https://rstudio.github.io/DT/" target="_blank">DT</a> and <a href="https://github.com/kcuilla/reactablefmtr" target="_blank">reactablefmtr</a>, and the plots are <a href="https://github.com/tidyverse/ggplot2" target="_blank">ggplot2</a> and <a href="https://jkunst.com/highcharter/">highcharter</a>. The Sankey diagrams are <a href="https://christophergandrud.github.io/networkD3/" target="_blank">networkD3</a>. I also used <a href="https://cran.r-project.org/web/packages/kableExtra/vignettes/awesome_table_in_html.html" target="_blank">kableExtra</a>, <a href="https://github.com/renkun-ken/formattable" target="_blank">formattable</a>, and <a href="http://shinyapps.dreamrs.fr/shinyWidgets/" target="_blank">shinywidgets</a>.</p><p><strong>Did you calculate the value you added to your company or the money you saved them? If so, how?</strong></p><p><strong>Nicolas:</strong> The easiest calculation was on the automation of the process. People could spend 2 or 3 months per year just on data processing. This multiplied by the number of planners, indicates the money saved only by automating the process. Considering that R &amp; Shiny are free, this is a quick win!</p><p>We also invested in RStudio Connect to be able to share those apps with different users (Supply Chain, Sales, Finance, Country Managers). The cost is really low compared to the savings done through automation and the time saved in sharing the analysis with different stakeholders.</p><p>Then there were all the benefits around a proper calculation of demand &amp; supply planning (reduction of shortages, management of allocations, etc.) that we did not quantify but mentioned.</p><p><strong>Is there any interaction with a database with your applications?</strong></p><p><strong>Nicolas:</strong> Yes, for some apps using <a href="https://jrowen.github.io/rhandsontable/" target="_blank">rhandsontable</a> - (I also should try <a href="https://github.com/DillonHammill/DataEditR" target="_blank">DataEditR</a>). The user can input/change some data and launch a new calculation.</p><p><strong>Where do you host your app? What about the security parameters and authorization matrix?</strong></p><p><strong>Nicolas:</strong> We are using RStudio Connect to host the various Shiny applications (and R Markdown files) behind the company firewall. Using RStudio Connect, we can decide which users can see which app and we then define a list of users per app.</p><p><strong>Did you build an API to communicate the data?</strong></p><p>As per today, we haven’t built an API to communicate the data.</p><h2 id="join-the-supply-chain-community-conversation">Join the Supply Chain Community Conversation:</h2><p>Thank you to Nicolas for an awesome presentation on how he is using Shiny today. With all the interest from the community in this topic, we’d love to continue the discussion and connect us all through:</p><ul><li><a href="https://www.meetup.com/RStudio-Enterprise-Community-Meetup/" target="_blank">Future meetups</a> - Join us on July 26th for another R in Supply Chain Management event (details to be provided soon)</li><li><a href="https://r4ds.io/join" target="_blank">R for Data Science Online Learning Community</a> (Slack channel: #chat-supply_chain)</li><li>RStudio Community</li></ul><p>It would be great to hear from you on <a href="https://community.rstudio.com/t/recording-of-r-in-supply-chain-management-rstudio-enterprise-virtual-meetup/104459/2" target="_blank">RStudio Community</a> about your goals, challenges, and successes that you are having in this space.</p><ul><li>How did code-based solutions help you pivot and respond to the pandemic?</li><li>How are you using R to succeed in inventory management, inventory reduction and replenishment, delivery/shipments optimization, or demand forecasting?</li><li>If you had a magic wand to deliver unlimited resources and budget, what projects would you be working on?</li></ul></description></item><item><title>Debunking the Myths of R vs. Python</title><link>https://www.rstudio.com/blog/debunking-the-myths-of-r-vs-python/</link><pubDate>Tue, 15 Jun 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/debunking-the-myths-of-r-vs-python/</guid><description><script src="index_files/header-attrs/header-attrs.js"></script><style type="text/css">.vidcontainer {text-align:center;}.vidcapcontainer {width: 560px;margin: auto;}</style><p>Data science teams sometimes believe that they must standardize on R or Python for efficiency, at the cost of forcing individual data scientists to give up their preferred, most productive language. RStudio’s professional products provide the best single home for R and Python data science, so teams can optimize the impact their team has, not the language they use.</p><div id="r-python-and-serious-data-science" class="level1"><h2>R, Python and Serious Data Science</h2><p>RStudio’s mission is to create free and open-source software for data science, analytic research, and technical communication. This mission is expressed in <a href="https://blog.rstudio.com/2020/01/29/rstudio-pbc/" target="_blank">our charter as a Public Benefits Corporation</a>, and funded by the revenue from our professional products. These products, such as <a href="https://www.rstudio.com/products/team/" target="_blank">RStudio Team</a>, enable teams and organizations to scale, secure and operationalize their open source data science.</p><p>In working with many different organizations that want to maximize the impact of their data science work, we’ve seen three recurring attributes that contribute to success–which collectively we call Serious Data Science:</p><ul><li><strong>Open source</strong>: It’s better for everyone if the tools used for data science are free and open. This enhances the production and consumption of knowledge and facilitates collaboration. The widespread use of open source software makes recruiting, retention and training of data science team members easier, and comprehensive open source ecosystems ensure you have the right tool for any analytic challenge.</li><li><strong>Code-first</strong>: Coding is the most powerful and efficient path to tackle complex, real-world data science challenges. It gives data scientists superpowers to tackle the hardest problems because code is flexible, reusable, inspectable, and reproducible. With code, the answer is always yes.</li><li><strong>Centralized</strong>, on premises or in the cloud: Centralizing the infrastructure for data science work reduces unnecessary headaches for data science teams, promotes collaboration and sharing self-service applications, supports reproducibility and eases administration.</li></ul><p>RStudio’s professional products deliver a platform on which to centralize, secure and scale your data science, but there are two prominent choices for open source, code-first environments: R and Python. Teams sometimes believe that they must standardize on one or the other for efficiency, at the cost of forcing individual data scientists to give up their preferred, most productive language.</p></div><div id="myths-about-r-vs.-python" class="level1"><h2>Myths about R vs. Python</h2><p>There are a few common myths that we frequently hear from different organizations struggling with the decision of R vs. Python:</p><ul><li><strong>Cognitive overload for Data Scientists</strong>: Practitioners often fear that using more than one language will add overhead and context switching, forcing them to use different development environments.</li><li><strong>Unnecessary burden on IT</strong>: The DevOps and IT teams are concerned that supporting two languages will mean supporting twice the infrastructure for development and deployment, and answering twice as many support tickets for help.</li><li><strong>Blockers to collaboration, reuse and sharing</strong>: The leaders of data science teams worry that allowing multiple different languages will make it harder for the team to collaborate, re-use each other’s work, and deliver that work to the rest of the organization.</li></ul></div><div id="debunking-the-myths" class="level1"><h2>Debunking the Myths</h2><p>While these myths are common, they are nonetheless myths. Advancements in tools in the last few years have made it far easier for a data science team to use both R and Python, side by side.</p><ul><li><strong>Data scientists can easily combine R and Python</strong>: The RStudio IDE makes it easy to combine R and Python in a single data science project. The <a href="https://rstudio.github.io/reticulate/" target="_blank">reticulate package</a> provides a comprehensive set of tools for interoperability between Python and R, and the RStudio IDE has added new capabilities to make Python coding easier, including the display of Python objects in the Environment pane, viewing of Python data frames, and tools for configuring Python versions and conda/virtual environments. (See this <a href="https://blog.rstudio.com/2020/10/07/rstudio-v1-4-preview-python-support/" target="_blank">blog post on RStudio 1.4</a>, and the <a href="https://blog.rstudio.com/2021/06/09/rstudio-v1-4-update-whats-new/" target="_blank">recent RStudio 1.4 update</a>, for more information).</li></ul><div class="vidcontainer"><script src="https://fast.wistia.com/embed/medias/gbgej8p99s.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_gbgej8p99s videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/gbgej8p99s/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div></div><div class="vidcapcontainer"><p><em>Video: Recent improvements to Python integrations in the RStudio 1.4 release.</em></p></div><ul><li><p><strong>Common infrastructure can support multiple languages and reduce support costs</strong>: By using a platform that supports both R and Python, such as <a href="https://www.rstudio.com/products/team/" target="_blank">RStudio Team</a>, DevOps and IT teams can enable data scientists to use their preferred languages and development environments, while supporting a single infrastructure for both development and deployment. For example, RStudio Workbench (<a href="https://blog.rstudio.com/2021/06/02/announcing-rstudio-workbench/" target="_blank">recently renamed from RStudio Server Pro</a>) allows data science teams to use the RStudio IDE, Jupyter or <a href="https://blog.rstudio.com/2021/06/02/rstudio-workbench-vscode-sessions/" target="_blank">VS Code</a> on the same infrastructure, so data scientists can use their IDE of choice without putting an additional burden on IT.</p></li><li><p><strong>Optimize your team’s impact, not the language they use</strong>: Data science teams are most effective when they are sharing work with their fellow team members and with their key stakeholders, as was discussed <a href="https://www.rstudio.com/resources/webinars/building-effective-data-science-teams/" target="_blank">in this recent panel webinar</a> with leaders of data science teams. By supporting both languages, teams have access to more tools for distributing work and making an impact. Frameworks like Shiny, Dash, Streamlit, plumber, Flask, and R Markdown allow data scientists to focus on communication regardless of the language they use.</p></li></ul><div class="vidcontainer"><p><img src="serious-data-science.png" alt="Serious Data Science"></p></div><div class="vidcapcontainer"><p>Figure: RStudio Team provides a single infrastructure for data science teams to develop, share and manage their work, whether it is built in R or Python.</p></div></div><div id="for-more-information" class="level1"><h2>For More Information</h2><ul><li>If you’d like to learn more about how RStudio provides a single home for R and Python Data Science, watch <a href="https://www.rstudio.com/resources/webinars/rstudio-a-single-home-for-r-and-python/" target="_blank">this recent webinar</a>, or get an overview on our website at the <a href="https://www.rstudio.com/solutions/r-and-python/" target="_blank">R and Python Solutions</a> page.</li><li>If you’d like to catch up on all the product feature details, check out <a href="https://blog.rstudio.com/2021/01/13/one-home-for-r-and-python/" target="_blank">this overview of RStudio’s ongoing efforts around Python</a>, as well as the latest Python features in the <a href="https://blog.rstudio.com/2021/01/19/announcing-rstudio-1-4/" target="_blank">RStudio IDE in this post</a>, <a href="https://blog.rstudio.com/2021/06/09/rstudio-v1-4-update-whats-new/" target="_blank">this update</a>, and <a href="https://blog.rstudio.com/2021/06/02/rstudio-workbench-vscode-sessions/" target="_blank">this post on VSCode</a>.</li><li>For advice on reference architectures, usage patterns and configuration, check out the <a href="https://solutions.rstudio.com/python/" target="_blank">Python section</a> of solutions.rstudio.com.</li></ul></div></description></item><item><title>Building Effective Data Science Teams: Answering Your Questions</title><link>https://www.rstudio.com/blog/building-effective-data-science-team-answering-your-questions/</link><pubDate>Thu, 10 Jun 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/building-effective-data-science-team-answering-your-questions/</guid><description><p><sup>Photo by <a href="https://unsplash.com/@zlucerophoto?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Zach Lucero</a> on <a href="https://unsplash.com/s/photos/question?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup></p><p>As a follow-up to last week’s <a href="https://blog.rstudio.com/2021/06/03/building-effective-data-science-teams/" target="_blank">blog post</a> with panel questions from the Building Effective Data Science Teams <a href="https://www.rstudio.com/resources/webinars/building-effective-data-science-teams/" target="_blank">webinar</a>, we’d also like to highlight the great audience questions asked during the session:</p><ul><li><a href="#Tips"><strong>What tips would you provide for organizations where data science is not fully established?</strong></a></li><li><a href="#Evaluate"><strong>How do you evaluate data science candidates for roles?</strong></a></li><li><a href="#Structure"><strong>What&rsquo;s the best structure for a data science team? Should we have a centralized data science team or data science teams that are located within products?</strong></a></li><li><a href="#Leadership"><strong>I’m currently a data scientist and I am interested in moving into leadership. What do you think I should do?</strong></a></li></ul><p>Thank you again to our panelists for their insights and for opening up the conversation on building effective data science teams.</p><p>Our panelists for this webinar were:</p><ul><li><strong>Kobi Abayomi</strong>, Senior VP of Data Science at Warner Music Group</li><li><strong>Gregory Berg</strong>, VP of Data Science at Caliber Home Loans</li><li><strong>Elaine McVey</strong>, VP of Data Science at The Looma Project</li><li><strong>Jacqueline Nolis</strong>, Head of Data Science at Saturn Cloud</li><li><strong>Nasir Uddin</strong>, Director of Strategy &amp; Inspirational Analytics at T-Mobile</li><li>Moderated by <strong>Julia Silge</strong>, Software Engineer at RStudio, PBC</li></ul><h2 id="attendee-questions">Attendee Questions:</h2><p><strong><a name="Tips"></a>What tips would you provide for organizations where data science is not fully established?</strong></p><p><strong>Elaine:</strong> I think one of the answers, which I wish were not the answer, is that a lot depends on finding the right home in the organization. I don&rsquo;t think there&rsquo;s one clear answer to what that is. It depends a lot on your company and stakeholders.</p><p>In terms of scaling up, even if you have a lot of credibility, have produced a lot of great work and people are excited about the data science team - if you&rsquo;re not in a place in the organization that fits in terms of the business and how the company is organized, it&rsquo;s hard to grow the team.</p><p>There can be a lot of uncertainty about what it means if we have more data scientists. An executive who&rsquo;s not a data science person may not quite understand what we get from that. This is not an easy problem to solve because it can be a lot easier to add people to more classic business things that they understand. For leaders, this is a really important thing to think about. Where in your organization can you find the best long term fit? Where do your highest level leaders understand the value of what you do?</p><p><strong>Nasir:</strong> At one of my previous employers, I was hired as the first data scientist - actually, the first person to explore whether AI/machine learning would be a value-add. They had a huge amount of data available within the organization. I took the challenge of generating confidence among the stakeholders with limited resources.</p><p>I defined some low hanging fruit types of problems and solved them by providing access to self-service tools. In that case, Shiny applications were tremendously helpful to me. I took the sample data and generated outcomes the way they wanted, interacted with them, and put them into the driver&rsquo;s seat. They were so happy. From this, I was able to get buy-in from most of the stakeholders so that I could grow the team. I then built their engineering team, data science team, and system administration team. It was all about generating confidence among the stakeholders and creating values for the business.</p><p><strong>Greg:</strong> I think that makes complete sense. To add a point to that, it seems like you have to shift from being a data scientist and put your advertising hat on. You need to advertise what you&rsquo;re doing and show the value. Then, shift to be an economist and say, “here is the return on investment of adding more data scientists and this is what you can get.” You need to have this broader perspective, rather than just wanting to build models. You need to be an advertiser and I think your example, Nasir, was great. You did that.</p><p><strong><a name="Evaluate"></a>How do you evaluate data science candidates for roles?</strong></p><p><strong>Greg:</strong> There are a couple of things. One is what their academic background is, depending on what I&rsquo;m after. I don&rsquo;t want to just hire people like me, as economists. I need a broader skill set.</p><p>The other is their skills and things that they&rsquo;ve worked on. You can get a sense of how technical they are in certain areas just based on that. When it comes to a phone interview or talking to them, I shift. I talk about some technical things, but I shift into the behavioral-based interviewing. I want to know how they perform and what they did in different specific situations with the idea being that if they perform this way in a certain situation at a company, that could carry over. So, I have the dual frame of technical and behavioral based interviewing.</p><p><strong>Kobi:</strong> I like what Greg just said. These days, I try to divine out what they actually do and what was created beyond the jargon on the resume. When I&rsquo;m looking at resumes, I look for that same narrative and description that illustrates an understanding of what was done. I don&rsquo;t fault candidates for this, but part of this is the nature of the way people are finding jobs and hiring managers are finding resumes.</p><p>There&rsquo;s a tendency to throw a lot of the jargon at NLP behind the resume. You&rsquo;ll see resumes with a list of acronyms of the software they use and a list of types. I&rsquo;ve seen people break out clustering analysis and then list types of clustering analysis - machine learning - and then they&rsquo;ll list types of machine learning. I&rsquo;ve found, more often than not, candidates behind those resumes didn&rsquo;t have the level of a sophisticated understanding that I&rsquo;m looking for - problem solving. Technology will change and techniques will change. I look for people who can think and are willing to. With this field these days, it&rsquo;s difficult to parse it out, but I think what Greg said about having a scenario and that sort of discussion is helpful.</p><p><strong><a name="Structure"></a>What&rsquo;s the best structure for a data science team? Should we have a centralized data science team or data science teams that are located within products?</strong></p><p><strong>Jacqueline:</strong> I would like to talk about this because I used to take a very neutral stand. Years later, I think the right approach is to be distributed because of exactly the thing we&rsquo;ve been talking about for the last hour, that communication is key. When you have a distributed data science team, communication with the stakeholder is most important, not the communication between data scientists. Maybe some of the teams are using R and some are using Python, whatever. That&rsquo;s still better than all using the same language but not talking to the people who use your work.</p><p>The other point is that the question may not be “what is the right approach?” The question here is “how do I convince people above me to switch to a better approach?” I don&rsquo;t have a good answer for that except to say this is the job of a director of data science or someone in leadership. They should be thinking about that. They have the authority and title by their name to make those changes. If you do not have that, depending on your organization, it can be very difficult to get people at levels above you to listen to you as a senior data scientist. In this case, the best things you can do if you want to try and make those changes at the lower level, is get one level up to try and buy in and have them move it up and up rather than going straight to the CEO to say, “Listen to my org distribution.” It&rsquo;s about thinking it through - how do you work through an organization of people?</p><p><strong><a name="Leadership"></a>I&rsquo;m currently a data scientist and I am interested in moving into leadership. What do you think I should do?</strong></p><p><strong>Elaine:</strong> To hit more on the theme of communication, I would say ask for opportunities to communicate the work that your team is doing to more and more people at higher and higher levels and get really good at that.</p><p><strong>Jacqueline:</strong> A lot of leadership is organization and communication, as said, so I find the things that have helped me in my career are running a project on my own. Find places where you can do the whole thing instead of asking your manager, &ldquo;well, now what do I do?&rdquo;&rdquo; Find places where you can be the person calling the shots, and eventually, you find yourself calling shots with other people and then you get titles and stuff like that.</p><p><strong>Nasir:</strong> I would recommend developing your business acumen. I&rsquo;m sure you are a data scientist, so you&rsquo;re technically very sound. You know your technical details. You need storytelling experience and to be able to speak in terms of the business language, rather than the technical language. So while you will be explaining your heuristic curves, instead of using the heuristic curves, how can you explain that in the language of the business?</p><p>Also as Jacqueline suggested, be the leader of a project. Take your project, take the ownership of that project, and see whether you can execute end to end.</p><p><strong>Greg:</strong> As Nasir said, putting on your business hat rather than your data science hat and expanding your horizons. Also, talk with your manager and have an individual development plan. A lot of companies have training for new managers or people who want to become managers. If your manager doesn&rsquo;t even know that you want to grow and become a manager, they may not push you in that direction. So definitely communicate that &ldquo;hey, I want to do this. How can we make this happen?&rdquo;</p><p><strong>Kobi:</strong> Apply for leadership jobs.</p><h2 id="building-effective-data-sciences-teams-summary">Building Effective Data Sciences Teams Summary</h2><p>Thank you once more to our panelists for opening up this important discussion on how we can build effective data science teams. Our <a href="https://www.rstudio.com/resources/webinars/building-effective-data-science-teams/" target="_blank">panel webinar</a> focused on three main themes that we think contribute to effective data science teams:</p><ul><li><strong>Building and maintaining credibility</strong></li><li><strong>Delivering real value</strong></li><li><strong>Collaboration across an organization</strong></li></ul><p>Our <a href="https://blog.rstudio.com/2021/06/03/building-effective-data-science-teams/" target="_blank">last blog post</a> published on June 3rd, shared the panel Q&amp;A that addressed those three themes above.</p><p>While we, unfortunately, were unable to address every attendee question during the webinar, we would really love to keep this conversation going. There were so many great follow-up questions that touched on team design, project scoping, tool selection, and selling data science internally that we will dive deeper into through:</p><ul><li>Future blog posts <a href="https://www.rstudio.com/about/subscription-management/" target ="_blank">(Subscribe here)</a></li><li>Open meetup discussions with data science leaders (Join this <a href="https://www.meetup.com/RStudio-Enterprise-Community-Meetup" target="_blank">meetup group</a> for an event on June 24th with John Thompson, Global Head of Advanced Analytics &amp; AI at CSL Behring)</li><li><a href="https://community.rstudio.com/c/industry/44" target="_blank">RStudio Community</a></li></ul><p>If you have other ideas or questions you’d like to share, you can use the RStudio Community link for each individual webinar question to share your thoughts on a specific topic as well.</p></description></item><item><title>RStudio v1.4 Update: What's New</title><link>https://www.rstudio.com/blog/rstudio-v1-4-update-whats-new/</link><pubDate>Wed, 09 Jun 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-v1-4-update-whats-new/</guid><description><p><sup>Photo by <a href="https://unsplash.com/@arizz?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">jurissa yanoria</a> on <a href="https://unsplash.com/s/photos/flower-bloom?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup></p><p>We&rsquo;ve just covered the recent <a href="https://www.rstudio.com/blog/announcing-rstudio-workbench/">name change from RStudio Server Pro to RStudio Workbench</a> we made in the last RStudio update, and the <a href="https://www.rstudio.com/blog/rstudio-workbench-vscode-sessions/">improvements we&rsquo;ve made to VS Code sessions</a>. However, this update (codenamed &ldquo;Juliet Rose&rdquo;) comes with a lot of other small features, too, available to open source users as well as those who use RStudio&rsquo;s professional suite of products. Today we&rsquo;re taking a look at some of these smaller features.</p><h3 id="r-41-support">R 4.1 Support</h3><p>First, we&rsquo;ve <strong>added support for R 4.1</strong>. This update to the core R language adds more features than usual, and previous versions of RStudio are not compatible with these changes, so this RStudio update is <strong>required</strong> if you plan to work with the new version of R.</p><h4 id="new-native-pipe-operator-">New Native Pipe Operator, |&gt;</h4><p>R 4.1 adds a <a href="https://developer.r-project.org/blosxom.cgi/R-devel/NEWS/2020/12/04">native pipe operator</a>, <code>|&gt;</code>. Many R users will be be familiar with pipe operators already; they were popularized in R by the <a href="https://cran.r-project.org/web/packages/magrittr/vignettes/magrittr.html">magrittr package</a>&lsquo;s <code>%&gt;%</code> pipe and have become a <a href="https://style.tidyverse.org/pipes.html">fixture in the tidyverse</a>.</p><p>RStudio now supports this new native pipe operator. If you&rsquo;re working primarily in code that uses the new <code>|&gt;</code> operator, you&rsquo;ll want to change RStudio&rsquo;s <em>Insert Pipe</em> command (Cmd/Ctrl + Shift + M) so that it inserts native pipes instead of matrittr-style pipes. To do this, go to <em>Options -&gt; Code -&gt; Editing</em> and check <em>Use native pipe operator</em>.</p><p><img src="pipe-operator-option.png" alt="Screenshot of the RStudio Options dialog showing the Use Native Pipe Operator preference highlighted"></p><p>Also, if you&rsquo;re using a ligature font like <a href="https://github.com/tonsky/FiraCode">FiraCode</a> or <a href="https://www.jetbrains.com/lp/mono/">JetBrains Mono</a> with RStudio, you&rsquo;ll see a nice triangle glyph representing the new operator.</p><p><img src="native-pipe.png" alt="Screenshot of an RStudio code editor showing triangle-shaped ligatures for the native pipe operator"></p><h4 id="new-anonymous-function-syntax-x">New Anonymous Function Syntax, (x)</h4><p>R 4.1 also adds new syntax for anonymous functions; you can write <code>\(x) .. </code> instead of the more cumbersome <code>function (x) ...</code>). RStudio now supports this syntax.</p><p>Read more about <a href="https://www.jumpingrivers.com/blog/new-features-r410-pipe-anonymous-functions/">the new pipe operator and anonymous functions in R 4.1</a>.</p><h4 id="new-graphics-engine">New Graphics Engine</h4><p>Finally, R 4.1 adds a <a href="https://developer.r-project.org/Blog/public/2020/07/15/new-features-in-the-r-graphics-engine/">new graphics engine</a>. This graphics engine isn&rsquo;t compatible with previous releases of RStudio (crashes will ensue when using ggplot2 or other grid-based graphics), which is the primary reason you need this RStudio update to work with R 4.1. Here&rsquo;s RStudio demonstrating support for linear gradient fills, one of the new graphics engine features:</p><p><img src="grid-graphics.png" alt="Screenshot of RStudio demonstrating a linear gradient plot generated by the new graphics engine"></p><p>Note that if you want to use the new graphics device features, you&rsquo;ll need to use the Cairo graphics backend on most platforms; you can enable it in <em>Options -&gt; General -&gt; Graphics</em>.</p><h3 id="python-improvements">Python Improvements</h3><p>We&rsquo;re continuing to improve support for Python in RStudio. There are a bunch of improvements in this update, but chief among them is that we now show you the version of Python you&rsquo;re working with, right in the Console tab. Python environment configuration is <a href="https://xkcd.com/1987/">notoriously tricky</a>, and this makes it just a <em>little</em> easier to see what you&rsquo;re working with.</p><p><img src="python.png" alt="Screenshot of RStudio&rsquo;s improved python support, showing the version information and stop button"></p><p>We&rsquo;ve also made it possible to interrupt Python when it&rsquo;s running (in other words, the Stop button works now), improved detection of Python environments on the system, made sure that Python versions match up when you&rsquo;re knitting and publishing, put Python on <code>$PATH</code> on the Terminal, and lots more.</p><p>Note that to take full advantage of all of the Python improvements, you&rsquo;ll need the <a href="https://cran.r-project.org/web/packages/reticulate/index.html">latest version of the reticulate package</a> (1.20 or higher).</p><h3 id="apple-silicon-support">Apple Silicon Support</h3><p><img src="apple-m1.jpg" alt="Logo depicting Apple&rsquo;s M1 processor"></p><p>RStudio for macOS now works with the <a href="https://cran.r-project.org/bin/macosx/">native arm64 builds of R</a>, meaning you&rsquo;ll now experience the full benefits of the M1&rsquo;s <a href="https://www.cpubenchmark.net/cpu.php?cpu=Apple+M1+8+Core+3200+MHz&amp;id=4104">considerable processing power</a> when running your R code inside RStudio.</p><p>Note that, while the components of RStudio that interface with R are now fully native, the front end is still compiled only for Intel and runs under Rosetta2 due to upstream dependencies that don&rsquo;t yet have binaries available for the new architecture. We will produce a fully native version of RStudio for Apple Silicon in an upcoming release.</p><h3 id="visualizing-memory-usage">Visualizing Memory Usage</h3><p>RStudio&rsquo;s Environment pane now includes a small widget that shows both how much memory your R session is using and how much free memory is available on your system.</p><p><img src="memory-usage.png" alt="Screenshot of RStudio showing memory usage in the Environment Pane"></p><p>Clicking on this widget will generate a memory usage report that gives you more information about available memory. It&rsquo;s a helpful tool for understanding how much memory your data is taking up, and letting you know that you&rsquo;re approaching the limit if you&rsquo;re using RStudio in a memory-constrained environment. Read more about <a href="https://support.rstudio.com/hc/en-us/articles/1500005616261-Understanding-Memory-Usage-in-RStudio">understanding memory usage in RStudio</a>.</p><h3 id="document-context-menu">Document Context Menu</h3><p>Document tabs in RStudio now have a context menu, which makes it more convenient to take actions on the file/tab directly instead of going through the top-level menus. Right-click on the file&rsquo;s name in the tab to invoke the context menu.</p><p><img src="document-tab-menu.png" alt="Screenshot of RStudio&rsquo;s document tab context menu, showing actions that can be taken on the document"></p><h3 id="command-palette-upgrade">Command Palette Upgrade</h3><p>The <a href="https://blog.rstudio.com/2020/10/14/rstudio-v1-4-preview-command-palette/">Command Palette</a> was one of the most popular features we introduced in RStudio 1.4. With this update, we&rsquo;ve upgraded it with a new Most Recently Used (MRU) section at the top.</p><p><img src="command-palette-mru.png" alt="Screenshot of RStudio&rsquo;s command palette showing a list of recently used commands"></p><p>This puts your most recently used commands within easier reach. They also match first in searches, so just a letter or two is often enough to recall one. And &ndash; for the extremely <strike>lazy</strike> efficient &ndash; you can just hit Enter in the Palette to re-run the most recent command.</p><p>You can clear this list using the <em>Clear Recently Executed Command List</em> command, or turn off the feature if you&rsquo;d like the Palette to be pristine every time you use it by disabling the <em>Remember Recently Used Items in Command Palette</em> setting. This setting is only accessible via the Command Palette itself.</p><p>There&rsquo;s lots more in this release, and it&rsquo;s <a href="https://www.rstudio.com/products/rstudio/download/">available for download today</a>. You can read about all the features and bugfixes in the &ldquo;Juliet Rose&rdquo; update in the <a href="https://www.rstudio.com/products/rstudio/release-notes/">RStudio Release Notes</a>, and we&rsquo;d love to hear your feedback about the new release on our <a href="https://community.rstudio.com/c/rstudio-ide/9">community forum</a>.</p></description></item><item><title>Upcoming Webinar - Incorporating R into your Clinical Legacy Workflows</title><link>https://www.rstudio.com/blog/incorporating-r-into-your-clinical-legacy-workflows/</link><pubDate>Tue, 08 Jun 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/incorporating-r-into-your-clinical-legacy-workflows/</guid><description><p style="text-align: right"><small><i>Photo by <a href="https://unsplash.com/photos/IFpQtennlj8" target="_blank" rel="noopener noreferrer">@cdc</a> on Unsplash</i></small></p><p><i>This is a guest post from Mike Stackhouse, Chief Innovation Officer, at <a href="https://www.atorusresearch.com/" target="_blank" rel="noopener noreferrer">Atorus Research</a>, a Full Service RStudio Partner, about an upcoming joint webinar <a href="https://www.rstudio.com/registration/incorporating-r-into-your-clinical-legacy-workflows/" target="_blank" rel="noopener noreferrer">Incorporating R into your Clinical Legacy Workflows</a></i>.</p><p>Organizations with legacy clinical workflows have much to gain by automating routine functions and sharing critical information in richer and more interactive forms. When organizations look to adopt new programming languages, such as R and Python, to tackle these improvements, they run into a number of challenges.</p><p>There are many questions that have to be addressed such as: How do we migrate legacy workflows over to new programming languages? And with this migration, how do we upskill a workforce to use these new tools?</p><p>Throughout my career, I have seen these challenges first hand. My entry into the world of data analytics started as a statistical programming intern working in the world of clinical research and using the language SAS®. Like many others, I fell into programming as a career – and this was the first time I had ever even touched a programming language.</p><p>Along the journey of learning how to program, I also was exposed to learning the ins and outs of clinical research itself. The pharmaceutical industry is highly regulated and organizations have strict policies in place governing the process for data analysis and the quality control/assurance of that analysis. Furthermore, there are many data standards within industry including CDISC’s SDTM, ADaM, and define.xml. These pieces together lead to fairly standardized programming processes and associated tools within an organization. Working at a clinical research organization (CRO), I had the opportunity to see several different implementations of largely the same processes.</p><p>At a certain point, my interest in programming drove me to explore other languages outside of SAS®. I found Python and R – and a new sense of freedom. I could automate routine functions I was doing manually, create rich reports to share information, and make web-based interfaces to dynamically let users visualize the data. I had new tools in my tool belt that could not only change the way I did my job, but the way I shared information – most importantly with non-technical users.</p><p>Unfortunately, adopting these tools is not as simple as a lift and shift. Legacy processes in clinical research have decades of precedent, with highly standardized workflows and utilities built to suit an organization’s needs. Furthermore, programming in clinical research tends to be monolithic; SAS® is the language available and thus SAS® is the language that’s used. In light of this, many programmers in industry have only used and only know how to use SAS® - and having to pick up a new language can be scary, especially when you have years of experience in the one you know.</p><p>On June 22nd, we’ll be teaming up with GSK’s Michael Rimler to talk about these topics. In the webinar, we’ll be discussing:</p><ul><li>Open-source R packages developed specifically for clinical research workflows</li><li>Open-source development within pharma and embracing a collaborative paradigm</li><li>Upskilling existing SAS® programmers in pharma to learn R and open-source languages, and how Atorus can help</li></ul><h2 id="register-for-the-webinar">Register for the webinar</h2><p>You can sign up for the June 22nd webinar here: <a href="https://www.rstudio.com/registration/incorporating-r-into-your-clinical-legacy-workflows/" target="_blank" rel="noopener noreferrer">Incorporating R into your Clinical Legacy Workflows</a></p><h3 id="about-atorus-research">About Atorus Research</h3><img src="atorus.png" alt="Atorus Research" style="float:left; padding: 20px" ALIGN="top"><p>Atorus Research helps organizations on their journey to R by guiding you down the proper paths to explore, embrace, and ultimately embed R and the power of open-source tools within your teams. We deliver the answers to your questions to how to safely migrate existing processes and adopt new technologies, while remaining compliant within the regulatory environment. </p></description></item><item><title>Building Effective Data Science Teams</title><link>https://www.rstudio.com/blog/building-effective-data-science-teams/</link><pubDate>Thu, 03 Jun 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/building-effective-data-science-teams/</guid><description><p><sup> &ldquo;Data and Goliath&rdquo;-gouache painting by Jacqueline Nolis (image from her <a href="https://www.etsy.com/shop/NolisMusings" target="_blank">Etsy store</a>)</sup></p><p><strong>So what does it take to build a successful data science team?</strong> Whether you are the first “data person” at your organization or leading a team of hundreds, we know success is not based on just technology; it requires people to create a productive, effective, and collaborative data science team.</p><p>Last month’s webinar featured data science leaders from Caliber Home Loans, The Looma Project, Saturn Cloud, T-Mobile, and Warner Music Group to start to answer this question.</p><p>Our panelists for this webinar were:</p><ul><li><strong>Kobi Abayomi</strong>, Senior VP of Data Science at Warner Music Group</li><li><strong>Gregory Berg</strong>, VP of Data Science at Caliber Home Loans</li><li><strong>Elaine McVey</strong>, VP of Data Science at The Looma Project</li><li><strong>Jacqueline Nolis</strong>, Head of Data Science at Saturn Cloud</li><li><strong>Nasir Uddin</strong>, Director of Strategy &amp; Inspirational Analytics at T-Mobile</li><li>Moderated by <strong>Julia Silge</strong>, Software Engineer at RStudio, PBC</li></ul><p><img src="blogpanelists.jpg" alt="Image of Webinar Panelists"></p><p>You can view the recording of the webinar at <a href="https://www.rstudio.com/resources/webinars/building-effective-data-science-teams/" target="_blank">Building Effective Data Science Teams</a>.</p><p>There were so many great follow-up questions that we’d like to keep this conversation going. We will dive deeper into specific topics through:</p><ul><li>Future blog posts <a href="https://www.rstudio.com/about/subscription-management/" target ="_blank">(Subscribe here)</a></li><li>Open meetup discussions with data science leaders - join this <a href="https://www.meetup.com/RStudio-Enterprise-Community-Meetup" target="_blank">meetup group</a> for an event on June 24th with John Thompson, Global Head of Advanced Analytics &amp; AI at CSL Behring</li><li><a href="https://community.rstudio.com/c/industry/44" target="_blank">RStudio Community</a></li></ul><p>We’ve also added links to an RStudio Community thread for each individual question if you’d like to continue the conversation there as well.</p><p>We will summarize the questions and answers brought up during the panel that focus on three main themes that we think contribute to effective data science teams:</p><ul><li><a href="#Building-and-Maintaining-Credibility"><strong>Building and maintaining credibility</strong></a></li><li><a href="#Delivering-Real-Value"><strong>Delivering real value</strong></a></li><li><a href="#Collaboration-Across-an-Organization"><strong>Collaboration across an organization</strong></a></li></ul><p>We have paraphrased and distilled portions of the responses for brevity and narrative quality.</p><h2 id="panel-questions">Panel Questions</h2><h3 id="a-namebuilding-and-maintaining-credibilityabuilding-and-maintaining-credibility"><a name="Building-and-Maintaining-Credibility"></a>Building and Maintaining Credibility</h3><p><strong>What is a symptom that you have observed, during your time in this field, of a team being low on credibility within an organization or with stakeholders?</strong></p><p><strong>Jacqueline</strong>: Maybe this isn&rsquo;t low credibility per se, but I would categorize this as an unhealthy relationship with stakeholders. Something I think I&rsquo;ve noticed happening in unhealthy relationships is that there is no partnership.</p><p>The business person should be able to say, “I trust if I have a business question, I can go to you, and you&rsquo;re going to come back with an answer.” And the data science people can trust that, “Hey, you&rsquo;re going to bring us the important questions and the context needed to answer them.”</p><p>One sign that the relationship isn&rsquo;t working well is when those boundaries aren&rsquo;t held. When you have a stakeholder who says “Well, I&rsquo;m not sure you should be using a logistic regression here. I just read a blog post about neural nets. Why aren&rsquo;t we using those?” But also if the data scientist says, “No, what we really need to be building is a churn model. We think the value&rsquo;s in the churn model and we don&rsquo;t care what you think.” I think keeping good boundaries is a really important sign of healthiness there.</p><p><strong>Nasir</strong>: In addition to what Jacqueline said, one symptom is that you feel like there is a lack of interest from the stakeholder side. You can overcome that problem by generating confidence among the stakeholders. By delivering outcomes in a more transparent way, you can empower the stakeholders and put them in the driver&rsquo;s seat. You can do this through a self-service tool, like a Shiny app. If you deliver results even from the beginning of the project, you can receive feedback from them and can iteratively develop and improve upon that. It&rsquo;s all about developing the self-service tool and delivering this in a meaningful way.</p><p><strong>Taking that further, how can we build credibility and maintain credibility once we have it?</strong></p><p><strong>Elaine</strong>: I think a lot of the things that come to mind are related to communication. One thing to build on what&rsquo;s been said is communicating the results of data science work in a way that is appropriate to the audience. One problem that we can have is usually, people assume we&rsquo;re really smart and know all these amazing magical things. But then we present in a way that they struggle to understand, and that really shows some lack of connection between the perspective of the business stakeholders and the team. One avenue of communication that you can demonstrate is in the way you present. Even if this means leaving out a lot of really interesting and cool details that you understand, what is the end game of your work and how will people need to consume it? That helps build credibility.</p><p>The other is around communicating what work is happening on the team and what work is coming up, in a way that people can understand what the value can be from that and have a productive conversation about what the priorities are. Especially starting data science in a company for the first time, there&rsquo;s a lot of things people imagine could be done and a whole range of things - anything related to data that can come your way. Building a process that allows people to help you and understand where the team will be able to contribute the most to prioritize those things can be really helpful.</p><p><strong>Kobi</strong>: These days, in the way that data science has taken on its own life, it is often divorced from a lot of the feature engineering and covariates response that a lot of us were previously familiar with as long-time statisticians. We can lose the importance of having models that have clearly explanatory effects in them. Business people aren&rsquo;t often interested in convergence rates and things like that. They&rsquo;re interested in, “if I do this, this thing happens.”</p><p>We try to be transparent with the models that we create so that they have an immediate utility to the things that people in business understand. That&rsquo;s not just a conversation. That&rsquo;s a first principles thing. Let&rsquo;s start off making things that are, on the face, transparent with features that match KPIs [Key Performance Indicators] that the business finds important. That dance between doing something that&rsquo;s precise, balanced with the utility of the explanation. That&rsquo;s something that we think about all the time.</p><h3 id="a-namedelivering-real-valueadelivering-real-value"><a name="Delivering-Real-Value"></a>Delivering Real Value</h3><p><strong>What are data teams like, that are able to be pretty effective, impactful, and really make a difference?</strong></p><p><strong>Greg</strong>: I have a couple ideas. First, the ability to communicate effectively with the stakeholders. I mean Jacqueline, Elaine, Nasir, and Kobi all mentioned this - communicating with stakeholders. I don&rsquo;t think that point can be hammered in enough - getting that relationship, communicating effectively, and keeping them in mind when developing things instead of developing things in isolation. It may seem like common sense, but it&rsquo;s not always done in practice. Also, continually involve them in decisions throughout the process instead of going off into your own world, creating something and then coming back. The &ldquo;here&rsquo;s what I did for you&rdquo; approach can come off negatively, instead of &ldquo;let&rsquo;s work on this together and solve this business problem.&rdquo;</p><p>Another thing, from the organizational standpoint, is a mindset of being open to change. Just because this is how things have always been done doesn&rsquo;t necessarily mean they always have to be done that way. Having that ability to change and being open to change is important.</p><p>The other characteristic - and I&rsquo;ve seen this in several areas - is a diverse set of backgrounds for the team members. For instance, if everyone was like me and had a PhD in economics, focusing on econometrics, we&rsquo;d communicate well and all be thinking of the same thing, but that doesn&rsquo;t help the business. That doesn&rsquo;t help my team. I’d want to include computer scientists, industrial engineers, etc. I want other disciplines that have a broader perspective, and that really opens up an open source of ideas. It&rsquo;s not just the leader coming up with the ideas. If you have this diverse skill set, they&rsquo;ll come up with great ideas on their own, working with the business, and it creates an environment of growth and value for the business.</p><p><strong>Jacqueline</strong>: I agree with everything that was said, and I think a real truth that I&rsquo;ve noticed when I&rsquo;ve led data science teams, is that the team rarely fails because, “Oh, we had one data scientist and two data engineers. We should have had two data scientists and one engineer.” What usually fails is there&rsquo;s not a clear focus. A team fails when they&rsquo;re hired because “Hey, we have a lot of data, and, hey, you&rsquo;re data scientists. You figure out what to do with that.” A team will fail if there&rsquo;s not a clear focus.</p><p>To the point of flexibility, when the team recognizes that they aren&rsquo;t finding value in this project, how quickly can they pivot to something better? This is partly people who are flexible and can do that and partly a business environment where the data science team can say, “It turns out that a churn model isn&rsquo;t actually effective here.” And people say, “All right, that&rsquo;s fine. Let&rsquo;s move on to something else.” Instead of, “Well, why don’t you spend another year on it and then come back?”</p><p><strong>Greg</strong>: To add on to that, I think it builds credibility if you can deliver negative results and explain that. Sometimes the business has an idea of what they want and they may be really excited, but it&rsquo;s not how it works in reality. Having the ability to say, “No, this is not how it works” lifts the credibility of the department being the impartial observer and giving results, rather than just giving people what they want to see.</p><p><strong>Julia</strong>: It&rsquo;s interesting that you bring that up, because having to say no or explain a negative result, that the thing you wanted to do is no better than the way things were before, is something I have actually personally experienced as being quite challenging.</p><p><strong>What is your perspective on team characteristics that allow resilience in the face of delivering negative results, or being able to handle that well?</strong></p><p><strong>Kobi</strong>: I&rsquo;m not going to say anything related to scientific characteristics. I think a baseline level of competence, maturity, and having experience of trying things, allows you to be honest with what you&rsquo;re doing and what output it generates. One of the gospels I preach at my job is the experimental design philosophy. We&rsquo;re not generating one answer. Answers aren&rsquo;t static things. Experiments have different results. I flip a coin once, I get heads. I flip another time, I get tails. Have a structure and apparatus in place that can generate an answer at the time that it&rsquo;s asked. Sometimes, you&rsquo;ll get an answer you like, sometimes you get an answer in a direction. All of that is information, and we try to treat what we offer to the business as something that&rsquo;s come from an experimental investigation.</p><p><strong>Data scientists sometimes have a reputation for valuing their autonomy or wanting to spend their time on fancy algorithms over delivering valuable results. Do you think that’s fair? How do you deal with this?</strong></p><p><strong>Nasir</strong>: Most data scientists would like to leverage the latest and greatest types of algorithms, like deep learning. In reality, you may be able to solve the business problem with simple regression techniques. As we’ve mentioned, it’s about building credibility and delivering transparent results. It&rsquo;s not about the latest and greatest technology, it&rsquo;s more about how you can deliver meaningful outcomes to the business and that you can explain it. Whatever you&rsquo;ll be delivering, you can explain it better.</p><p>Model explainability is a huge field and there is still a lack of explanation of very complex black box types of models. Data scientists should not go to complex techniques first but try simple techniques, and see whether you can answer the business question.</p><p><strong>Jacqueline</strong>: I do think it&rsquo;s very fair. I think we, as a field, set this up. What&rsquo;s every big blog post? “Check out GPT-4! Look how much bigger and better - we make the exciting stuff, the new giant GPUs, whatever.” I think when junior data scientists come in they have a zest for trying the latest and greatest, which is okay but you&rsquo;ve got to have a sit down conversation and explain that the metric isn&rsquo;t if the model the most technically accurate. Is it easy to use? Is it robust to rerun?</p><p>As a junior data scientist, I learned the hard way when a bunch of my work had to be thrown out. If you have senior data scientists still making this mistake later in their career, now that&rsquo;s a problem. That&rsquo;s a toxic scenario where you have someone who has a lot of influence and power not helping out the business. I would say as the manager, it&rsquo;s your job to assess how much can I guide them towards the right path and how much do I have to say “this behavior is actually going to cause major problems and we need to have harsh talks and cut it down?” I do think that&rsquo;s a very real thing that happens all the time.</p><p><strong>Julia</strong>: I agree with both of you that this is real - this is a stereotype because it&rsquo;s real. It&rsquo;s something for us, as a field, to grapple with as we think about how we are going to be effective in organizations.</p><p><strong>Elaine</strong>: Building on something Jacqueline said before about companies hiring data scientists just because they have data and the goal isn&rsquo;t clear. I think we can set ourselves up to be more successful here, partly in the hiring process, by not lying to people and saying we&rsquo;re doing these cool, new, sophisticated things. I try to be honest with candidates about the places where we might get to do really cool new things. There&rsquo;s also a lot of getting the right data, providing averages, and doing some of that data democratization. If what you really want to do is do the latest and greatest all day long, you need to go work somewhere where there&rsquo;s real business value in that bleeding edge. At most places, that is not what we&rsquo;re doing the vast majority of the time and can potentially weed out some people who just don&rsquo;t understand what they&rsquo;re getting into.</p><h3 id="a-namecollaboration-across-an-organizationacollaboration-across-an-organization"><a name="Collaboration-Across-an-Organization"></a>Collaboration Across an Organization</h3><p><strong>What contributes to data science teams that collaborate well with non-data stakeholders?</strong></p><p><strong>Greg</strong>: I think part of that is just the willingness to communicate. Sometimes, people want to work in a box or in isolation, so breaking that mindset and saying no, we need to collaborate with each other. We&rsquo;re all on the same team. We want to provide value for the company. We want the company to be successful.</p><p>Shift the mindset to all being on the same team, working together for this as opposed to “let me go do something and bring it back and see if it works.” The latter comes across as something that&rsquo;s being done to the business rather than something they&rsquo;re a part of so the adoption and use would suffer. This goes into the value as well. If they&rsquo;re not using it, then there&rsquo;s obviously no value.</p><p>Again, it&rsquo;s also about being open to change. Just because we&rsquo;ve always done it this way doesn&rsquo;t mean we have to. So some of those same topics are relevant here as well.</p><p><strong>Kobi</strong>: I’ve been at media companies not tech companies for my last couple of jobs. Collaboration is a huge piece of being able to understand the way that the business measures success. I&rsquo;m listening to all the things that everybody else said on the panel. When you first asked that question, Greg said it exactly, communication.</p><p>Also, product management. When I first started, coming from academia, I was less of a devotee to product management but I found it extremely useful to have someone whose job is to figure out how this thing looks, how people will react with it, push that back onto the data science team, all while having having that role separate from the people who are going to use it. People will get frustrated if they don&rsquo;t feel that they have the same language that you do. I think what I found more often than not, is people in frustration will just get quiet rather than push back so that you can refine and change it. Carving out a role on the team for somebody whose job is just to do that, the ingress and egress to the client side, is super important.</p><p><strong>Julia</strong>: That&rsquo;s an interesting point because I think a lot of us have probably heard about self-service data and democratizing data in our organizations to make it so everyone can query the database and everyone can get their own stuff, but then have also experienced frustration as you just said. Like, &ldquo;oh no, people are finding some kind of trend that I don&rsquo;t think is real.&rdquo; Maybe you have experienced frustrations with giving people mass access to data or the democratization of data. I&rsquo;m wondering if part of it is the lack of applying the lessons of product management to those self-service data internally and not thinking like a product manager as you make these things available internally?</p><p><strong>Kobi</strong>: Let me give you an exact example without divulging any intellectual property. Optimization. The linear programmers and operations research want to solve for the most optimal outcome. At one job, one of the things we were optimizing over was the number of views of something over a certain period of time and arranging things to make that most optimal. Turns out in some settings, like ad sales, that&rsquo;s not actually what you want. You&rsquo;re dealing with inventory that has different costs, different rates of return, and it&rsquo;s a nuanced thing. There&rsquo;s a bit of an art to it. The first go round, I&rsquo;m telling a story in developing a model where we know we&rsquo;re hitting this at 100% and causing a bunch of frustration elsewhere in the organization because it made it more difficult for them to do other things that they needed to do.</p><p>It&rsquo;s those sorts of conversations that you need to have - again going back to communication, communication, and communication. This is a relatively new field, we do something which is relatively technical, and society in general is relatively innumerate. The lift here is important in being able to converse with what&rsquo;s going on and the people who are making the money. It&rsquo;s super important because the downside or the negative outcome is the silence that can happen between a data science team and the rest of the organization. I worry about that at any company, in any data science team, and as I think about data science as a field.</p><p>I remember when places were getting rid of their statistics departments, when you would see an econometrics department and a psychometrics department, but there wouldn’t be a stats department. I remember trying to think of different names, being at society meetings and thinking, &ldquo;Should we call it analytics? Should we call it data science?&rdquo; We enjoy a time of popularity in the zeitgeist right now and I think it&rsquo;s important for us, as we&rsquo;re instantiating ourselves in this business, to do it in the best way. Communication is super important and to have that language back and forth about the technical things we create and the things that are meaningful, especially at places that aren&rsquo;t tech companies.</p><p><strong>Jacqueline</strong>: I think one thing that came up was this idea that people are silent when they should be talking. This is something I&rsquo;ve noticed leading a data science team. I have tried to teach my team about going and telling people stuff, even if it&rsquo;s kind of scary.</p><p>To another point that was made, if someone says something you don&rsquo;t agree with even if they&rsquo;re more senior than you, you have the right to push back on it. If you don&rsquo;t feel comfortable pushing back, that is the job of your manager or the data science leader. I think it&rsquo;s certainly the case that communication is key. But to the people who are not yet managers or directors, communication can be really difficult. I think a lot of our job is to figure out how to actually get people to do the communication that you know the team needs, but is not necessarily obvious. I really want to stress that mentoring on communication is so much of the job of a data science leader and I feel very passionate about that.</p><p><strong>Elaine</strong>: To emphasize that, I remember someone on my team who had done this great work and it had kind of gotten brushed aside because people didn&rsquo;t really understand how it fit. I told him, if you spend three times as much time continuing to communicate this until people understand as you did doing the work, it will be worth it. I don&rsquo;t think he was excited to hear that, but we need to shift our thinking to making sure that people understand the work that we do. It really doesn&rsquo;t matter how great it is, the communication is even more a legitimate part of the job. People need to think of that as a valuable way to spend their time, especially when they&rsquo;re coming out of school and new to a team.</p><h2 id="lets-keep-talking">Let&rsquo;s keep talking</h2><p>Thank you so much to our panelists for their insights and for opening up the conversation on building effective data science teams. As mentioned above, we’d love to keep this discussion going and dive deeper into specific topics through future blogs, RStudio Community, and open meetup discussions.</p><h3 id="future-blogs">Future Blogs</h3><p><a href="https://www.rstudio.com/about/subscription-management/" target ="_blank">Subscribe to future blog posts</a> on Building Effective Data Science Teams. In the next blog post on this topic, we&rsquo;ll share the attendee questions and answers that were asked live during the session:</p><ul><li><strong>What tips would you provide for organizations where data science is not fully established?</strong></li><li><strong>How do you evaluate data science candidates for roles?</strong></li><li><strong>How can I convince senior leadership about where data scientists should be in different parts of the business?</strong></li><li><strong>I&rsquo;m currently a data scientist and I am interested in moving into leadership. What do you think I should do?</strong></li></ul><h3 id="rstudio-community">RStudio Community</h3><p><a href="https://community.rstudio.com/c/industry/44" target="_blank">Continue the conversation</a> through RStudio Community and use the link for each individual webinar question to share your thoughts or ask follow-up questions on a specific topic.</p><h3 id="open-meetup-discussion">Open Meetup Discussion</h3><p><a href="https://www.meetup.com/RStudio-Enterprise-Community-Meetup" target="_blank">Join the RStudio Enterprise Community Meetup Group</a> for this open discussion on June 24th with John Thompson, Global Head of Advanced Analytics &amp; AI at CSL Behring.</p><p>We&rsquo;ll kick off this conversation with a few unanswered attendee questions from the <a href="https://www.rstudio.com/resources/webinars/building-effective-data-science-teams/" target="_blank">Building Effective Data Science Teams</a> webinar:</p><ul><li><strong>What tips do you have for training a data science team of varied skillsets?</strong></li><li><strong>Are engineering and DevOps tightly integrated with the data science team and in either case, how do you create a common vocabulary across these roles? How do you build partnerships with other teams, such as security?</strong></li><li><strong>How do you establish coding standards across your team and secondly how do you decide what tools are going to be used - whether that’s certain software, internal packages, visualization tools, etc.?</strong></li></ul></description></item><item><title>Announcing RStudio Workbench</title><link>https://www.rstudio.com/blog/announcing-rstudio-workbench/</link><pubDate>Wed, 02 Jun 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-rstudio-workbench/</guid><description><p>We have renamed RStudio Server Pro to RStudio Workbench. This change reflects the product’s growing support for a wide range of different development environments. RStudio Workbench enables R and Python data scientists to use their preferred IDE in a secure, scalable, and collaborative environment&ndash;whether that is the RStudio IDE, JupyterLab, Jupyter Notebooks, or VS Code. We want RStudio Workbench to be the best single platform to support open source, code-first data science, whether your team is using R or Python.</p><p>If you’d like to learn more about the reasons behind this name change, and what it might mean for you, please check out our <a href="https://support.rstudio.com/hc/en-us/articles/1500012472761" target="_blank" rel="noopener noreferrer">FAQ here</a>, or set up a conversation with your customer success representative.</p><p>We have also released new versions of the RStudio open source IDE, as well as RStudio Server Open Source and RStudio Desktop, and will share all the details in an upcoming post. Check out the <a href="https://www.rstudio.com/products/rstudio/release-notes/">release notes</a> if you&rsquo;d like to know more now. <strong>This release includes support for the latest version of R (4.1). If you wish to use R 4.1 with the RStudio IDE (whether open source or professional), you must upgrade to this release.</strong></p><h2 id="whats-new-in-rstudio-workbench">What&rsquo;s new in RStudio Workbench</h2><ul><li>VS Code as a fully supported development environment (learn more <a href="https://blog.rstudio.com/2021/06/02/rstudio-workbench-vscode-sessions/" target="_blank" rel="noopener noreferrer">here</a>)</li><li>Multiple Python-based improvements</li><li>Additional R and RStudio-based improvements</li></ul><p>In addition, stay tuned as we bring you more details on this release and support for R 4.1.</p><p>You can download RStudio Workbench <a href="https://www.rstudio.com/products/rstudio/download/">here</a> and read the documentation <a href="https://docs.rstudio.com/ide/server-pro/" target="_blank" rel="noopener noreferrer">here</a>.</p><p>To receive email notifications for RStudio professional product releases, patches, security information, and general product support updates, subscribe to the Product Information list by visiting the <a href="https://www.rstudio.com/about/subscription-management/">RStudio subscription management portal</a>.</p></description></item><item><title>RStudio Workbench: VS Code Sessions</title><link>https://www.rstudio.com/blog/rstudio-workbench-vscode-sessions/</link><pubDate>Wed, 02 Jun 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-workbench-vscode-sessions/</guid><description><p>The RStudio Workbench Juliet Rose release introduces support for VS Code sessions. Previously released as a <a href="https://blog.rstudio.com/2020/11/16/rstudio-1-4-preview-server-pro/" target="_blank" rel="noopener noreferrer">preview feature in RStudio Server Pro v1.4</a>, VS Code sessions are now fully supported.</p><p>Once VS Code sessions have been configured, users can launch them simply by changing the ‘Editor’ field in the New Session dialog.</p><p>With addition of support for VS Code sessions alongside standard RStudio sessions, Jupyter Lab sessions, and Jupyter Notebook sessions, RStudio Workbench now allows data scientists to more easily analyze their data, whatever their preferred language and editor. Note that RStudio Workbench includes the launcher support that is required in order to use non-standard editors. To learn more feel free to read our <a href="https://support.rstudio.com/hc/en-us/articles/1500012472761" target="_blank" rel="noopener noreferrer">FAQ here</a> on the recent release of RStudio Workbench, formerly known as RStudio Server Pro.</p><h2 id="installing-vs-code">Installing VS Code</h2><p>To get started with VS Code sessions, RStudio Workbench administrators will need to install Code Server, which is an open source project that provides VS Code sessions over a server. RStudio Workbench includes a new command to help you easily and quickly install and configure VS Code sessions:</p><pre><code class="language-{.bash}" data-lang="{.bash}">sudo rstudio-server install-vs-code /opt/code-server</code></pre><p>For more details about installing and configuring VS Code, please see our <a href="https://docs.rstudio.com/ide/server-pro/vs-code-sessions.html" target="_blank" rel="noopener noreferrer">Admin Guide</a>.</p><h2 id="using-vs-code-sessions">Using VS Code Sessions</h2><p>Once VS Code is installed and configured, users can launch VS Code sessions easily from the RStudio Workbench Homepage. Just like in the desktop version of VS Code, users will be able to edit, test, debug their code, and use all the other features they are already familiar with.</p><h2 id="the-rstudio-workbench-vs-code-extension">The RStudio Workbench VS Code Extension</h2><p>RStudio Workbench VS Code sessions will have the RStudio Workbench VS Code extension automatically installed. This extension gives users an easy way to get back to the RStudio Workbench Homepage from within their VS Code session.</p><img src="home-button.png" alt="RStudio Workbench Home Button" class="center"><p>The RStudio Workbench VS Code extension will also help users access their development web servers. When developing a web server, like Shiny, Dash, or Streamlit, with a remote VS Code session through RStudio Workbench, the application will also be running remotely - on the same machine as the VS Code session. Because of that, users won’t be able to access their web servers over 127.0.0.1, like they would if they were developing locally. The ‘Proxied Servers’ view in the RStudio Workbench container of VS Code shows a list of the user’s running web servers. Users can click on an item in that list to open a link that will let them access their web server running on the remote host.</p><img src="proxied-servers-view.png" alt="Proxied Servers View" class="center"><p>To learn more on the recent release of RStudio Workbench you can read our <a href="https://blog.rstudio.com/2021/06/02/announcing-rstudio-workbench/" target="_blank" rel="noopener noreferrer">post here</a>.</p></description></item><item><title>(Re)Introducing the "Solutions" website</title><link>https://www.rstudio.com/blog/re-introducing-the-solutions-website/</link><pubDate>Thu, 27 May 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/re-introducing-the-solutions-website/</guid><description><p>At RStudio, we pride ourselves on the quality of our documentation. Our documentation website, <a href="https://docs.rstudio.com" target="_blank">docs.rstudio.com</a>, provides detailed information on the installation and configuration of our products.</p><p>The documentation for each product is owned by the relevant product team, but we all contribute where necessary. One of the teams that spends the most time in the documentation is our Solutions Engineering team. It’s our SEs job to help our customers get the most from the products they’ve purchased. Oftentimes this help needs to go beyond configuration options and authentication mechanisms and for that, we have the solutions website, <a href="https://solutions.rstudio.com" target="_blank">solutions.rstudio.com</a>.</p><div class="figure"><img src="solutions-homepage.png" alt="" /><p class="caption"><em>The Solutions site home page</em></p></div><p>The solutions site is home to things like <a href="https://solutions.rstudio.com/sys-admin/architectures/" target="_blank">reference architecture patterns</a>, information on other ways to run our software, such as <a href="https://solutions.rstudio.com/sys-admin/docker/docker_deployment/" target="_blank">in Docker containers</a>, and great tips on <a href="https://solutions.rstudio.com/data-science-admin/model-management/" target="_blank">model management</a>, as well as lots more information besides!</p><div class="figure"><img src="solutions-reference-architecture.png" alt="" /><p class="caption"><em>One of the many reference architecture diagrams from the site</em></p></div><p>The Solutions site has been around for a long time and has already been through several design revisions. Recently though, we decided to bring it into line with our main documentation site, both visually, as well as functionally.</p><p>That new design went live a few weeks ago and we’re already hearing great reports from colleagues and customers alike about how much easier it is to use. This new version of the site has a much improved search functionality as well as a simplified layout that makes finding the things that are of interest to you much simpler.</p><p>In the background, the site is also easier to maintain, which lightens the burden for the Solutions Engineering team and makes publishing new articles and updates so much more straightforward. We hope you like the new design and that you find the information useful.</p><div id="solutions-site-highlights" class="level2"><h2>Solutions site highlights</h2><p>Here are some highlights of the content on <a href="https://solutions.rstudio.com" target="_blank">solutions.rstudio.com</a>:</p><ul><li><a href="https://solutions.rstudio.com/sys-admin/architectures/" target="_blank">Reference architectures</a> - Reference architectures fully diagrammed and explained, showing each product in all of it’s main configurations.</li><li><a href="https://solutions.rstudio.com/python/" target="_blank">Using Python with RStudio</a> - A breakdown of all the different ways you can use our products with Python as well as R.</li><li><a href="https://solutions.rstudio.com/data-science-admin/deploy/apis/" target="_blank">Programmatic deployment to Connect</a> - Using RStudio Connect’s APIs to publish content.</li><li><a href="https://solutions.rstudio.com/r/rest-apis/plumber-slack/" target="_blank">Building a slackbot with R and plumber</a> - Integrate a plumber API with your organisation’s Slack instance to harness the power of R from directly within Slack.</li><li><a href="https://solutions.rstudio.com/r/rest-apis/clients/" target="_blank">Calling plumber APIs from other languages</a> - Example code in a variety of languages for interacting with a plumber API.</li><li><a href="https://solutions.rstudio.com/data-science-admin/scheduling/" target="_blank">Scheduling data science tasks</a> - Advice and tips on scheduling using both Linux cron and RStudio Connect.</li><li><a href="https://solutions.rstudio.com/python/dash/" target="_blank">Running Dash applications on Connect</a> - Sample Dash content running on RStudio Connect as well as links to example code.</li></ul></div><div id="and-finally" class="level2"><h2>And finally…</h2><p>For those of you who spend far too much time staring at your computer, we’ve introduced a “dark mode” via the little toggle on the nav bar at the top.</p><div class="figure"><img src="solutions-dark-mode.png" alt="" /><p class="caption"><em>Solutions site with dark mode enabled</em></p></div></div></description></item><item><title>Centralizing your Analytics Infrastructure with eoda and Covestro</title><link>https://www.rstudio.com/blog/centralizing-your-analytics-infrastructure-with-eoda-and-covestro/</link><pubDate>Tue, 25 May 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/centralizing-your-analytics-infrastructure-with-eoda-and-covestro/</guid><description><p><b>Guten Tag!</b> On June 16th, we’ll be teaming up with <a href="https://www.eoda.de/" target="_blank" rel="noopener noreferrer">eoda</a> and <a href="https://www.covestro.com/en" target="_blank" rel="noopener noreferrer">Covestro</a> to discuss—in German—how they put their data science into production by developing a unified infrastructure. In addition to learning about Covestro’s success, participants will also receive hands-on guidance on deploying data products in highly structured environments.</p><img src="robots.png" alt="Reproducible robots" class="center"><p style="text-align: right"><small><i>Photo by <a href="https://unsplash.com/@ekrull" target="_blank" rel="noopener noreferrer">Ekrull</a> on Unsplash</i></small></p><p>The primary problem for Covestro, a leading manufacturing company in Germany, was that they lacked a centralized development environment. Data scientists that used R and Python did complex analyses on their laptops and used a variety of tools to analyze and share results. The team decided that they wanted to deliver a greater impact at faster speeds, and so with the help of eoda and RStudio, they created a centralized, reproducible analytics infrastructure. Although R and Python represent the core of their environment, many other techniques and tools were integrated such as H2O for machine learning, scaling with kubernetes, CI/CD with GitLab, and version control with SVN. Not only did this framework increase the collaboration of Covestro&rsquo;s data science teams, but compliance guidelines could also be better fulfilled. A centralized infrastructure, incorporating statistics, IT support, machine learning and artificial intelligence, enables decentralized teams to solve problems and achieve long-term success.</p><p>The second half of this 2-hour German webinar will be a hands-on workshop that participants can follow along with. The instructors will discuss how to use RStudio products, including native git integration and multiple approaches to deploying with <a href="https://www.rstudio.com/products/connect/" target="_blank" rel="noopener noreferrer">RStudio Connect</a>. This workshop will provide teams with an excellent collaborative workflow for building, maintaining, and scaling projects in a growing data science environment.</p><h2 id="what-you-will-learn">What you will learn</h2><ul><li><p>Insights into the pain points and solutions for the data science initiatives of Covestro. What is a centralized analytics system? What tools and techniques are needed?</p></li><li><p>How to identify existing building blocks in your infrastructure and successfully address them. What is infrastructure as code? What is the business value for your company?</p></li><li><p>Exclusive insights into RStudio products. How to use the native git integration of RStudio products? How does deployment work with RStudio Connect?</p></li></ul><p>Please note: This live webinar will be presented in German, and English subtitles will be provided approximately 3 weeks following the recording.</p><p>Register here: <a href="https://www.eoda.de/webinar-data-science-in-production/">https://www.eoda.de/webinar-data-science-in-production/</a></p></description></item><item><title>ASA DataFest 2021</title><link>https://www.rstudio.com/blog/asa-datafest-2021/</link><pubDate>Thu, 20 May 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/asa-datafest-2021/</guid><description><p>The <a href="https://ww2.amstat.org/education/datafest" target="_blank">American Statistical Association (ASA) DataFest</a> is a celebration of data in which teams of undergraduates work around the clock to find and share meaning in a large, rich, and complex data set.The teams that impress the judges win prizes, and the event is a great opportunity for them to gain some data analysis experience.As part of our <a href="https://www.rstudio.com/about/what-makes-rstudio-different/" target="_blank">mission</a> to support open source data science, RStudio was proud to help sponsor this event, and provide <a href="https://rstudio.cloud/" target="_blank">RStudio Cloud</a> as a platform to help enable collaboration within the teams</p><p>ASA DataFest 2021 season just wrapped up, and like all things this year, DataFest was virtual.Despite the challenges of organising a virtual community event, we had 30 sites from six countries host an event this year, many with participation from multiple universities.Over 2,500 students participated in DataFest over an eight-week period between March and May 2021.You can find out about the participating institutions <a href="https://ww2.amstat.org/education/datafest/participants.cfm" target="_blank">here</a>.</p><h2 id="the-challenge">The challenge</h2><p>For this year&rsquo;s challenge ASA DataFest partnered with <a href="https://www.rmpds.org/" target="_blank">The Rocky Mountain Poison &amp; Drug Safety (RMPDS)</a>.RMPDS is a leader in public health protection serving the public since 1956 with innovative research in toxicity, progressive solutions in case management, emergency services and regulatory compliance.RMPDS also runs a large survey on drug misuse, and data from this survey conducted in the United States, the United Kingdom, Canada, and Germany formed the basis of this year&rsquo;s DataFest challenge.Teams were tasked with discovering and identifying patterns of drug use, with particular attention paid to identifying misuse of prescription drugs.These could include patterns that might describe demographic profiles within a given category of drug or combinations of drugs that frequently appear together.</p><p>The dataset was challenging to work with for a variety of reasons.First, many undergraduate curricula don&rsquo;t address working with surveys with weights so students had to choose between the following options:</p><ul><li>Incorporating the weights into their analysis and being able to make generalizable conclusions.</li><li>Omitting that feature of the data due to lack of experience with working with survey data and be very careful about the scope of their inference.</li></ul><p>The weighted samples also made country comparisons challenging without the use of proper survey analysis techniques, so many teams opted for analyzing data from a single country (often the country where they were located).On the upside, this allowed them to bring in what they might know about drug use in their countries as an outside data source and do a more localized analysis.</p><h2 id="running-a-virtual-datafest">Running a virtual DataFest</h2><p>In 2020, COVID-19 lockdown restrictions came just at the beginning of the DataFest season and many organizers either ran a modified virtual version of the event or had to abandon it altogether, since pivoting to virtual in such a short timespan with little preparation felt daunting.This year, after a full year of teaching virtually, faculty had a much better sense of what works and what doesn&rsquo;t when it comes to running virtual events.</p><p>Just about all sites used some form of virtual communication tool like Slack, MS Teams, or Discord.Many sites made use of Zoom, especially for the kickoff event and the awards ceremony, though the idea of keeping folks on Zoom for the full 48-hour duration of the event wasn&rsquo;t appetizing to anyone.At The University of Edinburgh (for our joint event with Heriot-Watt) we used GatherTown for co-working and communication throughout the event, which worked better than we could have hoped for, and the students loved it!</p><p>Of course, good tech alone doesn&rsquo;t build a virtual community.The key players in the success of DataFest each year (whether it&rsquo;s in person or online) are the volunteer mentors &ndash; postgraduate students, faculty, industry data professionals &ndash; who devote their weekend to check in with the students, help them get over hurdles, and act as a sounding board for their ideas.Recreating effective mentoring interactions online is no small feat, but spatial tools like GatherTown coupled with the willingness of mentors to check in with students regularly helped us achieve that goal.</p><h2 id="rstudio-cloud-at-asa-datafest">RStudio Cloud at ASA DataFest</h2><p>DataFest was founded in UCLA in 2011, and since then, just about each year RStudio has provided sponsorship for various DataFest events, including sending over the much coveted hex stickers to sites!</p><p>This year, since the event was virtual, hex sticker drops didn&rsquo;t quite fit, but a new challenge was making sure all participants had access to the computing resources needed for the event, which is where RStudio Cloud came in!</p><p>We set up an ASA DataFest organization on RStudio Cloud, and within the organization, created workspaces for each host site who requested one.This meant the organizers for the host site got admin access to the workspace and could use it however they wanted.Many organisers used this to also distribute the data &ndash; they placed the data in a base project in the Cloud workspace, which means any projects created in the workspace came with the data as well.</p><p>Since ASA DataFest is tool/language agnostic, not all computing was done in RStudio Cloud, but students who chose to use R for their analysis appreciated the easy access!</p><h2 id="asa-datafest-2021-in-action">ASA DataFest 2021 in action</h2><p>Below are links to a few of the DataFest events from this year where you can find more out about some of the individual events and watch recordings of student presentations.</p><ul><li><a href="https://westveld-statsci.com/new-blog/2021/5/14/asa-datafest-2021-the-australian-national-university" target="_blank">Australian National University</a> - Canberra, Australia.</li><li><a href="https://web.colby.edu/datascience/datafest-2021/" target="_blank">Colby College</a> with participation also from Bates College, Bowdoin College, and College of the Atlantic - Waterville, ME.</li><li><a href="https://www2.stat.duke.edu/datafest/winners.html" target="_blank">Duke University</a>, with participation also from Appalachian State University, North Carolina State University, North Carolina A&amp;T University, University of North Carolina Greensboro, and University of North Carolina Chapel Hill - Durham, NC.</li><li><a href="https://datamine.purdue.edu/datafest.html" target="_blank">Purdue University</a> - West Lafayette, IN.</li><li><a href="http://dslab.stat.metu.edu.tr/datafest/" target="_blank">Orta Doğu Teknik Üniversitesi (Middle East Technical University</a> - Ankara, Turkey.</li><li><a href="http://datafest.stat.ucla.edu/competition/2021-asa-datafesttm-results/" target="_blank">UCLA</a>, with participation also from Pomona College, University of California Riverside, and University of Southern California - Los Angeles, CA.</li><li><a href="https://datafest-edi.github.io/web/df2021/" target="_blank">University of Edinburgh</a>, with participation also from Heriot Watt - Edinburgh, UK.</li><li><a href="https://datafest.stat.missouri.edu/datafest2021/result.html" target="_blank">University of Missouri</a> - Columbia, MO.</li><li><a href="https://datafestnd.weebly.com/datafest-2021.html" target="_blank">University of Notre Dame</a> - Notre Dame, IN.</li></ul><h2 id="interested-in-datafest">Interested in DataFest?</h2><p>If you&rsquo;re a faculty member interested in hosting your own DataFest or participating in a nearby event in 2022, see <a href="https://ww2.amstat.org/education/datafest/contact.cfm" target="_blank">here</a> for instructions for signing up for our mailing list.</p><p>If you&rsquo;re an undergraduate student interested in participating in DataFest, and your school has not held an event before, reach out to a faculty member with the above information.Student interest would be a great motivator for taking on the task of organising an event or, at a minimum, reaching out to an organiser at a nearby institution to join forces.</p><h2 id="interested-in-teaching-your-own-class-or-workshop-using-r">Interested in teaching your own class or workshop using R?</h2><p>RStudio Cloud is a lightweight, cloud-based solution that allows anyone to do, share, teach and learn data science online.There&rsquo;s nothing to configure and no dedicated hardware, installation or annual purchase contract required.</p><p>We offer a free plan for casual, individual use, and we offer paid premium plans for professionals, instructors, researchers and organizations.Learn more at <a href="https://www.rstudio.com/products/cloud/">https://www.rstudio.com/products/cloud/</a>.</p></description></item><item><title>Managing COVID Vaccine Distribution, With a Little Help From Shiny</title><link>https://www.rstudio.com/blog/managing-covid-vaccine-distribution-with-a-little-help-from-shiny/</link><pubDate>Tue, 18 May 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/managing-covid-vaccine-distribution-with-a-little-help-from-shiny/</guid><description><sup><em>Photo by <a href="https://unsplash.com/photos/jWPNYZdGz78" target="_blank">Steven Cornfield</a> on <a href="https://unsplash.com/" target="_blank">Unsplash</a></em></sup></p><script src="index_files/header-attrs/header-attrs.js"></script><style type="text/css">.vidcontainer {text-align:center;}.vidcapcontainer {width: 560px;margin: auto;}</style><p>In the United States, <a href="https://www.nytimes.com/interactive/2020/us/covid-19-vaccine-doses.html" target="_blank">approximately 2.2 million doses of COVID vaccines are being delivered each day</a>, and how these doses go from the manufacturer to a shot in someone’s arm varies by state, often with mixed results. But early in the vaccine distribution process, one state led the pack in terms of using the majority of vaccine doses it had been allotted. That state? <a href="https://www.npr.org/sections/coronavirus-live-updates/2021/02/22/968829227/west-virginias-vaccination-rate-ranks-among-highest-in-world" target="_blank">West Virginia</a>.</p><p>Behind West Virginia success has been <a href="https://business.wvu.edu/research-outreach/data-driven-wv" target="_blank">Data Driven West Virginia’s</a> creation of an inventory management system using <a href="https://shiny.rstudio.com/" target="_blank">Shiny</a>, an open source framework for building interactive web applications. Using Shiny has provided visibility into each component of the vaccine supply chain, leading to the creation of distribution plans that are able to quickly and efficiently match supply with demand, getting vaccines to the right people in the right location at the right time.</p><div class="vidcontainer"><p><iframe width="560" height="315" align="middle" src="https://www.youtube.com/embed/CYilc-rEgjg" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></p></div><div class="vidcapcontainer"><p><em>Watch the story of how Data Driven West Virginia collaborated with the West Virginia Army National Guard to build a COVID vaccine inventory management system using Shiny.</em></p></div><div id="covid-vaccine-lifecycle-in-west-virginia" class="level1"><h2>COVID vaccine lifecycle in West Virginia</h2><p>To understand just how hard it is to get vaccines to the population, it helps to understand where it can go wrong. This starts with <a href="https://www.usatoday.com/in-depth/graphics/2020/12/21/how-covid-19-vaccines-will-be-shipped-and-distributed-using-cold-chain-technologies/3941343001/" target="_blank">how vaccines are packed into containers</a>. To fill up a container, Pfizer places 195 vials into a tray, and up to 5 trays into a single container. Moderna puts 10 vials into a small box, and then combines a minimum of 10 small boxes into a single container.</p><p>In most states Pfizer and Moderna ship directly to the organization that will be administering the vaccine to the population. This could be a hospital, a pharmacy, or any place where trained professionals will be putting shots into arms. But what happens when a pharmacy receives a full container from Pfizer, 975 vials, but only needs 600?</p><p>West Virginia has removed this complication by shipping directly to five hubs strategically located throughout the state. Within each of these hubs, containers of vaccine vials are broken down into smaller components and then either picked up or shipped directly to the hospital, pharmacy, or organization that will be administering the vaccine.</p><p>These hubs are managed by the <a href="https://www.weirtondailytimes.com/news/local-news/2020/12/w-va-rehearses-for-vaccine-rollout/" target="_blank">Joint Interagency Task Force</a> (JIATF), a team of teams composed of public, private, and governmental organizations as well as the National Guard. The Joint Interagency Task Force is responsible for drawing up a weekly distribution plan for each hub, in alignment with CDC allocations, and matching vaccine supply with demand.</p><div class="vidcontainer"><iframe width="560" height="315" src="https://www.youtube.com/embed/T2DzDs0ksZY" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></div><div class="vidcapcontainer"><p><em>Watch Katherine Kopp, Director of Data Driven West Virginia, walk through one of the Shiny apps built to manage vaccine distribution.</em></p></div><p>By using a statewide system managed by a central organization, there’s a level of agility and fluidity that allows each hub to adjust to a variety of changes in order to maximize the number of vaccines that are being administered to the population each week.</p></div><div id="benefits-of-open-source-software" class="level1"><h2>Benefits of open source software</h2><p>Data Driven West Virginia and the West Virginia Army National Guard have given the state of West Virginia an invaluable gift – the gift of time. By reducing the time to create a distribution report from days to an hour, the National Guard can deal with unexpected weather emergencies while still managing vaccine delivery.</p><p>By using Shiny, the team at Data Driven West Virginia was able to leverage their existing R skills while avoiding lengthy procurement processes and instead focus on helping the citizens of West Virginia. What’s more, using a code-first approach allowed the creation of Shiny apps that could be iterated upon, thereby meeting user needs along with updates and changes as the project developed.</p><p>The team has continued to build out a series of interconnected apps to further assist with vaccine delivery in West Virginia, because in addition to matching up vaccine supply with demand, the Joint Interagency Task Force is also responsible for getting Ancillary Supply Kits – all of the related supplies needed to deliver vaccines, such as syringes, alcohol wipes, gauze pads – out to each organization responsible for getting shots in arms.</p><p>You can check out our videos on this story, as well as learn more about Shiny, by visiting our YouTube channel: <a href="http://www.youtube.com/rstudiopbc" target="_blank">www.youtube.com/rstudiopbc</a>. Be sure to subscribe to keep up to date with new stories, product updates, and releases!</p></div></description></item><item><title>Code-First Data Science for the Enterprise</title><link>https://www.rstudio.com/blog/code-first-data-science-for-the-enterprise2/</link><pubDate>Wed, 12 May 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/code-first-data-science-for-the-enterprise2/</guid><description><p>As a data scientist, or as a leader of a data science team, you know the power and flexibility that open source data science delivers. However, if your team works within a typical enterprise, you compete for budget and executive mindshare with a wide variety of other analytic tools, including self-service BI and point-and-click data science tools. Navigating this landscape, and convincing others in your organization of the value of open source data science, can be difficult. In this blog post, we draw on <a href="https://www.rstudio.com/resources/why-your-enterprise-needs-code-first-data-science/"> our recent webinar on this topic</a> to give you some talking points to use with your colleagues when tackling this challenge.</p><p>However, it is important to keep in mind that “code-first” does not mean “code only.” While code is often the right choice, most organizations need multiple tools, to ensure you have the right tool for the task at hand.</p><h2 id="the-pitfalls-of-bi-tools-and-codeless-data-science">The Pitfalls of BI Tools and Codeless Data Science</h2><p>There are multiple ways to approach any given analytic problem. At their core, various data science and BI tools share many aspects. They all provide a way of drawing on data from multiple data sources, and to explore, visualize and understand that data in open-ended ways. Many tools support some way of creating applications and dashboards that can be shared with others to improve their decision-making.</p><p>Since these very different approaches can end up delivering applications and dashboards that may (at first glance) appear very similar, the strengths and nuances of the different approaches can be obscured to decision makers, especially to executive budget holders—which leads to the potential competition between the groups.</p><p>However, when taking a codeless approach, it can be difficult to achieve some critical analytic best practices, and to answer some very common and important questions:</p><ul><li><strong>Difficulty tracking changes and auditing work</strong>: When modifications and additions are obscured in a series of point-and-click steps, it can be very challenging to answer questions like:<ul><li>Why did we make this decision in our analysis?</li><li>How long has this error gone unnoticed?</li><li>Who made this change?</li></ul></li><li><strong>No single source of truth</strong>: Without a centralized way of sharing and storing analyses and reports, different versions and spreadsheets can proliferate, leading to questions like:<ul><li>Is this the most recent [data, report, dashboard]?</li><li>Is the file labeled <code>sales-data 2020-12 final FINAL Apr 21 NR (4).xlsx</code> <strong>really</strong> the most recent version of the analysis?</li><li>Where do I find the [data, report, dashboard] I am looking for? And who do I have to email to get the right link?</li></ul></li><li><strong>Difficult to extend and reproduce your work</strong>: When you are depending on a proprietary platform for your analysis, with the details hidden behind the point-and-click interface, you might face questions like:<ul><li>What did our model say 6 months ago?</li><li>Can I apply this analysis to this new (slightly different) data/problem?</li><li>Are we actually meeting the relevant regulatory requirements?</li><li>Is our work truly portable? Will others be able to reproduce and confirm our results?</li></ul></li></ul><p>At best, wrestling with questions like these will distract an analytics team, burning precious time that could be spent on new, valuable analyses. At worst, stakeholders end up with inconsistent or even incorrect answers because the analysis is wrong, not the correct version, or not reproducible. This can fundamentally undermine the credibility of the analytics team. Either way, the potential impact of the team for supporting decision makers is greatly reduced.</p><h2 id="the-benefits-of-code-first-data-science">The benefits of code-first data science</h2><p><a href="https://www.rstudio.com/about/">RStudio’s mission</a> is to create free and open-source software for data science, because we fundamentally believe that this enhances the production and consumption of knowledge, and facilitates collaboration and reproducible research.</p><p>At the core of this mission is a focus on a code-first approach. Data scientists grapple every day with novel, complex, often vaguely-defined problems with potential value to their organization. Before the solution can be automated, someone needs to figure out how to solve it. These sorts of problems are most easily approached with code.</p><p><strong>With Code, the answer is always yes!</strong></p><p>Code is:</p><ul><li><strong>Flexible</strong>: With code, there are no black box constraints. You can access and combine all your data, and analyze and present it exactly as you need to.</li><li><strong>Iterative</strong>: With code, you can quickly make changes and updates in response to feedback, and then share those updates with your stakeholders.</li><li><strong>Reusable and extensible</strong>: With a code-first approach, you can tackle similar problems in the future by applying your existing code, and extend that to novel problems as circumstances change. This makes code a fundamental source of Intellectual Property in your organization.</li><li><strong>Inspectable</strong>: With code, coupled with version control systems like git, you can track what has changed, when, by whom, and why. This helps you discover when errors might have been introduced, and audit the analytic approach.</li><li><strong>Reproducible</strong>: When combined with environment and package management (such as the capabilities provided by <a href="https://www.rstudio.com/products/team/">RStudio Team</a>, you can ensure that you will be able to rerun and verify your analyses. And since your data science is open source at its core, you can be confident that others will be able to rerun and reproduce your analysis, without being reliant on expensive proprietary tools.</li></ul><table><thead><tr><th class="problem"> Codeless Problem </th><th class="solution"> Code-First Solution </th></tr></thead><tr><td><p>Difficulty tracking changes and auditing work</p></td><td><p>Code, coupled with version control systems like git, to track what changed, when, by whom, and why.</p><p>Code can be logged when run for auditing and monitoring.</p></td></tr><tr><td><p>No single source of truth</p></td><td><p>Centralized tools to create a single source of truth for data, dashboards, and models.</p><p>Version control to track multiple versions of code separately without creating conflicts.</p></td></tr><tr><td><p>Difficult to extend and reproduce work</p></td><td><p>Code enables reproducibility by explicitly recording every step taken.</p><p>Open-source code can be deployed on many platforms, and is not dependent on proprietary tools.</p><p>Code can be copied, pasted, and modified to address novel problems as circumstances change.</p></td></tr><tr><td><p>Black box constraints on how you analyze your data and present your insights</p></td><td><p>Access and combine all your data, and analyze and present it exactly as you need to, in the form of tailored dashboards and reports.</p><p>Pull in new methods and build on other open source work without waiting for proprietary features to be added by vendors.</p></td></tr></table><center><i>A summary of how a code-first approach helps tackle codeless challenges</i></center><h2 id="objections-to-code-first-data-science">Objections to Code-First Data Science</h2><p>When discussing the benefits of a code-first approach within your organization, you may hear some common objections:</p><ul><li><strong>“Coding is too hard!”</strong>: In truth, it’s never been easier to learn data science with R. RStudio is dedicated to the proposition that code-first data science is uniquely powerful, and that everyone can learn to code. We support this through <a href="https://education.rstudio.com/" target="_blank">our education efforts</a>, <a href="https://community.rstudio.com/" target="_blank">our Community site</a>, and making R easier to learn and use through our open source projects such as the <a href="https://www.tidyverse.org/" target="_blank">tidyverse</a>.</li><li><strong>“Does code-first mean only code?”</strong>: Absolutely not. It’s about choosing the right tool for the job, which is why RStudio focuses on the idea of <a href="https://blog.rstudio.com/2020/07/15/interoperability-maximize-analytic-investments/">Interoperability</a> with the other analytic frameworks in your organization, <a href="https://blog.rstudio.com/2021/01/13/one-home-for-r-and-python/">supporting Python alongside R</a>, and <a href="https://blog.rstudio.com/2021/03/18/bi-and-data-science-the-tradeoffs/">working closely with BI tools</a> to reach the widest possible range of users.</li><li><strong>“But R doesn’t provide the enterprise features and infrastructure we need!”</strong>: Not true. RStudio’s professional product suite, <a href="https://www.rstudio.com/products/team/">RStudio Team</a>, provides security, scalability, package management and centralized administration of development and deployment environments, delivering the enterprise features many organizations require. Our hosted offerings, <a href="https://rstudio.cloud/" target="_blank">RStudio Cloud</a> and <a href="https://www.shinyapps.io/" target="_blank">Shinyapps.io</a>, enable data scientists to develop and deploy data products on the cloud, without managing their own infrastructure.</li></ul><h2 id="to-learn-more">To Learn More</h2><p>If you’d like to learn more about the advantages of code-first data science, and see some real examples in action, watch the free, on-demand webinar <a href="https://www.rstudio.com/resources/why-your-enterprise-needs-code-first-data-science/">Why Your Enterprise Needs Code-First Data Science</a>. Or, you can <a href="http://rstd.io/r_and_python" target="_blank">set up a meeting directly with our Customer Success team</a>, to get your questions answered and learn how RStudio can help you get the most out of your data science.</p><p><a class="btn btn-primary" href="http://rstd.io/r_and_python" target="_blank">Schedule a Conversation </a><a class="btn btn-info" href="https://www.rstudio.com/resources/why-your-enterprise-needs-code-first-data-science/" target="_blank">Watch the full code-first webinar </a></p></description></item><item><title>Low Friction Package Management in Three Parts</title><link>https://www.rstudio.com/blog/pkg-mgmt-admins/</link><pubDate>Thu, 06 May 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/pkg-mgmt-admins/</guid><description><p><sup>Photo by <a href="https://unsplash.com/@sxoxm?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Sven Mieke</a> on <a href="https://unsplash.com/s/photos/plan?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup></p><p><em>This is the final post in a short series of blogs on package management.</em></p><p><em>The <a href="https://blog.rstudio.com/2021/02/05/pkg-mgmt-prime-directive/" target = "_blank">first post</a> explained the role of repositories and libraries.</em></p><p><em>The <a href="https://blog.rstudio.com/2021/02/11/pkg-mgmt-pain/" target = "_blank">second post</a> explored package management pain shows up in different organizations.</em></p><p>As a Solutions Engineer at RStudio, I spend a lot of time helping data science teams figure out their package management needs.</p><p>I often meet with IT/Admins frustrated with trying to provide data scientists with the packages they need while also maintaining stability and security. I also speak with data scientists discouraged and annoyed at how hard it is to gets the open source R and Python packages they need.</p><p>The resulting cat-and-mouse games often end in <em>creative</em> detentes &ndash; idiosyncratic package management strategies that <em>kinda</em> work for everyone involved. On the other hand, organizations with secure, low-friction package management strategies seem to follow just a few patterns.</p><blockquote><p>&ldquo;Happy <del>families</del> package management processes are all alike; every unhappy <del>family</del> package management process is unhappy in its own way.&rdquo;</p><p>-Leo Tolstoy, Anna Karenina (sorta)</p></blockquote><p>In this post, I&rsquo;ll share the common components I see at organizations where IT/Admins and data scientists both contribute to a package environment that is secure, reproducible, and easy to use.</p><h2 id="divvying-up-responsibility">Divvying Up Responsibility</h2><p>While the details of package management differ widely from one organization to another, organizations with secure, low-friction package management processes usually exhibit a <strong>three-part framework</strong>, with clear ownership of each part.</p><p><img src="./pkg_mgmt_overview.svg" alt="Diagram of Package Management Flow Described Below"></p><p>One way this pattern can go awry is that admins, trying to be helpful, decide to take control of the package libraries themselves. We <a href="https://blog.rstudio.com/2021/02/05/pkg-mgmt-prime-directive/" target="_blank">previously explored</a> why admins controlling <em>repositories</em> and data scientists controlling <em>libraries</em> tends to be a much lower-friction way to manage package environments.</p><h3 id="part-i-add-packages-to-repositories">Part I: Add packages to repositories</h3><p>In most organizations with good package management processes, admins decide whether a private package repository is needed and, if so, what packages are in the organization&rsquo;s shared package repositories.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> Many organizations decide that a public package repository like CRAN, PyPI, or <a href="https://packagemanager.rstudio.com" target="_blank">public RStudio Package Manager</a> is sufficient.</p><p>In other organizations, there may not be open access to the internet, packages might need to be validated before they can be used, or there might be heavy usage of internally-developed packages. In these cases, organizations configure an internal CRAN or PyPI mirror. <a href="https://www.rstudio.com/products/package-manager/" target="_blank">RStudio Package Manager</a> is RStudio&rsquo;s professional product for this purpose.</p><p>Data scientists and admins trying to choose the right configuration for their organization might want to consider the pain points explored in <a href="https://blog.rstudio.com/2021/02/11/pkg-mgmt-pain/" target="_blank">the previous post in this series</a> as well as the decision tree on the <a href="https://solutions.rstudio.com/data-science-admin/packages/" target="_blank">RStudio solutions site</a>.</p><h3 id="part-ii-set-defaults-so-things-just-work">Part II: Set defaults so things &ldquo;just work&rdquo;</h3><p>Once security concerns are satisfied, admins spend a lot of time making sure that data scientists can get to work as soon as they enter their data science environment. Admins want to ensure data scientists have all the packages they need.</p><p>It often works well for admins to set default settings for users so package installs just work. Admins generally set appropriate default repositories and install required system libraries. Some admins additionally choose to install a &ldquo;starter&rdquo; package set for all users.</p><p>More details on how to do all those things are on the <a href="https://solutions.rstudio.com/data-science-admin/packages/#2-set-rstudio-server-pro-defaults" target="_blank">RStudio Solutions site</a>.</p><p>Many organizations choose to centralize all of their data scientists on <a href="https://www.rstudio.com/products/rstudio-server-pro/" target="_blank">RStudio Server Pro</a> to simplify the administration.</p><h3 id="part-iii-use-and-capture-reproducible-project-environments">Part III: Use and capture reproducible project environments</h3><p>The last step of the process is data scientists doing their work! If admins have successfully configured a repository and package defaults, this should be an extremely low-friction process for data scientists, even if they&rsquo;re inside an air-gapped or validated environment.</p><p><img src="./pkg_installs.svg" alt="Installs go from repositories to libraries"></p><p>In the best case, data scientists use <a href="https://solutions.rstudio.com/data-science-admin/packages/#3-manage-libraries" target="_blank">project-level isolation</a> of packages using tools like renv and virtualenv to ensure project package libraries are isolated, reproducible, and shareable.</p><h2 id="great-process-leads-to-great-outcomes">Great Process Leads to Great Outcomes</h2><p>A three-part package management plan allows admins to be confident that their network is secure and that data scientists aren&rsquo;t blocked trying to acquire the packages they need. Data scientists are also able to access and use the packages they need to do their work.</p><p>Within the three-part structure, organizations&rsquo; package needs are as varied as the organizations themselves, and an <a href="https://blog.rstudio.com/2021/02/11/pkg-mgmt-pain/" target="_blank">an earlier blog post</a> explored why teams make different choices within this framework.</p><p>If you think your organization could benefit from more information on package management, contact our <a href="https://rstudio.chilipiper.com/book/rst-demo" target="_blank">sales team</a> to learn more about how RStudio Package Manager and RStudio Server Pro work together to make it easy for admins to create safe, low-friction environment for data scientists to be productive.</p><blockquote><p><em>For more on this topic, please see the recording of our free webinar on <a href="https://www.rstudio.com/resources/webinars/managing-packages-for-open-source-data-science/" target="_blank">Managing Packages for Open-Source Data Science</a>.</em></p></blockquote><section class="footnotes" role="doc-endnotes"><hr><ol><li id="fn:1" role="doc-endnote"><p>Sometimes this package admin is a member of the IT or DevOps organization, and sometimes they&rsquo;re a data scientist. <a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p></li></ol></section></description></item><item><title>RStudio and APIs</title><link>https://www.rstudio.com/blog/rstudio-and-apis/</link><pubDate>Tue, 04 May 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-and-apis/</guid><description><sup>Photo by <ahref="https://unsplash.com/@samthewam24?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">SamuelSianipar</a> on <ahref="https://unsplash.com/s/photos/pipes?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup><p>Data Scientists and analysts work to constantly deliver valuable insights fromdata. In many cases, these individuals practice a <a href="https://www.rstudio.com/resources/why-your-enterprise-needs-code-first-data-science/">Code Firstapproach</a>,using a programming language like R or Python to explore and understand data.Once an analysis reaches conclusion, it is important to carefully consider whathappens next. Perhaps the analysis resulted in a complex machine learning modelthat can generate valuable predictions on new data. Or perhaps it resulted insome new business logic that can be implemented to improve efficiency. In anycase, ensuring the longevity of analysis outcomes increases business value longafter the original analysis concludes.</p><h2 id="increasing-the-impact-of-an-analysis">Increasing the impact of an analysis</h2><p>There are common, standard methods for distributing results and increasing theimpact of a given analysis. Data scientists may choose one or more of severaloptions:</p><ul><li><strong>Prepare and present a presentation</strong> to business stakeholders.</li><li><strong>Create a reproducible report</strong> that is widely shared and distributed.</li><li><strong>Develop and share an interactive dashboard or application</strong> to provideothers with self-service access to the analysis results and findings at theirconvenience.</li><li>Or, as we’ll discuss in this post, <strong>share the analysis as an API</strong> thatallows real time interactivity with other technologies.</li></ul><p>In each case, the goal is to increase the potential impact of the analysis.Thegreater the reproducibility, interactivity and reach of the analysis, thegreater the potential impact.</p><p><img src="analysis-impact.png" alt="Analysis Impact"></p><p>Let’s consider an example. Sofia works as a business analyst for a large SaaScompany. She was recently assigned a project to analyze customer usage of thecompany platform to better identify customers at risk of churning. After carefulanalysis, Sofia determines the top 5 contributing factors to customer churn. Atthis point, she has learned something of value, and she can share those insightswith business leaders via a presentation or email to help inform futuredecisions. However, this outcome has limited impact. What if Sofia created areproducible report or a dashboard using Shiny or Dash to better understand theexisting customer base and their risk factors? Now the analysis impact hasdramatically increased.</p><p>As a result of Sofia’s analysis, the company wants to generate real timepredictions for each customer. This will allow each customer interaction to beinformed by the customers current level of risk. In order to achieve this, thework Sofia has done <strong>needs to be responsive and accessible in real time, from avariety of other tools and technologies.</strong> This is when creating an API canprove particularly useful.</p><h2 id="apis-provide-real-time-analysis-outcomes">APIs provide real time analysis outcomes</h2><p>APIs are, in their most basic form, a standardized way for computers tocommunicate with one another. Just as human communication is improved by ashared baseline or common language, APIs allow different digital platforms tocommunicate with one another. In the case of data analytics, APIs can empowerreal time interaction with statistical models and analysis outcomes. Thisenables other developers either inside or outside of an organization tointegrate directly with and build upon work that’s already been done without theneed for costly re-implementation.</p><p><img src="api-diagram.png" alt="API Diagram"></p><p>The impact of a given analysis can often be measured by how accessible itsresults are. Slide presentations, reports, and interactive applications allincrease analysis impact by distributing results to a wider audience. APIs canfurther increase impact by allowing other tools within the organization toquickly make use of analysis results. Unfortunately, many organizations stopshort of creating these APIs and don’t realize the full impact potential of agiven analysis.</p><p>This isn’t to say that every analysis should result in an API. When an analysisis expected to be short lived or exists only as a proof of concept, an API maybe unnecessary. However, when other tools want to build upon the work done in anexisting analysis, an API becomes a useful tool that enables quick integration.</p><p><img src="rstudio-connect-diagram.png" alt="RStudio Connenct Diagram"></p><h2 id="to-learn-more">To learn more</h2><p>RStudio Connect enables enterprise data science teams to quickly deliveranalysis results to a wide variety of business stakeholders. In addition tosupporting interactive applications and static reports, RStudio Connect can alsobe used to deploy and manage R and Python APIs, using the Plumber and Flaskframeworks. To learn more:</p><ul><li>Visit the <a href="https://www.rstudio.com/products/connect/">Connect product page</a>,or <a href="http://rstd.io/r_and_python">set up a meeting</a> with our Customer Successteam.</li><li>Watch the webinar, <a href="https://www.rstudio.com/resources/webinars/expanding-r-horizons-integrating-r-with-plumber-apis/">“Expanding R Horizons: Integrating R with PlumberAPIs”</a></li><li>Check out these examples of<a href="https://solutions.rstudio.com/r/rest-apis/">Plumber</a> and<a href="https://solutions.rstudio.com/python/flask/">Flask</a> APIs.</li></ul></description></item><item><title>What's New on RStudio Cloud - May 2021</title><link>https://www.rstudio.com/blog/rstudio-cloud2/</link><pubDate>Mon, 03 May 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-cloud2/</guid><description><style type="text/css">div.howTo { display: inline-block; margin-top: 30px; padding: 3px 12px; font-size: 10px; letter-spacing: 1px; background-color: #e3eef8; border-radius: 3px; text-transform: uppercase; }p.howTo { margin-top: 2px;padding-top: 0px; }</p></style><p>Here are the new features and improvements we’ve rolled out on RStudio Cloud since our last post in <a href="https://www.rstudio.com/blog/rstudio-cloud1/" >February, 2021</a>. Note that you can always see the latest significant new features as they are released on Cloud’s <a href="https://rstudio.cloud/learn/whats-new" target="_blank" rel="noopener noreferrer">What’s New</a> page.</p><h2 id="whats-new">What’s New</h2><ul><li>Upgraded Operating System</li><li>Export Project</li><li>Updated User Panel</li><li>Account Notifications</li><li>Additional Usage Data Access</li><li>Teaching with Cloud Guide</li><li>Compare Plans page</li></ul><h2 id="upgraded-operating-system">Upgraded Operating System</h2><p>Each RStudio Cloud project is deployed into its own container - we have upgraded the operating system in these containers to Ubuntu 20.04 (Focal) for all new projects. This will ensure that you can use the latest versions of R and python packages with confidence that the underlying operating system has the features they require.</p><p>For existing projects, you can choose to upgrade to the updated operating system as well. At the end of May 2021, we will automatically upgrade any project still using the older operating system (Ubuntu 16.04) the next time it is opened.</p><div class="howTo">How To</div><p class="howTo">To change a project’s OS version, press the <img src="settings-button.png" width="30" class="my-0"> button to open the project settings pane, then click on the System tab.</p><div class="my-5"><img src="system-settings.png" width="400"></div><h2 id="export-project">Export Project</h2><p>You can export the contents of a project as a .zip file directly from any projects listing. You don’t need to open the project first in order to create and download the .zip file. This feature lets you easily work on a project locally if you have the RStudio IDE installed on your computer.</p><div class="howTo">How To</div><p class="howTo">Click the Export action next to the project that you wish to download. After a short while you will be provided with a link to download the project.</p><div class=""><img src="project-export.png"></div><h2 id="updated-user-panel">Updated User Panel</h2><p>The user panel has a new look, and now includes some at-a-glance information about your account.</p><div class="howTo">How To</div><p class="howTo">Click on your icon/name on the right side of the header to open the user panel.</p><div class=""><img src="user-panel.png" width="500"></div><h2 id="account-notifications">Account Notifications</h2><p>You will see a notification indicator on your icon/name in the header when we have important information to tell you about your account.</p><div class=""><img src="notification-indicator.png" width="150"></div><div class="howTo">How To</div><p class="howTo">Click your icon/name to view the full message in the user panel.</p><p>We will also send notification emails when you reach your project hours limit (Cloud Free plan only) or your included hours for the month (Cloud Plus, Premium or Instructor plans).</p><h2 id="additional-usage-data-access">Additional Usage Data Access</h2><p>In addition to viewing usage data by calendar month, you can now view usage data by account usage period. This option is available for all accounts/spaces you own.</p><p>You can now also see your usage data for Your Workspace.</p><div class="howTo">How To</div><p class="howTo">Navigate to Your Workspace and press the <span ><img src="usage-button.png" class="my-0" width="30"></span> in the header.</p><h2 id="teaching-with-cloud-guide">Teaching with Cloud Guide</h2><p>For those of you using Cloud to help with your teaching, we’ve added a <a href="https://rstudio.cloud/learn/guide#course-spaces" target="_blank" rel="noopener noreferrer">Teaching with Cloud</a> section to the <a href="https://rstudio.cloud/learn/guide" target="_blank" rel="noopener noreferrer">Guide</a> to help you get started. It lays out the typical steps to take to use Cloud with your students - and covers questions that are frequently asked by educators.</p><h2 id="compare-plans-page">Compare Plans Page</h2><p>To make it easier to understand how our plans measure up against one another, we’ve added a <a href="https://rstudio.cloud/plans/compare" target="_blank" rel="noopener noreferrer">Compare Plans</a> page that lists all our plans, their features and pricing options, in a single table.</p><h2 id="whats-next">What’s Next</h2><p>We don’t like to pre-announce features before they’re available, but the team is busy both improving our underlying systems and developing new features. If there is something you’d love to see improved or added to Cloud, please let us know in the <a href="https://community.rstudio.com/c/rstudio-cloud" target="_blank" rel="noopener noreferrer">RStudio Cloud section</a> of the RStudio Community site.</p><p>If you are new to RStudio Cloud and would like to learn more about the platform and various plans available, check out the <a href="https://www.rstudio.com/products/cloud/">RStudio Cloud product page</a>.</p><p>Thanks!</p></description></item><item><title>Using RStudio to Amplify Digital Marketing Results</title><link>https://www.rstudio.com/blog/using-rstudio-to-amplify-digital-marketing-results/</link><pubDate>Mon, 26 Apr 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/using-rstudio-to-amplify-digital-marketing-results/</guid><description><p><sup><i>Photo by <a href="https://unsplash.com/@mjessier" target="_blank" rel="noopener noreferrer">Myriam Jessier</a> on Unsplash</i></small></sup></p><p>We recently teamed up with <a href="https://www.extendo.company/en/" target="_blank" rel="noopener noreferrer">Extendo</a> and <a href="https://www.ixpantia.com/en/" target="_blank" rel="noopener noreferrer">ixpantia</a> to learn how they use RStudio to help streamline their digital analytics offerings for more credible and durable marketing insights.</p><p>Extendo, known as <a href="https://www.miweb.digital/" target="_blank" rel="noopener noreferrer">MiWeb</a> in the US, is a digital marketing firm that specializes in offering marketing solutions and strategies. They work with major end user behavior platforms such as Google, Facebook, and Amazon to make sense of the massive amount of data organizations can get from these platforms. ixpantia is an incredibly talented RStudio Full Service Partner in Costa Rica that helps organizations deliver data-driven products and services and develop the infrastructure to support it.</p><p>Listen in Spanish, or read in English or Spanish, how, with the help of ixpantia, Extendo was able to deliver beyond traditional descriptive analysis of marketing platform data, and offer predictive insights that were based on serious data science.</p><blockquote><p>&ldquo;Rather than creating and connecting each piece of the architecture, RStudio Connect standardizes this process and allows our team to focus on designing solutions rather than managing the architecture.&rdquo;</p><p>&mdash; Paul Fervoy, Co-Founder &amp; VP Business Development</p></blockquote><p>One of the biggest challenges in the world of digital marketing is unifying the plethora of data sources in a secure and robust way. With GDPR further governing how organizations are able to interact with user behavior data, many marketing firms are limited to only providing descriptive analytics based on Google Analytics, Amazon, and Facebook data. Extendo wanted to offer more predictive and reproducible insights, and they realized that they would need a secure and centralized data infrastructure to support that.</p><p>The Extendo team decided to partner with ixpantia to develop a data science workflow centered around RStudio Connect. This code-based infrastructure allowed their data science team to collaborate more efficiently, spend less time troubleshooting IT problems and DevOps hurdles, and ultimately focus on what really matters: delivering credible business insights.</p><img src="diagram.png" alt="Extendo's data science infrastructure" class="center"><p style="text-align: right"><small><i>Extendo's data science architecture diagram</i></small></p><p>This discussion is the first customer story that we’ve supported that is not in English, and we certainly hope it isn’t the last. We understand that RStudio is being used by folks all over the world, across all industries, and one of our goals as a company is to create products and resources that everyone can use, regardless of their means, or preferred language.</p><p>You can read more about their story <a href="https://www.rstudio.com/about/customer-stories/extendo-en/">here</a> in English or Spanish, or listen in on the full conversation in Spanish <a href="https://www.rstudio.com/resources/webinars/como-ixpantia-ayudo-a-extendo-a-poner-r-en-produccion/">here</a>.</p><p>We love hearing about how you are successful with RStudio products, so if you’d like to share your story with us, please <a href="https://www.rstudio.com/about/contact/">let us know</a>.</p></description></item><item><title>New in knitr: Improved Accessibility with Image Alt Text</title><link>https://www.rstudio.com/blog/knitr-fig-alt/</link><pubDate>Tue, 20 Apr 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/knitr-fig-alt/</guid><description><script src="https://www.rstudio.com/blog/knitr-fig-alt/index_files/header-attrs/header-attrs.js"></script><p>We are happy to share that <strong>knitr</strong> (<a href="https://yihui.org/knitr/" class="uri">https://yihui.org/knitr/</a>) version 1.32 is now on CRAN. <strong>knitr</strong> is a package that executes all code embedded within an <code>.Rmd</code> file, and prepares the code output to be displayed within the R Markdown output document.</p><table><thead><tr class="header"><th align="center">Latest release</th></tr></thead><tbody><tr class="odd"><td align="center"><img src="https://img.shields.io/badge/CRAN-1.32-brightgreen" alt="Last bookdown release 1.32 cran badge" /></td></tr></tbody></table><p>You can install the latest version from CRAN:</p><pre class="r"><code>install.packages(&quot;knitr&quot;)</code></pre><p>The latest version of the package includes an important new chunk option to add <a href="https://www.w3schools.com/tags/att_img_alt.asp">alternative text</a> to figures produced in code chunks. This improves the accessibility of your knitted HTML outputs, and in the rest of this post, we wanted to show users how to effectively use this new code chunk option.</p><p>First of all, what is alt text? Here is the definition of alt text from <a href="https://webaim.org/techniques/alttext/">Webaim</a>:</p><blockquote><p>It is read by screen readers in place of images allowing the content and function of the image to be accessible to those with visual or certain cognitive disabilities.</p><p>It is displayed in place of the image in browsers if the image file is not loaded or when the user has chosen not to view images.</p><p>It provides a semantic meaning and description to images which can be read by search engines or be used to later determine the content of the image from page context alone.</p></blockquote><p>Here is an example from the <a href="https://www.a11yproject.com/">a11y project website</a>:</p><p><img src="a11y.png" title="Screenshot of accessibility details for an image on a11yproject.com." alt="Screenshot of accessibility details for an image on a11yproject.com." width="90%" style="display: block; margin: auto;" /></p><p>You can see that the alt text for the image highlighted in purple says:</p><blockquote><p>“Two stacked copies of the book, Accessibility for Everyone.”</p></blockquote><p>This works well, but what about adding alt text for figures you produce with code? Previously, with code chunks that produced figures, knitr used the figure caption to create the alt text, and there was no way to create a caption and alt text for figures separately. This <a href="https://github.com/rstudio/rmarkdown/issues/1867">feature</a> was originally requested by <a href="https://mdogucu.ics.uci.edu/">Dr. Mine Dogucu</a>.</p><p>Why would you want to provide different caption and alt text for figures? As <a href="https://github.com/rstudio/rmarkdown/issues/1867#issuecomment-716200288">JooYoung Seo pointed out</a>, figure captions are used for relatively concise figure titles, whereas image alt text is intended to deliver more descriptive text-based information for assistive technologies like screen readers. Moreover, a screen reader will read both the caption and alt text, so using the same text for both can be frustrating to the user<a href="#fn1" class="footnote-ref" id="fnref1"><sup>1</sup></a>.</p><div id="how-to-add-alt-text" class="section level2"><h2>How to add alt text</h2><p>You can now set the alt text using the new knitr code chunk option <code>fig.alt</code> for HTML-based R Markdown output (we explain below what happens with <a href="#limitations">other output formats</a>). We’ll use data from the <a href="https://github.com/allisonhorst/palmerpenguins">palmerpenguins package</a> to illustrate usage with a <a href="https://ggplot2.tidyverse.org/">ggplot2</a> plot.</p><pre class="r"><code># install packages to run locally# install.packages(&quot;palmerpenguins&quot;)# install.packages(&quot;ggplot2&quot;)library(palmerpenguins)library(ggplot2)</code></pre><p>Here is a scatterplot to start:</p><pre><code>```{r, fig.alt = &quot;Scatterplot of flipper length by bill length of 3 penguin species, where we show penguins with bigger flippers have bigger bills.&quot;}ggplot(data = penguins, aes(x = flipper_length_mm,y = bill_length_mm,color = species)) +geom_point(aes(shape = species), alpha = 0.8) +scale_color_manual(values = c(&quot;darkorange&quot;,&quot;purple&quot;,&quot;cyan4&quot;))```</code></pre><p><img src="figures/penguins-1.png" title="Scatterplot of flipper length by bill length of 3 penguin species, where we show penguins with bigger flippers have bigger bills." alt="Scatterplot of flipper length by bill length of 3 penguin species, where we show penguins with bigger flippers have bigger bills." width="672" /></p><p>Here is a visual check of the alt text that a screen reader could access, using “Inspect Accessibility Properties” in Firefox:</p><div class="figure"><img src="accessibility-props.png" alt="" /><p class="caption">Image: Using “inspect accessibility properties” in browser on a knitr-produced plot</p></div><p>You may use your browser inspector to check that the alt text is set properly. Browsers have specific Accessibility inspectors, for example:</p><ul><li><p>Firefox: <a href="https://developer.mozilla.org/en-US/docs/Tools/Accessibility_inspector" title="Firefox&#39;s Accessibility Inspector website">Accessibility Inspector</a></p></li><li><p>Chrome: <a href="https://developer.chrome.com/docs/devtools/accessibility/reference/" title="Chrome&#39;s Accessibility features reference">Accessibility features reference</a></p></li></ul><p>This chunk option can take either a single string or a vector of strings as input as well, if a code chunk produces more than one plot. For example:</p><pre><code>```{r, fig.alt = c(&quot;Informative alt text for plot 1&quot;, &quot;Informative alt text for plot 2&quot;)}plot1plot2```</code></pre></div><div id="combining-figure-captions-and-alt-text" class="section level2"><h2>Combining figure captions and alt text</h2><p>By default, if you do not provide the <code>fig.alt</code> chunk option, the text in the figure caption provided by the <code>fig.cap</code> chunk option will be used as the alt text. You do not <em>have</em> to use <code>fig.cap</code> to use <code>fig.alt</code>- you may use each chunk option in isolation, but they will also work together.</p><pre><code>```{r fig.cap=&quot;Bigger flippers, bigger bills&quot;, fig.alt = &quot;Scatterplot of flipper length by bill length of 3 penguin species, where we show penguins with bigger flippers have bigger bills.&quot;}ggplot(data = penguins, aes(x = flipper_length_mm,y = bill_length_mm,color = species)) +geom_point(aes(shape = species), alpha = 0.8) +scale_color_manual(values = c(&quot;darkorange&quot;,&quot;purple&quot;,&quot;cyan4&quot;))```</code></pre><div class="figure"><span style="display:block;" id="fig:penguins"></span><img src="figures/penguins-1.png" alt="Scatterplot of flipper length by bill length of 3 penguin species, where we show penguins with bigger flippers have bigger bills." width="672" /><p class="caption">Figure 1: Bigger flippers, bigger bills</p></div></div><div id="reusing-alt-text-across-chunks" class="section level2"><h2>Reusing alt text across chunks</h2><p>Since <code>fig.alt</code> was introduced in an earlier knitr release (v1.31), it gave us the opportunity to get and respond to feedback from early adopters. One <a href="https://github.com/yihui/knitr/issues/1959">feature requested by Dr. Mine Çetinkaya-Rundel</a> was to make it possible to reuse alt text across code chunks. Many knitr users reuse code chunks using <code>ref.label</code> as a <a href="https://bookdown.org/yihui/rmarkdown-cookbook/reuse-chunks.html">chunk option</a>, and it would be nice for <code>fig.alt</code> (and other chunk options) to “come along for the ride” with the code.</p><p>Let’s look at an example. Do you remember the penguins plot above? We can add a name to the code chunk so that we can reuse it later:</p><pre><code>```{r penguins, fig.alt = &quot;Scatterplot of flipper length by bill length of 3 penguin species, where we show penguins with bigger flippers have bigger bills.&quot;, fig.cap = &quot;Bigger flippers, bigger bills&quot;}# plotting code here```</code></pre><p>Elsewhere in your document, if you’d like to show the same plot, you can use <code>ref.label='chunk_label'</code> as a chunk option with an empty chunk. With <strong>knitr</strong> 1.32, if you’d like to show the same plot <em>with the same chunk options</em>, you can combine <code>ref.label='chunk_label'</code> and <code>opts.label = TRUE</code> to carry over the chunk options when reusing the chunk. It does not matter if the code chunks referenced are before or after the code chunk that uses <code>ref.label</code>. An early code chunk can reference a chunk later in the same document.</p><p>For example, this empty chunk:</p><pre><code>```{r ref.label = &#39;penguins&#39;, opts.label = TRUE}```</code></pre><p>Produces this plot:</p><p><img src="figures/penguins-1.png" title="Scatterplot of flipper length by bill length of 3 penguin species, where we show penguins with bigger flippers have bigger bills." alt="Scatterplot of flipper length by bill length of 3 penguin species, where we show penguins with bigger flippers have bigger bills." width="672" /></p><p>By setting <code>opts.label = TRUE</code>, the plot and all its chunk options were carried over, including the caption and alt text. Without it, only the code chunk would have been reused. You can also override any of the previously used chunk options by setting them again in this new code chunk. For example, we can change the <code>fig.cap</code>:</p><pre><code>```{r ref.label = &#39;penguins&#39;, opts.label = TRUE, fig.cap = &quot;Penguin plot, take 3&quot;}```</code></pre><p>Produces this plot:</p><div class="figure"><span style="display:block;" id="fig:penguins"></span><img src="figures/penguins-1.png" alt="Scatterplot of flipper length by bill length of 3 penguin species, where we show penguins with bigger flippers have bigger bills." width="672" /><p class="caption">Figure 1: Penguin plot, take 3</p></div><p>By adding <code>fig.cap</code>, we have overridden the figure caption initially set in the original chunk.</p><p>To find out more, you may take a look at the <a href="https://github.com/yihui/knitr-examples/blob/master/121-ref-label.Rmd">new knitr example</a>. If you are not familiar with these options, you can find the documentation for knitr options at <a href="https://yihui.org/knitr/options/" class="uri">https://yihui.org/knitr/options/</a>. To learn more about saving and reusing sets of chunk options with knitr, you can read more about <code>opts.label</code> and options templates in the <a href="https://bookdown.org/yihui/rmarkdown-cookbook/opts-template.html">R Markdown cookbook</a>. The special value <code>opts.label = TRUE</code> shown above means <code>opts.label = ref.label</code>, i.e., to inherit chunk options from chunks referenced by the <code>ref.label</code> option.</p></div><div id="alt-text-for-static-images" class="section level2"><h2>Alt text for static images</h2><p>For static images, you can include a figure caption using Markdown syntax:</p><pre class="markdown"><code></code></pre><p>By default, this activates the <a href="https://pandoc.org/MANUAL.html#images">implicit_figures</a> extension from Pandoc with output formats like <code>html_document</code>. This will lead to the same text being used for both captions and alt text. Setting <a href="https://bookdown.org/yihui/rmarkdown/r-code.html#figures"><code>fig_caption: FALSE</code></a> in the YAML of your <code>html_document</code> would prevent the caption if you only wanted to set the alt text. However, then you cannot have figure captions. Unfortunately, Pandoc does not yet offer a way to differentiate between figure captions and alt text. To work around this limitation, the <code>fig.alt</code> code chunk option can be used with <code>knitr::include_graphics</code>:</p><pre><code>```{r, fig.alt = &quot;Crochet (not knitting!) needle with colorful yarn&quot;, out.width=&quot;25%&quot;}knitr::include_graphics(&quot;thumbnail.jpg&quot;)```</code></pre><p><img src="thumbnail.jpg" title="Crochet (not knitting!) needle with colorful yarn" alt="Crochet (not knitting!) needle with colorful yarn" width="25%" /></p></div><div id="limitations" class="section level2"><h2>Limitations</h2><p>There is one major limitation to this feature, which is that it is currently limited to HTML-based output formats. We mentioned earlier that the default behavior is to use the figure caption provided by the <code>fig.cap</code> chunk option if you do not provide the <code>fig.alt</code> chunk option. This is true still for non-HTML based output formats like Word <code>.docx</code> documents: they will only respect <code>fig.cap</code>. You can follow this <a href="https://github.com/yihui/knitr/issues/1967">issue on GitHub</a> to see our progress toward addressing this limitation in a future release.</p></div><div id="alt-text-resources" class="section level2"><h2>Alt text resources</h2><p>You may learn more about how to write more informative alt text for data visualization in this <a href="https://medium.com/nightingale/writing-alt-text-for-data-visualization-2a218ef43f81">Nightingale article</a>.</p><p>Some additional resources:</p><ul><li><a href="https://www.a11yproject.com/posts/2013-01-14-alt-text/" title="Using alt text properly - a11yproject">Using alt text properly, from the a11yproject</a>.</li><li><a href="https://www.wgbh.org/foundation/ncam/guidelines/guidelines-for-describing-stem-images">WGBH Guide Guidelines for describing STEM images</a></li><li><a href="http://diagramcenter.org/making-images-accessible.html">Diagram Center Accessible Images</a></li><li><a href="http://diagramcenter.org/accessible-math-tools-tips-and-training.html">Diagram Center Accessible Math Tricks and Tips</a></li></ul><p>Sincere thanks to <a href="http://www.doggenetics.com/">Liz Hare</a> for recommending <a href="https://twitter.com/DogGeneticsLLC/status/1375267373586976769?s=20">these and other resources on Twitter</a>, and to <a href="https://silvia.rbind.io/">Silvia Canelón</a> for sharing them with us.</p></div><div id="acknowledgements" class="section level2"><h2>Acknowledgements</h2><p>This latest release introduces numerous new features and bug fixes as well. You can read the <a href="https://github.com/yihui/knitr/releases">release notes</a> to review all of the changes. A big thanks to the other 55 contributors who helped with the previous two knitr releases by discussing problems, proposing features, and contributing code in the <a href="https://github.com/yihui/knitr"><strong>knitr</strong> repo on Github</a>:</p><p><a href="https://github.com/abhsarma">@abhsarma</a>, <a href="https://github.com/alusiani">@alusiani</a>, <a href="https://github.com/andrew-fuller">@andrew-fuller</a>, <a href="https://github.com/apreshill">@apreshill</a>, <a href="https://github.com/arencambre">@arencambre</a>, <a href="https://github.com/aschersleben">@aschersleben</a>, <a href="https://github.com/atusy">@atusy</a>, <a href="https://github.com/beanumber">@beanumber</a>, <a href="https://github.com/Bisaloo">@Bisaloo</a>, <a href="https://github.com/bounlu">@bounlu</a>, <a href="https://github.com/cderv">@cderv</a>, <a href="https://github.com/cpsievert">@cpsievert</a>, <a href="https://github.com/cysouw">@cysouw</a>, <a href="https://github.com/davidwales">@davidwales</a>, <a href="https://github.com/deb-m">@deb-m</a>, <a href="https://github.com/dmenne">@dmenne</a>, <a href="https://github.com/dmurdoch">@dmurdoch</a>, <a href="https://github.com/egoipse">@egoipse</a>, <a href="https://github.com/ekatko1">@ekatko1</a>, <a href="https://github.com/elbersb">@elbersb</a>, <a href="https://github.com/englianhu">@englianhu</a>, <a href="https://github.com/GitHunter0">@GitHunter0</a>, <a href="https://github.com/gsrohde">@gsrohde</a>, <a href="https://github.com/hermandr">@hermandr</a>, <a href="https://github.com/iago-pssjd">@iago-pssjd</a>, <a href="https://github.com/iMarcello">@iMarcello</a>, <a href="https://github.com/jamarav">@jamarav</a>, <a href="https://github.com/jangorecki">@jangorecki</a>, <a href="https://github.com/jimhester">@jimhester</a>, <a href="https://github.com/jooyoungseo">@jooyoungseo</a>, <a href="https://github.com/julieinsan">@julieinsan</a>, <a href="https://github.com/karoliskoncevicius">@karoliskoncevicius</a>, <a href="https://github.com/kbvernon">@kbvernon</a>, <a href="https://github.com/kmcbest">@kmcbest</a>, <a href="https://github.com/knokknok">@knokknok</a>, <a href="https://github.com/krivit">@krivit</a>, <a href="https://github.com/ktrutmann">@ktrutmann</a>, <a href="https://github.com/LTLA">@LTLA</a>, <a href="https://github.com/matthewgson">@matthewgson</a>, <a href="https://github.com/mine-cetinkaya-rundel">@mine-cetinkaya-rundel</a>, <a href="https://github.com/MonteShaffer">@MonteShaffer</a>, <a href="https://github.com/msgoussi">@msgoussi</a>, <a href="https://github.com/muschellij2">@muschellij2</a>, <a href="https://github.com/NickCH-K">@NickCH-K</a>, <a href="https://github.com/phargarten2">@phargarten2</a>, <a href="https://github.com/rasyidstat">@rasyidstat</a>, <a href="https://github.com/rcst">@rcst</a>, <a href="https://github.com/rnorberg">@rnorberg</a>, <a href="https://github.com/rundel">@rundel</a>, <a href="https://github.com/StephenGerry">@StephenGerry</a>, <a href="https://github.com/thompsonsed">@thompsonsed</a>, <a href="https://github.com/tomschenkjr">@tomschenkjr</a>, <a href="https://github.com/TTT12-dumb-dumb">@TTT12-dumb-dumb</a>, <a href="https://github.com/XiangyunHuang">@XiangyunHuang</a>, and <a href="https://github.com/yihui">@yihui</a>.</p></div><div class="footnotes"><hr /><ol><li id="fn1"><p>See <a href="https://www.brandeis.edu/web-accessibility/understanding/images-alt-text.html" class="uri">https://www.brandeis.edu/web-accessibility/understanding/images-alt-text.html</a><a href="#fnref1" class="footnote-back">↩︎</a></p></li></ol></div></description></item><item><title>Latest news from the R Markdown family</title><link>https://www.rstudio.com/blog/2021-spring-rmd-news/</link><pubDate>Thu, 15 Apr 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/2021-spring-rmd-news/</guid><description><p>Happy spring from the R Markdown family! We are excited to share a few package updates that have been keeping us busy so far this year.</p><div id="rmarkdown" class="level2"><h2>1. rmarkdown</h2><table><thead><tr class="header"><th align="center">Latest release</th></tr></thead><tbody><tr class="odd"><td align="center"><img src="https://img.shields.io/badge/CRAN-2.7-brightgreen" alt="Last rmarkdown release 2.7 cran badge" /></td></tr></tbody></table><p>We are proud to share that <strong>rmarkdown</strong> (<a href="https://pkgs.rstudio.com/rmarkdown/" class="uri">https://pkgs.rstudio.com/rmarkdown/</a>) version 2.7 is on CRAN. <strong>rmarkdown</strong> is a package that helps you create dynamic documents that combine code, rendered output (such as figures), and markdown-formatted text.</p><p>You can install <strong>rmarkdown</strong> from CRAN with:</p><pre class="r"><code>install.packages(&quot;rmarkdown&quot;)</code></pre><p>First, the rmarkdown package’s documentation site recently got a makeover and new place to live, now at: <a href="https://pkgs.rstudio.com/rmarkdown/" class="uri">https://pkgs.rstudio.com/rmarkdown/</a></p><p>You may also notice a new hex sticker design too — thanks to our artist <a href="https://www.allisonhorst.com/">Allison Horst</a> for reimagining our iconic quill!</p><div class="figure" style="text-align: center"><span id="fig:unnamed-chunk-1"></span><img src="https://pkgs.rstudio.com/rmarkdown/reference/figures/logo.png" alt="The new rmarkdown hex sticker design with a green quill, by Allison Horst" width="20%" /><p class="caption">Figure 1: rmarkdown hex by Allison Horst</p></div><p>Below, we share highlights from the latest release, but you might want to look at the <a href="https://pkgs.rstudio.com/rmarkdown/news/index.html#rmarkdown-2-7-2021-02-19">release notes</a> for the full details.</p><div id="sass-and-scss-support-for-html-based-output" class="level3"><h3>Sass and SCSS support for HTML-based output</h3><p>The biggest news is that it is now possible to use Sass (and, by extension, SCSS) to style your HTML-based outputs. This new functionality builds on the <a href="https://rstudio.github.io/sass/"><strong>sass</strong> package</a>, which provides bindings to <a href="https://github.com/sass/libsass">LibSass</a>, a fast <a href="https://sass-lang.com/">Sass</a> compiler written in C++. Sass is a mature and stable CSS extension language that makes styling modern websites less complex and more composable. If you want to learn more, this <a href="https://rstudio.github.io/sass/articles/sass.html"><strong>sass</strong> vignette</a> is a solid place to start.</p><p>How is Sass useful for R markdown users?</p><blockquote><p>“Sass lets you use features that don’t exist in CSS yet like variables, nesting, mixins, inheritance and other nifty goodies that make writing CSS fun again.”</p><p><a href="https://sass-lang.com/guide" class="uri">https://sass-lang.com/guide</a></p></blockquote><p>Files with <code>.sass</code> or <code>.scss</code> extension provided to <code>html_document</code>’s <code>css</code> parameter are now compiled to CSS using the <strong>sass</strong> package (thanks, <span class="citation">[@cpsievert]</span>(<a href="https://github.com/cpsievert" class="uri">https://github.com/cpsievert</a>), <a href="https://github.com/rstudio/rmarkdown/pull/1706">#1706</a>).</p><p>In the rest of this post, we’ll walk through a simple way to start using Sass with R Markdown. Even if you never thought CSS was fun to write in the first place, we hope this will help you the value of using Sass to simplify your styling of R Markdown HTML-based outputs. And, in case you missed it, Pandoc has added support that makes it easier to style your markdown text with CSS. We’ll start with showing you those features first with plain CSS, then add in the Sass layer on top.</p><div id="using-css-to-style-an-html_document" class="level4"><h4>Using CSS to style an <code>html_document</code></h4><p>Let’s start with a simple single document with this in the YAML:</p><pre class="yaml"><code>---output:html_document:css: custom.css---</code></pre><p>Inside that <code>.css</code> file, you could define <a href="https://developer.mozilla.org/en-US/docs/Web/CSS/Using_CSS_custom_properties">CSS custom properties</a> for some colors, then use them inside a simple CSS rule like <code>my-color</code>:</p><pre class="css"><code>:root {--green: #212D2C;--sky: #A9FDFF;}.my-color {background-color: var(--green);color: var(--sky);padding: 1em;}</code></pre><style type="text/css">:root {--green: #212D2C;--sky: #A9FDFF;}.my-color {background-color: var(--green);color: var(--sky);padding: 1em;}</style><p>To apply your CSS rules in the body of your document, you could write raw HTML:</p><pre><code>This is a &lt;span class=&quot;my-color&quot;&gt;color&lt;/span&gt; word.</code></pre><p>would produce…</p><p>This is a <span class="my-color">color</span> word.</p><p>This works, but can clutter up your <code>.Rmd</code> with HTML. Instead, you may achieve the same result writing with Pandoc’s <a href="https://pandoc.org/MANUAL.html#extension-bracketed_spans">bracketed Spans</a>, which allows you to keep your markdown text cleaner:</p><pre><code>This is a [color]{.my-color} word.</code></pre><p>would produce…</p><p>This is a <span class="my-color">color</span> word.</p><p>To apply CSS to a full sentence, or even longer text, you can create divs using Pandoc’s <a href="https://pandoc.org/MANUAL.html#extension-fenced_divs">fenced Div blocks</a>:</p><pre><code>::: {.my-color}All of these words are colored.:::</code></pre><p>would produce…</p><div class="my-color"><p>All of these words are colored.</p></div><p>Now, so far this is still a plain <code>.css</code> file- nothing too fancy yet! Let’s make it a <code>.scss</code> file instead, and add some fancier logic.</p></div><div id="using-sass-to-style-an-html_document" class="level4"><h4>Using Sass to style an <code>html_document</code></h4><p>We’ll refactor our previous CSS from above to sass-ify it. We’ll use the SCSS (Sassy CSS) syntax. Sassy CSS builds on the existing syntax of CSS, and is a good way to get started using Sass because any valid CSS is also valid SCSS. The first thing we’ll do is save our file now with the <code>.scss</code> file extension:</p><pre class="yaml"><code>---output:html_document:css: custom.scss # update this!---</code></pre><p>Just like CSS, SCSS uses semi-colons and curly braces. The main difference is that we’ll use the <code>$</code> symbol to make something a variable:</p><pre class="scss"><code>$green: #212D2C;$sky: #A9FDFF;.my-color {background-color: $green;color: $sky;padding: 1em;}</code></pre><style type="text/css">.my-color{background-color:#212D2C;color:#A9FDFF;padding:1em}</style><p>We apply the style in the exact same way as before:</p><pre><code>This is a [color]{.my-color} word.</code></pre><p>would produce…</p><p>This is a <span class="my-color">color</span> word.</p><p>But there is still a lot more that we can do. Among other things, Sass allows you nest your CSS selectors in a way that follows the same visual hierarchy of your HTML. This can make your CSS rules easier to read and write. Let’s add a simple nested rule to change the link text color to a new variable <code>$green</code>:</p><pre class="scss"><code>$green: #212D2C;$sky: #A9FDFF;$cyan: #36C9B4;.my-color {background-color: $green;color: $sky;padding: 1em;a {color: $cyan;}}</code></pre><style type="text/css">.my-color{background-color:#212D2C;color:#A9FDFF;padding:1em}.my-color a{color:#36C9B4}</style><p>Let’s switch to using Pandoc divs in the body of our document to apply this style:</p><pre><code>::: {.my-color}This is a link that will be [green](https://pkgs.rstudio.com/rmarkdown/).:::</code></pre><p>would produce…</p><div class="my-color"><p>This is a link that will be <a href="https://pkgs.rstudio.com/rmarkdown/">green</a>.</p></div><p>Sass also provides additional processing abilities like <a href="http://www.sass-lang.com/documentation/at-rules/extend"><code>@extend</code> rules</a>, which you’d use when one class should have all the styles of another class, as well as its own specific styles.</p><p>Let’s separate <code>my-color</code> now from a new class called <code>my-link</code>. This new class will do some clever, sassy things:</p><ol style="list-style-type: decimal"><li><p>Build on the <code>my-color</code> class with the <a href="http://www.sass-lang.com/documentation/at-rules/extend"><code>@extend</code> rule</a>,</p></li><li><p>Add a unique <code>$cyan</code> for link text,</p></li><li><p>Add <code>hover</code> effects for links using the <a href="https://sass-lang.com/documentation/style-rules/parent-selector">Sass parent selector</a>, <code>&amp;</code>, and</p></li><li><p>Use the Sass <code>rgba()</code> color function to apply an alpha level to our existing <code>$green</code> color variable (note that it begins life as a hex color!).</p></li></ol><pre class="scss"><code>$green: #212D2C;$sky: #A9FDFF;$cyan: #36C9B4;.my-color {background-color: $green;color: $sky;padding: 1em;}.my-link a {@extend .my-color;color: $cyan;&amp;:hover {background-color: rgba( $green, .5 );color: white;}}</code></pre><style type="text/css">.my-color,.my-link a{background-color:#212D2C;color:#A9FDFF;padding:1em}.my-link a{color:#36C9B4}.my-link a:hover{background-color:rgba(33,45,44,0.5);color:white}</style><pre><code>::: {.my-link}This is a link that will be [cyan](https://pkgs.rstudio.com/rmarkdown/).:::</code></pre><p>would produce…</p><div class="my-link"><p>This is a link that will be <a href="https://pkgs.rstudio.com/rmarkdown/">cyan</a>.</p></div><p>Now we’ve made it so links will have a lighter background color upon hover, and the text turns white. One more little thing: Pandoc also works with <code>ID</code> attributes. Let’s add a special ID selector for links with the cookbook <code>ID</code>. We’ll start the rule with the hash (<code>#</code>) character (instead of the <code>.</code> we used for classes), followed by the id of the element.</p><pre class="scss"><code>#cookbook a {text-decoration-style: wavy;}</code></pre><style type="text/css">#cookbook a{text-decoration-style:wavy}</style><pre><code>::: {#cookbook}This is a link that will be [wavy](https://bookdown.org/yihui/rmarkdown-cookbook/).:::</code></pre><p>would produce…</p><div id="cookbook"><p>This is a link that will be <a href="https://bookdown.org/yihui/rmarkdown-cookbook/">wavy</a>.</p></div><p>You can combine <code>ID</code>s, classes, and even nest bracketed spans inside fenced divs. Let’s do it all!</p><pre><code>::: {.my-link}This is a link that will be [cyan](https://pkgs.rstudio.com/rmarkdown/),but you can read more in [[The Cookbook](https://bookdown.org/yihui/rmarkdown-cookbook/)]{#cookbook}.:::</code></pre><p>would produce…</p><div class="my-link"><p>This is a link that will be <a href="https://pkgs.rstudio.com/rmarkdown/">cyan</a>,but you can read more in the<span id="cookbook"><a href="https://bookdown.org/yihui/rmarkdown-cookbook/">Cookbook</a></span>.</p></div><p>We hope this short walk-through inspires you to test out this new feature, and consider how using Sass might help you customize your HTML-based outputs with R Markdown. To learn more, check out the <strong>sass</strong> <a href="https://rstudio.github.io/sass/articles/sass.html">R package vignette</a>. Look out for our future blog posts and resources that will showcase how to use these features with R Markdown.</p><p>For shorter styling rules, you could instead use knitr’s <code>sass</code> or <code>scss</code> engines (also powered by the <strong>sass</strong> package) to provide those rules inline in a code chunk (without lugging around a separate external style file in your project). Thanks to <a href="https://github.com/emilyriederer">Emily Reiderer</a> for this <a href="https://github.com/yihui/knitr/pull/1666">contribution</a>!</p><p>The <code>sass</code>/<code>scss</code> code chunks are compiled through the <code>sass::sass()</code> function. This means you can write Sass code directly into a code <code>sass</code> chunk and the resulting CSS will be included in the output document. For example, this kind of code chunk can exist anywhere in your <code>.Rmd</code> to add styles:</p><pre><code>```{scss}$green: #212D2C;$sky: #A9FDFF;$cyan: #36C9B4;.my-color {background-color: $green;color: $sky;padding: 1em;}```</code></pre><p>When you knit, the style will be applied to the whole document, so you can put these chunks at either the top or bottom of your <code>.Rmd</code> and they will work the same way (i.e., styles will be applied to all content, not just content below the <code>sass</code>/<code>scss</code> chunk). You can read more about these knitr engines <a href="https://bookdown.org/yihui/rmarkdown-cookbook/eng-sass.html">in the R Markdown Cookbook</a>.</p></div></div><div id="simplified-custom-blocks-support-for-latex" class="level3"><h3>Simplified custom blocks support for LaTeX</h3><p>The Pandoc syntax for fenced divs is a powerful feature (as shown above), but one limitation is that Pandoc currently only processes fenced divs for HTML output; fenced divs are silently ignored when rendering to other output formats. For R Markdown users, the <strong>rmarkdown</strong> package extends the fenced div syntax to better support LaTeX/PDF outputs. Since <strong>rmarkdown</strong> 1.16, it is possible to opt-in and use fenced divs to produce LaTeX environments by adding a special attribute.</p><pre class="markdown"><code>::: {.verbatim data-latex=&quot;&quot;}We show some _verbatim_ text here.:::</code></pre><p>Its LaTeX output will then be:</p><pre class="latex"><code>\begin{verbatim}We show some \emph{verbatim} text here.\end{verbatim}</code></pre><p><strong>rmarkdown</strong> 2.7 simplifies the opt-in of this feature by using a shorter name for the attribute: <code>latex</code>. This example would lead to the same result as above:</p><pre class="markdown"><code>::: {.verbatim latex=true}We show some _verbatim_ text here.:::</code></pre><p>In addition, <code>latex=1</code> could also be used as an alias for <code>latex=true</code>. This attribute accepts a string that will be appended to the opening line of the LaTeX environment:</p><pre class="markdown"><code>::: {.name latex=&quot;[options]&quot;}content that can be Markdown syntax:::</code></pre><p>would produce</p><pre class="latex"><code>\begin{name}[options]content that can be Markdown syntax\end{name}</code></pre><p>In the R Markdown ecosystem, we call these <a href="https://bookdown.org/yihui/rmarkdown-cookbook/custom-blocks.html#custom-blocks">“Custom Blocks”</a>. For the <strong>bookdown</strong> package, custom blocks replace the special <code>block</code> and <code>block2</code> <strong>knitr</strong> engines (see: <a href="https://bookdown.org/yihui/bookdown/custom-blocks.html" class="uri">https://bookdown.org/yihui/bookdown/custom-blocks.html</a>).</p></div></div><div id="pagedown" class="level2"><h2>2. pagedown</h2><table><thead><tr class="header"><th align="center">Latest release</th></tr></thead><tbody><tr class="odd"><td align="center"><img src="https://img.shields.io/badge/CRAN-0.14-brightgreen" alt="Last pagedown release 0.14 cran badge" /></td></tr></tbody></table><p>We are proud to announce that <strong>pagedown</strong> version 0.14 is now on CRAN. <strong>pagedown</strong> is a package to help paginate HTML output with CSS print and <strong>paged.js</strong>.</p><p>You can install <strong>pagedown</strong> from CRAN with:</p><pre class="r"><code>install.packages(&quot;pagedown&quot;)</code></pre><p>Below we share important highlights from the latest release, but you might want to look at the <a href="https://github.com/rstudio/pagedown/releases">release notes</a> for the full details.</p><div id="paged.js-upgrade" class="level3"><h3>Paged.js upgrade</h3><p><strong>pagedown</strong> is powered by the awesome <strong>paged.js</strong> (<a href="https://www.pagedjs.org/" class="uri">https://www.pagedjs.org/</a>) library and thanks to the help of <a href="https://github.com/RLesur">Romain Lesur</a>, it has been updated from <em>0.1.32</em> (03-2019) to <em>0.1.43</em> (10-2020). This is an important upgrade as it offers speed improvement and fixes several bugs.</p><p>All features of <strong>paged.js</strong> <em>v0.1.43</em> can now be used with <strong>pagedown</strong>. To read up on these features, see the <a href="https://gitlab.pagedmedia.org/tools/pagedjs/-/tags">release notes</a> and a series of posts:</p><ul><li><a href="https://www.pagedjs.org/posts/2020-02-25-weekly/" class="uri">https://www.pagedjs.org/posts/2020-02-25-weekly/</a></li><li><a href="https://www.pagedjs.org/posts/2020-03-03-update-pagedjs-0-1-39/" class="uri">https://www.pagedjs.org/posts/2020-03-03-update-pagedjs-0-1-39/</a></li><li><a href="https://www.pagedjs.org/posts/2020-04-01-pagedjs-0-1-40/" class="uri">https://www.pagedjs.org/posts/2020-04-01-pagedjs-0-1-40/</a></li><li><a href="https://www.pagedjs.org/posts/2020-06-22-pagedjs-0-1-42/" class="uri">https://www.pagedjs.org/posts/2020-06-22-pagedjs-0-1-42/</a></li></ul><p>On the R package side, this upgrade has fixed an issue with <code>counter-reset</code> <a href="https://developer.mozilla.org/en-US/docs/Web/CSS/counter-reset">CSS property</a>. We have updated our templates accordingly, and we advise template authors to do the same. The default value of <code>counter-reset</code> is now correctly set to 0 instead of 1. This means that to reset the <code>page</code> CSS counter to 1, you now need to use <code>counter-reset: page 1;</code> instead of <code>counter-reset: page;</code></p><pre class="css"><code>/* reset page numbering for main content */.main .level1:first-child h1 {counter-reset: page 1;}</code></pre></div><div id="and-more-little-things" class="level3"><h3>And more little things…</h3><p>We have also made some smaller but important changes to the <strong>pagedown</strong> formats:</p><ul><li>New <code>lot-unlisted</code> and <code>lof-unlisted</code> arguments to remove the list of figures and the list of tables in <code>html_paged()</code>.</li><li>Support for <code>fig_caption = FALSE</code> in <code>html_resume()</code>.</li><li>Better <a href="https://davidgohel.github.io/flextable/"><strong>flextable</strong></a> support in <code>html_paged()</code>.</li></ul></div></div><div id="last-but-not-least" class="level2"><h2>3. Last but not least!</h2><p>Let’s not forget other packages in the R Markdown family that have been updated so far in 2021:</p><ul><li><p><a href="https://github.com/yihui/knitr/"><strong>knitr</strong></a> had 2 releases (<a href="https://github.com/yihui/knitr/releases/tag/v1.31">v1.31</a> and <a href="https://github.com/yihui/knitr/releases/tag/v1.32">v1.32</a>), which we’ll detail in a separate post on this blog—stay tuned! <small>See <a href="https://github.com/yihui/knitr/releases/">release notes</a> for details in the meantime</small></p></li><li><p><a href="https://pkgs.rstudio.com/blogdown/"><strong>blogdown</strong></a> had 3 releases (<a href="https://pkgs.rstudio.com/blogdown/news/index.html#changes-in-blogdown-version-1-1-2021-01-19">v1.1</a>, <a href="https://pkgs.rstudio.com/blogdown/news/index.html#changes-in-blogdown-version-1-2-2021-03-04">v1.2</a>, <a href="https://pkgs.rstudio.com/blogdown/news/index.html#changes-in-blogdown-version-1-3-unreleased">v1.3</a>), following the <a href="https://www.rstudio.com/2021/01/18/blogdown-v1.0/">v1.0 release announcement</a> in January 2021. <small>See <a href="https://pkgs.rstudio.com/blogdown/news/index.html">release note</a> for details</small></p></li><li><p><a href="https://rstudio.github.io/DT/"><strong>DT</strong></a> is now in version 0.18. Fixes and new features keep getting added regularly thanks to the help of <a href="https://github.com/shrektan">Xianying Tan</a>. <small>See <a href="https://github.com/rstudio/DT/releases">release note</a> for details</small></p></li><li><p><a href="https://pkgs.rstudio.com/distill/"><strong>distill</strong></a> is in version 1.2 with several important fixes (like encoding issues) and new features (like support for alternate R Markdown formats in Distill websites). <small>See <a href="https://pkgs.rstudio.com/distill/news/index.html#distill-v1-2-cran-">release note</a> for details</small></p></li><li><p><a href="https://github.com/rstudio/rticles"><strong>rticles</strong></a> had 2 releases (v0.18, v0.19) with a few bug fixes and improvements, as well as 2 new community contributed formats: <em>Papers in Historical Phonology</em> (<code>pihph_article()</code> - thanks <a href="https://github.com/stefanocoretta">@stefanocoretta</a>) and <em>Institute of Mathematical Statistics Journals</em> (<code>ims_article()</code> - thanks <a href="https://github.com/auzaheta">@auzaheta</a>). <small>See <a href="https://github.com/rstudio/rticles/releases">release note</a> for details</small></p></li></ul><p>We hope this round-up gives you some new ideas for your data science projects that use R Markdown. A big thank you to all the contributors who helped with these releases by discussing problems, proposing features, and contributing code. Happy spring!</p></div></description></item><item><title>Impressions from New Zealand’s R Exchange</title><link>https://www.rstudio.com/blog/impressions-from-new-zealand-s-r-exchange/</link><pubDate>Wed, 14 Apr 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/impressions-from-new-zealand-s-r-exchange/</guid><description><p><i>This is a guest post from Uli Muellner, Director of IT &amp; Learning, at <a href="https://www.epi-interactive.com/" target="_blank" rel="noopener noreferrer">Epi-Interactive</a>, a Full Service RStudio Partner.</i></p><img src="header.png" alt="The New Zealand R-Exchange" class="center"><p><b>Kia ora koutou</b>,</p><p>In March we were fortunate to run a face-to-face event in Wellington, New Zealand, bringing people together to discuss how they use R in their organisation to analyse and share data. It felt special to be able to do that, hence we wanted to share some of the highlights with you all. Thanks to our presenters from Statistics NZ, the New Zealand Ministry of Health, the Cancer Control Agency, and RStudio for supporting this event. See the full agenda <a href="https://www.epi-interactive.com/events/r-exchange/" target="_blank" rel="noopener noreferrer">here</a>.</p><p>Here is what we learned:</p><h2>1. We are all in the same boat</h3><p>While the stories we heard from different organisations highlighted the diversity of how R is being used, there seem to be common challenges that reach across the board. Often organisations start their R journey with employees innovating in their respective field or there are obvious pain-points to tackle, like tedious manual reporting or the limitations of proprietary tools like Excel. However, once R grows within organisations, isolated solutions often hit a wall when IT gets involved and a more structured approach is required to address systemic issues such as data security.</p><p>One strategy to overcome those obstacles successfully is to get the IT department on board early and build up an infrastructure where code development and publishing can be centralised and better controlled. As Josiah Parry from RStudio mentioned in his talk, the start-up mantra “fail fast, fail early, fail often” doesn’t always apply, for example, to government agencies who can’t fail on issues such as data security and privacy standards. This is where RStudio professional products can help as they facilitate the establishment of R in a controlled environment, for example by</p><ul><li>Managing a curated set of R packages with version control (<i><a href="https://www.rstudio.com/products/package-manager/" target="_blank" rel="noopener noreferrer">RStudio Package Manager</a></i>)</li><li>Providing a dedicated server-based development environment for analysts and data scientists (<i><a href="https://www.rstudio.com/products/rstudio-server-pro/" target="_blank" rel="noopener noreferrer">RStudio Server Pro</a></i>) or </li><li>Offering a publishing platform that provides access to outputs for a wider internal and external audience, e.g. via R Shiny dashboards, Markdown documents, scheduled reports or APIs (<i><a href="https://www.rstudio.com/products/connect/" target="_blank" rel="noopener noreferrer">RStudio Connect</a></i>).</li></ul><img src="diagram.png" alt="The RStudio professional product ecosystem" class="center"><p style="text-align: right"><small><i>Diagram courtesy of RStudio</i></small></p><p>While a centralised controlled environment is considered a key success factor, users were keen to be able to continue to try out novel approaches, e.g. by using a sandbox or experimental environment. This would allow for a proof-of-concept or prototyping, where some of the approaches may or may not be transitioned into production.</p><h2>2. Change comes from within</h3><p>Innovation was there in the room with us. Hearing battle stories from our colleagues really showcased the impact people can have by customising R to their specific needs. For example, while it is ideal to maintain a consistent appearance and behaviour across several different Shiny apps within an organisation, this is not always easy to achieve as different people might be involved in app development and code is commonly duplicated or difficult to maintain and update. We heard from one of our speakers how creating an R package containing styles, reference data, and documentation can serve as a single point of reference for any number of R Shiny apps and allows a team to easily update and maintain key facets of style and function. From a user point of view a consistent look and feel can establish trust and make dashboards easier to understand and navigate. Think about the “Don’t make me think” principle established by Steve Krug about web usability, where people should be able to accomplish tasks as easily and directly as possible. A consistent interface across all apps within an organisation can greatly contribute to this goal.</p><br><br><img src="health.png" alt="Screenshot from New Zealand Ministry of Health survey results as a Shiny app" class="center"><p style="text-align: right"><small><i>Screenshot: Ministry of Health. 2020. Annual Data Explorer 2019/20: New Zealand Health Survey [Data File]. URL: https://minhealthnz.shinyapps.io/nz-health-survey-2019-20-annual-data-explorer/ (Accessed 07/04/2021)</i></small></p><h2>3. The opportunities are huge</h2><p>While saying this to the audience of this blog feels like preaching to the choir this event showcased yet again how analysis, data science, and interactive dashboards that are code-based can be used to inform and engage stakeholders and connect data with real-world decision making. It was inspiring to see this in action during the talk on reporting of performance indicators to improve health service delivery and how R-driven reporting can make a difference in getting the message out there. It was easy to observe that R supports transparency and reproducibility, as it is open source, and features such as package creation and literate programming are readily available. These characteristics are becoming increasingly important for organisations that rely on the trust of their stakeholders, that might be both producers and consumers of the data. It was truly exciting to hear our speakers and participants’ vision on what the future here holds.</p><img src="screenshot2.png" alt="Screenshot of University of Minnesota's AIS Explorer, 2021" class="center"><p style="text-align: right"><small><i>Screenshot: University of Minnesota. 2021. AIS Explorer 2021 [Data File]. URL: https://www.aisexplorer.umn.edu/ (Accessed 08/04/2021)</i></small></p><hr><h3>About Epi-interactive</h3>While we are based at the bottom of the world in New Zealand’s windswept and creative capital (Lord of the Rings anyone?) we work across the globe. Here is how you can connect.<h4>Learn from our Shiny Ninjas</h4>Keen to build and extend your R Shiny skills in a fun and interactive online environment? Then the 2021 Edition of our R Shiny Masterclass Series might be for you.Delivered online in May and June 2021 we offer an Introduction and Advanced Class (8 sessions each).<p><a href="https://www.epi-interactive.com/events/r-shiny-masterclass-series/" target="_blank" rel="noopener noreferrer">Find out more and register on our website</a></p><img src="social2.jpg" alt="Learn from Shiny experts" class="center"><h4>Stay in touch</h4><ul><li>Contact us on <a href="mailto:info@epi-interactive.com" target="_blank" rel="noopener noreferrer">info@epi-interactive.com</a> if you would like to discuss your project idea with us.</li><li>Follow us on <a href="https://epi-interactive.us9.list-manage.com/track/click?u=36f9ab1413fd9ff644e6cffc7&id=cfbe387558&e=8357c5dc58" target="_blank" rel="noopener noreferrer">LinkedIn</a> for news and updates. </li><li>Learn more about <a href="https://www.epi-interactive.com/work/" target="_blank" rel="noopener noreferrer">our work </a>.</li><li>Subscribe to our <a href="http://eepurl.com/hvjVpb" target="_blank" rel="noopener noreferrer">mail list</a>.</li></ul><img src="footer.jpg" alt="Epi-Interactive" class="center"></description></item><item><title>Model Monitoring with R Markdown, pins, and RStudio Connect</title><link>https://www.rstudio.com/blog/model-monitoring-with-r-markdown/</link><pubDate>Thu, 08 Apr 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/model-monitoring-with-r-markdown/</guid><description><p>ModelOps or MLOps (for &ldquo;model/machine learning operations&rdquo;) focuses on the real-world processes involved in building, deploying, and maintaining a model within an organization&rsquo;s data infrastructure. Developing a model that meets your organizations needs and goals is a big accomplishment, but whether that model&rsquo;s purpose is <a href="https://www.tmwr.org/software-modeling.html#types-of-models" target="_blank" rel="noopener noreferrer">largely predictive, inferential, or descriptive</a>, the &ldquo;care and feeding&rdquo; of your model often doesn&rsquo;t end when you are done developing it. How is the model going to be deployed? Should you retrain the model on a schedule? Based on changes in model performance? When should you kick off retraining the same kind of model with fresh data versus go back to the drawing board for a full round of model development again? These are the kinds of questions that ModelOps deals with.</p><p><strong>Model monitoring</strong> is a key component of ModelOps, and is typically used to answer questions about how a model is performing over time, when to retrain a model, or what kinds of observations are not being predicted well. There are a lot of solutions out there to address the need for model monitoring, but the R ecosystem offers options that are <strong>code-first, flexible, and already in wide use</strong>. When we use this approach to model monitoring, we gain all the benefits of handling our data science logic via reusable, extensible code (as opposed to clicks), as well as the enormous open source community surrounding R Markdown and related tools.</p><p>In this post, I&rsquo;ll walk through one option for this approach.</p><ul><li>Deploy a model as a RESTful API using Plumber</li><li>Create an R Markdown document to regularly assess model performance by:<ul><li>Sending the deployed model new observations via httr</li><li>Evaluating how the model performed with these new predictions using model metrics from yardstick</li><li>Versioning the model metrics using the pins package</li><li>Summarize and visualize the results using flexdashboard</li></ul></li><li>Schedule the R Markdown dashboard to regularly evaluate the model and notify us of the results</li></ul><h2 id="predicting-injuries-from-traffic-data">Predicting injuries from traffic data</h2><p>I recently developed a model to <a href="https://juliasilge.com/blog/chicago-traffic-model/" target="_blank" rel="noopener noreferrer">predict injuries for traffic crashes in Chicago</a>. <a href="https://data.cityofchicago.org/Transportation/Traffic-Crashes-Crashes/85ca-t3if" target="_blank" rel="noopener noreferrer">The data set covers traffic crashes</a> on city streets within Chicago city limits under the jurisdiction of the Chicago Police Department, and the model predicts the probability of a crash involving an injury.</p><div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;"><iframe src="https://www.youtube.com/embed/J5gTzoRU9tc" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" allowfullscreen title="YouTube Video"></iframe></div><p>I work on the <a href="https://www.tidymodels.org/" target="_blank" rel="noopener noreferrer">tidymodels</a> team developing open source tools for modeling and machine learning, but you can use the R ecosystem for monitoring any kind of model, even one trained in Python. I used <a href="https://www.rplumber.io/" target="_blank" rel="noopener noreferrer">Plumber</a> to <a href="https://colorado.rstudio.com/rsc/traffic-crashes/" target="_blank" rel="noopener noreferrer">deploy my model on RStudio Connect</a>, but depending on your own organization&rsquo;s infrastructure, you might consider <a href="https://docs.rstudio.com/connect/user/flask/" target="_blank" rel="noopener noreferrer">deploying a Flask API</a> or another appropriate format.</p><h2 id="monitor-model-performance">Monitor model performance</h2><p>There are new crashes everyday, so I would like to measure how my model performs over time. I built a <a href="https://rmarkdown.rstudio.com/flexdashboard/" target="_blank" rel="noopener noreferrer">flexdashboard</a> for model monitoring; this dashboard does <em>not</em> use Shiny but it&rsquo;s published on <a href="https://www.rstudio.com/products/connect/" target="_blank" rel="noopener noreferrer">RStudio Connect</a> as a <a href="https://docs.rstudio.com/connect/user/scheduling/" target="_blank" rel="noopener noreferrer">scheduled report</a> that re-executes automatically once a week. I get an email in my inbox with the new results every time!</p><p><a href="https://colorado.rstudio.com/rsc/traffic-crash-monitor/monitor.html" target="_blank" rel="noopener noreferrer"><img src="traffic_flexdashboard.png" width="100%" alt="Model monitoring flexdashboard"></a></p><p>The <a href="https://colorado.rstudio.com/rsc/traffic-crash-monitor/monitor.html" target="_blank" rel="noopener noreferrer">monitoring dashboard</a> uses <a href="https://httr.r-lib.org/" target="_blank" rel="noopener noreferrer">httr</a> to call two APIs:</p><ul><li>the city of Chicago&rsquo;s API for the traffic data to get the latest crashes</li><li>the model API to make predictions on those new crashes</li></ul><p>The dashboard also makes use of <a href="https://pins.rstudio.com/" target="_blank" rel="noopener noreferrer">pins</a> to <strong>publish</strong> and <strong>version</strong> model metrics each time the dashboard updates. I am a huge fan of the pins package in the context of ModelOps; you can even use it to publish and version models themselves!</p><p><a href="https://colorado.rstudio.com/rsc/traffic-crash-monitor/monitor.html" target="_blank" rel="noopener noreferrer"><img src="traffic_monitor.gif" width="100%" alt="Model monitoring flexdashboard"></a></p><p>Basic model monitoring should cover at least the model metrics of interest, but in the real world, most data practitioners need to track something specific to their domain or use case. This is why inflexible ModelOps tooling is often frustrating to work with. Using flexible tools like R Markdown, on the other hand, let me build a model monitoring dashboard with a table of crashes that were misclassified (so I can explore them) and an interactive map of where they are around the city of Chicago.</p><h2 id="to-learn-more">To learn more</h2><p>All the code for this demo <a href="https://github.com/juliasilge/modelops-playground" target="_blank" rel="noopener noreferrer">is available on GitHub</a>, and future posts will address how to use R for other ModelOps endeavors. If you&rsquo;d like to learn more about how RStudio products like Connect can be used for tasks from serving model APIs to model monitoring and more, <a href="https://rstudio.chilipiper.com/book/schedule-time-with-rstudio" target="_blank" rel="noopener noreferrer">set up a meeting with our Customer Success team</a>.</p></description></item><item><title>RStudio 1.4 - A quick tour</title><link>https://www.rstudio.com/blog/rstudio-1-4-a-quick-tour/</link><pubDate>Tue, 06 Apr 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-1-4-a-quick-tour/</guid><description><p>In case you missed the details of our recent release or would just like a quick tour, here&rsquo;s a set of videos highlighting our recent work on RStudio 1.4.</p><h4>Watch the full video <a href="https://www.youtube.com/watch?v=SdMPh5uphO0" target="_blank" rel="noopener noreferrer">here</a> or see some highlights below.</h4><h2 id="visual-rmarkdown-editing">Visual RMarkdown Editing</h2><script src="https://fast.wistia.com/embed/medias/zuepig0pyb.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_zuepig0pyb videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/zuepig0pyb/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><p>A <a href="https://blog.rstudio.com/2020/09/30/rstudio-v1-4-preview-visual-markdown-editing/" target="_blank" rel="noopener noreferrer">visual markdown editor</a> that provides improved productivity for composing longer-form articles and analyses with R Markdown.<br><br></p><h2 id="python-integrations">Python Integrations</h2><script src="https://fast.wistia.com/embed/medias/gbgej8p99s.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_gbgej8p99s videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/gbgej8p99s/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><p>New <a href="https://blog.rstudio.com/2020/10/07/rstudio-v1-4-preview-python-support/" target="_blank" rel="noopener noreferrer">Python capabilities</a>, including display of Python objects in the Environment pane, viewing of Python data frames, and tools for configuring Python versions and conda/virtual environments.<br><br></p><h2 id="command-palette--shortcuts">Command Palette &amp; Shortcuts</h2><script src="https://fast.wistia.com/embed/medias/sg197m4je0.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_sg197m4je0 videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/sg197m4je0/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><p>A new <a href="https://blog.rstudio.com/2020/10/14/rstudio-v1-4-preview-command-palette/" target="_blank" rel="noopener noreferrer">command palette</a> (accessible via Ctrl+Shift+P) that provides easy keyboard access to all RStudio commands, add-ins, and options.<br><br></p><h2 id="saml-support">SAML Support</h2><script src="https://fast.wistia.com/embed/medias/om8q7xegn9.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_om8q7xegn9 videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/om8q7xegn9/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><p>Integration with <a href="https://blog.rstudio.com/2020/11/16/rstudio-1-4-preview-server-pro/" target="_blank" rel="noopener noreferrer">a host of new RStudio Server Pro features</a> including project sharing with Launcher, Microsoft Visual Studio Code support (currently in beta), Secure Assertion Markup Language (SAML) authentication, and local Launcher load balancing.</p><p>For more detail on RStudio 1.4, feel free to read our <a href="https://blog.rstudio.com/2021/01/19/announcing-rstudio-1-4/" target="_blank" rel="noopener noreferrer">previous summary post</a> or watch the <a href="https://www.youtube.com/watch?v=SdMPh5uphO0" target="_blank" rel="noopener noreferrer">full video</a>.</p></description></item><item><title>BI and Data Science: Deliver Insights Through Embedded Analytics</title><link>https://www.rstudio.com/blog/bi-and-data-science-deliver-insights-through-embedded-analytics/</link><pubDate>Thu, 01 Apr 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/bi-and-data-science-deliver-insights-through-embedded-analytics/</guid><description><sup>Photo by <a href="https://unsplash.com/@omidarmin?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText" target="_blank" rel="noopener noreferrer">Omid Armin</a> on <a href="https://unsplash.com/" target="_blank" rel="noopener noreferrer">Unsplash</a></sup><p>Data Scientists are trailblazers. They look for value inside of data and seek to ask the right questions, disseminating insights to their stakeholders. In the world of business intelligence, those “on the ground” need more than just static reports. They need access to clear reproducible insights for exploration, feedback, and action, all in the right place at the right time. This may seem daunting but fortunately, BI and analytics have been tackling these challenges for some time. <b>Embedded analytics</b> integrates data analysis inside workflows, applications, and processes that people use every day, helping move the point of discovery to the point of decision.</p><p>In this post, we are going to dive into how the data scientist can integrate insights, increase adoption, and effectively empower end-users to make better decisions. With a code-first approach, data science is perfectly suited to rapidly integrate organizational insights with everyday systems. Our last post covered practical ways that BI and Data Science <a href="https://blog.rstudio.com/2021/03/25/bi-and-data-science-the-handoff/" target="_blank" rel="noopener noreferrer">collaborate with data handoffs</a>. Now let’s look further at how analysts, decision-makers, and end-users can benefit from “tightly tying the rope” between embedded analytics and data science in a secure, scalable, and flexible way.</p><blockquote><p><em>With a code-first approach, data science is perfectly suited to rapidly integrate organizational insights with everyday systems.</em></p></blockquote><h2 id="security-and-authentication-overcoming-the-first-roadblock">Security and authentication: Overcoming the first roadblock</h2><p>For an enterprise, data security is regularly a top concern across the entire organization. Security must be front and center as you plan your path forward and coordinate sharing across stakeholders. Not everyone will likely require (or should have) access to the same data. This is where having a system in place that customizes security and permissions for various predetermined roles, often at the data and row-level, will be critical. You need to define which of your stakeholders can view and collaborate on various data products. For example, will only internal users have access, or will outside stakeholders and/or customers also be consuming information as a service? Will you need to integrate with existing services?</p><p>No matter the answer to these questions, considerable work will be involved to ensure that proper security is enforced and organizational standards are met. This is one of the major reasons that <a href="https://www.rstudio.com/products/connect/" target="_blank" rel="noopener noreferrer">RStudio Connect</a> is considered, to simplify the deployment of data products for multiple users, integrating directly with existing security protocols like LDAP/Active Directory, OAuth, PAM, SAML, and more.</p><h2 id="scalability-and-communication-expanding-the-horizon">Scalability and communication: Expanding the horizon</h2><p>As your user base grows, effective communication of results often requires access to the right tools for scheduling and alerts. Your team will likely need automated systems for updates and emails at critical times. No one wants to constantly monitor dashboards or receive non-relevant alerts. Having a system that helps you to administer alerts and scheduling will not only make your life easier but will make working and communicating across multiple teams and stakeholders over the long run more effective. Learn about how RStudio Connect makes this easy in our “Avoid Dashboard Fatigue” webinar <a href="https://www.rstudio.com/resources/webinars/avoid-dashboard-fatigue/" target="_blank" rel="noopener noreferrer">here</a>.</p><p>Embedded analytics runs on scalable platforms, particularly with software as a service (SaaS) to manage cost and capacity over time. As a data scientist, you can plug into these, allowing end-users to utilize models and increase adoption. The <a href="https://www.rplumber.io/index.html" target="_blank" rel="noopener noreferrer">Plumber API</a> (R-based) and <a href="https://flask.palletsprojects.com/en/1.1.x/" target="_blank" rel="noopener noreferrer">Flask API</a> (Python-based) both work alongside each other with RStudio Connect to provide the perfect combination of organizational access and integration. This in turn provides access without requiring R or Python knowledge for users. In addition, you can integrate work from both these languages, giving data science teams a clear point of collaboration. This is perfect for data science as a service (DSaaS) where models may need to be deployed and reused by multiple customers and different data sets. You can learn more about how the Plumber API allows data science to be used by a wide range of tools and technologies <a href="https://www.rstudio.com/resources/webinars/expanding-r-horizons-integrating-r-with-plumber-apis/" target="_blank" rel="noopener noreferrer">here</a> in this webinar.</p><h2 id="flexibility-and-custom-visualization-a-lasting-path">Flexibility and custom visualization: A lasting path</h2><p>Insights from data science have huge potential, but they’re only as good as the runway given for exploration, visualization, and changes on the fly. Analysts and BI teams need access to fast, flexible, and performant updates, relevant to the question at hand.</p><blockquote><p><em>Even when reporting was a part of the equation, it almost always boiled down to one major requirement; customizable self-service analytics, that could be reproduced and deployed quickly.</em></p></blockquote><p>As a product manager for embedded analytics, I’ve spoken to thousands of customers and carefully analyzed their top needs. Even when reporting was a part of the equation, it almost always boiled down to one major requirement; customizable self-service analytics, that could be reproduced and deployed quickly. On one end of the spectrum, this may mean simply providing diversity and access to input controls and the ability to import and/or export data to ensure the flexibility required. On the other end, it may mean going the extra mile to connect data science results to BI systems (with APIs like Plumber) to ensure that end-users have full access to results directly inside a pre-built BI tool for full-service data analytics. This means putting the keys to the kingdom (with the right access) into the hands of managers, decision-makers, and final users, communicating results that are meant to be explored based on changes that are happening in real-time.</p><p>Also important to consider when scaling out new systems is the time that will be required from concept to release. How often do new data and insights need to be put into different applications, visualizations, or sets of controls? What type of performance is expected and how many users will ultimately need to be supported? How much customization will be required with the final visualization of results?</p><p>Luckily data scientists are already in a place where they are accustomed to working with code and have full access to a range of open-source packages tailored for building out interactive applications, input controls, and custom visualizations. <a href="https://shiny.rstudio.com/" target="_blank" rel="noopener noreferrer">Shiny</a> is one such package for R, which combines computational power with interactivity for the modern web. <a href="https://docs.bokeh.org/en/latest/docs/gallery.html" target="_blank" rel="noopener noreferrer">Bokeh</a> and <a href="https://plotly.com/dash/" target="_blank" rel="noopener noreferrer">Dash</a> are similar packages for Python, which in addition to Shiny are fully supported for easy deployment inside <a href="https://www.rstudio.com/products/connect/" target="_blank" rel="noopener noreferrer">RStudio Connect</a>.</p><h2 id="conclusion">Conclusion</h2><p>Organizations and their stakeholders depend on data scientists to forge a path forward through data. Like trailblazers, they are carving a path and overcoming unique challenges and obstacles that others can follow with BI tools. Many organizations are struck by the sheer speed that insights can be deployed using tools and languages that are native to their data science teams today. RStudio supports the direct creation and integration of open-source data science and stands ready to help companies and organizations expand with <a href="https://www.rstudio.com/products/team/" target="_blank" rel="noopener noreferrer">enterprise-level tooling and deployment</a>.</p><p>Curious to learn more about our approach? Check out this <a href="https://blog.rstudio.com/2020/05/19/driving-real-lasting-value-with-serious-data-science/" target="_blank" rel="noopener noreferrer">previous post</a> where we explore not only the importance of agility and durability to Serious Data Science but the key aspect of <a href="https://blog.rstudio.com/2020/06/02/is-your-data-science-credible-enough/" target="_blank" rel="noopener noreferrer">credibility</a> and how having the correct access and tools to find insights that are relevant builds trust and a path forward to understanding.</p></description></item><item><title>plumber 1.1.0</title><link>https://www.rstudio.com/blog/plumber-v1-1-0/</link><pubDate>Mon, 29 Mar 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/plumber-v1-1-0/</guid><description><p>I am happy to announce that <code>{plumber}</code> v1.1.0 is now on CRAN! Install it with:</p><pre><code>install.packages(&quot;plumber&quot;)</code></pre><p>By the way, <code>{plumber}</code> v1.0.0 was released about 6 months ago, but we didn&rsquo;t make a full announcement, so here we&rsquo;ll highlight features and improvements in both v1.0.0 and v1.1.0. At a high level, this includes:</p><ul><li><a href="#parallel-exec">parallel endpoint execution</a>,</li><li>a <a href="#tidy-interface"><em>tidy</em> interface</a> for programmatic development,</li><li><a href="#body-parsing">request body parsing</a>,</li><li>redirect requests to <a href="#trailing-slash">include trailing slash</a>, and</li><li>the ability to <a href="#custom">extend plumber</a> by adding additional request body parsers, response serializers, and visual documentation representations.</li></ul><p>In addition to the new features, <code>{plumber}</code> now has an official <a href="https://rstudio.com/resources/cheatsheets">RStudio cheat sheet</a>, a new <a href="https://swag.rstudio.com/product/plumber-sticker/15">hex logo</a>, and uses <a href="https://pkgdown.r-lib.org/"><code>{pkgdown}</code></a> to construct its website <a href="https://www.rplumber.io/">https://www.rplumber.io/</a>.</p><p><a href="https://github.com/rstudio/cheatsheets/blob/master/plumber.pdf"><img src="https://raw.githubusercontent.com/rstudio/cheatsheets/master/pngs/thumbnails/plumber-cheatsheet-thumbs.png" width="100%" /></a></p><h1 id="new-features">New features and improvements</h1><h2 id="parallel-exec">Parallel execution</h2><p><code>{plumber}</code> now has the ability to execute endpoints asynchronously via the <code>{promises}</code> and <code>{future}</code> packages. By wrapping slow endpoint code in <code>promises::future_promise()</code>, the main R session is able to execute multiple concurrent requests much more efficiently (compared to regular execution). For example, suppose we have the plumber API with endpoints:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic">#* @get /slow/&lt;k&gt;</span><span style="color:#06287e">function</span>() {promises<span style="color:#666">::</span><span style="color:#06287e">future_promise</span>({<span style="color:#06287e">slow_calc</span>()})}<span style="color:#60a0b0;font-style:italic">#* @get /fast/&lt;k&gt;</span><span style="color:#06287e">function</span>() {<span style="color:#06287e">fast_calc</span>()}</code></pre></div><p>Now let&rsquo;s imagine a scenario where 6 <code>/slow/&lt;k&gt;</code> requests are received before a <code>/fast/&lt;k&gt;</code> request. Since the <code>slow_calc()</code> has been wrapped in <code>promises::future_promise()</code>, the <code>fast_calc()</code> is able to execute immediately, even when limited <code>{future}</code> workers are available. The figure below depicts a timeline of what happens in this scenario when 2 <code>{future}</code> workers are available. Note that without async execution, the <code>/fast/&lt;k&gt;</code> would take 60 seconds to complete, but with <code>promises::future_promise()</code> it completes almost immediately! 🎉</p><p><a href="https://rstudio.github.io/promises/articles/future_promise.html"><img src="future_promise.png" alt="Using future_promise() allows the main R session to be free while waiting for a future worker to become available"></a></p><p>See the article on <a href="https://rstudio.github.io/promises/articles/future_promise.html"><code>promises::future_promise()</code></a> to learn more.</p><h2 id="tidy-interface">Tidy interface</h2><p>A brand new <a href="https://www.rplumber.io/reference/index.html?q=pr_">tidy interface</a> to create plumber API&rsquo;s with a more natural, pipe-able, functional programming approach:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic"># Initialize</span><span style="color:#06287e">pr</span>() <span style="color:#666">%&gt;%</span><span style="color:#60a0b0;font-style:italic"># Add a route</span><span style="color:#06287e">pr_get</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">/echo&#34;</span>, <span style="color:#06287e">function</span>(msg <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">(not provided)&#34;</span>) {<span style="color:#06287e">list</span>(msg <span style="color:#666">=</span> msg)}) <span style="color:#666">%&gt;%</span><span style="color:#60a0b0;font-style:italic"># Run the API</span><span style="color:#06287e">pr_run</span>(port <span style="color:#666">=</span> <span style="color:#40a070">8000</span>)</code></pre></div><h2 id="-plumber-tag"><code>#* @plumber</code> tag</h2><p>When <code>plumb()</code>ing a file, there are only a limited set of tags that <code>{plumber}</code> knows how to handle. To avoid having to create an <code>./endpoint.R</code> file, you can access your <code>{plumber}</code> API when <code>plumb()</code>ing your file by using the <code>@plumber</code> tag.</p><p>The <code>@plumber</code> tag will immediately execute the function definition right after the tag.</p><p>In the example below, we show how you can mount another API using the <code>@plumber</code> tag in addition to defining a regular <code>GET</code> route to <code>/echo</code>. Mounting a router (and many other API alterations) is not possible when <code>plumb()</code>ing a file unless you use the <code>#* @plumber</code> tag.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic">#* @get /echo</span><span style="color:#06287e">function</span>(msg <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">(not provided)&#34;</span>) {<span style="color:#06287e">list</span>(msg <span style="color:#666">=</span> msg)}<span style="color:#60a0b0;font-style:italic">#* @plumber</span><span style="color:#06287e">function</span>(pr) {mnt <span style="color:#666">&lt;-</span> <span style="color:#06287e">plumb</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">plumber_mount.R&#34;</span>)pr <span style="color:#666">%&gt;%</span><span style="color:#06287e">pr_mount</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">/mount_path/&#34;</span>, mnt)}</code></pre></div><details><summary> Tidy API </summary><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">pr</span>() <span style="color:#666">%&gt;%</span><span style="color:#06287e">pr_get</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">/echo&#34;</span>, <span style="color:#06287e">function</span>(msg <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">(not provided)&#34;</span>) {<span style="color:#06287e">list</span>(msg <span style="color:#666">=</span> msg)}) <span style="color:#666">%&gt;%</span><span style="color:#06287e">pr_mount</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">/mount_path/&#34;</span>, <span style="color:#06287e">plumb</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">plumber_mount.R&#34;</span>))</code></pre></div></details><h2 id="body-parsing">Request body parsing</h2><p>Prior to <code>{plumber}</code> v1.0.0, <code>{plumber}</code> had a very limited set of body parsers (JSON and form), but we&rsquo;ve added numerous parsers including: text, octet-stream, multipart forms, CSV, TSV, RDS, YAML, Feather, and <a href="https://www.rplumber.io/reference/parsers.html">more</a>.</p><p>No additional effort is required to use the JSON, form, text, octet-stream, and multipart form body parsers; however, if you&rsquo;d like to use any of the other parsers, you&rsquo;ll want to know about the new <code>#* @parser</code> tag. Be aware that when adding this tag to an endpoint, it&rsquo;ll overwrite the default set of body parsers. So, for example, if you <em>only</em> want support for parsing TSV information, then do:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic">#* @parser tsv</span><span style="color:#60a0b0;font-style:italic">#* @post /tsv_to_json</span><span style="color:#06287e">function</span>(req, res) {req<span style="color:#666">$</span>body}</code></pre></div><details><summary> Tidy API </summary><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">pr</span>() <span style="color:#666">%&gt;%</span><span style="color:#06287e">pr_post</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">/tsv_to_json&#34;</span>,<span style="color:#06287e">function</span>(req, res) { req<span style="color:#666">$</span>body },parsers <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">tsv&#34;</span>)</code></pre></div></details><h2 id="new-response-serializers">New response serializers</h2><p>Like the request body parsers, a whole new set of <a href="https://www.rplumber.io/reference/serializers.html">response serializers</a> have been added. These include CSV, TSV, RDS, Feather, YAML, <code>format()</code> output, <code>print()</code> output, <code>cat()</code> output. To change the default serializer from JSON, add a <em>single</em> <code>#* @serializer</code> tag to your route definition since unlike body parsers, an endpoint can only have one serializer.</p><p>In the example below, the CSV serialize is used and the extra arguments are passed along using the <code>list(na = &quot;&quot;)</code></p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic">#* @get /mtcars.csv</span><span style="color:#60a0b0;font-style:italic">#* @serializer csv list(na = &#34;&#34;)</span><span style="color:#06287e">function</span>() {mtcars}</code></pre></div><details><summary> Tidy API </summary><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">pr</span>() <span style="color:#666">%&gt;%</span><span style="color:#06287e">pr_get</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">/data&#34;</span>,<span style="color:#06287e">function</span>() {<span style="color:#06287e">as_attachment</span>(mtcars, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">custom.csv&#34;</span>)},serializer <span style="color:#666">=</span> plumber<span style="color:#666">::</span><span style="color:#06287e">serializer_csv</span>(na <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">&#34;</span>))</code></pre></div></details><h2 id="respond-with-an-attachment">Respond with an attachment</h2><p>If a user visited the endpoint in the previous section (<code>/mtcars.csv</code>) using their web browser, their browser would download <code>mtcars.csv</code>. In order to customize the downloaded filename, use the new <code>as_attachment()</code>. This allows you to decouple the named of the endpoint (e.g., <code>/data</code>) with the downloaded filename (<code>custom.csv</code>).</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic"># Download the response as a file named `custom.csv`</span><span style="color:#60a0b0;font-style:italic">#&#39; @get /data</span><span style="color:#60a0b0;font-style:italic">#&#39; @serializer csv</span><span style="color:#06287e">function</span>() {<span style="color:#06287e">as_attachment</span>(mtcars, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">custom.csv&#34;</span>)}</code></pre></div><details><summary> Tidy API </summary><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">pr</span>() <span style="color:#666">%&gt;%</span><span style="color:#06287e">pr_get</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">/data&#34;</span>,<span style="color:#06287e">function</span>() {<span style="color:#06287e">as_attachment</span>(mtcars, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">custom.csv&#34;</span>)},serializer <span style="color:#666">=</span> plumber<span style="color:#666">::</span><span style="color:#06287e">serializer_csv</span>())</code></pre></div></details><h2 id="openapi-v3">OpenAPI v3</h2><p>With <code>{plumber}</code> v1.0.0, we upgraded the API specification to follow OpenAPI v3. Upgraded the visual documentation to be based on OpenAPI v3. Before v1.0.0, <code>{plumber}</code> used Swagger 2.0 specification. Since then, <a href="https://www.openapis.org/faq">Swagger 2.0 spec was rebranded to OpenAPI Specification v2 and has upgraded to v3</a>.</p><p>When running a <code>{plumber}</code> API interactively, you will see visual documentation similar to the screenshot below:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">plumb_api</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">plumber&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">04-mean-sum&#34;</span>) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">pr_run</span>(port <span style="color:#666">=</span> <span style="color:#40a070">8000</span>)</code></pre></div><p><img src="swagger_ui.png" alt="Swagger UI"></p><h3 id="trailing-slash">Redirect requests to include trailing slash</h3><p>We&rsquo;ve implemented a highly requested behavior where requests that do not end in a slash will redirect to the route with a final slash.</p><p>For example, let&rsquo;s pretend that the route <code>GET</code> <code>/example</code> does not exist on our API, but <code>GET</code> <code>/example/</code> does exist. If the API receives any request for <code>GET</code> <code>/example?a=1</code>, <code>{plumber}</code> will respond with a redirect to <code>GET</code> <code>/example/?a=1</code>.</p><p>The implementation details may change in a later release (such as internally redirecting to avoid a second request), but the intent of eventually executing the <em>slashed</em> route will remain.</p><p>To opt into this behavior, set the option <code>options_plumber(trailingSlash = TRUE)</code>. The current default behavior is <code>options_plumber(trailingSlash = FALSE)</code>. This default behavior will most likely change to <code>TRUE</code> with the next major release of <code>{plumber}</code>.</p><h2 id="custom">Advanced customization</h2><p>For advanced <code>{plumber}</code> developers, <code>{plumber}</code> provides tools to register your own request body parser (<a href="https://www.rplumber.io/reference/register_parser.html"><code>register_parser()</code></a>), response serializer (<a href="https://www.rplumber.io/reference/register_serializer.html"><code>register_serializer()</code></a>), and custom visual documentation of the <code>{plumber}</code> API. Visual documentation can be customized in two ways: <a href="https://www.rplumber.io/reference/register_docs.html"><code>register_docs()</code></a> which allows you to easily get different UI styling (via packages such as <a href="https://cran.r-project.org/package=rapidoc">{rapidoc}</a>) and <a href="https://www.rplumber.io/reference/pr_set_api_spec.html"><code>pr_set_api_spec()</code></a> which allow customization of the OpenAPI specification. To do the latter, provide either YAML or JSON (that conforms to the <a href="http://spec.openapis.org/oas/v3.0.3">OAS</a>) to <code>pr_set_api_spec()</code>.</p><h1 id="community-questions">Community Questions</h1><p>If you ever want to pose a general question or have a question about your <code>{plumber}</code> setup, post a question on <a href="https://community.rstudio.com/tag/plumber"> <img alt="RStudio Community `{plumber}` tag" src="https://img.shields.io/badge/community-plumber-blue?style=social&logo=rstudio&logoColor=75AADB" style="margin-bottom:-5px"> </a> using the <code>{plumber}</code> tag.</p><p>Many new features of <code>{plumber}</code> have come from community questions. Please keep them coming!</p><h1 id="learn-more">Learn more</h1><p>For more details on <code>{plumber}</code>'s recent releases (including bug fixes and other enhancements), please see the full <a href="https://www.rplumber.io/news/index.html">Changelog</a>.</p><p>Happy <code>plumb()</code>ing!</p></description></item><item><title>BI and Data Science: Collaboration Using Data Handoffs</title><link>https://www.rstudio.com/blog/bi-and-data-science-the-handoff/</link><pubDate>Thu, 25 Mar 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/bi-and-data-science-the-handoff/</guid><description><p>In <a href="https://www.rstudio.com/tags/bi-tools/" target="_blank" rel="noopener noreferrer">recent posts</a> we have explored how organizations can make better decisions by focusing on data science and BI collaboration. In this post, we will look at one type of collaboration accomplished through <em>data handoffs</em>, which we define as</p><blockquote><p>Datasets stored in databases that are created by data scientists and shared with BI analysts</p></blockquote><p>In future posts we will explore additional collaboration techniques that enable real-time interactions between BI tools and data science work.</p><h3 id="why-data-handoffs">Why Data Handoffs?</h3><p>Often data science teams are created to answer particularly hard questions. They work with large messy data, often from unstructured or novel sources, and then apply advanced analytical methods and statistical rigor. As part of this work, data science teams create visualizations, dashboards, and interactive applications to influence decisions. While data scientists can usually accomplish these tasks most effectively using reproducible code, they are typically resource constrained and discover they can&rsquo;t:</p><ul><li>Address every question posed by the data.</li><li>Adapt results to each audience in the organization.</li><li>Satisfy audiences who wish to quickly explore the data themselves.</li></ul><p>In these scenarios, data scientists can use data handoffs to address these issues and leverage existing BI capabilities. By exporting the novel data sources, predictions, and calculated features they have created, data scientists can:</p><ul><li>Collaborate with larger BI teams.</li><li>Increase the visibility and re-use of their work.</li><li>Broaden and democratize the access to advanced data.</li></ul><p>These benefits create a virtuous cycle. Data scientists can share novel data, and then BI teams can explore that data, identify new problems, and propose solutions that require further validation from the data science team.</p><h3 id="how-data-handoffs-work">How Data Handoffs Work</h3><p>Data scientists typically begin by building analytic notebooks that do the hard work of cleaning unstructured data, generating calculated features, applying model predictions, or pre-processing big data. These notebooks can conclude by writing final <a href="https://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.html" target="_blank" rel="noopener noreferrer">tidy datasets</a> to an analytics data warehouse. The notebooks are then deployed into a production environment where they can monitored and scheduled. An illustrative workflow using data handoffs for the Washington, D.C. bikeshare program is shown below in Figure 1. The <a href="https://github.com/sol-eng/bike_predict" target="_blank" rel="noopener noreferrer">bike-predict GitHub repository</a> contains a more detailed description and all the code behind this workflow.</p><style type="text/css">img.screenshot { border: 0.5px solid #888; padding: 5px; background-color: #eee;}</style><figure><a href="./handoff3.jpeg" target="_blank" rel="noopener noreferrer"><img class="screenshot" src="handoff3.jpeg"></a><figcaption>Figure 1: Illustration of the use of handoff data for forecasting Washington, D.C. bikeshare</figcaption></figure><p>The R Markdown notebook in this example uses <a href="https://dbplyr.tidyverse.org/" target="_blank" rel="noopener noreferrer">dbplyr</a> to query data from a database, and then uses a trained xgboost model to create a forecast. The resulting forecast is written back to the database. This document is then deployed and scheduled on <a href="http://rstudio.com/connect" target="_blank" rel="noopener noreferrer">RStudio Connect</a>, which also supports scheduling Jupyter Notebooks. While this example focuses on creating batch model predictions, other common tasks could include <a href="https://spark.rstudio.com" target="_blank" rel="noopener noreferrer">data wrangling in Spark</a>, accessing novel data sources such as <a href="https://www.tidyverse.org/blog/2021/03/rvest-1-0-0/" target="_blank" rel="noopener noreferrer">web scraping</a>, <a href="https://www.tidymodels.org/tags/recipes/" target="_blank" rel="noopener noreferrer">feature generation</a>, or <a href="https://rich-iannone.github.io/pointblank/" target="_blank" rel="noopener noreferrer">advanced data verification</a>.</p><p>Data scientists are able to create analytic pipelines that generate rich and tidy data. This activity often complements the existing work of data engineering teams responsible for wrangling data across the entire organization.</p><p>Once the data is written to the database it becomes easily accessible to the BI team. BI tools such as Tableau or Power BI have robust support for querying relational databases that contain tidy data. BI analysts can access the tidy data and conduct exploratory analysis or generate visualizations and dashboards that can be broadly consumed by stakeholders. BI admins can set up data extracts to occur on a scheduled basis, ensuring that newly processed data from the data science team is automatically available to BI users. Dashboards and workbooks that are tied to these data extracts can register updates automatically and take advantage of the latest BI features such as <a href="https://www.tableau.com/about/blog/2020/2/introducing-dynamic-parameters-viz-animations-buffer-calcs" target="_blank" rel="noopener noreferrer">dynamic parameters</a>.</p><figure><a href="./tableau-extract3.png" target="_blank" rel="noopener noreferrer"><img class="screenshot" src="./tableau-extract3.png"></a><figcaption>Figure 2: A scheduled Tableau data extractions</figcaption></figure><h3 id="the-pros-and-cons-of-data-handoffs">The Pros and Cons of Data Handoffs</h3><p>The above example demonstrates the flexibility of data handoffs. However, like any technology, data handoffs have their advantages and disadvantages as shown in the Table 1 below.</p><style type="text/css">th.Approach { width: 24%; }th.Pros { width: 38%; vertical-align: middle; }th.Cons { width: 38%; vertical-align: middle; }table thead th {border-bottom: 1px solid #ddd;}th {font-size: 90%;background-color: #4D8DC9;color: #fff;vertical-align: center}td {font-size: 80%;background-color: #F6F6FF;vertical-align: top;line-height: 16px;}caption {padding: 0 0 16px 0;}table {width: 100%;}th.problem {width: 15%;}th.solution {width: 15%;}th.proscons {width: 35%;}th.options {width: 35%;}div.action {padding: 0 0 16px 0;}div.procon {padding: 0 0 0 0;}td.ul {padding: 0 0 0 0;margin-block-start: 0em;}table {border-top-style: hidden;border-bottom-style: hidden;border-collapse: separate;text-indent: initial;border-spacing: 2px;}table>thead>tr>th, .table>thead>tr>th {font-size: 0.7em !important;}table>tbody>tr>td {line-height: inherit;vertical-align: baseline;}table tbody td, td.approach {font-size: 14px;}</style><figure><table><thead><tr><th class="Approach"></th><th class="Pros"> Pros </th><th class="Cons"> Cons </th></tr></thead><tr><td class="approach"><strong>Data Handoff Approach</strong></td><td><ul><li>The data hand-off technique allows ready access from tools in both the data science and BI stack. R, Python, Tableau, and Power BI all support reading and writing to databases which means setup is easy and ongoing maintenance is limited.</li><li>Data handoffs cleanly isolate interactions between tools, allowing developers to collaborate quickly while still making it easy to troubleshoot errors.</li></ul></td><td><ul><li>Data handoffs rely on batch schedules which aren't ideal for data workflows that require near real-time data updates.</li><li>The flow of data in this approach is uni-directional starting with the data science team and flowing into the BI tool. Bi-directional data flows would require additional techniques.</li></ul></td></tr></table><figcaption style = "font-size: 90%; caption-side:bottom; text-align:left">Table 1: Summary of some of the pros and cons of data handoffs.</figcaption></figure><h2 id="to-learn-more">To Learn More</h2><p>Future posts will address how data scientists and BI teams can overcome the limitations in the data handoff technique, and <a href="https://blog.rstudio.com/2021/03/04/bi-and-ds-part1/" target="_blank" rel="noopener noreferrer">prior posts</a> explore the relationship between BI and data science in more details. If you&rsquo;d like to learn more about how RStudio products can help augment and complement your BI approaches, you can <a href="https://rstudio.chilipiper.com/book/schedule-time-with-rstudio" target="_blank" rel="noopener noreferrer">set up a meeting with our Customer Success team</a>.</p></description></item><item><title>BI and Data Science: Matching Approaches to Applications</title><link>https://www.rstudio.com/blog/bi-and-data-science-the-tradeoffs/</link><pubDate>Thu, 18 Mar 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/bi-and-data-science-the-tradeoffs/</guid><description><p><sup>Photo by <a href="https://unsplash.com/@jamie452?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Jamie Street</a> on <a href="https://www.rstudio.com/s/photos/match?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup></p><p>In the previous posts in our series on Data Science and Business Intelligence, we first discussed how <a href="https://blog.rstudio.com/2021/03/04/bi-and-ds-part1/" target="_blank" rel="noopener noreferrer">data science can either complement or augment self-service BI tools</a> to deliver more combined value. We then explored the <a href="https://blog.rstudio.com/2021/03/11/bi-and-ds2-strengths-challenges/" target="_blank" rel="noopener noreferrer">strengths and challenges</a> of the two approaches, both of which aim to help an organization get more insights from their data and to make better decisions.</p><p>In this post, we&rsquo;ll provide insights from organizations who have used both types of tools and give some guidance about which you should use when. We&rsquo;ll also set the stage for future blog posts where we will explore specific integration points for BI and Data Science tools.</p><h2 id="dont-get-trapped-into-a-false-choice">Don&rsquo;t Get Trapped into a False Choice</h2><p>In our prior post, we explored the <a href="https://blog.rstudio.com/2021/03/11/bi-and-ds2-strengths-challenges/" target="_blank" rel="noopener noreferrer">strengths and challenges</a> of both BI tools and open source data science. We won&rsquo;t repeat those arguments here. Instead, we&rsquo;ll hear from users who seem to understand that both approaches have their place.</p><p>BI tools are often an easier place for an organization to start when approaching an analytic problem, They provide a lower barrier to entry for the typical business user, who may not be comfortable coding in R or Python. The built-in features make it easy to visualize, explore and analyze data using a point-and-click approach and then to share that analysis with others.</p><p>For example, this user prefers Power BI for creating quick and easy visualizations, but switches to R and Shiny for their highly interactive user interfaces.</p><blockquote><p><em>&ldquo;Power BI is an easy to build visualization tool widely used in our organization to make data accessible to non-data people. This is a really great tool when we want to create a dashboard for trends and track some metrics. But it becomes very difficult when we want to enable high levels of user interactivity with the dashboard. That&rsquo;s where R Shiny helped us to build intuitive and highly interactive user interfaces.&quot;</em></p><p> &ndash; <a href="https://www.trustradius.com/reviews/rstudio-2020-11-19-19-05-28" target="_blank" rel="noopener noreferrer">A marketer at a large telecommunications firm</a></p></blockquote><p>Meanwhile this Biotech firm views Spotfire and Tableau as fine products so long as you are satisfied with their built-in capabilities, but sees R being more flexible.</p><blockquote><p>&ldquo;<em>RStudio is code based, so in the beginning tools like Spotfire and Tableau have [their] advantages since many things are already built in, but in terms of flexibility RStudio will win over the longer term.</em>&rdquo;</p><p> &ndash; <a href="https://www.trustradius.com/reviews/rstudio-2020-12-01-23-31-23" target="_blank" rel="noopener noreferrer">A team lead in a biotech company</a></p></blockquote><p>The individuals below describe how they apply this flexibility and power from two different industry perspectives.The first is from a financial industry leader.</p><blockquote><p><em>&ldquo;Most of the work the data scientists did used the R language. They did a great job satisfying management&rsquo;s constant barrage of questions because iterative analysis is so easy with tools like R, and the powerful visualization tools made communication of results easy for sales people to grasp. As the CEO, I was gratified at how clear the presentations were and at how quickly presenters answered my difficult questions, in some cases on the fly during the presentations.</em></p><p><em>As an R user myself, I know its code-based workflow lends itself to rapid iteration while, at the same time, documenting the process used. It was easy to unroll the tape to see every step that led to any conclusion.&quot;</em></p><p> <a href="https://blog.rstudio.com/2020/10/13/open-source-data-science-in-investment-management/" target="_blank" rel="noopener noreferrer">&ndash; Art Steinmetz</a>, former Chairman and CEO of Oppenheimer Funds</p></blockquote><p>The second individual describes how he uses R in the beverages industry:</p><blockquote><p><em>&ldquo;The R ecosystem has vast power to quickly solve problems. With R, I can incorporate nearly any AI/ML model into a dashboard or Shiny app, without being reliant on proprietary data science tools. Executives can be confident I am using the best analytic approach for a given problem, and I can rapidly apply new approaches as they become available.&quot;</em></p><p> &ndash; Paul Ditterline, Director of Data Science at Heaven Hill Brands</p></blockquote><p>While these may be only anecdotal evidence, they do show awareness of both approaches to data analysis and provide some color into why companies opt for each solution. They illustrate that as the questions get more complex, requiring greater analytic depth to answer, and more customization in how the analysis is done and presented, BI tools may struggle. Users will encounter a relatively low ceiling to the complexity of questions they can answer.</p><p>On the other hand, code-friendly data science tools represent a relatively high barrier to entry. They require those who create the analyses to have some understanding of coding in R and Python, and familiarity with applying and interpreting advanced analytic methods to get the most out of the tools. However, the flexibility and analytic breadth of code-friendly data science combines to provide a very high ceiling for answering difficult, valuable questions for an organization.</p><p>This just leaves open the question, &ldquo;How should I select my approach?&rdquo;</p><h2 id="match-your-data-science-approach-to-application-needs">Match Your Data Science Approach to Application Needs</h2><p>We expect firms to continue struggling with this tradeoff between BI tools and open source data science for years to come. As we argued in our first post on the topic, this isn&rsquo;t about choosing between the two approaches, but how to exploit the strengths of each while mitigating their challenges.</p><p>In the table below, the <em>Use When You&hellip;</em> column augments the table we presented last week. While this guide won&rsquo;t be correct for every case, it at least provides a guideline for those times a data science leader needs a quick answer to an urgent project.</p><style type="text/css">p { padding: 0 0 8px 0; }th { font-size: 90%; background-color: #4D8DC9; color: #fff; vertical-align: middle; }td { font-size: 80%; background-color: #F6F6FF; vertical-align: top; line-height: 16px; }td.approach { font-size: 90%; background-color: #4D8DC9; color: #fff; vertical-align: middle; }caption { padding: 0 0 0 0; }table { width: 100%; padding: 0 0 16px 0; }th.approach { width: 16%; }th.strengths { width: 28%;; vertical-align: middle; }th.challenges { width: 28%; vertical-align: middle; }th.use { width: 28%; vertical-align: middle; }table { border-top-style: hidden; border-bottom-style: hidden;}</style><table><tr><th class="approach"></th><th class="strengths"><strong>Strengths</strong></th><th class="challenges"><strong>Challenges</strong></th><th class="use"><strong>Use When You...</strong></th></tr><tr><td class="approach"><strong>Self-service BI Tools</strong></td><td><ul><li>Explore and visualize data without coding skills</li><li>Share analyses and interactive dashboards</li><li>Do self-service reporting and scheduling</li><li>Support data-driven organizations</li></li></ul></td><td><ul><li>Are difficult to adapt and inspect</li><li>Are limited by their black box nature</li><li>Struggle with enriched or wide data</li><li>Create uncertain conclusions</li><li>Include limited data science and machine learning</li> capabilities<li>Require skills that aren't easily transferred</li></ul></td><td><ul><li>Must support analysis and sharing with people without coding skills</li><li>Want to produce descriptive analytics and general reporting</li><li>Know that your use is covered by your BI Tool's feature set</li></ul></td></tr><tr><td class="approach"><strong>Open Source Data Science</strong></td><td><ul><li>Provide a wide range of open source capabilities</li><li>Unlock the benefits of code</li><li>Allow fully customizable data products</li><li>Have broad Interoperability</li><li>Create transferable skills and analyses</li><li>Tap a wider pool of potential talent</li></ul></td><td><ul><li>Necessitate coding in R or Python</li><li>May require package and environment management</li><li>Provide limited native deployment capabilities</li><li>Don't include enterprise security, scalability and cloud features</li></ul></td><td><ul><li>Need flexibility to tackle novel problems</li><li>Expect the analysis to be reused and will need to be reproducible without the code creator</li><li>Need to solve harder questions, which require data science and ML on complex data </li><li>Must support complex decision-making with deep interactivity</li></ul></td></tr></table><p><em>Table 1: Guidelines for when you should apply BI Tools or open source data science.</em></p><h2 id="summary">Summary</h2><p>RStudio is <a href="https://rstudio.com/about/what-makes-rstudio-different/" target="_blank" rel="noopener noreferrer">dedicated to the proposition</a> that code-friendly data science is uniquely powerful, and that everyone can learn to code. We support this through <a href="https://education.rstudio.com/" target="_blank" rel="noopener noreferrer">our education efforts</a>, our <a href="https://community.rstudio.com/" target="_blank" rel="noopener noreferrer">Community site</a>, and making R easier to use through our open source projects such as the <a href="https://www.tidyverse.org/" target="_blank" rel="noopener noreferrer">tidyverse</a>. Our software is already used by millions of people to analyze data every day.</p><p>However, code-friendly data science does present a higher barrier to entry compared to BI tools, which are very valuable for the wider community of analysts and business users in an organization. Because of this, it is critical to leverage both, and use data science to augment and complement your BI tools.</p><p>In our next posts, we will explore specific points of integration between these tools. We&rsquo;re happy to help you explore these topics, so if you&rsquo;d like to learn more about how RStudio products can help augment and complement your BI approaches, you can <a href="https://rstudio.chilipiper.com/book/schedule-time-with-rstudio" target="_blank" rel="noopener noreferrer">set up a meeting with our Customer Success team</a>.</p><h2 id="to-learn-more">To Learn More</h2><ul><li>See the <a href="https://blog.rstudio.com/2021/03/11/bi-and-ds2-strengths-challenges/" target="_blank" rel="noopener noreferrer">second blog post in our BI series</a> for more information on how RStudio tackles the challenges of open source data science listed in the table above. <a href="https://rstudio.com/products/team/" target="_blank" rel="noopener noreferrer">RStudio Team</a> provides security, scalability, package management and the centralized management of development and deployment environments, delivering the enterprise features many organizations require.</li><li>Read this recent interview for more information on <a href="https://blog.rstudio.com/2020/11/17/an-interview-with-lou-bajuk/" target="_blank" rel="noopener noreferrer">Why RStudio focuses on code-friendly data science</a>.</li><li>For more information on what RStudio is doing to make deep learning and AI available in the R ecosystem, see the <a href="https://blogs.rstudio.com/ai/" target="_blank" rel="noopener noreferrer">RStudio AI blog</a>.</li><li>Explore the enterprise value of an open source, code-friendly approach in our blog post series, <a href="https://blog.rstudio.com/2020/06/24/delivering-durable-value/" target="_blank" rel="noopener noreferrer">importance and benefits of Serious Data Science</a>.</li></ul></description></item><item><title>R in Pharma with ProCogia X-Session Recordings are Now Available</title><link>https://www.rstudio.com/blog/r-in-pharma-with-procogia-x-session-recordings-are-now-available/</link><pubDate>Tue, 16 Mar 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/r-in-pharma-with-procogia-x-session-recordings-are-now-available/</guid><description><p>We’re excited to announce that all rstudio::global(2021) X-Session recordings are now available for you to watch.</p><p>During the week of rstudio::global(2021), one of our Full Service Partners, <a href="https://www.procogia.com/" target="_blank" rel="noopener noreferrer">ProCogia</a>, teamed up with the <a href="https://www.pharmar.org/" target="_blank" rel="noopener noreferrer">R/Pharma</a> organization to offer three hours of sponsored, hands-on material covering all aspects of pharmaceutical data science processes. With guest speakers from Novartis, Janssen, Biogen, and more, topics included everything from clinical trial processes to the R Package Validation Framework.</p><h2 id="rstudio-investments-in-pharma---sean-lopp">RStudio Investments in Pharma - Sean Lopp</h2><script src="https://fast.wistia.com/embed/medias/sgcsdmq69t.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_sgcsdmq69t videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/sgcsdmq69t/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><p>Sean Lopp discusses RStudio updates for the Pharma community including: new validation guidance for RStudio software and packages, Bioconductor support in RStudio Package Manager, and a demo of the <code>gt</code> package for creating beautiful and precise tables in R.</p><p><a href="https://rstudio.com/resources/rstudioglobal-2021/rstudio-investments-in-pharma/" target="_blank" rel="noopener noreferrer">Learn more.</a></p><h2 id="the-rpharma-organization---harvey-lieberman">The R/Pharma Organization - Harvey Lieberman</h2><script src="https://fast.wistia.com/embed/medias/hdblqcyc78.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_hdblqcyc78 videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/hdblqcyc78/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><p><a href="https://www.pharmar.org/" target="_blank" rel="noopener noreferrer">R/Pharma</a> is an organization of R enthusiasts who work in the pharma and biotech industries. This presentation summarizes the group and presents some goals for 2021.</p><p><a href="https://rstudio.com/resources/rstudioglobal-2021/r-pharma/" target="_blank" rel="noopener noreferrer">Learn more.</a></p><h2 id="risk-assessment-tools-r-validation-hub-initiatives---marly-gotti">Risk Assessment Tools: R Validation Hub Initiatives - Marly Gotti</h2><script src="https://fast.wistia.com/embed/medias/gq0zkcet3p.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_gq0zkcet3p videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/gq0zkcet3p/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><p>In this talk, Marly presents some of the resources and tools the R Validation Hub has been working on to aid the biopharmaceutical industry in the process of using R in a regulatory setting.</p><p><a href="https://rstudio.com/resources/rstudioglobal-2021/risk-assessment-tools-r-validation-hub-initiatives/" target="_blank" rel="noopener noreferrer">Learn more.</a></p><h2 id="r-in-janssen-drug-discovery-statistics---volha-tryputsen">R in Janssen Drug Discovery Statistics - Volha Tryputsen</h2><script src="https://fast.wistia.com/embed/medias/i9dbsvzi8r.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_i9dbsvzi8r videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/i9dbsvzi8r/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><p>This talk discusses how R is utilized in the Janssen drug discovery statistics workflow.</p><p><a href="https://rstudio.com/resources/rstudioglobal-2021/r-in-janssen-drug-discovery-statistics/" target="_blank" rel="noopener noreferrer">Learn more.</a></p><h2 id="r-package-validation-framework---ellis-hughes">R Package Validation Framework - Ellis Hughes</h2><script src="https://fast.wistia.com/embed/medias/gytgay4xjh.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_gytgay4xjh videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/gytgay4xjh/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><p>By using tools native to the R package building infrastructure, validation can become an integrated part of your package development, improving the quality of both the package and validation.</p><p><a href="https://rstudio.com/resources/rstudioglobal-2021/r-package-validation-framework/" target="_blank" rel="noopener noreferrer">Learn more.</a></p><h2 id="r-in-pharma-intro-to-shiny---mike-garcia">R in Pharma: Intro to Shiny - Mike Garcia</h2><script src="https://fast.wistia.com/embed/medias/2t1w03on9j.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_2t1w03on9j videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/2t1w03on9j/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><p>In this introduction to Shiny app development, Mike begins with a quick review of visualization with <code>ggplot2</code> and then covers core concepts in app structure and reactive programming.</p><p><a href="https://rstudio.com/resources/rstudioglobal-2021/r-in-pharma-intro-to-shiny/" target="_blank" rel="noopener noreferrer">Learn more.</a></p></description></item><item><title>Mastering Shiny with Appsilon X-Session Recordings are Now Available</title><link>https://www.rstudio.com/blog/mastering-shiny-with-appsilon-x-session-recordings-are-now-available/</link><pubDate>Mon, 15 Mar 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/mastering-shiny-with-appsilon-x-session-recordings-are-now-available/</guid><description><p>We’re excited to announce that all rstudio::global(2021) X-Session recordings are now available for you to watch.</p><p>Leading up to rstudio::global, we teamed up with one of our Full Service Partners, <a href="https://appsilon.com/" target="_blank" rel="noopener noreferrer">Appsilon</a>, to offer a sponsored, hands-on experience covering everything from styling Shiny applications to scaling them out to thousands of users.</p><p><a href="https://blog.rstudio.com/2021/01/11/x-sessions-at-rstudio-global/" target="_blank" rel="noopener noreferrer">X-Sessions</a> are 3-hour themed events made up of tool shares, case studies, and live hands-on sessions. The material in each session is organized by industry, with the goal of creating smaller, unique experiences tailored to specific use cases.</p><h2 id="theming-shiny--rmarkdown-with-thematic-and-bslib---tom-mock--shannon-hagerty">Theming Shiny &amp; RMarkdown with <code>thematic</code> and <code>bslib</code> - Tom Mock &amp; Shannon Hagerty</h2><script src="https://fast.wistia.com/embed/medias/j0stmrbq6j.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_j0stmrbq6j videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/j0stmrbq6j/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><p>This presentation covers the basics of how the <code>bslib</code> and <code>thematic</code> packages can be used to consistently style all the components of a Shiny app at once.</p><p><a href="https://rstudio.com/resources/rstudioglobal-2021/theming-shiny-and-rmarkdown-with-thematic-and-bslib/" target="_blank" rel="noopener noreferrer">Learn more.</a></p><h2 id="styling-shiny-with-css--sass-and-speeding-up-shiny-apps---pedro-silva">Styling Shiny with CSS &amp; SASS and Speeding Up Shiny Apps - Pedro Silva</h2><script src="https://fast.wistia.com/embed/medias/0renaho46n.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_0renaho46n videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/0renaho46n/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><p>In this talk, Pedro discusses how to use Cascading Style Sheets (CSS) to give your application a fresh and unique look, while keeping your codebase clean and organized with Syntactically Awesome Style Sheets (SASS). He then discusses how to use Shiny <code>update</code> functions, proxy objects, and JavaScript messages to speed up your dashboards.</p><p><a href="https://rstudio.com/resources/rstudioglobal-2021/styling-shiny-with-css-and-sass-and-speeding-up-shiny-apps/" target="_blank" rel="noopener noreferrer">Learn more.</a></p><h2 id="appsilons-guide-to-working-with-open-source-shiny---dominik-krzemiński">Appsilon’s Guide to Working with Open Source Shiny - Dominik Krzemiński</h2><script src="https://fast.wistia.com/embed/medias/k5sgpyzs0a.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_k5sgpyzs0a videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/k5sgpyzs0a/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><p>This presentation discusses some best-practices for contributing to Shiny along with some helpful Shiny extensions.</p><p><a href="https://rstudio.com/resources/rstudioglobal-2021/guide-to-working-in-open-source-shiny/" target="_blank" rel="noopener noreferrer">Learn more.</a></p><h2 id="best-practices-for-developing-shiny-apps---olga-mierzwa-sulima">Best Practices for Developing Shiny Apps - Olga Mierzwa-Sulima</h2><script src="https://fast.wistia.com/embed/medias/atrqx4owu1.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_atrqx4owu1 videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/atrqx4owu1/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><p>In this presentation, Olga Mierzwa-Sulima presents some of the best practices for developing Shiny apps. These practices include how to organize an application&rsquo;s code with modules and R6 classes, setting up a development environment, and testing the resulting Shiny app.</p><p><a href="https://rstudio.com/resources/rstudioglobal-2021/best-practices-for-developing-shiny-apps/" target="_blank" rel="noopener noreferrer">Learn more.</a></p><h2 id="scaling-shiny-to-thousands-of-users---damian-rodziewicz">Scaling Shiny to Thousands of Users - Damian Rodziewicz</h2><script src="https://fast.wistia.com/embed/medias/446h3e5nlt.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_446h3e5nlt videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/446h3e5nlt/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><p>You made a Shiny app, and now everyone wants to use it. How can you scale your app so that hundreds or thousands of users can have a seamless experience? Damien Rodziewicz shows that you must approach Shiny apps differently to ensure they scale.</p><p><a href="https://rstudio.com/resources/rstudioglobal-2021/scaling-shiny-to-thousands-of-users/" target="_blank" rel="noopener noreferrer">Learn more.</a></p><h2 id="empowering-data-scientists-to-build-spectacular-shiny-apps---filip-stachura--marek-rogala">Empowering Data Scientists to Build Spectacular Shiny Apps - Filip Stachura &amp; Marek Rogala</h2><script src="https://fast.wistia.com/embed/medias/0r9wqpai1c.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_0r9wqpai1c videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/0r9wqpai1c/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><p>In this talk, <a href="https://appsilon.com/" target="_blank" rel="noopener noreferrer">Appsilon</a>'s CEO and CTO discuss the challenges facing Shiny app developers and the crucial steps they must take to achieve success. They explain three Appsilon initiatives to empower data scientists to build great Shiny apps and how those developers can use the <code>shiny.fluent</code> package to speed that process.</p><p><a href="https://rstudio.com/resources/rstudioglobal-2021/empowering-data-scientists-to-build-spectacular-shiny-apps/" target="_blank" rel="noopener noreferrer">Learn more.</a></p></description></item><item><title>BI and Open Source Data Science: Strengths and Challenges</title><link>https://www.rstudio.com/blog/bi-and-ds2-strengths-challenges/</link><pubDate>Thu, 11 Mar 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/bi-and-ds2-strengths-challenges/</guid><description><p><sup>Photo by <a href="https://unsplash.com/@romankraft?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText" target="_blank" rel="noopener noreferrer">Roman Kraft</a> on <a href="https://unsplash.com/s/photos/balance?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText" target="_blank" rel="noopener noreferrer">Unsplash</a></sup></p><p>In <a href="https://blog.rstudio.com/2021/03/04/bi-and-ds-part1/" target="_blank" rel="noopener noreferrer">our first post in this series</a>, we started examining a critical aspect of <a href="https://blog.rstudio.com/2020/07/15/interoperability-maximize-analytic-investments/" target="_blank" rel="noopener noreferrer">interoperability</a>: the intersection between Business Intelligence (BI) and data science platforms. The two approaches share a common goal: delivering rich interactive applications and dashboards that can be shared with others to improve their decision-making. However, this common purpose often leads to the tools (and the teams that support and use them) being seen as competitors for software budgets and executive mindshare in a large organization.</p><p>In the previous post, we reviewed two high-level approaches for combining these tools to deliver increased value to an organization: Using data science to either <strong>complement</strong> or <strong>augment</strong> self-service BI. That is, using the tools either side by side to tackle different use cases, or together to tackle a single analytic problem.</p><p>In this post, we&rsquo;ll focus on the strengths and challenges of the two approaches, to help you identify which tool to use in different situations.</p><h2 id="strengths-and-challenges-of-self-service-bi"><strong>Strengths and Challenges of Self-Service BI</strong></h2><p>Self-Service BI tools, such as Tableau, PowerBI or Spotfire, are widely used because they allow business analysts to:</p><ul><li><strong>Explore and visualize data without coding skills or being dependent on data scientists:</strong> While these business users may not understand R, Python, or advanced modeling methods, they are typically very familiar with the data and the business problems they are trying to solve. These BI tools enable them to apply that knowledge.</li><li><strong>Share analyses and interactive dashboards:</strong> Another key strength of BI tools is that users can easily share these analyses with others, typically without relying on IT for deployment. More advanced users with specialized skills can create interactive dashboards and applications, enabling other users to apply the same analytic approach in the future.</li><li><strong>Do self-service reporting and scheduling:</strong> Many organizations require consistent, visually-appealing reports on a regular schedule, for both internal stakeholders and clients. BI tools usually provide a way to schedule the updates of reports, and then notify stakeholders of the updates.</li><li><strong>Support data-driven organizations:</strong> When these tools are adopted as a corporate standard and widely deployed, they provide a common platform for sharing insights and supporting decision-making. That common platform helps support a data-driven culture in an organization.</li></ul><p>Despite the strengths of BI tools, they also present challenges that may not be obvious at first glance. BI tools:</p><ul><li><strong>Are difficult to adapt and inspect:</strong> Analyses and visualizations are typically heavily tied to the specific data schema they are built on, making them difficult to adapt when the underlying data changes substantially. Data transformations are often obscured in a series of point-and-click actions, which reinforces this challenge. This makes extending an analysis to new data difficult, errors difficult to find, and processes challenging to audit.</li><li><strong>Are limited by their black box nature:</strong> While modern BI tools have a wide range of visualizations and some basic statistical tools, they are largely constrained by the proprietary capabilities their vendors implement. Going beyond these standard options often requires heroic effort, such as embedding custom javascript visualizations or custom extension development in C++. Similarly, these tools have limited workflow and application development capabilities, so automating a series of steps for data retrieval and transformation can be difficult. Tasks such as integrating specialized data sources (e.g., web scraping) can be impossible with standard functionality.</li><li><strong>Struggle with enriched or wide data:</strong> While data access is a major focus, these tools typically provide limited ways to interact with &ldquo;enriched&rdquo; or unstructured data. BI tools may also have challenges when dealing with many variables. Interactive visualization of wide data sets can overwhelm users, who may be uncertain which variables are most relevant. This typically requires the application of advanced analytic methods, such as eliminating correlated columns or principal component analysis, to reduce the data.</li><li><strong>Create uncertain conclusions:</strong> Humans are hardwired to see patterns and create explanations for them&ndash;even if they are not real. It can be very difficult for a BI user to know if an apparent pattern can be relied upon for future decision making. In some cases, it can be difficult to draw any conclusions at all without the application of more advanced analytic methods.</li><li><strong>Include limited data science and machine learning capabilities:</strong> BI platforms typically have native support only for very basic predictive/ML models. It can also be very difficult to embed the work done by a data science team in the BI product, since that often requires data scientists to work with unfamiliar development environments and limited integration points. This slows the process, hampers iteration and reduces productivity.</li><li><strong>Require skills that aren&rsquo;t easily transferred:</strong> Getting the most out of these BI tools, especially creating reusable analyses, requires specialized skills developed over time. If the analyst moves to another organization, or their organization decides not to renew expensive commercial software, these platform-specific skills are wasted. Similarly, if you wish to share your analyses with colleagues who may not have access to the tool, it will be difficult for them to run the analysis themselves. Or if they can, the analyses may be difficult to reuse, adapt and inspect, as described above.</li></ul><h2 id="strengths-and-challenges-of-code-friendly-data-science"><strong>Strengths and Challenges of Code-Friendly Data Science</strong></h2><p>When compared with self-service BI tools, open source data science tools using R and Python provide:</p><ul><li><strong>A wide range of open source capabilities:</strong> Users can draw on a broad spectrum of capabilities, ranging from classical models to cutting-edge deep learning techniques. These expansive libraries ensure that data scientists will always have the right tool for the analytic problem at hand.</li><li><strong>The benefits of code:</strong> Code-based approaches are inherently reusable, extensible and inspectable. Changes are easy to track over time using version control. These aspects of code actually reduce complexity compared to point-and-click solutions. Reproducible code becomes core intellectual property for your organization, making it easier to solve new problems in the future and increase the aggregate value of your data science work.</li><li><strong>Fully customizable data products:</strong> Code-based solutions allow you to fully customize reports, dashboards, visualizations, and applications, allowing you to tailor them to the needs of your decision-makers. In addition, these data products are built using the same languages that data scientists already know, instead of requiring them to learn a new framework.</li><li><strong>Broad interoperability</strong>: As discussed <a href="https://blog.rstudio.com/2020/07/15/interoperability-maximize-analytic-investments/" target="_blank" rel="noopener noreferrer">in this blog post</a>, teams need interoperable tools that give a data scientist direct access to different platforms and tools. This access is critical because it keeps data scientists more productive and helps ensure better utilization of IT and data resources. The ability to use data in many different formats, including unstructured and non-traditional data, makes it far easier to enrich analyses and reports.</li><li><strong>Transferable skills and analyses:</strong> When you use open source as the core of your data science, you are not constrained by commercial platforms. This means that you can use your hard-won skills at any organization, regardless of what software they purchase, and you can share your analyses and insights with anyone, regardless of what software they can afford. This premise is the heart of <a href="https://rstudio.com/about/what-makes-rstudio-different/" target="_blank" rel="noopener noreferrer">RStudio&rsquo;s mission to support open source data science</a>.</li><li><strong>A wider pool of potential talent:</strong> R and Python are widely adopted, and taught almost universally in colleges and universities. This allows organizations to draw on a much wider pool of knowledgeable data scientists when hiring. They can be confident that these new hires will already be familiar with the languages, and become productive members of the organization much more quickly.</li></ul><p>Despite these strengths, teams which adopt open source, code-friendly data science do encounter a number of challenges. Open source data science tools:</p><ul><li><strong>Necessitate coding skills</strong>: Business users who are used to Excel often find code-based approaches foreign and inaccessible and therefore hesitate to deploy them. While Shiny applications and other data products can easily be used by stakeholders unfamiliar with R or Python, these data products require someone familiar with the languages to develop them.</li><li><strong>May require package and environment management:</strong> A key strength of the R and Python ecosystems is the broad universe of packages. However, unless data science teams make special efforts, this rapidly evolving ecosystem can make it difficult to maintain stable, reproducible applications as those packages change over time.</li><li><strong>Provide limited native deployment capabilities:</strong> Open source data science teams often must create their own ways to deploy and share applications and dashboards to their community of users. These homegrown solutions can be difficult to develop and maintain and may run into objections from IT groups.</li><li><strong>Don&rsquo;t include enterprise security, scalability and cloud features:</strong> Similarly, R and Python do not provide many enterprise-required features as part of the open source ecosystem. Organizations frequently struggle to support large teams, whether for development or deployment, on premise or in the cloud.</li></ul><style type="text/css">p { padding: 0 0 8px 0; }th { font-size: 90%; background-color: #4D8DC9; color: #fff; vertical-align: middle; }td { font-size: 80%; background-color: #F6F6FF; vertical-align: top; line-height: 16px; }td.approach { font-size: 90%; background-color: #4D8DC9; color: #fff; vertical-align: middle; }caption { padding: 0 0 0 0; }table { width: 100%; padding: 0 0 16px 0; }th.approach { width: 24%; }th.strengths { width: 38%; vertical-align: middle; }th.challenges { width: 38%; vertical-align: middle; }table thead th {border-bottom: 1px solid #ddd;}th {font-size: 90%;background-color: #4D8DC9;color: #fff;vertical-align: center}td {font-size: 80%;background-color: #F6F6FF;vertical-align: top;line-height: 16px;}caption {padding: 0 0 16px 0;}table {width: 100%;}th.problem {width: 15%;}th.solution {width: 15%;}th.proscons {width: 35%;}th.options {width: 35%;}div.action {padding: 0 0 16px 0;}div.procon {padding: 0 0 0 0;}td.ul {padding: 0 0 0 0;margin-block-start: 0em;}table {border-top-style: hidden;border-bottom-style: hidden;border-collapse: separate;text-indent: initial;border-spacing: 2px;}table>thead>tr>th, .table>thead>tr>th {font-size: 0.7em !important;}table>tbody>tr>td {line-height: inherit;vertical-align: baseline;}table tbody td, td.approach {font-size: 14px;}</style><div class="text-center mt-5"><caption><b>Table 1: Summary of the strengths and challenges of using Self-Service BI and open source data science tools.</b></caption></div><table><thead><tr><th></th><th class="strengths"> Strengths </th><th class="challenges"> Challenges </th></tr></thead><tr><td class="approach"><strong>Self-service BI Tools</strong></td><td><ul><li>Explore and visualize data without coding skills</li><li>Share analyses and interactive dashboards</li><li>Do self-service reporting and scheduling</li><li>Support data-driven organizations</li></ul></td><td><ul><li>Are difficult to adapt and inspect</li><li>Are limited by their black box nature</li><li>Struggle with enriched or wide data</li><li>Create uncertain conclusions</li><li>Include limited data science and machine learning capabilities</li><li>Require skills that aren't easily transferred</li></ul></td></tr><tr><td class="approach"><strong>Open Source Data Science Tools</strong></td><td><ul><li>Provide a wide range of open source capabilities</li><li>Unlock the benefits of code</li><li>Allow fully customizable data products</li><li>Have broad Interoperability</li><li>Create transferable skills and analyses</li><li>Tap a wider pool of potential talent</li></ul></td><td><ul><li>Necessitate coding in R or Python</li><li>May require package and environment management</li><li>Provide limited native deployment capabilities</li><li>Don't include enterprise security, scalability and cloud features</li></ul></td></tr></table><h2 id="rstudio-tackles-the-open-source-challenges"><strong>RStudio Tackles the Open Source Challenges</strong></h2><p>The challenges for open source data science summarized above are significant&ndash;and are the specific challenges that RStudio addresses.</p><ul><li>We are <a href="https://rstudio.com/about/what-makes-rstudio-different/" target="_blank" rel="noopener noreferrer">dedicated to the proposition</a> that code-friendly data science is uniquely powerful, and that everyone can learn to code. We support this through <a href="https://education.rstudio.com/" target="_blank" rel="noopener noreferrer">our education efforts</a>, our <a href="https://community.rstudio.com/" target="_blank" rel="noopener noreferrer">Community site</a>, and making R easier to use through our open source projects such as the <a href="https://www.tidyverse.org/" target="_blank" rel="noopener noreferrer">tidyverse</a>.</li><li><a href="https://rstudio.com/products/team/" target="_blank" rel="noopener noreferrer">RStudio Team</a> provides security, scalability, package management and the centralized management of development and deployment environments, delivering the enterprise features many organizations require.</li><li><a href="https://rstudio.com/products/cloud/" target="_blank" rel="noopener noreferrer">RStudio Cloud</a> and <a href="https://rstudio.com/products/shinyapps/" target="_blank" rel="noopener noreferrer">Shinyapps.io</a> enable data scientists to develop and deploy data products on the cloud.</li></ul><h2 id="complement-and-augment-your-bi-tools"><strong>Complement and Augment your BI Tools</strong></h2><p>Code-friendly data science with R and Python is powerful, and can be even more valuable when used in conjunction with self-service BI tools (as discussed <a href="https://blog.rstudio.com/2021/03/04/bi-and-ds-part1/" target="_blank" rel="noopener noreferrer">in our first post</a>).</p><p>The strengths and challenges above show that:</p><ul><li>BI tools are powerful and have a lower barrier to entry for most users, but have limits to their flexibility and analytic depth. This limits the complexity of the questions they can answer.</li><li>Open source data science has a higher barrier to entry, requiring coding skills for development. But its flexibility and analytic power is nearly limitless. This allows organizations to answer the most complex questions they have.</li></ul><p>Organizations must consider this balance, between the barrier to entry and the complexity of the questions that need to be answered, when choosing an approach. In future blog posts, we will dive more deeply into this topic, explore specific integration points for BI and Data Science tools, and provide concrete recommendations.</p><p>We&rsquo;re happy to help you explore these topics, so if you&rsquo;d like to learn more about how RStudio products can help augment and complement your BI approaches, you can <a href="https://rstudio.chilipiper.com/book/schedule-time-with-rstudio" target="_blank" rel="noopener noreferrer">set up a meeting with our Customer Success team</a>.</p><h2 id="to-learn-more"><strong>To Learn More</strong></h2><ul><li>For an example of how a data science team used RStudio products to create and share applications without needing to learn new languages, see this <a href="https://rstudio.com/about/customer-stories/brown-forman/" target="_blank" rel="noopener noreferrer">customer story from Brown-Forman</a>.</li><li>Read about <a href="https://rstudio.com/about/what-makes-rstudio-different/" target="_blank" rel="noopener noreferrer">RStudio&rsquo;s mission to support open source data science</a> and why we&rsquo;ve <a href="https://blog.rstudio.com/2020/01/29/rstudio-pbc/" target="_blank" rel="noopener noreferrer">dedicated ourselves to that mission as a Public Benefit Corporation</a>.</li></ul></description></item><item><title>Time to get your Shiny on, Shiny Contest 2021 is here!</title><link>https://www.rstudio.com/blog/time-to-shiny/</link><pubDate>Thu, 11 Mar 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/time-to-shiny/</guid><description><p>We’re excited to announce Shiny Contest 2021! This marks the third year of the Shiny contest, and over the past two years we have been in awe of all of your submissions and we are very much looking forward to seeing what the Shiny community comes up with this year.</p><p>Shiny Contest 2021 officially kicks off today and the deadline for submissions is May 14 2021 at 5pm ET. You can submit your entry for the contest by filling out the form at <a href="https://rstd.io/shiny-contest-2021">rstd.io/shiny-contest-2021</a>. We strongly recommend getting your submission in well before the deadline so that you have ample time to resolve any last-minute technical hurdles.</p><p>You are welcome to submit an existing Shiny app of yours or create a new one in two months. There is no limit on the number of entries one participant can submit. Please submit as many as you wish!</p><h2 id="requirements-for-the-contest-are-same-as-before">Requirements for the contest are same as before:</h2><ul><li>Data and code used in the app should be publicly available and/or openly licensed.</li><li>Your app should be deployed on <a href="https://www.shinyapps.io/">shinyapps.io</a>.</li><li>Your app should be in a public <a href="https://rstudio.cloud/">RStudio Cloud project</a>. (Be sure to set access to everyone.)<ul><li>If you’re new to RStudio Cloud and shinyapps.io, you can create an account for free. Additionally, you can find <a href="https://docs.google.com/document/d/1p-5Ls2kEU9TUoUTQfBNqwEMPoAL0eNHceKRXDZ1koXc/">instructions specific to this contest here</a> and find the <a href="https://rstudio.cloud/learn/guide">general RStudio Cloud guide here</a>.</li></ul></li></ul><h2 id="criteria-for-evaluation">Criteria for evaluation:</h2><p>Just like the last two years, apps will be judged based on technical merit and/or on artistic achievement (e.g., UI design). We recognize that some apps may excel in one of these categories and some in the other, and some in both. Evaluation will be done keeping this in mind and it will also take into account the narrative on the contest submission post. We recommend crafting your submission post with this in mind.</p><h2 id="awards">Awards:</h2><p>Last year we announced prizes specifically for novice Shiny developers and we were thrilled that over 30% of the submissions were developed by those with less than 1 year experience with Shiny. We would love to see just as many, if not more, submissions from new Shiny developers, and to encourage that, we will again be giving out awards at both novice and experienced developer levels.</p><p>The award categories and the associated prizes are as follows:</p><ul><li><strong>Honorable Mention</strong>:<ul><li>One year of shinyapps.io Basic plan or RStudio Cloud Premium.</li><li>A bunch of hex stickers of RStudio packages</li><li>A spot on the Shiny User Showcase</li></ul></li><li><strong>Runner Up</strong>:<ul><li>All prizes listed above, plus</li><li>Any number of RStudio t-shirts, books, and mugs (worth up to $200)</li></ul></li><li><strong>Grand Prizes</strong>:<ul><li>All prizes listed above, plus</li><li>Special &amp; persistent recognition by RStudio in the form of a winners page, and a badge that’ll be publicly visible on your RStudio Community profile</li><li>Half-an-hour one-on-one with a representative from the RStudio Shiny team for Q&amp;A and feedback</li></ul></li></ul><p><code>*</code> <em>Please note that we may not be able to send t-shirts, books, or other items larger than stickers to non-US addresses.</em></p><p>The names and work of all winners will be highlighted in the <a href="https://shiny.rstudio.com/gallery/#user-showcase">Shiny User Showcase</a> and we will announce them on RStudio’s social platforms, including <a href="community.rstudio.com">RStudio Community</a> (unless the winner prefers not to be mentioned).</p><p>We will announce the winners and their submissions on the RStudio blog, RStudio Community, and also on Twitter.</p><h2 id="need-inspiration">Need inspiration?</h2><ul><li>Review the winning apps and honorable mentions of last years’ contests: <a href="https://blog.rstudio.com/2020/07/13/winners-of-the-2nd-shiny-contest/">Shiny Contest 2020</a>, <a href="https://blog.rstudio.com/2019/04/05/first-shiny-contest-winners/">Shiny Contest 2019</a>.</li><li>Browse the <a href="https://shiny.rstudio.com/gallery/#user-showcase">Shiny User Showcase</a>.</li><li>Peruse all <a href="https://community.rstudio.com/c/shiny/shiny-contest/30">Shiny Contest submissions on RStudio Community</a>.</li></ul><p>We really appreciate the time and effort each contestant puts into building their submissions and can’t wait to see what you produce!</p></description></item><item><title>RStudio Professional Drivers 1.7.0</title><link>https://www.rstudio.com/blog/pro-drivers-1-7-0-release/</link><pubDate>Wed, 10 Mar 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/pro-drivers-1-7-0-release/</guid><description><p>Data security is a strategic imperative in most organizations. In a world where data is under constant threat of malicious attacks, the security of your data pipeline is critical. One of the easiest ways to keep your data secure is by applying the latest software updates to your systems. This release of the <a href="https://rstudio.com/products/drivers/">RStudio Professional Drivers</a> contains important updates that will help keep your data connections secure and easy to manage. Updating the drivers literally takes minutes and can help prevent future security and administrative issues. <em>We strongly encourage all customers to upgrade to the 1.7.0 release of the RStudio Professional Drivers</em>.</p><p>RStudio offers ODBC database drivers to all current customers using our professional products at no additional charge, so that data scientists and organizations can take full advantage of their data. The drivers are an important part of our effort to promote <a href="https://blog.rstudio.com/2020/07/15/interoperability-maximize-analytic-investments/">interoperability</a> between systems and data science languages like R and Python. All our drivers are commercially licensed and covered by our support program. For a full list of changes in this release refer to the <a href="https://docs.rstudio.com/drivers/1.7.0/release-notes/">release notes</a>.</p><h2 id="introducing-the-snowflake-driver">Introducing the Snowflake driver</h2><p>This release of the drivers includes a preview of the Snowflake driver. Snowflake is a popular &ldquo;data warehouse-as-a-service&rdquo; that runs in the cloud. This preview release provides full ODBC support for Snowflake, but offers limited capabilities with certain packages like <code>dbplyr</code>. We will make another announcement when the Snowflake driver is ready for all types of workloads.</p><h2 id="mongodb-security-updates">MongoDB security updates</h2><p>For those using MongoDB, we strongly recommend you upgrade your driver. A security vulnerability was found in the <a href="https://www.simba.com/products/MongoDB/doc/v2/SchemaEditor_UserGuide/content/schemaeditor/3.0/intro.htm">Schema Editor</a>, which enables users to create and modify schema definitions for NoSQL data stores. As a result, we will no longer ship the Schema Editor with the MongoDB driver. We encourage all customers to upgrade this driver even if they are not using the Schema Editor. For those who are not able to upgrade we recommend <a href="https://support.rstudio.com/hc/en-us/articles/360063916613">uninstalling the MongoDB driver</a>.</p><h2 id="updating-oracle-instant-client">Updating Oracle Instant Client</h2><p>The latest Oracle driver depends on Oracle Instant Client version 19.x, whereas the previous Oracle driver depends on version 12.x. If you are connecting to Oracle you <strong>must</strong> upgrade. Please follow the installation instructions in our <a href="https://docs.rstudio.com/pro-drivers/installation/">docs</a>. Note that the Oracle Instant Client is a dependency you license and install directly from Oracle.</p><h2 id="an-early-notice-on-driver-deprecations">An early notice on driver deprecations</h2><p>When vendors end support for databases, RStudio also ends support for those databases. You can receive email notifications for upcoming deprecations by subscribing to <em>Product Information</em> in the <a href="https://rstudio.com/about/subscription-management/">RStudio subscription management</a> portal. Be aware that the following databases will no longer be supported in upcoming years: Oracle 12.2 (March 2022); and Netezza (April 2023).</p></description></item><item><title>BI and Data Science: The Best of Both Worlds</title><link>https://www.rstudio.com/blog/bi-and-ds-part1/</link><pubDate>Thu, 04 Mar 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/bi-and-ds-part1/</guid><description><p><sup>Photo by <a href="https://unsplash.com/@stillnes_in_motion?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText" target="_blank" rel="noopener noreferrer">Stillness InMotion</a> on <a href="https://unsplash.com/s/photos/robot-fencing?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText" target="_blank" rel="noopener noreferrer">Unsplash</a></sup></p><p>In previous posts, we&rsquo;ve talked about the critical importance of <a href="https://blog.rstudio.com/2020/07/15/interoperability-maximize-analytic-investments/" target="_blank" rel="noopener noreferrer">interoperability</a>, and how it helps organizations and data science teams get the most out of their analytic investments. We&rsquo;ve focused recently on the ways that R and Python can be used together, and how RStudio&rsquo;s products <a href="https://blog.rstudio.com/2021/01/13/one-home-for-r-and-python/" target="_blank" rel="noopener noreferrer">provide a single home for R and Python</a>. For the next few posts, we will turn our attention to a different aspect of interoperability: the intersection between Business Intelligence (BI) and data science platforms.</p><h2 id="bi-and-data-science-organizational-rivals">BI and Data Science: Organizational Rivals?</h2><p>Organizations, large and small, have taken various paths on the quest for better, more data-driven decision making. Historically, many large organizations were dependent on centralized IT-driven projects to develop reports and dashboards. As pressure has increased to become more agile in creating and delivering insights to improve how decisions are made, organizations typically adopt these approaches:</p><ul><li><strong>Self-service BI tools:</strong> These tools, such as Tableau, PowerBI and Spotfire, are typically used by those who understand data, but may not be comfortable coding in languages such as R or Python. Often, these users are looking for the next level of analytic and visualization depth beyond spreadsheets. These tools typically include a way of sharing these analyses with others.</li><li><strong>Open source data science frameworks:</strong> These applications, using tools such as Shiny, R Markdown, Dash, Streamlit and Bokeh, are typically created by data scientists coding in R or Python and draw on the full analytic and visualization richness of these ecosystems. These applications can be shared in various ways, both through homegrown solutions and professional products such as <a href="https://rstudio.com/products/connect/" target="_blank" rel="noopener noreferrer">RStudio Connect</a>.</li></ul><p>Both approaches allow the analytically-minded to draw on data from multiple data sources and to explore, visualize and understand that data in flexible and powerful ways. They also allow users to create rich interactive applications and dashboards that can be shared with others to improve their decision-making.</p><p>These common purposes and capabilities, ironically, often trap the teams that use and maintain these tools as organizational competitors for software budgets and executive mindshare. These very different approaches can end up delivering applications and dashboards that may (at first glance) appear very similar. The strengths, weaknesses and nuances of the two approaches can be obscured to decision makers, especially to executive budget holders.</p><p>However, this confusion obscures the distinct opportunities each type of tool provides and how using the tools together can deliver even more value to the organization.</p><h2 id="data-science-should-complement-and-augment-your-bi-tools">Data Science Should Complement and Augment your BI Tools</h2><p>In our next post, we will do a deeper dive into the strengths and challenges of self-service BI tools and code-oriented, open source data science — and what to consider when choosing an approach.</p><p>In talking with many different analytic teams at organizations that have successfully combined BI and Data Science, their strategies have typically fallen into two categories: Using data science to either <strong>complement</strong> or <strong>augment</strong> self-service BI.</p><p>In the <strong>complement</strong> approach, organizations use BI and data science tools side by side for:</p><ul><li><strong>Widespread reporting:</strong> Where standard visualizations and simpler analytic approaches are sufficient, these organizations use BI tools such as Tableau or PowerBI to empower a wide range of analysts to create dashboards and reports and share them across the organization.</li><li><strong>Specialized applications:</strong> For use cases that require custom visualizations, deeper predictive or machine learning capabilities, or simply a higher level of customization in the data analysis and presentation, these organizations develop applications using code-based frameworks. These frameworks include Shiny, RMarkdown, Dash, Streamlit and Bokeh, and the R and Python languages they are built on. In some cases, these tools are used to create highly tailored applications that support critical decision making by executive teams, which can be rapidly iterated and redeployed in response to feedback and questions.</li><li><strong>A broader spectrum of data products:</strong> Frequently this mixed approach utilizes <a href="https://rstudio.com/products/connect/" target="_blank" rel="noopener noreferrer">RStudio Connect</a>, which also supports the delivery of predictive models, APIs for automated decision making, Jupyter Notebooks and automated data pipelines. This broader selection of outputs greatly enhances the ways that data-driven insights can be applied in an organization.</li></ul><p>A great example of this approach comes from Dr. Behrooz Hassani-Mahmooei, Principal Specialist and Director at the Strategic Intelligence and Insights Unit at Monash University:</p><blockquote><p>&ldquo;Tools such as PowerBI are very useful when you want to start from data and generate information. But when you have a specific decision that you are expected to inform, especially a strategic decision, you need tools that enable you to start from that decision and reverse engineer back to the data. That is where R and RStudio helped us as a competitive advantage, in what we call strategic analytics, where you need maximum flexibility and reproducibility as well as clarity for communication and translation.&rdquo;</p></blockquote><p>In the <strong>augment</strong> approach, organizations use BI and Data Science tools together to:</p><ul><li><strong>Validate insights:</strong> When the user of a BI tool finds a potentially interesting pattern, data scientists might be asked to verify the insight. Those data scientists can apply more rigorous analytic approaches to confirm whether those results are significant enough to base future decision making on. The data scientists validating the results may communicate that finding back to the BI team or deploy the insight on a low-cost open source platform.</li><li><strong>Enrich data for BI reporting:</strong> Data science tools can complement BI tools by enriching the underlying data using more advanced analytic techniques, such as eliminating highly-correlated columns. This can help the BI users focus on the most important aspects of the data. Data science teams can also encorporate non-traditional data sources, add model scores, and provide calculated columns that might be difficult to create in the BI tools. These additions can be delivered using shared data sources or API calls that take advantage of robust, customizable data and machine learning pipelines.</li><li><strong>Expand the audience for data science insights:</strong> Data science teams who deliver their insights through an organization&rsquo;s corporate-standard BI tool can reach a much broader audience. This can boost the visibility and perceived value of the data science team as well as increase the impact of generated features and model scores.</li><li><strong>Overcome BI limitations:</strong> While modern BI tools have a wide range of visualizations and basic statistical tools, they are largely constrained by the proprietary capabilities their vendors implement. Code-oriented data science can extend these capabilities with customizable visualizations and provide access to the rich and evolving ecosystem of open source R and Python.</li></ul><p>An example of the augment approach was highlighted in <a href="https://www.trustradius.com/reviews/rstudio-2020-12-18-15-36-31" target="_blank" rel="noopener noreferrer"> recent TrustRadius review</a>, where an IT Analyst at a real estate company shared:</p><blockquote><p>&ldquo;If you are an R user and you have models or reports that you work with regularly, RStudio is a great solution. I also find it handy for building quick apps using <code>Flask-Admin</code> for user config tables that support Tableau or Power BI reports for budget tables, KPI targets, and metric targets.&rdquo;</p></blockquote><h2 id="data-science-and-bi-in-your-organization">Data Science and BI in Your Organization</h2><p>In our next posts, we will dive more deeply into the strengths and challenges of self-service BI tools and recommend specific approaches and points of integration to consider. For now, if you are part of a BI or Data Science team in an organization, I encourage you to reach out to your counterparts and explore how you can better tackle your common purpose of improving decision making in your organization.</p><p>We&rsquo;re happy to help, so if you&rsquo;d like to learn more about how RStudio products can help augment and complement your BI approaches, you can <a href="https://rstudio.chilipiper.com/book/schedule-time-with-rstudio" target="_blank" rel="noopener noreferrer">set up a meeting with our Customer Success team</a>, or start a conversation <a href="https://www.linkedin.com/in/loubajuk/" target="_blank" rel="noopener noreferrer">with me on LinkedIn</a>.</p><h2 id="to-learn-more">To Learn More</h2><ul><li>Read <a href="https://rstudio.com/about/customer-stories/brown-forman/" target="_blank" rel="noopener noreferrer">this customer spotlight</a> to learn how Brown-Forman used RStudio products to help their data science team &ldquo;turn into application developers and data engineers without learning any new languages or computer science skills.&rdquo;</li><li>Read more about the importance of <a href="https://blog.rstudio.com/2020/07/15/interoperability-maximize-analytic-investments/" target="_blank" rel="noopener noreferrer">interoperability</a> and how <a href="https://blog.rstudio.com/2021/01/13/one-home-for-r-and-python/" target="_blank" rel="noopener noreferrer">RStudio provides a single home for R and Python</a></li><li>If you&rsquo;d like to know <a href="https://blog.rstudio.com/2020/11/17/an-interview-with-lou-bajuk/" target="_blank" rel="noopener noreferrer">why RStudio focuses on code-friendly data science</a>, read this recap of a recent podcast.</li><li>At rstudio::conf 2020, George Kastrinakis from the Financial Times <a href="https://rstudio.com/resources/rstudioconf-2020/building-a-new-data-science-pipeline-for-the-ft-with-rstudio-connect/" target="_blank" rel="noopener noreferrer">presented a case study</a> on building a new data science pipeline using R and RStudio Connect.</li></ul></description></item><item><title>Summer Internships 2021</title><link>https://www.rstudio.com/blog/summer-internship-2021/</link><pubDate>Tue, 02 Mar 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/summer-internship-2021/</guid><description><p><sup>Photo by <a href="https://twitter.com/CMastication">JD Long</a></sup></p><p>We are excited to announce the fourth formal summer internship program at RStudio. The goal of our internship program is to enable RStudio employees to collaborate with current students to do impactful work that not only helps both RStudio users and the broader community, but ensures that the community of R developers is just as diverse as its community of users. Over the course of the internship, you will work with experienced data scientists, software developers, and educators to create and share new tools and ideas.</p><p>The internship pays up to $12,000 USD (paid hourly) and will last 10-12 weeks. The start date is May 24th–June 7th, depending on your availability (applications are open now, and this year there is no application deadline). To qualify, you must currently be a student (broadly construed - if you think you’re a student, you probably qualify) and have some experience writing code in R and using Git and GitHub. To demonstrate these skills, your application needs to include a link to a package, Shiny app, or data analysis repository on GitHub. It’s OK if you create something specifically for this application: we just need to know that you’re already familiar with the mechanics of collaborative development in R.</p><p>RStudio is a geographically distributed team which means you can be based anywhere in the United States (we hope to expand the program to support in other countries in the future). This year you will be working 100% remotely and you will meet with your mentor regularly online.</p><p>We are recruiting interns for the following projects:</p><h2 id="tidyverse">Tidyverse</h2><p><strong>shinymodels</strong> - The tidymodels framework is a collection of packages for modeling and machine learning using tidyverse principles. The goal of this internship is to create a package that, given a tidymodels object, will launch a Shiny application.<br><em>Mentor: Max Kuhn</em></p><p><strong>Polishing cpp11</strong> - Improve the <a href="https://cpp11.r-lib.org/"><code>cpp11</code> package</a>. The <code>cpp11</code> package provides C++ bindings to R code. This intern will work to improve the package by adding functionality, fixing bugs, and writing documentation, including introductory tutorials, how-to guides and in-depth explanatory vignettes.<br><em>Supervisor: Jim Hester</em></p><h2 id="education">Education</h2><p><strong>Cheat Sheets</strong> - Enhance <a href="https://rstudio.com/resources/cheatsheets/">RStudio&rsquo;s cheat sheet gallery</a>. Primary tasks will be to review and update existing cheat sheets to reflect new package features, to create new cheat sheets, and to streamline the intake of community contributed cheat sheets. <br><em>Mentors: Mine Çetinkaya-Rundel and Garrett Grolemund.</em></p><p><strong>Exercise Content</strong> - Develop practice content for R users. Primary tasks will be to write exercises with the <a href="https://rstudio.github.io/learnr/"><code>learnr</code></a> and <a href="https://rmarkdown.rstudio.com/">R Markdown</a> formats, to write grading checks with <a href="https://rstudio-education.github.io/gradethis/"><code>gradethis</code></a>, to organize exercises in a git repository, and to identify and clean example datasets for R learners. <br><em>Mentor: Garrett Grolemund.</em></p><p><strong>Automate grading of R Markdown assignments</strong> - The intern will work on extending <code>gradethis</code> with a suite of tools for educators to orchestrate automated feedback of student assignments written in R Markdown documents. <br><em>Mentor: Garrick Aden-Buie.</em></p><h2 id="r-markdown">R Markdown</h2><p>This intern will work with the R Markdown team on our ecosystem of R packages for data science communication built on Pandoc. You&rsquo;ll contribute actively to our software development process as we work to improve our toolchain for academic, scientific, and technical publishing.<br><em>Mentor: Alison Hill</em></p><h2 id="marketing">Marketing</h2><p>Help share data science customer stories: Customer stories and conversations are a critical source of information, both for the data science community and for product teams, and RStudio has many such conversations with our customers. <br><em>Mentor: Lou Bajuk</em></p><p><strong>Apply now at <a href="https://rstudio.com/about/careers/">rstudio.com/about/careers/</a></strong></p><p>RStudio is committed to being a diverse and inclusive workplace. We encourage applicants of different backgrounds, cultures, genders, experiences, abilities and perspectives to apply. All qualified applicants will receive equal consideration without regard to race, color, national origin, religion, sexual orientation, gender, gender identity, age, or physical disability. However, applicants must legally be able to work in the United States.</p></description></item><item><title>Introducing the RStudio Launcher Plugin SDK</title><link>https://www.rstudio.com/blog/rstudio-sdk1/</link><pubDate>Tue, 23 Feb 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-sdk1/</guid><description><h2 id="improving-interoperability-through-the-rstudio-job-launcher">Improving Interoperability through the RStudio Job Launcher</h2><p>In a <a href="https://blog.rstudio.com/2020/07/15/interoperability-maximize-analytic-investments/" target="_blank" rel="noopener noreferrer">previous blog post</a> we discussed how improving interoperability between the multiple environments required by data scientists can improve productivity and ROI on IT investments. With the help of the <a href="https://rstudio.com/resources/rstudioconf-2019/rstudio-job-launcher-changing-where-we-run-r-stuff/" target="_blank" rel="noopener noreferrer">RStudio Job Launcher</a>, <a href="https://rstudio.com/products/team/" target="_blank" rel="noopener noreferrer">RStudio Team</a> products can make use of IT-managed computing resources in Kuberenetes or Slurm clusters. Data scientists can launch their RStudio, Jupyter, or VS Code sessions directly through RStudio Server Pro. They can also launch remote background jobs to leverage the hardware that is available in their organization&rsquo;s computing cluster.</p><p>However, there are many more job management systems available besides Kubernetes and Slurm. Data scientists may want or need access to the resources available in their organization&rsquo;s preferred system to run complex analyses. Data science leaders may want to increase productivity in their teams by making use of powerful pre-existing systems. IT departments may want to increase the utilization of job management systems which are expensive to configure and maintain.</p><h2 id="the-rstudio-job-launcher-plugin-system">The RStudio Job Launcher Plugin System</h2><p>The <a href="https://rstudio.com/resources/rstudioconf-2019/rstudio-job-launcher-changing-where-we-run-r-stuff/" target="_blank" rel="noopener noreferrer">RStudio Job Launcher</a> allows RStudio Team products to integrate with multiple types of job management systems through a Plugin-based system. Each Plugin should allow the Job Launcher to communicate with one type of job management system. The Job Launcher currently has Plugins for integrating with Kubernetes and Slurm, as well as a Plugin which allows jobs to be launched directly on the Job Launcher host.</p><p>To support additional job management systems, a new Plugin needs to be developed for each job management system. The <a href="https://rstudio.com/products/launcher-plugin-sdk/" target="_blank" rel="noopener noreferrer">RStudio Launcher Plugin SDK</a> (Software Development Kit) facilitates rapid development of these Plugins in C/C++.</p><h3 id="the-quickstart-plugin--guide">The QuickStart Plugin &amp; Guide</h3><p>A developer can follow along with the <a href="https://docs.rstudio.com/rlps/quickstart/" target="_blank" rel="noopener noreferrer">QuickStart Guide</a> to transform the RStudio QuickStart Plugin into a functioning Plugin. The <a href="https://docs.rstudio.com/rlps/quickstart/" target="_blank" rel="noopener noreferrer">QuickStart Guide</a> includes 16 steps, or &lsquo;<code>TODO</code> items&rsquo;, that correspond to different features that need to be implemented in the QuickStart Plugin.</p><ul><li><code>TODO</code>s #1 - #4 help the developer get the Plugin renamed and rebranded as desired.</li><li><code>TODO</code> #5 shows how configuration options can be added to the Plugin, although it may become more obvious what options will be needed as development continues.</li><li><code>TODO</code>'s #6 - #16 take the developer through the bulk of the work to create a functioning RStudio Job Launcher Plugin.</li></ul><p>Provided with the SDK is a utility called &ldquo;Smoke Test&rdquo;. The Smoke Test tool can be used during development to trigger many of the major code paths in a Plugin. Debugging a Plugin that is in use by an RStudio Team product can be difficult because the developer is not in control of which API calls are made and when. The Smoke Test makes debugging a Plugin much easier by giving the developer that control.</p><p>While following the RStudio Launcher Plugin SDK QuickStart Guide, most <code>TODO</code>s will follow a similar development process. For example, the development process for <code>TODO</code> #7 might look like this:</p><ol><li>Implement <a href="https://docs.rstudio.com/rlps/quickstart/todos.html#todo-7" target="_blank" rel="noopener noreferrer"><code>TODO</code> #7: Define cluster configuration</a></li><li>Compile the plugin and the smoke-test tool</li><li><a href="https://docs.rstudio.com/rlps/devguide/smoke-test.html#smoke-test-start" target="_blank" rel="noopener noreferrer">Launch the plugin through the smoke-test tool</a></li><li>Trigger the Cluster Info API call using the <a href="https://docs.rstudio.com/rlps/devguide/smoke-test.html#st-menu-1" target="_blank" rel="noopener noreferrer">first option in the smoke test tool</a></li><li>Validate that all of the desired configuration information is returned</li><li>Repeat the previous steps until satisfied, attaching a debugger to the plugin if necessary</li></ol><figure><img src="code-example.png" alt="A comparison of the same method in the QuickStart and Local Plugins"><caption>Figure 1: A side-by-side comparison of the QuickStart template method for `TODO` #7 and the sample Local plugin implementation of `TODO` #7.</caption></figure><p>After the developer is getting the desired results from the Smoke Test tool, they may wish to test their Plugin against an RStudio Team product. The Smoke Test tool will only cover the basic pathways of the Plugin.</p><h3 id="the-developers-guide">The Developer&rsquo;s Guide</h3><p>Some Plugin developers may find it necessary to do something more complex than what is presented in the <a href="https://docs.rstudio.com/rlps/quickstart/" target="_blank" rel="noopener noreferrer">QuickStart Guide</a>. For example, they may wish to allow administrators to set resource limits on a per-user or per-group basis. In that case, the Plugin developer can turn to the <a href="https://docs.rstudio.com/rlps/devguide" target="_blank" rel="noopener noreferrer">Developer&rsquo;s Guide</a>. The <a href="https://docs.rstudio.com/rlps/devguide/advanced-features.html" target="_blank" rel="noopener noreferrer">Advanced Features section of the Developer&rsquo;s Guide</a> covers optional advanced features that may be added to a Plugin as needed.</p><p>The <a href="https://docs.rstudio.com/rlps/devguide" target="_blank" rel="noopener noreferrer">Developer&rsquo;s Guide</a> also covers the high level architecture of the SDK, how some RStudio Team products integrate with the RStudio Job Launcher, and a detailed description of all the Smoke Test Utility options. Additionally, the API between the RStudio Job Launcher and a Plugin is described in full. If developers prefer to work in a language other than C or C++, a Plugin developer can use the <a href="https://docs.rstudio.com/rlps/devguide/pluginapi.html" target="_blank" rel="noopener noreferrer">Launcher Plugin API</a> section to develop a Plugin from scratch in any other language. The <a href="https://docs.rstudio.com/rlps/devguide/pluginapi.html" target="_blank" rel="noopener noreferrer">Launcher Plugin API</a> describes the communication mechanism between the RStudio Job Launcher and a Plugin.</p><h3 id="the-api-reference">The API Reference</h3><p>The RStudio Launcher Plugin SDK also includes a complete <a href="https://docs.rstudio.com/rlps/apiref/annotated.html" target="_blank" rel="noopener noreferrer">API Reference</a> for all of its C/C++ code. The <a href="https://docs.rstudio.com/rlps/apiref/annotated.html" target="_blank" rel="noopener noreferrer">API Reference</a> may be useful if the developer wishes to see detailed class hierarchy information or reference doxygen comments outside of the codebase.</p><h3 id="our-github">Our GitHub</h3><p>The RStudio Launcher Plugin SDK is open source. If you find any bugs or wish to request enhancements, please file an issue on the <a href="https://github.com/rstudio/rstudio-launcher-plugin-sdk" target="_blank" rel="noopener noreferrer">RStudio Launcher Plugin SDK GitHub Repository</a>. Pull requests for improvements or bug fixes are also welcome!</p></description></item><item><title>Introducing Shiny App Stories</title><link>https://www.rstudio.com/blog/shiny-app-stories/</link><pubDate>Fri, 12 Feb 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-app-stories/</guid><description><p>Today we&rsquo;re introducing <a href="https://shiny.rstudio.com/app-stories/">App Stories</a> to the <a href="https://shiny.rstudio.com/">Shiny website</a>. If you&rsquo;ve spent any time learning about Shiny, there&rsquo;s a good chance you&rsquo;ve already seen our <a href="https://shiny.rstudio.com/gallery/#user-showcase">Shiny User Showcase</a>. These are applications that Shiny users around the world have allowed us to share, and it&rsquo;s an excellent place to get ideas about what you can do with Shiny.</p><p>App Stories are a bit different from the User Showcase: an App Story will center around a Shiny application, but the application will be designed specifically to show off specific features, and it will also include explanations of how to use those features.</p><p>We&rsquo;re kicking off App Stories with an application for <a href="https://connect.rstudioservices.com/explore_your_weather/">exploring weather patterns</a> in US cities. This story shows off some of Shiny 1.6.0&rsquo;s new features, and it has two parts: <a href="https://shiny.rstudio.com/app-stories/weather-lookup-about.html">About the app</a> describing the application&rsquo;s functionality and motivation, and <a href="https://shiny.rstudio.com/app-stories/weather-lookup-caching.html">Using <code>bindCache()</code> to speed up an app</a> which shows how Shiny&rsquo;s new <code>bindCache()</code> function can be used to easily speed up your apps with very little code.</p><div style = "max-width: 550px; margin: 0 auto;"><img src = "full_app.png"alt = "Screenshot of the weather explorer app"/></div><p>Both posts for the weather explorer go deeper into motivations and real-use-case scenarios than traditional documentation, and they provide insight into the development process of a nice-looking and high-performance Shiny app.</p><div style = "max-width: 550px; margin: 0 auto;"><img src = "caching_article_sections.png"alt = "Screenshot of sections in caching article"/><em>Articles are written trying to explain why in addition to how to use new features in Shiny</em></div><p>Going forward we will continue to add new applications along with posts about those applications. We&rsquo;re experimenting with this kind of documentation, so we welcome feedback about what topics you would like explored or what could be improved. Feel free to tweet at us (Winston: <a href="https://twitter.com/winston_chang">@winston_chang</a>, Nick: <a href="https://twitter.com/NicholasStrayer">@NicholasStrayer</a>) with your thoughts and ideas. Happy app making!</p></description></item><item><title>Painful Package Management</title><link>https://www.rstudio.com/blog/pkg-mgmt-pain/</link><pubDate>Thu, 11 Feb 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/pkg-mgmt-pain/</guid><description><p><sup>Photo by <a href="https://unsplash.com/@brandablebox?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Brandable Box</a> on <a href="https://unsplash.com/?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Unsplash</a></span></sup></p><blockquote><p><em>This is the second of three blogs on package management.</em></p><p><em>Registration for our webinar on <a href="https://rstudio.com/registration/managing-packages-for-open-source-data-science/">Managing Packages for Open-Source Data Science</a> on February 17 is now open.</em></p></blockquote><p>If you&rsquo;re a data scientist, you&rsquo;ve been hired to generate insights and create assets &ndash; not manage R and Python package environments. But &ldquo;I spend my day managing packages,&rdquo; or even worse, &ldquo;I spend my day fighting IT for the packages I need,&rdquo; is an all-too-common refrain.</p><p>It doesn&rsquo;t have to be this way. With a little forethought and planning, your organization can adopt a <a href="https://environments.rstudio.com/reproduce.html">package management strategy</a> that will drastically reduce the amount of hassle data scientists have to endure managing packages.</p><p>In this blog post, we&rsquo;ll explore the frustration your data scientists probably feel if your package management plan doesn&rsquo;t provide both flexibility to get work done and structure to ensure reproducibility. Then we&rsquo;ll dig into the first step to make it better: determining your organization&rsquo;s package management requirements.</p><h2 id="when-package-management-is-pain">When Package Management is Pain</h2><p>When package management isn&rsquo;t going well, data scientists or engineers are usually the first ones to feel the sting. Here are some of the ways data scientists experience bad package management plans:</p><ul><li><p><strong>It&rsquo;s hard or slow to install packages.</strong> Data scientists often can&rsquo;t find the packages they need from public repositories or aren&rsquo;t empowered to share private packages. Even when it&rsquo;s easy to find packages, they may be slow to install or require system libraries that don&rsquo;t exist.</p></li><li><p><strong>Data science and IT/Ops feel at odds.</strong> If you&rsquo;re a data scientist, you probably want every package you need <strong>now</strong> without waiting for someone to approve installation. This can put you on a collision course with IT Admins who are concerned about platform security and stability.</p></li><li><p><strong>Sharing projects and deploying to production are ordeals.</strong> When you share or deploy, you may face a maze of package dependencies and conflicts where reaching success feels more like a roll of the dice than a smooth process.</p></li><li><p><strong>Reproducing your results is fragile or elusive.</strong> R and Python packages get constant updates, and unless you&rsquo;ve planned ahead, new package versions can break old code and create unexpected pitfalls when adding new capabilities.</p></li></ul><p>As we discussed in the <a href="https://blog.rstudio.com/2021/02/05/pkg-mgmt-prime-directive/">first blog in this series</a>, successful package management requires attention from both IT/Admins and data scientists as the process spans both the shared repository and the private library.</p><p>That means that <strong>there&rsquo;s no single solution to package management</strong>.</p><p>But, these issues <strong>are</strong> solvable by developing a package management plan for your organization. The first step is to clearly identify how packages are managed in your environment and who&rsquo;s responsible.</p><h2 id="discovering-package-management-requirements">Discovering Package Management Requirements</h2><p>Your organization&rsquo;s package management requirements depend on your organization&rsquo;s size and complexity. In some organizations, package management involves stakeholders from the data science, IT/Ops, security, and other teams.</p><p>Virtually all environments share a few requirements. To successfully manage open source packages for data science, <strong>your organization needs</strong>:</p><ul><li><p><strong>A simple way to create and save package sets.</strong> Organizations need a standard way to ensure that data scientists can capture the dependencies for their particular code and save them for later.</p></li><li><p><strong>The ability to quickly and easily add packages to libraries.</strong> Package management is much smoother when data scientists are confident they can quickly restore a previous working environment when and where they need to.</p></li></ul><p>And depending on your organization, <strong>you might need</strong> the ability to:</p><ul><li><p><strong>Share private (internally-developed) packages.</strong> If your organization is developing and using internal packages, you&rsquo;ll need a way to access and share them in addition to approved packages from public repositories.</p></li><li><p><strong>Limit the set of packages available in the environment.</strong> Organizations that allow open access to packages from public repositories face very different requirements than those that only allow validated packages.</p></li><li><p><strong>Do all of the above in an offline or air-gapped context.</strong> If your organization&rsquo;s security policy requires limited or no internet, you&rsquo;ll need to pay special attention to getting data scientists the packages they need.</p></li></ul><p>It&rsquo;s worth taking a minute to think about how your organization currently manages packages and whether you have a way to meet the requirements you face in your organization.</p><p>In the (forthcoming) final blog in this series, we&rsquo;ll dive into how to take the requirements you&rsquo;ve identified and create your organization&rsquo;s package management plan, including divvying up responsibility for package management between IT Admins and data scientists, and how to use tools like <a href="https://rstudio.github.io/renv/index.html">renv</a>, python virtual environments, and <a href="https://packagemanager.rstudio.com">public</a> and <a href="https://rstudio.com/products/package-manager/">private RStudio Package Manager</a> to execute your plan.</p><blockquote><p><em>Please sign up for our <a href="https://rstudio.com/registration/managing-packages-for-open-source-data-science/">free webinar</a> on February 17 to learn more about managing open source packages for R and Python.</em></p></blockquote></description></item><item><title>The Package Management Prime Directive</title><link>https://www.rstudio.com/blog/pkg-mgmt-prime-directive/</link><pubDate>Fri, 05 Feb 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/pkg-mgmt-prime-directive/</guid><description><p><sup>Photo by <a href="https://unsplash.com/@nate_dumlao?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Nathan Dumlao</a> on <a href="https://unsplash.com/?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Unsplash</a></span></sup></p><blockquote><p><em>This is the first in a short series of blogs on package management.</em></p><p><em>Registration for our webinar on <a href="https://rstudio.com/registration/managing-packages-for-open-source-data-science/">Managing Packages for Open-Source Data Science</a> on February 17 is now open.</em></p></blockquote><p>Absolutely essential, easy to forget if they&rsquo;re there when you need them, and utterly debilitating when they&rsquo;re not there, open source packages in R and Python are the pantry items in the metaphorical kitchen turning out <a href="https://blog.rstudio.com/2020/05/19/driving-real-lasting-value-with-serious-data-science/">serious data science</a>. These bundles of code and instructions, hosted in public repositories like CRAN and PyPI, are key ingredients in data science work.</p><p>Just as in cooking, serious data science depends on having suitable ingredients available where and when you need them. Not having the right packages available for your analysis is a common cause of frustration and time lost for data scientists and engineers.</p><p>You can make package management a solved problem for your organization, but it takes a little planning. Following the Package Management Prime Directive, which I&rsquo;ll share below, can help you avoid problems before they start. But before that, you&rsquo;ll need a little background.</p><h2 id="grocery-store-and-the-pantry">Grocery Store and the Pantry</h2><p>Packages live in one of two places: <strong>repositories</strong> and <strong>libraries</strong>. Understanding each of these environments, especially their differences, is the first step to avoiding package management headaches.</p><blockquote><p>Think of your data science workbench as a kitchen:</p><ul><li>The <strong>repository</strong> is the grocery store, a central place where everyone gets their packages.</li><li>The <strong>library</strong> is the pantry, where you keep your own private set of packages.</li><li><strong>Installation</strong> is the shopping trip to stock your library with packages from the repository.</li></ul></blockquote><img src="zz_pkg-envs.png" alt="Illustration of package repository and library." width="350"/><p><sup>Photos by <a href="https://unsplash.com/@neonbrand?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">NeONBRAND</a> and <a href="https://unsplash.com/@luisabrimble?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Luisa Brimble</a> on <a href="https://unsplash.com/?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Unsplash</a>.</sup></p><p>Like the grocery store, the package repository includes more packages &ndash; inert and boxed up &ndash; than any one person needs. Common repositories include CRAN, BioConductor, PyPI, <a href="https://rstudio.com/products/package-manager/">private RStudio Package Manager</a>, and <a href="https://packagemanager.rstudio.com/client/#/">public RStudio Package Manager</a>.</p><p>And like the pantry in your home, which holds shelf-stable ingredients until they&rsquo;re needed in the kitchen, the library holds packages specific to your work, ready for you to combine with other raw ingredients for your analysis.</p><p>Libraries are needed wherever there&rsquo;s a running R or Python process, like within the <a href="https://rstudio.com/products/rstudio/">RStudio IDE</a> and Jupyter Notebooks, or alongside apps and reports running on platforms like <a href="https://rstudio.com/products/connect/">RStudio Connect</a>.</p><p>Unlike your ability to choose among various colors or flavors of the same item in your pantry, R and Python find packages in the library by name alone, so the library that corresponds to your code must include:</p><ul><li>exactly one version of each package needed and</li><li>only package versions that are consistent, so package interdependencies work.</li></ul><p>Now that you have a general idea of what repositories and libraries are, it&rsquo;s time for a general rule for how to think about managing packages.</p><h2 id="the-package-management-prime-directive">The Package Management Prime Directive</h2><p>A grocery store or repository aims to meet the needs of as many people as possible.</p><p>Your pantry or library, on the other hand, requires all the ingredients for your creations, but no more. These constraints lead directly to the Package Management Prime Directive:</p><blockquote><p>Repositories should be as broad possible.</p><p>Libraries should be as specific as possible.</p></blockquote><p>Most organizations administer only one or a few central repositories to keep management simple. Many organizations decide to just use a public repository and skip repository management altogether.</p><p>In contrast, most organizations empower data scientists and engineers to manage their own libraries. Increasingly, many are using libraries that correspond to individual projects to make them even more specific. Luckily, this is easier than it sounds, as there&rsquo;s <a href="https://environments.rstudio.com">great tooling for library management</a> in both R and Python, which we&rsquo;ll explore in future posts in this series.</p><p>Now, armed with the Package Management Prime Directive and an understanding of why it&rsquo;s important, you&rsquo;ve got all the conceptual understanding to solve most package management issues.</p><p>In future posts, we&rsquo;ll cover frequent sources of package management pain for both data scientists and platform administrators, and how they can work together to create a Package Management Plan to prevent even the most pernicious of package problems.</p></description></item><item><title>What's New on RStudio Cloud - February 2021</title><link>https://www.rstudio.com/blog/rstudio-cloud1/</link><pubDate>Thu, 04 Feb 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-cloud1/</guid><description><p>We roll out new features and improvements regularly on RStudio Cloud, typically every week or two. You can always view the list of significant new features on Cloud’s <a href="https://rstudio.cloud/learn/whats-new" target="_blank" rel="noopener noreferrer">What’s New</a> page, which you can access via the left hand navigator in Cloud.</p><div class = "screenshot"><img align="center" style="padding: 35px:" src="menu.png"></div><p>But to go into a bit more detail, we have decided to begin periodic postings here on the RStudio Blog to highlight significant new developments. We hope you find it useful.</p><p>Here are the new features we’ve released over the past month or so:</p><ul><li>Cloud Plus Plan</li><li>RStudio 1.4</li><li>Python 3.8</li><li>Project Memory Gauge</li></ul><h3 id="cloud-plus-plan">Cloud Plus Plan</h3><p>We’ve had lots of users (many of them students) request a plan on Cloud that just lets them use more hours on the service. Many don’t need all the premium features of our paid plans, but love the convenience of accessing their work from any computer without needing to install any software and want to be able to use Cloud more than the hours included with the Free plan.</p><div class = "screenshot"><img align="center" style="padding: 35px:" src="cloud-plus.png"></div><p>We’re happy to announce that a Cloud Plus plan is now available for $5/month. Cloud Plus includes all the features of our Cloud Free plan, plus 50 project hours per month. Any additional project hours are billed at 20¢ per hour (as is the case with all our self-service, paid plans).</p><p>If the Cloud Free features meet your needs, but you use more than 15 project hours a month, this is likely the right plan for you. Please visit the Cloud <a href="https://rstudio.cloud/plans/plus" target="_blank" rel="noopener noreferrer">Plans &amp; Pricing</a> page for more details.</p><h3 id="rstudio-14">RStudio 1.4</h3><p>The latest release of our world class IDE is now available on Cloud. All your new and existing projects on RStudio Cloud will automatically be updated to use this new release.</p><p>To learn more about all the great new features and capabilities of RStudio 1.4, please take a look at our blog post <a href="https://blog.rstudio.com/2021/01/19/announcing-rstudio-1-4/" target="_blank" rel="noopener noreferrer">Announcing RStudio 1.4</a>.</p><h3 id="python-38">Python 3.8</h3><p>If you use python on Cloud (for example, via the reticulate package), Cloud has been updated to use python 3.8. This should make it easier to use the latest versions of many python packages on Cloud — and stay tuned for more exciting python developments this year…</p><h3 id="project-memory-gauge">Project Memory Gauge</h3><div class = "screenshot"><img align="center" style="padding: 35px:" src="ram-144.png"></div><p>Every project on RStudio Cloud runs inside its own container and by default is allocated 1GB of RAM. Depending on your subscription, you can increase a project’s allocation up to 8GB.</p><p>You can now see how much of your project&rsquo;s allocated memory is currently in use: when you open a project you will find a project memory gauge in the header. This gauge updates roughly every ten seconds.</p><p>Note that reclaiming unused memory is controlled by R and the operating system. You may see “uncanny” fluctuations in the gauge as the system manages memory.</p><h2 id="whats-next">What’s Next?</h2><p>We don’t like to pre-announce features before they’re available, but the team is busy both improving our underlying systems and developing new features. If there is something you’d love to see improved or added to Cloud, please let us know in the <a href="https://community.rstudio.com/c/rstudio-cloud" target="_blank" rel="noopener noreferrer">RStudio Cloud section</a> of the RStudio Community site.</p><p>If you are new to RStudio Cloud and would like to learn more about the platform and various plans available, check out the <a href="https://rstudio.com/products/cloud/" target="_blank" rel="noopener noreferrer">RStudio Cloud product page</a>.</p><p>Thanks!</p></description></item><item><title>Enjoy More rstudio::global(2021)</title><link>https://www.rstudio.com/blog/enjoy-more-rstudio-global-2021/</link><pubDate>Mon, 01 Feb 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/enjoy-more-rstudio-global-2021/</guid><description><p>It&rsquo;s been just over a week since we wrapped up the final session of rstudio::global(2021). This first virtual RStudio conference broke new ground with:</p><ul><li><strong>A 24-hour conference:</strong> RStudio::global ran around the clock with 3 keynotes from renowned data scientists, 3 industry and professionally-focused X-Sessions, 30 technical talks with live Q&amp;A, 20 rapid-fire lightning talks, and 30+ birds-of-a-feather sessions. No matter what time zone attendees logged in from, rstudio::global had streaming conference sessions available to view. And best of all, the conference was free to attend!</li><li><strong>More than 17,000 attendees from 148 countries:</strong> This was the largest conference RStudio has ever hosted, and it had the most global reach as well. While only about half of attendees reported their locations, you can interactively explore the wide variety of attendee locations by visiting our <a href="https://connect.rstudioservices.com/global2021-registrations/" target="_blank" rel="noopener noreferrer">interactive rstudio::global registrant map</a>.</li><li><strong>74 Diversity Scholars from 32 countries.</strong> Diversity Scholars represent groups who are typically underrepresented at in-person rstudio::conf() events. These groups include people of color, those with disabilities, elders and older adults, LGBTQ folks, and women/minority genders. In past years, we have had to limit our scholarships geographically due to visa issues and international travel restrictions, so we were happy to have no such limitations for our virtual conference.</li><li><strong>New modes of interaction to create the feel of an in-person conference.</strong> As an experiment, we hosted spatial gathering rooms using the <a href="https://spatial.chat" target="_blank" rel="noopener noreferrer">spatial.chat</a> platform where conference attendees could gather and interact in ways similar to an in-person conference. While we wait to hear what attendees thought of this experience (see the survey link below to add your thoughts), most visitors to the spatial chat rooms seemed intrigued to try it.</li></ul><p>If you are now kicking yourself for not attending rstudio::global(2021), fear not: All of the prerecorded talks from the conference are now available on the RStudio.com site at <a href="https://rstudio.com/resources/rstudioglobal-2021/,">https://rstudio.com/resources/rstudioglobal-2021/,</a> complete with closed captions in English, Spanish, and Mandarin Chinese. The live question and answer questions will available as well as soon as they&rsquo;ve finished post-processing.</p><p>If you attended rstudio::global(2021) and haven&rsquo;t yet provided feedback to us, please take 5 minutes and fill out our survey at <a href="https://rstd.io/global-survey" target="_blank" rel="noopener noreferrer">rstd.io/global-survey</a>. Your feedback will help us plan future RStudio events. In the meantime, thank you to everyone who participated in rstudio::global(2021), and we look forward to seeing you at a future RStudio event!</p></description></item><item><title>Shiny 1.6</title><link>https://www.rstudio.com/blog/shiny-1-6-0/</link><pubDate>Mon, 01 Feb 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-1-6-0/</guid><description><p>We are thrilled to announce that Shiny 1.6.0 is now on CRAN! Install it now with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">shiny&#34;</span>)</code></pre></div><p>A lot of hard work went to this release to vastly improve four main areas: <a href="#theming">theming</a>, <a href="#caching">caching</a>, <a href="#accessibility">accessibility</a>, and <a href="#devmode">developer experience</a>.</p><h2 id="theming">Improved theming (and Bootstrap 4) support</h2><p>This version of Shiny makes it much easier to customize the appearance of your applications. Shiny now integrates with the <code>{bslib}</code> package, which provides Bootstrap 4 and <a href="https://bootswatch.com/">Bootswatch</a> themes, and also makes it much easier to modify colors, fonts, and more. It also provides an interactive theming widget (<a href="https://rstudio.github.io/bslib/reference/run_with_themer.html"><code>bslib::bs_themer()</code></a>) which can be used inside any Shiny app (as well as any <code>rmarkdown::html_document()</code> with <code>runtime: shiny</code>) to more quickly preview different theme variations. Here&rsquo;s a screen recording of that interactive theming tool in action (also <a href="https://testing-apps.shinyapps.io/themer-demo">see here</a> for a hosted version):</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">bslib<span style="color:#666">::</span><span style="color:#06287e">bs_theme_preview</span>()</code></pre></div><div><video class="w-100" src="images/real-time-theming.mp4" controls=""><a href="images/real-time-theming.mp4" alt="A screen recording of interaction applying different colors and fonts to a Shiny application."></a></video></div><p>To use <code>{bslib}</code> in your own Shiny app, provide a <a href="https://rstudio.github.io/bslib/reference/bs_theme.html"><code>bslib::bs_theme()</code></a> to the <code>theme</code> argument of <code>fluidPage()</code>, <code>navbarPage()</code>, and <code>bootstrapPage()</code> (for usage with R Markdown, <a href="https://rstudio.github.io/bslib/#r-markdown-usage">see here</a>). Inside <code>bs_theme()</code>, you can specify a version of <a href="https://getbootstrap.com/docs/4.6/getting-started/introduction/">Bootstrap</a> (4 and 3 currently supported), as well as any <a href="https://bootswatch.com/">Bootswatch</a> theme, including new ones like <a href="https://bootswatch.com/minty/">minty</a>!</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">ui <span style="color:#666">&lt;-</span> <span style="color:#06287e">fluidPage</span>(theme <span style="color:#666">=</span> bslib<span style="color:#666">::</span><span style="color:#06287e">bs_theme</span>(version <span style="color:#666">=</span> <span style="color:#40a070">4</span>, bootswatch <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">minty&#34;</span>),<span style="color:#007020;font-weight:bold">...</span>)</code></pre></div><p>For years, Shiny has supported Bootswatch 3 themes via the <code>{shinythemes}</code> package, but any further customization of the theme required writing complex CSS rules by hand. Now, thanks to <code>{bslib}</code>, it&rsquo;s way easier to control <a href="https://rstudio.github.io/bslib/articles/theming.html#main-colors">main colors &amp; fonts</a> and/or any of the <a href="https://rstudio.github.io/bslib/articles/bs4-variables.html">100s of more specific theming options</a>, directly from R. When it comes to custom font(s) that may not be available on the end users machine, make sure to leverage <code>{bslib}</code>'s helper functions like <a href="https://rstudio.github.io/bslib/reference/font_face.html"><code>font_google()</code></a>, <a href="https://rstudio.github.io/bslib/reference/font_face.html"><code>font_link()</code></a>, and <a href="https://rstudio.github.io/bslib/reference/font_face.html"><code>font_face()</code></a>, which assist in including font file(s) in an convenient, efficient, and responsible way.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(bslib)theme <span style="color:#666">&lt;-</span> <span style="color:#06287e">bs_theme</span>(bg <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">#0b3d91&#34;</span>, fg <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">white&#34;</span>, primary <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">#FCC780&#34;</span>,base_font <span style="color:#666">=</span> <span style="color:#06287e">font_google</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Space Mono&#34;</span>),code_font <span style="color:#666">=</span> <span style="color:#06287e">font_google</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Space Mono&#34;</span>))<span style="color:#06287e">bs_theme_preview</span>(theme)</code></pre></div><img src="images/custom-theme.png" alt="A Shiny app with a Material dark mode look." width="100%" style='border: 1px solid #ddd; box-shadow:5px 5px 5px #eee;'/><p>One main reason why <code>{bslib}</code> makes it so much easier to implement custom themes is that <code>bs_theme()</code> leverages <a href="https://rstudio.github.io/bslib/articles/theming.html#theming-variables">Bootstrap Sass variables</a>, allowing you to change only a few color(s) and font(s) to impact potentially hundreds of Bootstrap&rsquo;s CSS rules. Also, thanks to Bootstrap 4&rsquo;s <a href="https://rstudio.github.io/bslib/articles/theming.html#utility-classes">Utility Classes</a>, you can now more easily tackle complicated UI issues that Sass variables alone won&rsquo;t solve like adjustments to spacing, alignment, borders, background colors, and more.</p><p>To accommodate this new level of customization, a significant portion of shiny UI has also been revamped so that default styles now properly inherit from the <code>theme</code> setting (i.e., notice how <code>sliderInput()</code>, <code>selectInput()</code>, and <code>dateInput()</code> properly reflect the main colors and fonts). We hope that Shiny and <code>{htmlwidgets}</code> developers also find <code>{bslib}</code>'s <a href="https://rstudio.github.io/bslib/articles/theming.html#themeable-components">tools for theming custom components</a> useful for implementing components that also &ldquo;just work&rdquo; with custom themes.</p><p>While a lot of custom theming can be done via <code>bs_theme()</code> (i.e., CSS), it fundamentally can&rsquo;t effect things like <code>renderPlot()</code>, because the image is rendered by R, not by the web browser. To help solve this problem, we&rsquo;ve also created the <a href="https://rstudio.github.io/thematic/"><code>{thematic}</code> package</a> which can effectively translate CSS to new R plotting defaults by just calling <a href="https://rstudio.github.io/thematic/reference/thematic_on.html"><code>thematic::thematic_shiny()</code></a> before running an app.</p><div align="center"><img src="images/thematic-before.png" alt="A ggplot2 plot with default R styling" width="80%" style='border: 1px solid #ddd; box-shadow:5px 5px 5px #eee;'/><img src="images/thematic-after.png" alt="A ggplot2 plot with default styling defaults informed by CSS" width="80%" style='border: 1px solid #ddd; box-shadow:5px 5px 5px #eee;'/></div><p>This &lsquo;auto theming&rsquo; behavior that <code>{thematic}</code> provides works great in Shiny with any CSS framework (not just <code>{bslib}</code>). Also, more generally, <code>{thematic}</code> can help simplify plot theming inside any R environment, using any graphics device, and also makes it super easy to use <a href="https://fonts.google.com/">Google Fonts</a> inside your R plots.</p><p>To learn more about what <code>{bslib}</code> and <code>{thematic}</code> are able to do, see <a href="https://rstudio.github.io/bslib/">https://rstudio.github.io/bslib/</a> and <a href="https://rstudio.github.io/thematic/">https://rstudio.github.io/thematic/</a>.</p><p>By the way, Shiny 1.6 also adds to methods to <a href="https://shiny.rstudio.com/reference/shiny/latest/session.html">the <code>session</code> object</a>, namely <code>setCurrentTheme()</code> and <code>getCurrentTheme()</code>, to dynamically update (or obtain) the page&rsquo;s <code>theme</code> after initial load. These methods power <code>{bslib}'s</code> interactive theming tool (<a href="https://rstudio.github.io/bslib/reference/run_with_themer.html"><code>bslib::bs_themer()</code></a>), but can also be used to implement dynamic theming widgets <a href="https://rstudio.github.io/bslib/articles/theming.html#dynamic-theming-in-shiny">like a dark mode switch</a>.</p><h2 id="caching">Improved caching</h2><p>In many Shiny applications, the performance bottlenecks are pieces of code that perform the exact same computation over and over again. For example, you might have a dashboard that displays the same information for many users, and so for those users, it does the exact same data processing and plotting steps.</p><p>Instead of repeating these computations, you can now have your app <em>cache</em> the steps, with the new <code>bindCache()</code> function. Here&rsquo;s how to use it: simply pass your <code>reactive()</code> or <code>renderPlot()</code> (or other <code>render</code> function) to <code>bindCache()</code>, and tell it what to use for the <em>cache key</em>.</p><p>Suppose this is our reactive expression which does a slow operation; in this case, it&rsquo;s calling <code>fetchData()</code>, which retrieves data from a web API:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">weatherData <span style="color:#666">&lt;-</span> <span style="color:#06287e">reactive</span>({<span style="color:#06287e">fetchData</span>(input<span style="color:#666">$</span>city)})</code></pre></div><p>To cache the values, we&rsquo;ll pass it to <code>bindCache()</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">weatherData <span style="color:#666">&lt;-</span> <span style="color:#06287e">reactive</span>({<span style="color:#06287e">fetchData</span>(input<span style="color:#666">$</span>city)}) <span style="color:#666">%&gt;%</span><span style="color:#06287e">bindCache</span>(input<span style="color:#666">$</span>city)</code></pre></div><p>The call to <code>bindCache(input$city)</code> tells it to use <code>input$city</code> as the <em>cache key</em> (you can use more than one item for the cache key, if needed). The first time it sees a value for <code>input$city</code> (for example, <code>&quot;Boston&quot;</code>), it will execute the reactive expression and save the value in the cache. In the future, if it sees the same value of <code>input$city</code> again, instead of executing the reactive expression, it will simply retrieve the value from the cache.</p><p>In addition to <code>reactive()</code>, <code>bindCache()</code> works with most <code>render</code> functions. For example:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">output<span style="color:#666">$</span>plot <span style="color:#666">&lt;-</span> <span style="color:#06287e">renderPlot</span>({ <span style="color:#007020;font-weight:bold">...</span> }) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">bindCache</span>(input<span style="color:#666">$</span>city)output<span style="color:#666">$</span>text <span style="color:#666">&lt;-</span> <span style="color:#06287e">renderText</span>({ <span style="color:#007020;font-weight:bold">...</span> }) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">bindCache</span>(input<span style="color:#666">$</span>city)output<span style="color:#666">$</span>plot1 <span style="color:#666">&lt;-</span> plotly<span style="color:#666">::</span><span style="color:#06287e">renderPlotly</span>({ <span style="color:#007020;font-weight:bold">...</span> }) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">bindCache</span>(input<span style="color:#666">$</span>city)</code></pre></div><p>To learn more about using caching in Shiny, see <a href="https://shiny.rstudio.com/articles/caching.html">this article</a>.</p><p>In addition to <code>bindCache()</code>, we&rsquo;ve added a companion function, <code>bindEvent()</code>, which makes it easy to make reactive code run only when specified reactive values are invalidated. For example, if you have a plot that you want to redraw only when a button is clicked, you could do this:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">output<span style="color:#666">$</span>plot <span style="color:#666">&lt;-</span> <span style="color:#06287e">renderPlot</span>({<span style="color:#06287e">plot</span>(cars<span style="color:#06287e">[seq_len</span>(input<span style="color:#666">$</span>nrows), ])}) <span style="color:#666">%&gt;%</span><span style="color:#06287e">bindEvent</span>(input<span style="color:#666">$</span>button)</code></pre></div><p>And what&rsquo;s more, <code>bindCache()</code> and <code>bindEvent()</code> can be used together:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">output<span style="color:#666">$</span>plot <span style="color:#666">&lt;-</span> <span style="color:#06287e">renderPlot</span>({<span style="color:#06287e">plot</span>(cars<span style="color:#06287e">[seq_len</span>(input<span style="color:#666">$</span>nrows), ])}) <span style="color:#666">%&gt;%</span><span style="color:#06287e">bindCache</span>(input<span style="color:#666">$</span>nrows) <span style="color:#666">%&gt;%</span><span style="color:#06287e">bindEvent</span>(input<span style="color:#666">$</span>button)</code></pre></div><p>This would cache the plot based on the value of <code>input$nrows</code>, and also make it so the plot redraws only when <code>input$button</code> is clicked.</p><p>You may be familiar with the existing <code>eventReactive()</code> and <code>observeEvent()</code> functions. <code>bindEvent()</code> can be used with <code>reactive()</code> and <code>observe()</code> to do the same thing (and in fact, the older functions are now implemented using <code>bindEvent()</code>):</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic"># These are equivalent:</span><span style="color:#06287e">eventReactive</span>(input<span style="color:#666">$</span>button, { <span style="color:#007020;font-weight:bold">...</span> })<span style="color:#06287e">reactive</span>({ <span style="color:#007020;font-weight:bold">...</span> }) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">bindEvent</span>(input<span style="color:#666">$</span>button)<span style="color:#60a0b0;font-style:italic"># These are equivalent:</span><span style="color:#06287e">observeEvent</span>(input<span style="color:#666">$</span>button, { <span style="color:#007020;font-weight:bold">...</span> })<span style="color:#06287e">observe</span>({ <span style="color:#007020;font-weight:bold">...</span> }) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">bindEvent</span>(input<span style="color:#666">$</span>button)</code></pre></div><h2 id="accessibility">Improved accessibility</h2><p>Shiny 1.6 also includes many accessibility improvements to Shiny UI. Most improvements automatically make most existing Shiny apps more accessible, but some features such as <a href="#alt-text">alternative text for plots</a> require some effort to fully implement. And while this release is a big step forward for Shiny&rsquo;s overall accessibility, we&rsquo;re still in the process of learning more about this area, so expect more improvements in future releases!</p><h3 id="hello-aria-attributes">Hello ARIA attributes</h3><p>Nearly all of Shiny&rsquo;s (previously in-accessible) UI (e.g., <code>selectizeInput()</code>, <code>dateInput()</code>, <code>icon()</code>, etc.) now automatically includes suitable <a href="https://developer.mozilla.org/en-US/docs/Web/Accessibility/ARIA">ARIA attributes</a>, which helps make content more discoverable via keyboard and also assists <a href="https://en.wikipedia.org/wiki/Screen_reader">screen readers</a> to make proper announcements when focus has shifted.</p><p>More specifically, a black line is now shown around content that is brought into focus via keyboard. And, thanks to <code>{bslib}</code>, customizing the focus <a href="https://developer.mozilla.org/en-US/docs/Web/CSS/outline">outline</a>&lsquo;s style is fairly straight-forward:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">ui <span style="color:#666">&lt;-</span> <span style="color:#06287e">navbarPage</span>(theme <span style="color:#666">=</span> bslib<span style="color:#666">::</span><span style="color:#06287e">bs_theme</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">keyboard-outline-style&#34;</span> <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">dotted&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">keyboard-outline-color&#34;</span> <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">hotpink&#34;</span>, version <span style="color:#666">=</span> <span style="color:#40a070">3</span>),title <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Keyboard focus&#34;</span>, inverse <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>,<span style="color:#06287e">tabPanel</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">A&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a&#34;</span>),<span style="color:#06287e">tabPanel</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">B&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">b&#34;</span>))<span style="color:#06287e">shinyApp</span>(ui, <span style="color:#06287e">function</span>(input, output) {})</code></pre></div><img src="images/keyboard-outline.png" alt="Customizing the keyboard focus outline to be a dotted, hot-pink, outline." width="50%" style='border: 1px solid #ddd; box-shadow:5px 5px 5px #eee;'/><p>By default, an outline isn&rsquo;t shown for a mouse-based focus, but one may be added by setting <code>mouse-outline-style</code> to something other than <code>none</code> (it&rsquo;s default value). This helps makes the app more accessible to visually impaired users that use a mouse.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">ui <span style="color:#666">&lt;-</span> <span style="color:#06287e">navbarPage</span>(theme <span style="color:#666">=</span> bslib<span style="color:#666">::</span><span style="color:#06287e">bs_theme</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">mouse-outline-style&#34;</span> <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">auto&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">mouse-outline-color&#34;</span> <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">hotpink&#34;</span>, version <span style="color:#666">=</span> <span style="color:#40a070">3</span>),title <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Mouse focus&#34;</span>, inverse <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>,<span style="color:#06287e">tabPanel</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">A&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a&#34;</span>),<span style="color:#06287e">tabPanel</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">B&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">b&#34;</span>))<span style="color:#06287e">shinyApp</span>(ui, <span style="color:#06287e">function</span>(input, output) {})</code></pre></div><img src="images/mouse-outline.png" alt="Customizing the focus outline on mouse interactions to be a solid, hot-pink, outline." width="50%" style='border: 1px solid #ddd; box-shadow:5px 5px 5px #eee;'/><h3 id="language-attribute">Language attribute</h3><p>A <code>lang</code> argument has been added to all <code>*Page()</code> functions (e.g., <code>fluidPage()</code>, <code>bootstrapPage()</code>) for control over the document-level language used by screen readers and search-engine parsers. By default, it is set to empty string which is commonly recognized as a browser&rsquo;s default locale, so in most situations, this value won&rsquo;t need to be changed. BTW, the same argument has been added to the <code>save_html()</code> function from the <code>{htmltools}</code> package, so your static HTML pages may now reap the same benefits.</p><h3 id="alt-text">Alternate text for <code>renderPlot()</code></h3><p>It&rsquo;s now easy to include an <a href="https://webaim.org/techniques/alttext/">alternative text description</a> of static plots generated with <code>renderPlot()</code>. Most good descriptions will be a function of reactive values, so keep in mind that you can pass a <code>reactive()</code> expression directly to the new <code>alt</code> argument of <code>renderPlot()</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">ui <span style="color:#666">&lt;-</span> <span style="color:#06287e">fluidPage</span>(<span style="color:#06287e">sidebarPanel</span>(<span style="color:#06287e">sliderInput</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">obs&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Number of observations:&#34;</span>, min <span style="color:#666">=</span> <span style="color:#40a070">1</span>, max <span style="color:#666">=</span> <span style="color:#40a070">1000</span>, value <span style="color:#666">=</span> <span style="color:#40a070">500</span>)),<span style="color:#06287e">mainPanel</span>(<span style="color:#06287e">plotOutput</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">plot&#34;</span>)))server <span style="color:#666">&lt;-</span> <span style="color:#06287e">function</span>(input, output, session) {vals <span style="color:#666">&lt;-</span> <span style="color:#06287e">reactive</span>(<span style="color:#06287e">rnorm</span>(input<span style="color:#666">$</span>obs))<span style="color:#60a0b0;font-style:italic"># A textual description of the histogram of values. Also checkout the BrailleR </span><span style="color:#60a0b0;font-style:italic"># package to easily generate description(s) of common statistical objects</span><span style="color:#60a0b0;font-style:italic"># https://github.com/ajrgodfrey/BrailleR</span>alt_text <span style="color:#666">&lt;-</span> <span style="color:#06287e">reactive</span>({bins <span style="color:#666">&lt;-</span> <span style="color:#06287e">hist</span>(<span style="color:#06287e">vals</span>(), plot <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">FALSE</span>)bin_info <span style="color:#666">&lt;-</span> glue<span style="color:#666">::</span><span style="color:#06287e">glue_data</span>(bins, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">{round(100*density, 1)}% falling around {mids}&#34;</span>)<span style="color:#06287e">paste</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">A histogram of&#34;</span>, input<span style="color:#666">$</span>obs, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">random values with &#34;</span>,<span style="color:#06287e">paste</span>(bin_info, collapse <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">; &#34;</span>))})output<span style="color:#666">$</span>plot <span style="color:#666">&lt;-</span> <span style="color:#06287e">renderPlot</span>({<span style="color:#06287e">hist</span>(<span style="color:#06287e">vals</span>())}, alt <span style="color:#666">=</span> alt_text)}<span style="color:#06287e">shinyApp</span>(ui, server)</code></pre></div><h2 id="devmode">Shiny Developer Mode</h2><p>Shiny 1.6 also adds a <a href="https://shiny.rstudio.com/reference/shiny/latest/devmode.html">Developer Mode</a> which is enabled by calling <code>devmode()</code> and disabled by calling <code>devmode(FALSE)</code>.</p><p>With Developer Mode enabled, defaults for numerous <code>options()</code> are altered to enhance the developer experience (but may not be best for published applications), this includes:</p><ul><li>Defaulting <code>shiny.autoreload</code> to <code>TRUE</code> which reloads the app when a sourced R file changes.</li><li>Defaulting <code>shiny.minified</code> to <code>FALSE</code> which ensures that JavaScript files are not minified (e.g., <code>shiny.js</code>), making it easier to debug.</li><li>Defaulting <code>shiny.fullstacktrace</code> to <code>TRUE</code> which displays the full stack trace when errors occur during execution.</li><li>Defaulting <code>sass.cache</code> to <code>FALSE</code> which prevents any possible caching of Sass -&gt; CSS compilation (done via <code>sass::sass()</code>). This is relevant whenever <code>{bslib}</code> is used with Shiny or R Markdown. Note that Sass -&gt; CSS compilation of Bootstrap is costly in production, but caching during development may provide misleading false-positive results.</li><li>To learn more about implementing your own option, see the reference for <a href="https://shiny.rstudio.com/reference/shiny/latest/devmode.html"><code>devmode()</code></a>.</li></ul><p>If you have manually set any of these options, your provided value will take precedence. When a default Developer Mode option is used, a notification will be displayed every eight hours to remind you of the altered behavior.</p><p>Developer Mode also includes notifications about best practices within Shiny (such as avoiding <code>shinyServer()</code>).</p><h2 id="learn-more">Learn more</h2><p>For more details on Shiny&rsquo;s 1.6 release, see the <a href="https://github.com/rstudio/shiny/blob/master/NEWS.md">NEWS file</a>. There you&rsquo;ll find more details on accessibility improvements and the many bug fixes that also come with this release. Thanks for reading, and happy developing!</p></description></item><item><title>rstudio::global(2021) Starts Tomorrow!</title><link>https://www.rstudio.com/blog/rstudio-global-tomorrow/</link><pubDate>Wed, 20 Jan 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-global-tomorrow/</guid><description><p>The day you&rsquo;ve all been waiting for is almost here: rstudio::global(2021) starts tomorrow! This 24-hour virtual event will focus on all things R and RStudio, featuring 50+ speakers from around the world, and it begins at 1600GMT (11 am EST, 8 am PST) on January 21, 2021.</p><p>If you want to attend rstudio::global, you don&rsquo;t need a gold-embossed invitation, but you do still need to register, which you can do at <a href="https://www.rstudio.com/conference">rstudio.com/conference</a>. You&rsquo;ll be joining more than 12,000 attendees from 136 countries who are already registered and planning to attend.</p><p>But once you are registered, you are not done! After you&rsquo;ve registered, you should <a href="https://global.rstudio.com/student/all_events" target="_blank" rel="noopener noreferrer">plan your conference and enroll for the sessions you want to attend</a>. Every talk will be given twice, so choose the talks that are most convenient for your local time. You&rsquo;ll have three tracks of talks and more than 50 sessions to choose from, so planning in advance will ensure you don&rsquo;t miss any of your favorite speakers. Should you need help converting the times shown into your local time, I recommend using the <a href="https://www.timeanddate.com/worldclock/converter.html?iso=20210121T160000&p1=136&p2=43" target="_blank" rel="noopener noreferrer">time zone converter at timeanddate.com</a>. And just as like at rstudio::conf events in years past, stay tuned for social opportunities that will be announced once rstudio::global begins.</p><p>So don&rsquo;t wait; start compiling your conference schedule now. The 24 hours of rstudio::global(2021) are less than a day away!</p></description></item><item><title>2020 at RStudio: A Year in Review</title><link>https://www.rstudio.com/blog/2020-a-year-in-review/</link><pubDate>Tue, 19 Jan 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/2020-a-year-in-review/</guid><description><p><sup>Photo by<a href="https://unsplash.com/@kellysikkema?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText" target="_blank" rel="noopener noreferrer"> Kelly Sikkema</a> on<a href="https://unsplash.com/s/photos/year-in-review?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText" target="_blank" rel="noopener noreferrer"> Unsplash</a></sup></p><p>We at RStudio are excited to host our first fully virtual conference this week, <a href="https://www.rstudio.com/conference/">rstudio::global</a>. We were so pleased to have so many of you join us this week, and while we wish we could see you all in person again, we were happy to have this opportunity to come together with the open source data science community. We will share a recap of the conference in after it concludes this Friday.</p><p>Before we dive back into our projects, I thought it would be a good time to look back at what kept us busy in 2020. While the year presented many challenges for everyone, we were pleased to continue to support and deliver value to the R and Python data science community. Below, I list some of the many highlights of the past year. No doubt I have missed a few, but these are some of the things I am particularly proud we were able to accomplish last year.</p><h3 id="rstudio-the-company">RStudio the Company</h3><p>Our company grew significantly this year, despite the many challenges posed by COVID-19. As part of that growth, we:</p><ul><li>Started out 2020 with rstudio::conf, with thousands of attendees from around the world, both in person and virtually. You can watch all the <a href="https://www.rstudio.com/resources/rstudioconf-2020/">talks from the conference here</a>.</li><li>Announced that RStudio is now a Public Benefit Corporation, with our open source mission codified into our corporate charter. Check out <a href="https://blog.rstudio.com/2020/01/29/rstudio-pbc/" target="_blank" rel="noopener noreferrer">JJ Allaire&rsquo;s rstudio::conf keynote</a> for the full story. We also wrote about <a href="https://www.rstudio.com/about/what-makes-rstudio-different/">What Makes RStudio Different</a>.</li><li>Were honored to be named a <a href="https://blog.rstudio.com/2020/09/25/forrester-wave/" target="_blank" rel="noopener noreferrer">Strong Performer</a> in the Forrester Wave™: Notebook-Based Predictive Analytics and Machine Learning, Q3 2020.</li><li>Wrote and spoke about the importance of <a href="https://blog.rstudio.com/2020/05/19/driving-real-lasting-value-with-serious-data-science/" target="_blank" rel="noopener noreferrer">Serious Data Science</a>, <a href="https://blog.rstudio.com/2020/11/17/an-interview-with-lou-bajuk/" target="_blank" rel="noopener noreferrer">why we focus on a code-based approach</a> to data science, how <a href="https://blog.rstudio.com/2020/07/15/interoperability-maximize-analytic-investments/" target="_blank" rel="noopener noreferrer">Interoperability</a> helps you leverage your entire analytic ecosystem, and how we provide a single home for <a href="https://www.rstudio.com/solutions/r-and-python/">R and Python Data Science</a>.</li><li>Delivered <a href="https://www.rstudio.com/resources/webinars/">several webinars</a>, many featuring our customers and partners, and were privileged to share <a href="https://www.rstudio.com/about/customer-stories/">several customer stories</a>, featuring the great work our customers are doing. We were also thrilled and humbled by the <a href="https://www.trustradius.com/products/rstudio/reviews" target="_blank" rel="noopener noreferrer">many great reviews</a> our customers provided on TrustRadius.</li></ul><h3 id="rstudio-products">RStudio Products</h3><p>On the product side, we significantly enhanced the capabilities of both our commercial and open source products. Specifically, we:</p><ul><li>Created and delivered new releases of the RStudio IDE, making it <a href="https://blog.rstudio.com/2020/05/27/rstudio-1-3-release/" target="_blank" rel="noopener noreferrer">more accessible</a>, as well as delivering <a href="https://blog.rstudio.com/2020/09/30/rstudio-v1-4-preview-visual-markdown-editing/" target="_blank" rel="noopener noreferrer">many other enhancements</a>, including the surprisingly popular <a href="https://blog.rstudio.com/2020/11/04/rstudio-1-4-preview-rainbow-parentheses/" target="_blank" rel="noopener noreferrer">rainbow parentheses</a>. We also greatly expanded capabilities for <a href="https://blog.rstudio.com/2020/10/07/rstudio-v1-4-preview-python-support/" target="_blank" rel="noopener noreferrer">native Python coding in the RStudio IDE</a>, including a Python environment and object explorer.</li><li>Expanded the capabilities of <a href="https://www.rstudio.com/products/connect/">RStudio Connect</a>, our centralized platform for sharing the work data science teams create in R and Python, including support for a full suite of interactive Python applications based on Dash, Bokeh and Streamlit. (See the announcements for <a href="https://blog.rstudio.com/2020/07/14/rstudio-connect-1-8-4/" target="_blank" rel="noopener noreferrer">Connect 1.8.4</a> and <a href="https://blog.rstudio.com/2020/12/16/rstudio-connect-1-8-6-python-update/" target="_blank" rel="noopener noreferrer">1.8.6</a>.) We also added the ability to share Python APIs (via Flask) in <a href="https://blog.rstudio.com/2020/04/02/rstudio-connect-1-8-2/" target="_blank" rel="noopener noreferrer">Connect 1.8.2</a>.</li><li>Updated <a href="https://www.rstudio.com/products/package-manager/">RStudio Package Manager</a>, introducing support for Windows binaries, bioconductor, and beta support for PyPI packages. We also<a href="https://blog.rstudio.com/2020/07/01/announcing-public-package-manager/" target="_blank" rel="noopener noreferrer"> introduced Public Package Manager </a>as a free service.</li><li>Officially <a href="https://blog.rstudio.com/2020/08/05/rstudio-cloud-announcement/" target="_blank" rel="noopener noreferrer">launched RStudio Cloud</a>, our cloud-based platform for doing, teaching, and learning data science using only a browser&ndash;and promised we will always offer a free plan for casual users. We were gratified to <a href="https://blog.rstudio.com/2020/09/17/rstudio-cloud-a-student-perspective/" target="_blank" rel="noopener noreferrer">hear great responses </a>from the many people using RStudio Cloud to teach and learn data science.</li></ul><h3 id="r-and-python-packages">R and Python Packages</h3><p>RStudio also expanded its wealth of free and open-source packages available to the larger data science community in 2020. Some of the significant development included:</p><ul><li><a href="https://blog.rstudio.com/2020/01/29/sparklyr-1-1/" target="_blank" rel="noopener noreferrer">Announcing in January</a> that sparklyr is available on CRAN, enabling R users to scale datasets across computing clusters running Apache Spark. We later <a href="https://blog.rstudio.com/2020/07/16/sparklyr-1-3/" target="_blank" rel="noopener noreferrer">announced support for Apache Avro</a> in sparklyr.</li><li><a href="https://blog.rstudio.com/2020/09/29/torch/" target="_blank" rel="noopener noreferrer">Providing native access to Torch</a>, making one of the most widely used deep learning frameworks available to R users.</li><li><a href="https://blog.rstudio.com/2020/04/08/great-looking-tables-gt-0-2/" target="_blank" rel="noopener noreferrer">Introducing the gt package</a> (short for &ldquo;grammar of tables&rdquo;), to help R users reliably and efficiently create beautiful customized display tables. We also had a great response to our <a href="https://blog.rstudio.com/2020/12/23/winners-of-the-2020-rstudio-table-contest/" target="_blank" rel="noopener noreferrer">RStudio Table contest</a>.</li></ul><p>And of course, the tidyverse team was as productive as always this year, releasing (among other things) upgrades to:</p><ul><li><a href="https://www.tidyverse.org/blog/2020/03/forcats-0-5-0/" target="_blank" rel="noopener noreferrer">forcats</a>,</li><li><a href="https://www.tidyverse.org/blog/2020/03/ggplot2-3-3-0/" target="_blank" rel="noopener noreferrer">ggplot2</a>,</li><li><a href="https://www.tidyverse.org/blog/2020/04/tibble-3-0-0/" target="_blank" rel="noopener noreferrer">tibble</a>, and</li><li><a href="https://www.tidyverse.org/blog/2020/06/dplyr-1-0-0/" target="_blank" rel="noopener noreferrer">dplyr</a>, which received a massive update as part of its official 1.0 release.</li></ul><p>There were also significant updates to:</p><ul><li><a href="https://www.tidyverse.org/blog/2020/09/pkgdown-1-6-0/" target="_blank" rel="noopener noreferrer">pkgdown</a>,</li><li><a href="https://www.tidyverse.org/blog/2020/10/readr-1-4-0/" target="_blank" rel="noopener noreferrer">readr</a>,</li><li><a href="https://www.tidyverse.org/blog/2020/10/testthat-3-0-0/" target="_blank" rel="noopener noreferrer">testthat</a>,</li><li><a href="https://www.tidyverse.org/blog/2020/11/dbplyr-2-0-0/" target="_blank" rel="noopener noreferrer">dbplyr</a>,</li><li><a href="https://www.tidyverse.org/blog/2020/12/usethis-2-0-0/" target="_blank" rel="noopener noreferrer">usethis</a>, and</li><li><a href="https://www.tidyverse.org/blog/2020/11/magrittr-2-0-is-here/" target="_blank" rel="noopener noreferrer">magrittr</a>.</li></ul><p>The team also <a href="https://www.tidyverse.org/blog/2020/04/tidymodels-org/" target="_blank" rel="noopener noreferrer">launched tidymodels.org</a>, a central location for learning and using the <code>tidymodels</code> packages.</p><p>Finally, in support of online scientific and technical communication, we <a href="https://blog.rstudio.com/2020/12/07/distill/" target="_blank" rel="noopener noreferrer">introduced the 1.0 version of the distill package</a>, as well as real-time <a href="https://blog.rstudio.com/2020/09/30/rstudio-v1-4-preview-visual-markdown-editing/" target="_blank" rel="noopener noreferrer">visual editing of R Markdown documents</a>. We also introduced <a href="https://blog.rstudio.com/2020/12/21/rmd-news/" target="_blank" rel="noopener noreferrer">many other updates and enhancements</a> to R Markdown.</p><h2 id="to-learn-more">To Learn More</h2><p>2020 was a busy year, and I am sure there are still a dozen things I missed. I know it&rsquo;s difficult to keep up with everything RStudio is doing, but hopefully the links I&rsquo;ve included above will help. If you&rsquo;d like to learn more about any of the professional products, please drop a line to <a href="mailto:sales@rstudio.com"><a href="mailto:sales@rstudio.com">sales@rstudio.com</a></a>, or book at time talk with us <a href="https://rstudio.chilipiper.com/book/rst-demo" target="_blank" rel="noopener noreferrer">using this link</a>.</p><p>If you&rsquo;d particularly like to learn more about the many ways that RStudio provides a single home for data science teams using R and Python, we encourage you to <a href="https://pages.rstudio.net/RStudio_R_Python.html" target="_blank" rel="noopener noreferrer">register for our upcoming webinar</a> on February 3rd.</p><p>Here&rsquo;s to a happy, healthy, and productive 2021 for all of us!</p></description></item><item><title>Announcing RStudio 1.4</title><link>https://www.rstudio.com/blog/announcing-rstudio-1-4/</link><pubDate>Tue, 19 Jan 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-rstudio-1-4/</guid><description><p>RStudio is excited to announce that we released RStudio 1.4 today! The many features of RStudio 1.4 will already be familiar to regular readers of this blog. These include:</p><ul><li>A <a href="https://blog.rstudio.com/2020/09/30/rstudio-v1-4-preview-visual-markdown-editing/" target="_blank" rel="noopener noreferrer">visual markdown editor</a> that provides improved productivity for composing longer-form articles and analyses with R Markdown.</li><li>New <a href="https://blog.rstudio.com/2020/10/07/rstudio-v1-4-preview-python-support/" target="_blank" rel="noopener noreferrer">Python capabilities</a>, including display of Python objects in the Environment pane, viewing of Python data frames, and tools for configuring Python versions and conda/virtual environments.</li><li>The ability to add <a href="https://blog.rstudio.com/2020/10/21/rstudio-1-4-preview-multiple-source-columns/" target="_blank" rel="noopener noreferrer">source columns</a> to the IDE workspace for side-by-side text editing.</li><li>A new <a href="https://blog.rstudio.com/2020/10/14/rstudio-v1-4-preview-command-palette/" target="_blank" rel="noopener noreferrer">command palette</a> (accessible via Ctrl+Shift+P) that provides easy keyboard access to all RStudio commands, add-ins, and options.</li><li>Support for <a href="https://blog.rstudio.com/2020/11/04/rstudio-1-4-preview-rainbow-parentheses/" target="_blank" rel="noopener noreferrer">rainbow parentheses</a> in the source editor (enabled via <strong>Options -&gt; Code -&gt; Display).</strong></li><li>New <a href="https://blog.rstudio.com/2020/11/09/rstudio-1-4-preview-citations/" target="_blank" rel="noopener noreferrer">citation support</a> that allows you to include document citations from your document bibliography, personal or group libraries, and several other sources.</li><li>Integration with <a href="https://blog.rstudio.com/2020/11/16/rstudio-1-4-preview-server-pro/" target="_blank" rel="noopener noreferrer">a host of new RStudio Server Pro features</a> including project sharing when using Launcher, Microsoft Visual Studio Code support (currently in beta), SAML authentication, and local launcher load-balancing.</li></ul><p>You can read the complete set of new features and bug fixes in the <a href="https://rstudio.com/products/rstudio/release-notes/" target="_blank" rel="noopener noreferrer">RStudio 1.4 release notes</a>.</p><h2 id="rstudio-14-delivers-essential-tooling-for-serious-data-science">RStudio 1.4 Delivers Essential Tooling for Serious Data Science</h2><p>We&rsquo;ve written before about <a href="https://www.rstudio.com/blog/2020-07-09-why-you-need-a-world-class-ide-to-do-serious-data-science/" target="_blank" rel="noopener noreferrer">how a world class IDE is a requirement for any team that wants to do serious data science</a>. However, whenever a new release of an installed product comes out, it always raises the question, &ldquo;What value will my team get from upgrading?&rdquo; For data science teams committed to <a href="https://blog.rstudio.com/2020/05/19/driving-real-lasting-value-with-serious-data-science/" target="_blank" rel="noopener noreferrer">serious data science</a>, we believe RStudio 1.4 is a must-have upgrade because it:</p><ul><li><strong>Expands code credibility through transparency:</strong> Improved support for Python and what-you-see-is-what-you-mean R Markdown means that more data science content can share a common development workflow for all team members. This lowering of barriers between developers, leaders, and business people increases the credibility of the data science.</li><li><strong>Increases agility in the development process:</strong> Many of the new features in RStudio 1.4 are about speeding the development of code, regardless of the data scientist&rsquo;s preferred language or development environment. When developers don&rsquo;t have to switch environments to use a new tool or apply a new test, data science gets done faster.</li><li><strong>Enhances the reach and durability of the data science platform:</strong> Almost all of the new features noted above are available in both the open-source and professional versions of the RStudio platform. Data science leaders don&rsquo;t have to worry about whether they can afford this upgrade; the new version of RStudio 1.4 is available to all users, including those who don&rsquo;t pay us a penny.</li></ul><p>We believe RStudio 1.4&rsquo;s support for a single development environment that supports multiple language platforms will help address many data science challenges that teams will face in 2021. We expect that many organizations will find it to be <a href="https://blog.rstudio.com/2020/07/15/interoperability-maximize-analytic-investments/" target="_blank" rel="noopener noreferrer">the key to unlocking new value from their data science investments</a>.</p><h2 id="where-to-get-rstudio-14">Where to Get RStudio 1.4</h2><p>Open source and commercial versions of RStudio Desktop and RStudio Server Pro 1.4 are available for download today from the <a href="https://rstudio.com/products/rstudio/" target="_blank" rel="noopener noreferrer">RStudio products page</a>.</p><p>If you are just starting out with RStudio and want to learn more about the features available in the RStudio IDE, we invite you to browse the <a href="https://rstudio.com/products/rstudio/features/" target="_blank" rel="noopener noreferrer">the most popular RStudio IDE features</a>.</p><p>To receive email notifications for RStudio professional product releases, patches, security information, and general product support updates, subscribe to the Product Information list by visiting the <a href="https://rstudio.com/about/subscription-management/" target="_blank" rel="noopener noreferrer">RStudio subscription management portal</a>.</p></description></item><item><title>Announcing blogdown v1.0</title><link>https://www.rstudio.com/blog/blogdown-v1.0/</link><pubDate>Mon, 18 Jan 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/blogdown-v1.0/</guid><description><p>The R Markdown team is happy to share that <strong>blogdown</strong> version 1.0 is now available on CRAN. <strong>blogdown</strong> was <a href="https://www.rstudio.com/blog/announcing-blogdown/">originally released</a> in the fall of 2017. The latest version of the package includes some significant improvements to the user experience, and some under-the-hood improvements that you&rsquo;ll benefit from without even knowing!</p><table><thead><tr><th align="center">Latest release</th></tr></thead><tbody><tr><td align="center"><img src="https://img.shields.io/badge/CRAN-1.0-brightgreen" alt="Last bookdown release 1.0 cran badge"></td></tr></tbody></table><p>You can install the latest version from CRAN:</p><pre><code>install.packages(&quot;blogdown&quot;)</code></pre><p>In this post, we&rsquo;ll share some highlights from the latest release, but you might want to look at the <a href="https://github.com/rstudio/blogdown/blob/master/NEWS.md#changes-in-blogdown-version-10">release notes</a> for the full details.</p><h2 id="workflows">Workflows</h2><p>If you are already a <strong>blogdown</strong> user, you will notice some important changes in how the package works. Previously, <strong>blogdown</strong> did two things automatically for you if you had started serving your site as you worked:</p><ol><li>Knitting R Markdown files upon save.</li><li>Re-knitting already knitted R Markdown files, based on a timestamp filter.</li></ol><p>While both of these behaviors were sometimes helpful, we found that more often than not they were problematic for users. Based on user feedback, we have provided options to disable them. The second behavior is disabled by default now (more on this later). For the first behavior, you can set <a href="https://bookdown.org/yihui/blogdown/global-options.html">the global option</a> <code>options(blogdown.knit.on_save = FALSE)</code> to disable it. After that, you must knit an R Markdown post with intent to see your edits take effect (note that plain Markdown files with the extension <code>.md</code> do not need to be knitted).</p><p>The really great news is that the &ldquo;Knit&rdquo; button now <em>works</em> for <strong>blogdown</strong> content in the RStudio IDE! Please feel free to retrain your fingers to knit.</p><p><img src="https://media.giphy.com/media/1463o17ejELYqs/giphy.gif" alt="Cary Grant knitting"></p><p>If you have not yet served your site (after using the &ldquo;Serve Site&rdquo; addin or <code>blogdown::serve_site()</code>), clicking on the &ldquo;Knit&rdquo; button will start the server automatically and produce the site preview for you.</p><p>Also, the <code>public/</code> directory will no longer be generated when you serve the site. Where did it go? We are now using Hugo&rsquo;s server, which means the website is not rendered to disk by default, but instead served directly from memory. The Hugo server is much faster, and also supports navigating to the output web page of which you are currently editing the source document.</p><p>If you miss the <code>public/</code> folder and want it back, you&rsquo;ll need to build the site explicitly via <code>blogdown::build_site()</code>, or if you use RStudio, press <code>Ctrl + Shift + B</code> or <code>Cmd + Shift + B</code> to build the website. This function no longer recompiles R Markdown files by default, because it may be expensive and often undesirable to compile <code>.Rmd</code> files that have been compiled before.</p><p>If you do want to do that anyway, <code>build_site(build_rmd = TRUE)</code> will recompile <em>everything</em> (look out!). For easier control, this <code>build_rmd</code> argument can also take a function to filter the files to re-render when building the site. We have introduced 3 helpers functions (and you can use your own) with an alias each for ease of use:</p><ul><li><code>build_site(build_rmd = 'timestamp')</code> uses <code>filter_timestamp()</code> and will compare the timestamp of input and output files to decide which to render.</li><li><code>build_site(build_rmd = 'newfile')</code> uses <code>filter_newfile()</code> and will render only files that have no output file yet.</li><li><code>build_site(build_rmd = 'md5sum')</code> uses <code>filter_md5sum()</code> and will compare MD5 checksums of files to decide to render.</li></ul><p>See the help page <code>?blogdown::build_site</code> for more information.</p><h2 id="checking-functions">Checking functions</h2><p><strong>blogdown</strong> 1.0 comes with new <em>check</em> functions to help you diagnose and prevent build issues with your site. Checks will help you identify known issues and provide opinionated recommendations to guide you into the pit of success.</p><p>There are 5 specific <code>check_*</code> functions:</p><ul><li><code>check_config()</code> checks the configuration file (<code>config.yaml</code> or <code>config.toml</code>).</li><li><code>check_gitignore()</code> checks if necessary files are incorrectly ignored in GIT.</li><li><code>check_hugo()</code> checks possible problems with the Hugo installation and version.</li><li><code>check_netlify()</code> checks some important Netlify configuration <code>netlify.toml</code>.</li><li><code>check_content()</code> checks for possible problems in the content files, like the validity of YAML metadata, some posts with future dates and draft posts, or R Markdown posts that have not been rendered.</li></ul><p>A final function, <code>check_site()</code>, will run all above <code>check_*()</code> functions at once. If you are a blogdown educator, you may go step-by-step with the checking functions to help students gain a mental model of all the moving pieces needed to build and deploy a site. For people familiar with GitHub, Netlify, and Hugo, you may want to just check everything with <code>blogdown::check_site()</code>.</p><p>These functions will show you what is checked, why, and will assign you some <code>[TODO]</code> items that need your action.</p><pre><code>------------------------------------------------------------○ A successful check looks like this.● [TODO] A check that needs your attention looks like this.| Let's check out your blogdown site!------------------------------------------------------------</code></pre><p>We hope you&rsquo;ll find the checks as helpful as <a href="https://github.com/rstudio/blogdown/issues/548">several</a> other <a href="https://github.com/rstudio/blogdown/issues/510">users</a> have. We&rsquo;ll continue to incorporate more checks into these functions in the future. When in doubt, try <code>remotes::install_github('rstudio/blogdown')</code> and see if <code>blogdown::check_site()</code> uncovers any new problems.</p><p>As an extra bonus, as we were working on better messaging for these <code>check_*()</code> functions, we also improved the new site experience when running <code>blogdown::new_site()</code>. <strong>blogdown</strong> will now output more user-friendly messages on what is going on during your new site setup. To follow a complete &ldquo;code-through&rdquo; of setting up a new site with the new <strong>blogdown</strong>, go look at <a href="https://alison.rbind.io/post/new-year-new-blogdown/" title="Up &amp; running with blogdown in 2021">Up &amp; running with blogdown in 2021</a> written by <a href="https://alison.rbind.io">Alison Hill</a>.</p><h2 id="hugo-versioning-system">Hugo versioning system</h2><p>Although <strong>blogdown</strong> also supports Jekyll and Hexo, most users power their websites with the Hugo static site generator. Hugo has a lot of pluses, one of which is that it gives you fast site builds. However, one minus we&rsquo;ve found as experienced users over the past 3 years is that Hugo also changes fast&mdash;new functions are added and deprecated, and it can be difficult to keep track of if you have more than one Hugo site. This can lead to frustration trying to debug why a site that you could build last month will not build now.</p><p><strong>blogdown</strong> now gives you a way to pin your website project to a specific Hugo version. Both <code>blogdown::install_hugo()</code> and <code>blogdown::check_site()</code> will tell you how. You may also use the following to find all your locally installed Hugo versions:</p><pre><code>blogdown::find_hugo('all')</code></pre><p>You&rsquo;ll see the versions that you have available like this:</p><pre><code>[1] &quot;/Users/alison/Library/Application Support/Hugo/0.54.0/hugo&quot;[2] &quot;/Users/alison/Library/Application Support/Hugo/0.71.1/hugo&quot;[3] &quot;/Users/alison/Library/Application Support/Hugo/0.78.2/hugo&quot;[4] &quot;/Users/alison/Library/Application Support/Hugo/0.79.0/hugo&quot;[5] &quot;/usr/local/bin/hugo&quot;</code></pre><p>From these available Hugo versions, if you&rsquo;d like to pin a specific one to a particular project, you&rsquo;ll use a project-level <code>.Rprofile</code> file. You may call this new helper function to create and fill the <code>.Rprofile</code> with recommended <strong>blogdown</strong> options:</p><pre><code>blogdown::config_Rprofile()</code></pre><p>Inside that file, to pin Hugo to the version, say, 0.79.0, you may set:</p><pre><code>options(blogdown.hugo.version = &quot;0.79.0&quot;)</code></pre><p>Note that you must restart your R session for changes in your <code>.Rprofile</code> file to take effect. How could <code>check_site()</code> or <code>check_hugo()</code> help you do all this? Let&rsquo;s check it out:</p><pre><code>blogdown::check_hugo()</code></pre><pre><code>― Checking Hugo| Checking Hugo version...○ Found 4 versions of Hugo. You are using Hugo 0.79.0.| Checking .Rprofile for Hugo version used by blogdown...| Hugo version not set in .Rprofile.● [TODO] Set options(blogdown.hugo.version = &quot;0.79.0&quot;) in .Rprofile.● [TODO] Also run blogdown::check_netlify() to check for possible problems with Hugo and Netlify.― Check complete: Hugo</code></pre><p>Now, as we hint above in a <code>[TODO]</code> item, after you&rsquo;ve pinned a project-level Hugo version, you&rsquo;ll want to ensure that your Hugo version used by Netlify to build your site also matches your local version. Again, the checking functions <code>check_netlify()</code> can help you here, but you may also use:</p><pre><code>blogdown::config_netlify()</code></pre><p>To open and edit that file with your updated Hugo version number. After doing that, if we checked this file, we&rsquo;d see:</p><pre><code>blogdown::check_netlify()</code></pre><pre><code>― Checking netlify.toml...○ Found HUGO_VERSION = 0.79.0 in [build] context of netlify.toml.| Checking that Netlify &amp; local Hugo versions match...○ It's a match! Blogdown and Netlify are using the same Hugo version (0.79.0).| Checking that Netlify &amp; local Hugo publish directories match...○ Good to go - blogdown and Netlify are using the same publish directory: public― Check complete: netlify.toml</code></pre><p>You may also want to periodically clean up your older Hugo versions that are no longer in use. To do this, use:</p><pre><code>blogdown::remove_hugo()</code></pre><p>In your console, you&rsquo;ll see an interactive menu that allows you to choose which versions to remove like this:</p><pre><code>--------------------------------------------------------------------------------5 Hugo versions found and listed below (#1 on the list is currently used).Which version(s) would you like to remove?--------------------------------------------------------------------------------1: /Users/alison/Library/Application Support/Hugo/0.54.0/hugo2: /Users/alison/Library/Application Support/Hugo/0.71.1/hugo3: /Users/alison/Library/Application Support/Hugo/0.78.2/hugo4: /Users/alison/Library/Application Support/Hugo/0.79.0/hugo5: /usr/local/bin/hugoEnter one or more numbers separated by spaces, or an empty line to cancel</code></pre><p>If you want to update Hugo, you&rsquo;ll need to install a new version now using <code>install_hugo()</code> and a specific version. By default, it will install the latest available version. Consequently, the previous <code>update_hugo()</code> function has been deprecated.</p><h2 id="page-bundles">Page bundles</h2><p>Hugo <a href="https://gohugo.io/news/0.32-relnotes/">version 0.32</a> introduced a new feature called &ldquo;<a href="https://gohugo.io/content-management/page-bundles/">Page Bundles</a>,&rdquo; as a more natural way to organize your content files. The main benefit of using page bundles instead of normal pages is that you can put resource files associated with the post (such as images and data files) inside the same directory as the post itself. This means you no longer have to put them under the <code>static/</code> directory, which has been quite confusing to Hugo users. Here is an example of two page bundles, both inside the <code>content/post</code> section:</p><pre><code>.└── content├── post│ ├── raindrops-on-roses│ │ ├── index.md // That's your post content and Front Matter│ │ └── assets│ │ ├── rain.jpg| | ├── roses.jpg│ │ └── thorns.csv│ └── whiskers-on-kittens│ ├── index.md // That's your post content and Front Matter│ └── assets│ └── kittens.jpg└── songs</code></pre><p><strong>blogdown</strong> now works better with page bundles, like <code>raindrops-on-roses</code> and <code>whiskers-on-kittens</code> in the above example. The &ldquo;Insert image&rdquo; and &ldquo;New post&rdquo; add-ins work, and when you knit your posts or other R Markdown-based content, all figures and any other dependencies (like the <code>index_files/</code> and <code>index_cache/</code> folders) will be output to your page bundle instead of to a folder nested in the <code>static/</code> directory. Consequently, you should not ignore <code>&quot;_files$&quot;</code> in the <code>ignoreFiles</code> field in your site configuration file. This will also get flagged for you if you run <code>blogdown::check_site()</code>.</p><p>If you prefer no bundles, you may set <code>options(blogdown.new_bundle = FALSE)</code> in your <code>.Rprofile</code> to get the old behavior back.</p><p>Finally, if your pages have not been bundled up yet, we have provided a new helper function <code>bundle_site()</code> to help you convert normal pages to bundles:</p><pre><code># make sure your project is either backed up or under version controlblogdown::bundle_site(&quot;.&quot;, output = &quot;.&quot;)</code></pre><h2 id="better-support-for-markdown-format">Better support for Markdown format</h2><p><strong>blogdown</strong> lets you work with three formats for your website content, each of which is processed and rendered slightly differently:</p><table><thead><tr><th>File format</th><th></th><th>Processed by</th><th></th><th>Output format</th><th></th><th>Additional processing by</th></tr></thead><tbody><tr><td><code>.Rmd</code></td><td>→</td><td>Pandoc</td><td>→</td><td><code>.html</code></td><td></td><td></td></tr><tr><td><code>.Rmarkdown</code></td><td>→</td><td>Pandoc</td><td>→</td><td><code>.markdown</code></td><td>→</td><td>Hugo / Goldmark</td></tr><tr><td><code>.md</code></td><td>→</td><td></td><td>→</td><td><code>.md</code></td><td>→</td><td>Hugo / Goldmark</td></tr></tbody></table><p>To learn more about these formats, you may read this <a href="https://bookdown.org/yihui/blogdown/output-format.html">blogdown book chapter</a>.</p><p>Of course, you may always use <code>.md</code> for content that does not include any R code&mdash;this content will only be processed by Hugo&rsquo;s Markdown renderer (the default now is Goldmark). Content written in <code>.Rmd</code> files will be rendered with Pandoc straight to <code>.html</code>, bypassing Hugo&rsquo;s markdown renderer completely. However, some Hugo themes depend on Markdown files as input (not <code>.html</code>)&mdash;that is why the <code>.Rmarkdown</code> file extension has existed. <code>.Rmarkdown</code> files are rendered to <code>.markdown</code> using <strong>knitr</strong> and Pandoc, which will then be processed by Hugo Markdown renderer. This special file extension (unique to <strong>blogdown</strong>) means that the file will be processed by R before Hugo, allowing you to use R code, which a plain <code>.md</code> will not allow.</p><p>With <strong>blogdown</strong> 1.0, the <code>.Rmarkdown</code> file format has become more fully featured, to help users take better advantage of some Hugo theme features and configuration options. <code>.Rmarkdown</code> files now support bibliographies and HTML widgets like a standard <code>.Rmd</code> document. This makes <code>.Rmarkdown</code> an interesting format to be used alone in <strong>blogdown</strong> projects. However, some users voiced a need to be able to simply keep the <code>.md</code> version of their <code>.Rmd</code> content, without needing to use the special <code>.Rmarkdown</code> file extension.</p><p><strong>blogdown</strong> now offers a build method to render <code>.Rmd</code> files to <code>.md</code> instead of <code>.html</code>. This special <em>full markdown mode</em> can be activated by setting <code>options(blogdown.method = &quot;markdown&quot;)</code> in your <code>.Rprofile</code>.</p><p>We recommend this <code>&quot;markdown&quot;</code> mode to advanced users who have a high comfort level with Hugo, and want to use the full power of Goldmark (and understand the trade-offs of not using Pandoc for rendering here, e.g., not all Pandoc&rsquo;s Markdown features are supported by Goldmark).</p><h2 id="final-notes">Final notes</h2><p>This version is a big milestone for <strong>blogdown</strong>, with a lot of changes and improvements. Some improvements may not even be noticeable, yet they are important. For example, Hugo requires you to install GIT and Go if you use themes that contain &ldquo;Hugo modules,&rdquo; but we don&rsquo;t wish to turn <strong>blogdown</strong> users into Go developers, so we tried hard to get rid of the dependency on GIT and Go in this case. Similarly, multilingual sites are better supported under the hood now.</p><p>Updates to the <strong>blogdown</strong> book (<a href="https://bookdown.org/yihui/blogdown/">https://bookdown.org/yihui/blogdown/</a>) are also under way that will reflect the changes in Hugo, Hugo themes, and to <strong>blogdown</strong> itself since the initial package release. A friendly banner is now in place on every page in the online book to let you know that we are aware that the content there is currently out of date and will be updated shortly.</p><p>We hope this new release will improve the quality of life for <strong>blogdown</strong> users, and possibly make the waters seem a little more friendly for hesitant future <strong>blogdown</strong> users to feel braver to wade in. The community of <strong>blogdown</strong> users is always supportive and helpful, so please do not hesitate to ask questions, offer help, or propose new ideas in <a href="https://community.rstudio.com/tag/blogdown">https://community.rstudio.com/tag/blogdown</a>. We&rsquo;d like to take this opportunity to express our heartfelt thanks to <a href="https://drmowinckels.io/blog/2020-05-25-changing-you-blogdown-workflow/">Athanasia Mowinckel</a>, <a href="https://masalmon.eu/2020/02/29/hugo-maintenance/">Maëlle Salmon</a>, <a href="https://clauswilke.com/blog/2020/09/08/a-blogdown-post-for-the-ages/">Claus Wilke</a>, <a href="https://yutani.rbind.io/post/2017-10-25-blogdown-custom/">Hiroaki Yutani</a>, <a href="https://sharleenw.rbind.io/2020/09/02/how-to-remake-a-blogdown-blog-from-scratch/">Sharleen Weatherley</a>, and many others who have shared their <strong>blogdown</strong> experience publicly (we may be slow to respond but we have been all ears). <strong>blogdown</strong> v1.0 wouldn&rsquo;t be possible without this honest feedback.</p><p>Finally, a big thanks to the other 75 contributors who helped with this release by discussing problems, proposing features, and contributing code in the <a href="https://github.com/rstudio/blogdown"><strong>blogdown</strong> repo on Github</a>:</p><p><a href="https://github.com/amssljc">@amssljc</a>, <a href="https://github.com/andremrsantos">@andremrsantos</a>, <a href="https://github.com/andrewdarmond">@andrewdarmond</a>, <a href="https://github.com/andrewheiss">@andrewheiss</a>, <a href="https://github.com/anhhd">@anhhd</a>, <a href="https://github.com/anna-doizy">@anna-doizy</a>, <a href="https://github.com/asimumba">@asimumba</a>, <a href="https://github.com/atusy">@atusy</a>, <a href="https://github.com/b4D8">@b4D8</a>, <a href="https://github.com/bensoltoff">@bensoltoff</a>, <a href="https://github.com/Bijaelo">@Bijaelo</a>, <a href="https://github.com/brettkobo">@brettkobo</a>, <a href="https://github.com/bscott97">@bscott97</a>, <a href="https://github.com/c1au6i0">@c1au6i0</a>, <a href="https://github.com/chrisjake">@chrisjake</a>, <a href="https://github.com/danmrc">@danmrc</a>, <a href="https://github.com/dayabin">@dayabin</a>, <a href="https://github.com/defuneste">@defuneste</a>, <a href="https://github.com/DominiqueMakowski">@DominiqueMakowski</a>, <a href="https://github.com/dwiwad">@dwiwad</a>, <a href="https://github.com/ErickChacon">@ErickChacon</a>, <a href="https://github.com/erikriverson">@erikriverson</a>, <a href="https://github.com/eteitelbaum">@eteitelbaum</a>, <a href="https://github.com/f0nzie">@f0nzie</a>, <a href="https://github.com/frodriguezsmartclip">@frodriguezsmartclip</a>, <a href="https://github.com/giabaio">@giabaio</a>, <a href="https://github.com/gustavo-etal">@gustavo-etal</a>, <a href="https://github.com/irtools">@irtools</a>, <a href="https://github.com/jaggaroshu">@jaggaroshu</a>, <a href="https://github.com/Jansonboss">@Jansonboss</a>, <a href="https://github.com/jhuntergit">@jhuntergit</a>, <a href="https://github.com/jimmyday12">@jimmyday12</a>, <a href="https://github.com/jimrothstein">@jimrothstein</a>, <a href="https://github.com/jimvine">@jimvine</a>, <a href="https://github.com/joftius">@joftius</a>, <a href="https://github.com/jooyoungseo">@jooyoungseo</a>, <a href="https://github.com/JuneKay92">@JuneKay92</a>, <a href="https://github.com/kevinushey">@kevinushey</a>, <a href="https://github.com/lazappi">@lazappi</a>, <a href="https://github.com/Lion666">@Lion666</a>, <a href="https://github.com/llrs">@llrs</a>, <a href="https://github.com/luisotavio88">@luisotavio88</a>, <a href="https://github.com/meersel">@meersel</a>, <a href="https://github.com/melvidoni">@melvidoni</a>, <a href="https://github.com/mpaulacaldas">@mpaulacaldas</a>, <a href="https://github.com/mrkaye97">@mrkaye97</a>, <a href="https://github.com/nanxstats">@nanxstats</a>, <a href="https://github.com/nbwosm">@nbwosm</a>, <a href="https://github.com/nickcotter">@nickcotter</a>, <a href="https://github.com/nitingupta2">@nitingupta2</a>, <a href="https://github.com/pablobernabeu">@pablobernabeu</a>, <a href="https://github.com/pedrohbraga">@pedrohbraga</a>, <a href="https://github.com/petrbouchal">@petrbouchal</a>, <a href="https://github.com/RaymondBalise">@RaymondBalise</a>, <a href="https://github.com/rjfranssen">@rjfranssen</a>, <a href="https://github.com/Robinlovelace">@Robinlovelace</a>, <a href="https://github.com/rrachael">@rrachael</a>, <a href="https://github.com/ryanstraight">@ryanstraight</a>, <a href="https://github.com/setgree">@setgree</a>, <a href="https://github.com/ShirinG">@ShirinG</a>, <a href="https://github.com/ShixiangWang">@ShixiangWang</a>, <a href="https://github.com/solarchemist">@solarchemist</a>, <a href="https://github.com/SoniaNikiema">@SoniaNikiema</a>, <a href="https://github.com/tcwilkinson">@tcwilkinson</a>, <a href="https://github.com/Temurgugu">@Temurgugu</a>, <a href="https://github.com/thedivtagguy">@thedivtagguy</a>, <a href="https://github.com/TianyiShi2001">@TianyiShi2001</a>, <a href="https://github.com/TrungLeVn">@TrungLeVn</a>, <a href="https://github.com/werkstattcodes">@werkstattcodes</a>, <a href="https://github.com/wudustan">@wudustan</a>, <a href="https://github.com/xiaoa6435">@xiaoa6435</a>, <a href="https://github.com/yangepi">@yangepi</a>, <a href="https://github.com/yimingli">@yimingli</a>, and <a href="https://github.com/yogat3ch">@yogat3ch</a>.</p></description></item><item><title>RStudio: A Single Home for R and Python Data Science</title><link>https://www.rstudio.com/blog/one-home-for-r-and-python/</link><pubDate>Wed, 13 Jan 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/one-home-for-r-and-python/</guid><description><h2 id="why-r-and-python">Why R AND Python?</h2><p>From the very beginning, two key ideas have driven the work we do at RStudio:</p><ul><li><strong>It&rsquo;s better for everyone if the tools used for data science are free and open.</strong> This enhances the production and consumption of knowledge and facilitates collaboration and reproducible research in science, education and industry.</li><li><strong>Coding is the most powerful and efficient path to tackle complex, real-world data science challenges.</strong> It gives data scientists superpowers to tackle the hardest problems because code is flexible, reusable, inspectable, and reproducible.</li></ul><p>Some data scientists, and even some organizations, believe they have to pick between R or Python. However, this turns out to be a false choice. In talking to our many customers and others in the data science field, as well as in <a href="https://blog.rstudio.com/2020/10/30/why-rstudio-supports-python/" target="_blank" rel="noopener noreferrer">the surveys we&rsquo;ve done of the data science community</a>, we&rsquo;ve seen that many data science teams today are bilingual, leveraging both R and Python in their work. And while both languages have unique strengths, these teams frequently struggle to use them together.</p><h2 id="common-objections-to-using-r-and-python-together">Common Objections to using R and Python Together</h2><p>We&rsquo;ve heard three common criticisms from data science teams about using R and Python together:</p><ol><li>Data science leaders are often concerned that multilingual teams will have a harder time collaborating and sharing work than a team standardized on one language.</li><li>Individual data scientists may worry that using two languages together will incur a higher cost of project organization and maintenance.</li><li>IT organizations are often concerned that enabling two languages will mean doubling their effort, requiring they maintain, manage, and scale separate environments for R and Python.</li></ol><p>Contrary to these concerns, in talking with many data science teams, we&rsquo;ve found that:</p><ul><li>Modern tooling allows R and Python programmers to seamlessly share and build off of one another. Additionally, data science team leads find it easier to hire and recruit talent when they are able to reach into both R and Python communities.</li><li>Many data scientists find that combining R and Python allows them to use each language for their best strengths, and improvements in data science tools like RStudio eliminate additional overhead.</li><li>IT organizations find that common infrastructure and best practices can support both languages, enabling all the benefits without additional cost. One example of this common infrastructure is <a href="https://rstudio.com/products/team/" target="_blank" rel="noopener noreferrer">RStudio Team</a>, a single centralized infrastructure for bilingual teams using R and Python.</li></ul><p>As you can see, many of the potential concerns of using two languages are addressed through better tooling. In line with our ongoing mission to support the open source data science ecosystem, we&rsquo;ve invested heavily in creating the best platform for data science using both R AND Python. This effort includes many features in the products that comprise RStudio Team. We have also made significant investments in our open source offerings to make it easier than ever to combine R and Python in a single data science project.</p><h2 id="new-python-features-in-rstudio-products">New Python Features in RStudio products</h2><p>In our open source products, we improved and invested in a number of different features over the past year, including:</p><ul><li><a href="https://rstudio.github.io/reticulate/" target="_blank" rel="noopener noreferrer">Continuing to invest in the reticulate package</a> to make it easy for R users to access Python capabilities.</li><li><a href="https://blog.rstudio.com/2020/09/29/torch/" target="_blank" rel="noopener noreferrer">Providing native access from R to <code>torch</code></a>, one of the most widely used deep learning frameworks.</li><li><a href="https://ursalabs.org/" target="_blank" rel="noopener noreferrer">Investing in Ursa Labs</a> for the development of cross language capabilities.</li><li><a href="https://blog.rstudio.com/2020/10/07/rstudio-v1-4-preview-python-support/" target="_blank" rel="noopener noreferrer">Expanding capabilities for native Python coding in the RStudio IDE</a>, including a Python environment and object explorer.</li></ul><p>In <a href="https://rstudio.com/products/rstudio-server-pro/" target="_blank" rel="noopener noreferrer">RStudio Server Pro</a>, which provides collaboration, centralized management, and security for data science teams developing in R and Python, we&rsquo;ve <a href="https://blog.rstudio.com/2020/11/16/rstudio-1-4-preview-server-pro/" target="_blank" rel="noopener noreferrer">added beta support for the VSCode IDE</a>. This work is in addition to our existing support for Jupyter Notebooks and JupyterLab. These enhancements make RStudio Server Pro a true workbench for open source data science.</p><p><a href="https://rstudio.com/products/connect/" target="_blank" rel="noopener noreferrer">RStudio Connect</a> provides a centralized platform where data science teams can operationalize the works they create in R and Python. We&rsquo;ve solved the same challenges for Python users that have made Connect so popular with R users including:</p><ul><li><a href="https://blog.rstudio.com/2020/01/22/rstudio-connect-1-8-0/#python-support" target="_blank" rel="noopener noreferrer">Publishing enhancements</a> in Connect 1.8.0 that make it easier to share Jupyter Notebooks and mixed R and Python content.</li><li>Support for Dash, Bokeh and Streamlit, allowing users to share a full suite of Python applications. See the announcements for <a href="https://blog.rstudio.com/2020/07/14/rstudio-connect-1-8-4/" target="_blank" rel="noopener noreferrer">Connect 1.8.4</a> and <a href="https://blog.rstudio.com/2020/12/16/rstudio-connect-1-8-6-python-update/" target="_blank" rel="noopener noreferrer">1.8.6</a> for more details.</li><li>The ability to use Flask to share Python APIs in <a href="https://blog.rstudio.com/2020/04/02/rstudio-connect-1-8-2/" target="_blank" rel="noopener noreferrer">Connect 1.8.2</a>.</li></ul><p>Finally, in <a href="https://rstudio.com/products/package-manager/" target="_blank" rel="noopener noreferrer">RStudio Package Manager</a>, which helps organize, manage and centralize packages across a team or an entire organization, we recently <a href="https://blog.rstudio.com/2020/12/07/package-manager-1-2-0/" target="_blank" rel="noopener noreferrer">added beta support for PyPI</a>, giving users access to full documentation, automatic syncs, and historic snapshots of Python packages.</p><h2 id="to-learn-more">To Learn More</h2><p>If you&rsquo;d like to learn more about the many ways that RStudio provides a single home for teams using both R and Python, we encourage you to <a href="https://pages.rstudio.net/RStudio_R_Python.html" target="_blank" rel="noopener noreferrer">register for our upcoming webinar</a> on February 3rd and explore the information at <a href="https://rstudio.com/solutions/r-and-python/" target="_blank" rel="noopener noreferrer">R &amp; Python: A Love Story.</a></p><p>We&rsquo;ve also discussed R &amp; Python in several previous blog posts, including:</p><ul><li><a href="https://blog.rstudio.com/2020/10/30/why-rstudio-supports-python/" target="_blank" rel="noopener noreferrer">Why RStudio supports Python</a>, which reviewed survey data from the data science community about the use of R and Python for data science.</li><li><a href="https://blog.rstudio.com/2020/09/10/dispelling-r-and-python-myths-qanda/" target="_blank" rel="noopener noreferrer">Debunking R and Python Myths</a>, which answered questions from a recent joint webinar with our partner, Lander Analytics.</li><li><a href="https://blog.rstudio.com/2020/08/13/how-to-deliver-maximum-value-using-r-python/" target="_blank" rel="noopener noreferrer">Delivering Maximum value using R and Python</a>, which provided multilingual best practices from Dan Chen of Lander Analytics.</li><li><a href="https://blog.rstudio.com/2020/07/28/practical-interoperability/" target="_blank" rel="noopener noreferrer">Wild-caught R and Python applications</a>, which highlighted several bilingual applications suggested by the data science community.</li><li><a href="https://blog.rstudio.com/2020/11/17/an-interview-with-lou-bajuk/" target="_blank" rel="noopener noreferrer">Why RStudio focuses on code-based data science</a>, which recapped a recent podcast featuring RStudio&rsquo;s Lou Bajuk and the Outcast&rsquo;s Michael Lippis.</li></ul></description></item><item><title>Shiny Server 1.5.16 Update</title><link>https://www.rstudio.com/blog/shiny-server-1-5-16-update/</link><pubDate>Wed, 13 Jan 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-server-1-5-16-update/</guid><description><h2 id="important-security-notice">Important Security Notice</h2><p>A vulnerability was discovered in Shiny Server that could allow the download of published application source code directly from the server. This issue affects both Shiny Server Pro and the open source Shiny Server product.</p><p><strong>We recommend upgrading to the new version immediately.</strong> If this is not possible, please contact <a href="mailto:support@rstudio.com">support@rstudio.com</a> who will supply an interim fix that can be applied to the configuration.</p><h3 id="release-notes">Release Notes</h3><p>In addition to the important security patch described above, the following items have been addressed in this release:</p><ul><li>Fixed an issue where a failure in a certain phase of R process launching would result in a broken process being treated as a normal process, and repeatedly used to (unsuccessfully) serve new clients.</li><li>In accordance with the RStudio Platform Support strategy, this release drops support for RedHat/CentOS 6.</li><li>Upgrades Node.js to 12.20.0.</li></ul><p>Review the full <a href="https://support.rstudio.com/hc/en-us/articles/215642837-Shiny-Server-Pro-Release-History">Shiny Server Pro Release Notes</a>.</p><h2 id="upgrade-instructions">Upgrade Instructions</h2><h3 id="shiny-server-pro">Shiny Server Pro</h3><p>To perform an upgrade, download the newer package and install it using your package manager. Existing configuration settings are respected. Instructions are available for the following operating systems:</p><ul><li><a href="https://rstudio.com/products/shiny/download-commercial/redhat-centos/">RedHat/CentOS</a></li><li><a href="https://rstudio.com/products/shiny/download-commercial/ubuntu/">Ubuntu</a></li><li><a href="https://rstudio.com/products/shiny/download-commercial/suse/">SLES/openSUSE</a></li></ul><p>Please contact our <a href="mailto:support@rstudio.com">Support Team</a> if you encounter any issues with the upgrade process.</p><h3 id="shiny-server-open-source">Shiny Server Open Source</h3><p>To upgrade open source Shiny Server, download the newer package and install it using your package manager. Existing configuration settings are respected. Instructions are available for the following operating systems:</p><ul><li><a href="https://rstudio.com/products/shiny/download-server/redhat-centos/">RedHat/CentOS</a></li><li><a href="https://rstudio.com/products/shiny/download-server/ubuntu/">Ubuntu</a></li><li><a href="https://rstudio.com/products/shiny/download-server/suse/">SLES/openSUSE</a></li></ul></description></item><item><title>X-Sessions at rstudio::global</title><link>https://www.rstudio.com/blog/x-sessions-at-rstudio-global/</link><pubDate>Mon, 11 Jan 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/x-sessions-at-rstudio-global/</guid><description><p><sup>Photo by <a href="https://unsplash.com/@jefferyho" target="_blank" rel="noopener noreferrer">Jeffery Ho</a> on <a href="https://unsplash.com/photos/iR94JY-a14U" target="_blank" rel="noopener noreferrer">Unsplash</a></sup></p><p>When you registered for <a href="https://rstudio.com/conference/" target="_blank" rel="noopener noreferrer">rstudio::global(2021)</a>, you might have noticed a question asking if you were interested in participating in any hands-on sponsored sessions prior to the conference. To kick off rstudio::global, some of our sponsors are hosting industry and professionally focused sessions the days leading up to our 24-hour marathon on January 21st.</p><p>X-Sessions are 3-hour themed events made up of pre-recorded tool shares, case studies, and live hands-on sessions. The material in each session is organized by industry, and it is a great chance for sponsors to show their unique offerings with RStudio professional products.</p><p>The goals of these sessions are to create smaller, unique experiences tailored to specific use cases where learners can test drive different professional products and learn alongside their peers.</p><p>The three X-Sessions we’re offering this year are:</p><ol><li>Mastering Shiny with Appsilon: from Development to Deployment</li><li>Powering Enterprise Analytics at Scale Using Teradata Vantage &amp; RStudio</li><li>R in Pharma with ProCogia</li></ol><h2 id="mastering-shiny-with-appsilon-from-development-to-deployment">Mastering Shiny with Appsilon: from Development to Deployment</h2><p><strong>Wednesday, January 20th at 9am ET</strong></p><p><a href="https://appsilon.com/" target="_blank" rel="noopener noreferrer">Appsilon</a> has developed some of the world&rsquo;s most advanced Shiny dashboards. That&rsquo;s why Fortune 500 companies routinely approach them to create enterprise Shiny apps. In this hands-on session Appsilon will cover:</p><ul><li>Best practices for developing Shiny apps</li><li>Styling Shiny with CSS and SASS</li><li>Speeding up Shiny apps with <code>updateInput</code> and JavaScript</li><li>Deploying Shiny apps with RStudio Connect</li><li>Scaling Shiny to hundreds (or thousands) of users</li><li>Doing more with Shiny: <code>shiny.react</code>, <code>shiny.fluent</code>, and beyond</li></ul><p>This Masterclass is intended for all levels. Whether you are just getting started with Shiny or you&rsquo;re a Shiny wizard, there&rsquo;s something here for you to learn.</p><h2 id="powering-enterprise-analytics-at-scale-using-teradata-vantage--rstudio">Powering Enterprise Analytics at Scale Using Teradata Vantage &amp; RStudio</h2><p><strong>Tuesday, January 19th &amp; Wednesday, January 20th at 4pm ET</strong></p><p>Immerse yourself in a hands-on virtual data science workshop. Find out how <a href="https://www.teradata.com/" target="_blank" rel="noopener noreferrer">Teradata</a> Vantage™ and RStudio let you use your favorite analytic tools, languages, and algorithms to get answers, solve problems, and accelerate outcomes.</p><p>Teradata&rsquo;s analytic and data science experts will guide you through advanced analytic techniques on the Vantage analytics platform. You’ll gain hands-on experience following a clear data science process that addresses a real-world use case covering these objectives:</p><ul><li>Implementing Machine Learning functions in predictive R and Python models</li><li>Deploying predictive models on server clusters using RStudio&rsquo;s Launcher</li><li>Creating native applications using Vantage Analyst and AppCenter to deliver interactive visualizations</li><li>Understanding Teradata&rsquo;s native analytic functions, open-source language support, and ecosystem tools and diversity</li><li>Gaining a fundamental understanding of Vantage implementations to address diverse analytic use cases</li></ul><h2 id="r-in-pharma-with-procogia">R in Pharma with ProCogia</h2><p><strong>Tuesday, January 19th at 12pm ET</strong></p><p>Hosted by <a href="https://www.procogia.com/" target="_blank" rel="noopener noreferrer">ProCogia</a> and members of <a href="https://rinpharma.com/" target="_blank" rel="noopener noreferrer">R/Pharma</a>, this session will be made up of talks from industry experts along with a live hands-on workshop. This is a great opportunity to learn and get inspired about new capabilities for creating compelling analyses with applications in drug development using open source languages. Industry leaders from pharmaceutical organizations will share their experiences and best practices for advancing drug development and clinical trials.</p><p>This short course will provide a hands-on introduction to flexible and powerful tools for statistical analysis, reproducible research, and interactive visualizations. The workshop will include an overview of the Tidyverse for clinical data wrangling, how to build Shiny apps and R Markdown documents, as well as visualizations using HTML Widgets for R.</p><h2 id="how-do-i-sign-up">How do I sign up?</h2><p>Registration for the X-Sessions is available in the sign-up menu when you <a href="https://global.rstudio.com/student/authentication/register" target="_blank" rel="noopener noreferrer">register for rstudio::global</a>. If you’ve already registered for global and would like to add an X-Session to your agenda, you can do so by updating your profile and selecting the session you’re interested in. Participants can join at any point during the 3-hour session, however we recommend registering in advance so that the team can provide you with the required materials to follow along.</p><p>Unable to participate? All of the material during the X-Sessions will be recorded and available to view in the Sponsor Gallery during rstudio::global.</p></description></item><item><title>Last Call for the 2020 R Community Survey</title><link>https://www.rstudio.com/blog/last-call-for-the-2020-r-community-survey/</link><pubDate>Thu, 07 Jan 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/last-call-for-the-2020-r-community-survey/</guid><description><p>On December 11, RStudio launched our third annual R Community Survey (formerly known as the Learning R Survey) to better understand how and why people learn and use the R language and associated tools. That survey closes TOMORROW, January 8, 2021. We encourage anyone who is interested in R to respond. The survey should only require 5 to 10 minutes to complete, depending on how little or how much information you choose to share with us. You can find the survey here:</p><ul><li>English version: <a href="https://rstd.io/r-survey-en" target="_blank" rel="noopener noreferrer"><a href="https://rstd.io/r-survey-en">https://rstd.io/r-survey-en</a></a></li><li>Spanish version: <a href="https://rstd.io/r-survey-es" target="_blank" rel="noopener noreferrer"><a href="https://rstd.io/r-survey-es">https://rstd.io/r-survey-es</a></a></li></ul><p>If you don&rsquo;t know R yet or use Python or Julia more than R, that&rsquo;s fine too! The survey has specific questions for you, and your responses will help us better understand how we can be more encouraging to you and others like you.</p><p>Data and analysis of the 2018 and 2019 community survey data can be found on github at <a href="https://github.com/rstudio/r-community-survey" target="_blank" rel="noopener noreferrer"><a href="https://github.com/rstudio/r-community-survey">https://github.com/rstudio/r-community-survey</a></a> in the 2018/ and 2019/ folders. Results from the 2020 survey will also be posted as free and open source data that github repo in February 2021.</p><p>Please ask your students, Twitter followers, Ultimate Frisbee team, and anyone else you think may be interested to complete the survey. Your efforts will help RStudio, educators, and users understand and grow our data science community.</p><p>You will find a full disclosure of what information will be collected and how it will be used on the first page of the survey. The survey does not collect personally identifiable information nor email addresses, but it does have optional demographic questions.</p><p>Thank you in advance for your consideration and time. We look forward to sharing the results with you next month!</p></description></item><item><title>Custom Google Analytics Dashboards with R: Building The Dashboard</title><link>https://www.rstudio.com/blog/google-analytics-part2/</link><pubDate>Wed, 06 Jan 2021 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/google-analytics-part2/</guid><description><script src="index_files/header-attrs/header-attrs.js"></script><script src="index_files/core-js/shim.min.js"></script><script src="index_files/react/react.min.js"></script><script src="index_files/react/react-dom.min.js"></script><script src="index_files/reactwidget/react-tools.js"></script><script src="index_files/htmlwidgets/htmlwidgets.js"></script><script src="index_files/reactable-binding/reactable.js"></script><sup><p>Photo by <a href="https://unsplash.com/@abiglow?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Alan Biglow</a> on <a href="https://unsplash.com/s/photos/tesla-dashboard?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Unsplash</a></p></sup><style type="text/css">img.screenshot { border: 0.5px solid #888; padding: 5px; background-color: #eee;}</style><p>Back in November, I took readers step by step through the somewhat long process of <a href="https://blog.rstudio.com/2020/11/27/google-analytics-part1/" target="_blank" rel="noopener noreferrer">authenticating and downloading Google Analytics web site data into R</a>. This post will be much simpler; I’m going to walk you through creating a dashboard showing blog post popularity using the <code>flexdashboard</code> package.</p><p>Before we go there, however, I want to re-emphasize a correction that we made to the original credentials post. Mark Edmonston, the author of the terrific <code>googleAnalyticsR</code> package, has created a new version of his package that eliminates the need for OAUTH credentials when running on a server. Once that update is available on CRAN, I’ll update this post to document the simpler process of only submitting service account credentials. In the meantime, though, we’ll continue using both OAUTH and service account credentials.</p><div id="where-to-find-the-code-and-data" class="level2"><h3>Where to Find The Code and Data</h3><p>All the code and data presented in this post is in a GitHub repository at <a href="https://github.com/rstudio/a-flexdashboard-for-google-analytics/" target="_blank" rel="noopener noreferrer">https://github.com/rstudio/a-flexdashboard-for-google-analytics</a> in the <em>Part2</em> folder. The code from Part 1 of this blog series is also available in the <em>Part1</em> folder; however, users should be aware that they’ll need to provide their own authentication secrets for that code to work. My previous article, <a href="https://blog.rstudio.com/2020/11/27/google-analytics-part1/" target="_blank" rel="noopener noreferrer">Custom Google Analytics Dashboards with R: Downloading Data</a>, provides detailed instructions for how to obtain those credentials.</p><p>To make it easy for readers to reproduce this dashboard, I’ve constructed a synthetic set of Google Analytics data named <em>clickbait_GA_data.csv</em> for a hypothetical blog at the address <em>clickbait.com</em>. At the time of this writing, that domain was currently for sale and therefore shouldn’t be confused with any real blog. While the synthetic traffic comes from the Google Analytics log from an actual blog, the titles and URLs of all the articles are made up (although I wish I could find out the <em>3 Ways That Birds Are Confused About Bacon</em>). The dataset contains more than 32,000 visits and 105,000 page views conducted over one month.</p></div><div id="creating-our-dashboard" class="level1"><h2>Creating Our Dashboard</h2><p>So let’s begin building our dashboard. To do this, we’re going to open a new <code>flexdashboard</code> R file. We do that by selecting File &gt; New File &gt; R Markdown…. as shown below.</p><p><img class="screenshot" style="width: 477px;" src="01-menu-rmarkdown.jpg"></p><p>We next select From Template &gt; Flex Dashboard.</p><p><img class="screenshot" src="02-from-template.jpg" style="width: 477px;"></p><p>That selection yields a new file which looks like this:</p><p><img class="screenshot" src="03-flexdashboard-template.jpg" style="width: 298px;"></p><p>If you <em>knit</em> that file, you end up with this output in your Preview window.</p><p><img class="screenshot" src="04-flexdashboard-template-result.jpg" style="width: 468px;"></p><p>The preconfigured template has provided us with window panes in which to put our Google Analytics graphs and tables. We simply have to fill them in!</p><p>Our process for building our Google Analytics (GA) dashboard will go like this:</p><ol style="list-style-type: decimal"><li>Read in the Google Analytics data in the setup chunk of our document.</li><li>Use <code>dplyr</code> and <code>ggplot2</code> to create a graph of pageviews by day for Chart A.</li><li>Build a table of the top 10 most popular titles in Chart B using the <code>reactable</code> package.</li><li>Delete the R Markdown code for Chart C.</li></ol><p>So let’s build this dashboard.</p><div id="reading-in-the-data" class="level2"><h3>Reading in the Data</h3><p>We begin our dashboard by reading in the data from Google Analytics. <a href="https://blog.rstudio.com/2020/11/27/google-analytics-part1/" target="_blank" rel="noopener noreferrer">In our last post</a>, we built code to authenticate and read in the GA data using the Google Analytics API. In a production dashboard, we would put that code in the setup here.</p><p>However, because we have our synthetic data in a .csv file, reading in the data will be a much simpler process. We will simply load the libraries we intend to use, apply the <em>read_csv</em> function from the <code>readr</code> package to our dataset, and put all of this in the <em>setup</em> chunk of our R Markdown file as shown below. I’ve shown the first few lines of the output to provide a sense of what that content looks like.</p><pre class="r"><code>library(flexdashboard)library(readr)library(ggplot2)library(dplyr)library(reactable)gadata &lt;- read_csv(&quot;./data/clickbait_GA_data.csv&quot;)show(gadata %&gt;% head(7))</code></pre><pre><code>## # A tibble: 7 x 5## date pageviews users pageTitle landingPagePath## &lt;date&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;chr&gt; &lt;chr&gt;## 1 2020-12-01 2 2 3 Ways That Turtles… www.clickbait.com/2011/02/28/…## 2 2020-12-01 2 1 3 Ways That Turtles… www.clickbait.com/2011/02/28/…## 3 2020-12-01 3 3 Shocking Finding: W… www.clickbait.com/2012/06/04/…## 4 2020-12-01 1 1 Unexpected Research… www.clickbait.com/2012/11/29/…## 5 2020-12-01 11 10 Unexpected Research… www.clickbait.com/2013/06/10/…## 6 2020-12-01 1 1 3 Ways That Europea… www.clickbait.com/2013/10/22/…## 7 2020-12-01 2 2 Why Monkeys Deal wi… www.clickbait.com/2014/01/17/…</code></pre></div><div id="plotting-blog-traffic-by-day" class="level2"><h3>Plotting Blog Traffic by Day</h3><p>With the GA data in a tibble, we can use <code>dplyr</code> to group and sum the page views by day and then plot the data over time with <code>ggplot2</code>. This code will go in the R chunk under the heading <em>Chart A</em>.</p><pre class="r"><code>theme_set(theme_minimal())gadata_by_day &lt;- gadata %&gt;%group_by(date) %&gt;%summarize(pagesums = sum(pageviews))g &lt;- ggplot(gadata_by_day, aes(x = date, y = pagesums)) +geom_point(color = &quot;blue&quot;) +geom_line(color = &quot;blue&quot;) +scale_x_date() +labs(x = &quot;&quot;, y = &quot;&quot;, title = &quot;&quot;)show(g)</code></pre><p><img src="unnamed-chunk-1-1.png" width="672" /></p></div><div id="building-a-table-of-the-most-popular-results" class="level2"><h3>Building a Table of the Most Popular Results</h3><p>We’d also like to present a table of the most popular blog posts on our blog. We could do this with a variety of packages such as <code>kable</code> or <code>DT</code>, but for this example, we’ll use the <code>reactable</code> package. <code>Reactable</code> gives users interactive features such as the ability to search and sort the table. All this is done using client-side Javascript, which makes the table interactive without requiring server involvement.</p><p>We can compute and display the most popular blog posts by inserting this code into the chunk under <em>Chart B</em>. We added arguments to change the column names, specify the widths of the columns, and permit scrolling, searching, and striping just to make it prettier. Those could have been omitted if we weren’t fussy about the formatting.</p><pre class="r"><code>gadata_most_popular &lt;- gadata %&gt;%count(pageTitle, wt = pageviews, sort=TRUE) %&gt;%head(10)## For those who aren&#39;t as comfortable with the options in count, the following## code would also work# gadata_most_popular &lt;- gadata %&gt;%# group_by(pageTitle) %&gt;%# summarize(n = sum(pageviews)) %&gt;%# arrange(desc(n))reactable(gadata_most_popular,columns = list(pageTitle = colDef(name = &quot;Title&quot;,align = &quot;left&quot;,maxWidth = 250),n = colDef(name = &quot;Page Views&quot;,maxWidth = 100)),pagination = FALSE,searchable = TRUE,striped = TRUE)</code></pre><div id="htmlwidget-1" class="reactable html-widget" style="width:auto;height:auto;"></div><script type="application/json" data-for="htmlwidget-1">{"x":{"tag":{"name":"Reactable","attribs":{"data":{"pageTitle":["Amazing Ways That Elephants Embrace Squirrels","Unexpected Research Results: Pandas Don't Comprehend Puppies","3 Ways That Birds Avoid Friends","New Discovery: Americans Experience Friends","3 Ways That Birds Are Confused About Bacon","Unexpected Research Results: Birds Like Birthdays","New Discovery: Monkeys Observe Their Past","13 Ways That Koalas Observe Carbs","Discover How Dogs Can't Get Enough of Carbs","Learn How Cats Embrace Kittens"],"n":[16412,8888,6015,4858,3751,2823,2741,2452,2270,2220]},"columns":[{"accessor":"pageTitle","name":"Title","type":"character","maxWidth":250,"align":"left"},{"accessor":"n","name":"Page Views","type":"numeric","maxWidth":100}],"searchable":true,"defaultPageSize":10,"paginationType":"numbers","showPageInfo":true,"minRows":1,"striped":true,"dataKey":"e21fde16a6509bc62084d4fb648b3f06","key":"e21fde16a6509bc62084d4fb648b3f06"},"children":[]},"class":"reactR_markup"},"evals":[],"jsHooks":[]}</script></div><div id="the-final-result" class="level2"><h3>The Final Result</h3><p>Finally, we change the heading of our R Markdown code to have a meaningful title, rename the headings from Chart A and Chart B to something more reasonable, delete the heading and chunk for Chart C, and add some explanatory text about what our dashboard is about. Our finished dashboard R Markdown code should look like the code in <a href="https://github.com/rstudio/a-flexdashboard-for-google-analytics/blob/main/Part2/dashboard1.Rmd" target="_blank" rel="noopener noreferrer">dashboard1.Rmd</a></p><p>When we knit the results, we see this:</p><p><img class="screenshot" src="05-dashboard-final.jpg" style="width: 600px"></p><p>If we have access to an RStudio Connect server, we can publish this dashboard to that server by clicking the <em>Publish</em> button at the top right of the Viewer window. On the RStudio Connect server, we can schedule the dashboard to regularly download and analyze the Google Analytics data and allow others to interact with it. We can literally go from a desktop R Markdown document to a dashboard running in production for others to see in just a few clicks.</p></div></div><div id="conclusions" class="level1"><h2>Conclusions</h2><p>This post shows how:</p><ol style="list-style-type: decimal"><li><strong>A little R Markdown code can create a Google Analytics dashboard.</strong> Overall, the process of creating this dashboard is not really any more difficult than creating a report in R Markdown. The <code>flexdashboard</code> framework uses the same headings and code chunk structure as a regular R Markdown document. This means that we don’t have to learn a new language to build our dashboard.</li><li><strong>Flexdashboard allows us to exploit other tools we already know.</strong> The R Markdown template for <code>flexdashboard</code> provides visual containers into which we can drop code that uses other packages that we know such as <code>ggplot2</code>, <code>dplyr</code>, and <code>reactable</code>. Again, we don’t have to learn new and unfamiliar tools to create our dashboard.</li><li><strong>We can publish our dashboard and add new features incrementally.</strong> For organizations with an RStudio Connect server, we can put our dashboard into scheduled production with only a few clicks. Any time we wish to add another insight or plot to our dashboard, we simply change the R Markdown document on our desktop and republish the result.</li></ol><p>However, while we’ve successfully created a simple Google Analytics dashboard, we haven’t tackled the question that kicked off this series of blog posts, namely:</p><blockquote><p>Which of your blog articles received the most views in the first 15 days they were posted?</p></blockquote><p>That’s the question we’ll tackle in part 3 of this series, where we’ll derive the dates of publication for our blog posts and create a dashboard that ranks blog posts on the basis of a 15-day window of visitors. This approach will ensure that we don’t favor older blog posts that have just had more time to gather views.</p><div id="for-more-information" class="level2"><h3>For More Information</h3><p>If you would like to learn more about some of the packages and products we’ve used, we recommend:</p><ul><li><a href="https://rmarkdown.rstudio.com/flexdashboard/" target="_blank" rel="noopener noreferrer">flexdashboard: Easy interactive dashboards for R</a>, a web site that gives a broad overview of the many capabilities of the <code>flexdashboard</code> package.</li><li><a href="https://rmarkdown.rstudio.com" target="_blank" rel="noopener noreferrer">R Markdown</a>, RStudio’s web site that describes the many ways you can use R Markdown to create reports, slides, web sites, and more.</li><li><a href="https://rstudio.com/products/connect/" target="_blank" rel="noopener noreferrer">RStudio Connect</a>, RStudio’s publishing platform for R and Python, which provides push-button publishing from the RStudio IDE, scheduled execution of reports, and a host of other production capabilities.</li></ul></div></div></description></item><item><title>Exploring US COVID-19 Cases and Deaths</title><link>https://www.rstudio.com/blog/exploring-us-covid-19-cases/</link><pubDate>Wed, 23 Dec 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/exploring-us-covid-19-cases/</guid><description><script src="https://www.rstudio.com/rmarkdown-libs/header-attrs/header-attrs.js"></script><style type="text/css">h2 { margin-inline-start: 0; padding: 36px 0 0 0; line-height: 30px; }h3 { margin-inline-start: 0; padding: 24px 0 0 0; }</style><div class="lt-gray-box"><p>In this post, RStudio is pleased to once again feature Arthur Steinmetz, former Chairman, CEO, and President of OppenheimerFunds. Art is an avid amateur data scientist and is active in the R statistical programming language community.</p><p>We’ve reformatted and lightly edited Art’s post for clarity here. You can read Art’s original post and other articles he’s written at <a href="https://outsiderdata.netlify.app/post/covid-cases-vs-deaths/" target="_blank" rel="noopener noreferrer">Outsider Data Science</a>. Art has also published his code and data on his <a href="https://github.com/apsteinmetz/covid_cases_vs_deaths" target="_blank" rel="noopener noreferrer">Covid Cases Versus Deaths</a> GitHub repository.</p></div><div id="introduction" class="level1"><h2>Introduction</h2><p>I have a macabre fascination with tracking the course of the COVID-19 pandemic. I suspect there are two reasons for this. One, by delving into the numbers I imagine I have some control over this thing. Second, it feels like lighting a candle to show that science can reveal truth at a time when the darkness of anti-science is creeping across the land.</p><p>The purpose of this project is, as usual, twofold. First, to explore an interesting data science question and, second, to explore some techniques and packages in the R universe. We will be looking at the relationship of COVID-19 cases to mortality. What is the lag between a positive case and a death? How does that vary among states? How has it varied as the pandemic has progressed? This is an interesting project because is combines elements of time series forecasting and dependent variable prediction.</p><p>I have been thinking about how to measure mortality lags for a while now. What prompted to do a write-up was discovering a new function in Matt Dancho’s <code>timetk</code> package, <code>tk_augment_lags</code>, which makes short work of building multiple lags. Not too long ago, managing models for multiple lags and multiple states would have been a bit messy. The emerging <code>tidymodels</code> framework from RStudio using <em>list columns</em> is immensely powerful for this sort of thing. It’s great to reduce so much analysis into so few lines of code.</p><p>This was an exciting project because I got some validation of my approach. I am NOT an epidemiologist or a professional data scientist. None of the results I show here should be considered authoritative. Still, while I was working on this project I saw <a href="https://www.wsj.com/livecoverage/covid-2020-12-02/card/kdSg0ILBfalJHzXvX0bF" target="_blank" rel="noopener noreferrer">this article</a> in the <em>Wall Street Journal</em> which referenced the work by <a href="https://epi.washington.edu/faculty/bedford-trevor" target="_blank" rel="noopener noreferrer">Dr. Trevor Bedford</a>, an epidemiologist at the University of Washington. He took the same approach I did and got about the same result.</p><p>I’ve broken down this analysis into three major parts:</p><ol style="list-style-type: decimal"><li>Understanding the data</li><li>Modeling cases versus deaths</li><li>Validating the model against individual case data</li></ol></div><div id="understanding-the-data" class="level1"><h2>1. Understanding the Data</h2><p>There is no shortage of data to work with. Here we will use the NY Times COVID tracking data set which is updated daily. The package <code>covid19nytimes</code> lets us refresh the data on demand.</p><pre class="r"><code>knitr::opts_chunk$set(echo = TRUE, eval = TRUE)# correlate deaths and cases by statelibrary(tidyverse)library(covid19nytimes)library(timetk)library(lubridate)library(broom)library(knitr)library(gt)table_style &lt;- list(cell_text(font = &quot;Arial&quot;, size = &quot;small&quot;))# source https://github.com/nytimes/covid-19-data.gitus_states_long &lt;- covid19nytimes::refresh_covid19nytimes_states()# The following filter is to restrict the data to that originally posted at# https://outsiderdata.netlify.app/post/covid-cases-vs-deaths/# Should you wish to update the models with the latest data, remove the# following statement.us_states_long &lt;- us_states_long %&gt;% filter(date &lt; ymd(&quot;2020-12-06&quot;))# if link is broken# load(&quot;../data/us_states_long.rdata&quot;)# use data from November 15 to stay consistent with text narrativecutoff_start &lt;- as.Date(&quot;2020-03-15&quot;) # not widespread enough until thencutoff_end &lt;- max(us_states_long$date) - 7 # discard last week since there are reporting lagsus_states_long &lt;- us_states_long %&gt;% filter(date &gt;= cutoff_start)us_states_long &lt;- us_states_long %&gt;% filter(date &lt;= cutoff_end)# Remove tiny territoriesterritories &lt;- c(&quot;Guam&quot;, &quot;Northern Mariana Islands&quot;)us_states_long &lt;- us_states_long %&gt;% filter(!(location %in% territories))save(us_states_long, file = &quot;us_states_long.rdata&quot;)us_states_long %&gt;%head() %&gt;%gt() %&gt;%tab_options(table.width = &quot;100%&quot;) %&gt;%tab_style(style = table_style,locations = cells_body()) %&gt;%opt_all_caps()</code></pre><style>html {font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Helvetica Neue', 'Fira Sans', 'Droid Sans', Arial, sans-serif;}#jydtshcvtq .gt_table {display: table;border-collapse: collapse;margin-left: auto;margin-right: auto;color: #333333;font-size: 16px;font-weight: normal;font-style: normal;background-color: #FFFFFF;width: 100%;border-top-style: solid;border-top-width: 2px;border-top-color: #A8A8A8;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #A8A8A8;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;}#jydtshcvtq .gt_heading {background-color: #FFFFFF;text-align: center;border-bottom-color: #FFFFFF;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#jydtshcvtq .gt_title {color: #333333;font-size: 125%;font-weight: initial;padding-top: 4px;padding-bottom: 4px;border-bottom-color: #FFFFFF;border-bottom-width: 0;}#jydtshcvtq .gt_subtitle {color: #333333;font-size: 85%;font-weight: initial;padding-top: 0;padding-bottom: 4px;border-top-color: #FFFFFF;border-top-width: 0;}#jydtshcvtq .gt_bottom_border {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#jydtshcvtq .gt_col_headings {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#jydtshcvtq .gt_col_heading {color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;text-transform: uppercase;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 6px;padding-left: 5px;padding-right: 5px;overflow-x: hidden;}#jydtshcvtq .gt_column_spanner_outer {color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;text-transform: uppercase;padding-top: 0;padding-bottom: 0;padding-left: 4px;padding-right: 4px;}#jydtshcvtq .gt_column_spanner_outer:first-child {padding-left: 0;}#jydtshcvtq .gt_column_spanner_outer:last-child {padding-right: 0;}#jydtshcvtq .gt_column_spanner {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 6px;overflow-x: hidden;display: inline-block;width: 100%;}#jydtshcvtq .gt_group_heading {padding: 8px;color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;text-transform: uppercase;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;}#jydtshcvtq .gt_empty_group_heading {padding: 0.5px;color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: middle;}#jydtshcvtq .gt_from_md > :first-child {margin-top: 0;}#jydtshcvtq .gt_from_md > :last-child {margin-bottom: 0;}#jydtshcvtq .gt_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;margin: 10px;border-top-style: solid;border-top-width: 1px;border-top-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;overflow-x: hidden;}#jydtshcvtq .gt_stub {color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;text-transform: uppercase;border-right-style: solid;border-right-width: 2px;border-right-color: #D3D3D3;padding-left: 12px;}#jydtshcvtq .gt_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#jydtshcvtq .gt_first_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;}#jydtshcvtq .gt_grand_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#jydtshcvtq .gt_first_grand_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-top-style: double;border-top-width: 6px;border-top-color: #D3D3D3;}#jydtshcvtq .gt_striped {background-color: rgba(128, 128, 128, 0.05);}#jydtshcvtq .gt_table_body {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#jydtshcvtq .gt_footnotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#jydtshcvtq .gt_footnote {margin: 0px;font-size: 90%;padding: 4px;}#jydtshcvtq .gt_sourcenotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#jydtshcvtq .gt_sourcenote {font-size: 90%;padding: 4px;}#jydtshcvtq .gt_left {text-align: left;}#jydtshcvtq .gt_center {text-align: center;}#jydtshcvtq .gt_right {text-align: right;font-variant-numeric: tabular-nums;}#jydtshcvtq .gt_font_normal {font-weight: normal;}#jydtshcvtq .gt_font_bold {font-weight: bold;}#jydtshcvtq .gt_font_italic {font-style: italic;}#jydtshcvtq .gt_super {font-size: 65%;}#jydtshcvtq .gt_footnote_marks {font-style: italic;font-size: 65%;}</style><div id="jydtshcvtq" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;"><table class="gt_table"><thead class="gt_col_headings"><tr><th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1">date</th><th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1">location</th><th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1">location_type</th><th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1">location_code</th><th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1">location_code_type</th><th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1">data_type</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">value</th></tr></thead><tbody class="gt_table_body"><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">2020-11-28</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">Alabama</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">state</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">01</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">fips_code</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">cases_total</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">244993</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">2020-11-28</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">Alabama</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">state</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">01</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">fips_code</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">deaths_total</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">3572</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">2020-11-28</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">Alaska</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">state</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">02</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">fips_code</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">cases_total</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">31279</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">2020-11-28</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">Alaska</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">state</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">02</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">fips_code</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">deaths_total</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">115</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">2020-11-28</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">Arizona</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">state</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">04</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">fips_code</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">cases_total</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">322774</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">2020-11-28</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">Arizona</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">state</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">04</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">fips_code</td><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">deaths_total</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">6624</td></tr></tbody></table></div><p>The NY Times data is presented in a “long” format. When we start modeling, long will suit us well but first we have to add features to help us and that will require <code>pivot</code>ing to wide, adding features and then back to long. The daily data is so irregular the first features we will add are 7-day moving averages to smooth the series. We’ll also do a nation-level analysis first so we aggregate the state data as well.</p><pre class="r"><code># Create rolling average changes# pivot wider# this will also be needed when we create lagsus_states &lt;- us_states_long %&gt;%# discard dates before cases were tracked.filter(date &gt; as.Date(&quot;2020-03-01&quot;)) %&gt;%pivot_wider(names_from = &quot;data_type&quot;, values_from = &quot;value&quot;) %&gt;%rename(state = location) %&gt;%select(date, state, cases_total, deaths_total) %&gt;%mutate(state = as_factor(state)) %&gt;%arrange(state, date) %&gt;%group_by(state) %&gt;%# smooth the data with 7 day moving averagemutate(cases_7day = (cases_total - lag(cases_total, 7)) / 7) %&gt;%mutate(deaths_7day = (deaths_total - lag(deaths_total, 7)) / 7)# national analysis# ----------------------------------------------# aggregate state to nationalus &lt;- us_states %&gt;%group_by(date) %&gt;%summarize(across(.cols = where(is.double),.fns = function(x) sum(x, na.rm = T),.names = &quot;{col}&quot;))us[10:20, ] %&gt;%gt() %&gt;%tab_options(table.width = &quot;80%&quot;) %&gt;%tab_style(style = table_style,locations = cells_body()) %&gt;%opt_all_caps()</code></pre><style>html {font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Helvetica Neue', 'Fira Sans', 'Droid Sans', Arial, sans-serif;}#gqqexhllba .gt_table {display: table;border-collapse: collapse;margin-left: auto;margin-right: auto;color: #333333;font-size: 16px;font-weight: normal;font-style: normal;background-color: #FFFFFF;width: 80%;border-top-style: solid;border-top-width: 2px;border-top-color: #A8A8A8;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #A8A8A8;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;}#gqqexhllba .gt_heading {background-color: #FFFFFF;text-align: center;border-bottom-color: #FFFFFF;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#gqqexhllba .gt_title {color: #333333;font-size: 125%;font-weight: initial;padding-top: 4px;padding-bottom: 4px;border-bottom-color: #FFFFFF;border-bottom-width: 0;}#gqqexhllba .gt_subtitle {color: #333333;font-size: 85%;font-weight: initial;padding-top: 0;padding-bottom: 4px;border-top-color: #FFFFFF;border-top-width: 0;}#gqqexhllba .gt_bottom_border {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#gqqexhllba .gt_col_headings {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#gqqexhllba .gt_col_heading {color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;text-transform: uppercase;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 6px;padding-left: 5px;padding-right: 5px;overflow-x: hidden;}#gqqexhllba .gt_column_spanner_outer {color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;text-transform: uppercase;padding-top: 0;padding-bottom: 0;padding-left: 4px;padding-right: 4px;}#gqqexhllba .gt_column_spanner_outer:first-child {padding-left: 0;}#gqqexhllba .gt_column_spanner_outer:last-child {padding-right: 0;}#gqqexhllba .gt_column_spanner {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 6px;overflow-x: hidden;display: inline-block;width: 100%;}#gqqexhllba .gt_group_heading {padding: 8px;color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;text-transform: uppercase;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;}#gqqexhllba .gt_empty_group_heading {padding: 0.5px;color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: middle;}#gqqexhllba .gt_from_md > :first-child {margin-top: 0;}#gqqexhllba .gt_from_md > :last-child {margin-bottom: 0;}#gqqexhllba .gt_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;margin: 10px;border-top-style: solid;border-top-width: 1px;border-top-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;overflow-x: hidden;}#gqqexhllba .gt_stub {color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;text-transform: uppercase;border-right-style: solid;border-right-width: 2px;border-right-color: #D3D3D3;padding-left: 12px;}#gqqexhllba .gt_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#gqqexhllba .gt_first_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;}#gqqexhllba .gt_grand_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#gqqexhllba .gt_first_grand_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-top-style: double;border-top-width: 6px;border-top-color: #D3D3D3;}#gqqexhllba .gt_striped {background-color: rgba(128, 128, 128, 0.05);}#gqqexhllba .gt_table_body {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#gqqexhllba .gt_footnotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#gqqexhllba .gt_footnote {margin: 0px;font-size: 90%;padding: 4px;}#gqqexhllba .gt_sourcenotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#gqqexhllba .gt_sourcenote {font-size: 90%;padding: 4px;}#gqqexhllba .gt_left {text-align: left;}#gqqexhllba .gt_center {text-align: center;}#gqqexhllba .gt_right {text-align: right;font-variant-numeric: tabular-nums;}#gqqexhllba .gt_font_normal {font-weight: normal;}#gqqexhllba .gt_font_bold {font-weight: bold;}#gqqexhllba .gt_font_italic {font-style: italic;}#gqqexhllba .gt_super {font-size: 65%;}#gqqexhllba .gt_footnote_marks {font-style: italic;font-size: 65%;}</style><div id="gqqexhllba" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;"><table class="gt_table"><thead class="gt_col_headings"><tr><th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1">date</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">cases_total</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">deaths_total</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">cases_7day</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">deaths_7day</th></tr></thead><tbody class="gt_table_body"><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">2020-03-24</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">53906</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">784</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">6858</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">95.29</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">2020-03-25</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">68540</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">1053</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">8600</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">127.29</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">2020-03-26</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">85521</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">1352</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">10449</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">162.86</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">2020-03-27</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">102847</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">1769</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">12121</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">213.14</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">2020-03-28</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">123907</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">2299</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">14199</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">277.00</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">2020-03-29</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">142426</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">2717</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">15626</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">322.86</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">2020-03-30</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">163893</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">3367</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">17202</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">398.43</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">2020-03-31</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">188320</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">4302</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">19202</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">502.57</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">2020-04-01</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">215238</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">5321</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">20957</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">609.71</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">2020-04-02</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">244948</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">6537</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">22775</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">740.71</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">2020-04-03</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">277264</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">7927</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">24917</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">879.71</td></tr></tbody></table></div><div id="exploratory-data-analysis" class="level2"><h3>Exploratory Data Analysis</h3><p>We might be tempted to simply plot deaths vs. cases but a scatter plot shows us that would not be satisfactory. As it turns out, the relationship of cases and deaths is strongly conditioned on date. This reflects the declining mortality rate as we have come to better understand the disease.</p><pre class="r"><code># does a simple scatterplot tell us anything# about the relationship of deaths to cases? No.g &lt;- us %&gt;%ggplot(aes(deaths_7day, cases_7day)) +geom_point() +labs(title = &quot;Not Useful: Simple Scatterplot of U.S. Cases vs. Deaths&quot;,caption = &quot;Source: NY Times, Arthur Steinmetz&quot;)show(g)</code></pre><p><img src="silly-plot-1.png" width="672" /></p><p>We can get much more insight plotting smoothed deaths and cases over time. It is generally bad form to use two different y axes on a single plot but but this example adds insight.</p><p>A couple of observations are obvious. First when cases start to rise, deaths follow with a lag. Second, we have had three spikes in cases so far and in each successive instance the mortality has risen by a smaller amount. This suggests that, thankfully, we are getting better at treating this disease. It is NOT a function of increased testing because <a href="http://91-divoc.com/pages/covid-visualization/?chart=countries&highlight=United%20States&show=highlight-only&y=highlight&scale=linear&data=testPositivity-daily-7&data-source=merged&xaxis=right-12wk#countries">positivity rates</a> have not been falling.</p><pre class="r"><code># visualize the relationship between rolling average of weekly cases and deathscoeff &lt;- 30g &lt;- us %&gt;%ggplot(aes(date, cases_7day)) +geom_line(color = &quot;orange&quot;) +theme(legend.position = &quot;none&quot;) +geom_line(aes(x = date, y = deaths_7day * coeff), color = &quot;red&quot;) +scale_y_continuous(labels = scales::comma,name = &quot;Cases&quot;,sec.axis = sec_axis(deaths_7day ~ . / coeff,name = &quot;Deaths&quot;,labels = scales::comma)) +theme(axis.title.y = element_text(color = &quot;orange&quot;, size = 13),axis.title.y.right = element_text(color = &quot;red&quot;, size = 13)) +labs(title = &quot;U.S. Cases vs. Deaths&quot;,subtitle = &quot;7-Day Average&quot;,caption = &quot;Source: NY Times, Arthur Steinmetz&quot;,x = &quot;Date&quot;)show(g)</code></pre><p><img src="unnamed-chunk-1-1.png" width="672" /></p></div></div><div id="modeling-cases-versus-deaths" class="level1"><h2>2. Modeling Cases versus Deaths</h2><p>This illustrates a problem for any modeling we might do. It looks like the more cases surge, the less the impact on deaths. This is NOT a valid conclusion. A simple regression of deaths vs. cases and time shows the passage of time has more explanatory power than cases in predicting deaths so we have to take that into account.</p><pre class="r"><code># passage of time affects deaths more than caseslm(deaths_7day ~ cases_7day + date, data = us) %&gt;%tidy() %&gt;%gt() %&gt;%tab_options(table.width = &quot;60%&quot;) %&gt;%tab_style(style = table_style,locations = cells_body()) %&gt;%opt_all_caps()</code></pre><style>html {font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Helvetica Neue', 'Fira Sans', 'Droid Sans', Arial, sans-serif;}#ohxjnjrdfj .gt_table {display: table;border-collapse: collapse;margin-left: auto;margin-right: auto;color: #333333;font-size: 16px;font-weight: normal;font-style: normal;background-color: #FFFFFF;width: 60%;border-top-style: solid;border-top-width: 2px;border-top-color: #A8A8A8;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #A8A8A8;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;}#ohxjnjrdfj .gt_heading {background-color: #FFFFFF;text-align: center;border-bottom-color: #FFFFFF;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#ohxjnjrdfj .gt_title {color: #333333;font-size: 125%;font-weight: initial;padding-top: 4px;padding-bottom: 4px;border-bottom-color: #FFFFFF;border-bottom-width: 0;}#ohxjnjrdfj .gt_subtitle {color: #333333;font-size: 85%;font-weight: initial;padding-top: 0;padding-bottom: 4px;border-top-color: #FFFFFF;border-top-width: 0;}#ohxjnjrdfj .gt_bottom_border {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#ohxjnjrdfj .gt_col_headings {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#ohxjnjrdfj .gt_col_heading {color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;text-transform: uppercase;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 6px;padding-left: 5px;padding-right: 5px;overflow-x: hidden;}#ohxjnjrdfj .gt_column_spanner_outer {color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;text-transform: uppercase;padding-top: 0;padding-bottom: 0;padding-left: 4px;padding-right: 4px;}#ohxjnjrdfj .gt_column_spanner_outer:first-child {padding-left: 0;}#ohxjnjrdfj .gt_column_spanner_outer:last-child {padding-right: 0;}#ohxjnjrdfj .gt_column_spanner {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 6px;overflow-x: hidden;display: inline-block;width: 100%;}#ohxjnjrdfj .gt_group_heading {padding: 8px;color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;text-transform: uppercase;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;}#ohxjnjrdfj .gt_empty_group_heading {padding: 0.5px;color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: middle;}#ohxjnjrdfj .gt_from_md > :first-child {margin-top: 0;}#ohxjnjrdfj .gt_from_md > :last-child {margin-bottom: 0;}#ohxjnjrdfj .gt_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;margin: 10px;border-top-style: solid;border-top-width: 1px;border-top-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;overflow-x: hidden;}#ohxjnjrdfj .gt_stub {color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;text-transform: uppercase;border-right-style: solid;border-right-width: 2px;border-right-color: #D3D3D3;padding-left: 12px;}#ohxjnjrdfj .gt_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#ohxjnjrdfj .gt_first_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;}#ohxjnjrdfj .gt_grand_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#ohxjnjrdfj .gt_first_grand_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-top-style: double;border-top-width: 6px;border-top-color: #D3D3D3;}#ohxjnjrdfj .gt_striped {background-color: rgba(128, 128, 128, 0.05);}#ohxjnjrdfj .gt_table_body {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#ohxjnjrdfj .gt_footnotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#ohxjnjrdfj .gt_footnote {margin: 0px;font-size: 90%;padding: 4px;}#ohxjnjrdfj .gt_sourcenotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#ohxjnjrdfj .gt_sourcenote {font-size: 90%;padding: 4px;}#ohxjnjrdfj .gt_left {text-align: left;}#ohxjnjrdfj .gt_center {text-align: center;}#ohxjnjrdfj .gt_right {text-align: right;font-variant-numeric: tabular-nums;}#ohxjnjrdfj .gt_font_normal {font-weight: normal;}#ohxjnjrdfj .gt_font_bold {font-weight: bold;}#ohxjnjrdfj .gt_font_italic {font-style: italic;}#ohxjnjrdfj .gt_super {font-size: 65%;}#ohxjnjrdfj .gt_footnote_marks {font-style: italic;font-size: 65%;}</style><div id="ohxjnjrdfj" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;"><table class="gt_table"><thead class="gt_col_headings"><tr><th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1">term</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">estimate</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">std.error</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">statistic</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">p.value</th></tr></thead><tbody class="gt_table_body"><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">(Intercept)</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">7.654e+04</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">1.005e+04</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">7.613</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">5.153e-13</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">cases_7day</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">8.279e-03</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">1.118e-03</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">7.408</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">1.859e-12</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">date</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">-4.113e+00</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">5.467e-01</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">-7.523</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">9.109e-13</td></tr></tbody></table></div><p>We’ll approach this by running regression models of deaths and varying lags (actually leads) of cases. We chose to lead deaths as opposed to lagging cases because it will allow us to make predictions about the future of deaths given cases today. We include the date as a variable as well. Once we’ve run regressions against each lead period, we’ll chose the lead period that has the best fit (R-Squared) to the data.</p><p>The requires a lot of leads and a lot of models. Fortunately, R provides the tools to make this work very simple and well organized. First we add new columns for each lead period using <code>timetk::tk_augment_lags</code>. This one function call does all the work but it only does lags so we have to futz with it a bit to get leads.</p><p>I chose to add forty days of leads. I don’t really think that long a lead is realistic and, given the pandemic has been around only nine months, there aren’t as many data points forty days ahead. Still, I want to see the behavior of the models. Once we have created the leads we remove any dates for which we don’t have led deaths.</p><p>Here are the first 10 leads:</p><pre class="r"><code># create columns for deaths led 0 to 40 days aheadmax_lead &lt;- 40us_lags &lt;- us %&gt;%# create lags by daytk_augment_lags(deaths_7day, .lags = 0:-max_lead, .names = &quot;auto&quot;)# fix names to remove minus signnames(us_lags) &lt;- names(us_lags) %&gt;% str_replace_all(&quot;lag-|lag&quot;, &quot;lead&quot;)# use only case dates where we have complete future knowledge of deaths for all lead times.us_lags &lt;- us_lags %&gt;% filter(date &lt; cutoff_end - max_lead)us_lags[1:10, 1:7] %&gt;%gt() %&gt;%tab_options(table.width = &quot;100%&quot;) %&gt;%tab_style(style = table_style,locations = cells_body()) %&gt;%opt_all_caps()</code></pre><style>html {font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Helvetica Neue', 'Fira Sans', 'Droid Sans', Arial, sans-serif;}#rmldchqezd .gt_table {display: table;border-collapse: collapse;margin-left: auto;margin-right: auto;color: #333333;font-size: 16px;font-weight: normal;font-style: normal;background-color: #FFFFFF;width: 100%;border-top-style: solid;border-top-width: 2px;border-top-color: #A8A8A8;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #A8A8A8;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;}#rmldchqezd .gt_heading {background-color: #FFFFFF;text-align: center;border-bottom-color: #FFFFFF;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#rmldchqezd .gt_title {color: #333333;font-size: 125%;font-weight: initial;padding-top: 4px;padding-bottom: 4px;border-bottom-color: #FFFFFF;border-bottom-width: 0;}#rmldchqezd .gt_subtitle {color: #333333;font-size: 85%;font-weight: initial;padding-top: 0;padding-bottom: 4px;border-top-color: #FFFFFF;border-top-width: 0;}#rmldchqezd .gt_bottom_border {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#rmldchqezd .gt_col_headings {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#rmldchqezd .gt_col_heading {color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;text-transform: uppercase;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 6px;padding-left: 5px;padding-right: 5px;overflow-x: hidden;}#rmldchqezd .gt_column_spanner_outer {color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;text-transform: uppercase;padding-top: 0;padding-bottom: 0;padding-left: 4px;padding-right: 4px;}#rmldchqezd .gt_column_spanner_outer:first-child {padding-left: 0;}#rmldchqezd .gt_column_spanner_outer:last-child {padding-right: 0;}#rmldchqezd .gt_column_spanner {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 6px;overflow-x: hidden;display: inline-block;width: 100%;}#rmldchqezd .gt_group_heading {padding: 8px;color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;text-transform: uppercase;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;}#rmldchqezd .gt_empty_group_heading {padding: 0.5px;color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: middle;}#rmldchqezd .gt_from_md > :first-child {margin-top: 0;}#rmldchqezd .gt_from_md > :last-child {margin-bottom: 0;}#rmldchqezd .gt_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;margin: 10px;border-top-style: solid;border-top-width: 1px;border-top-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;overflow-x: hidden;}#rmldchqezd .gt_stub {color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;text-transform: uppercase;border-right-style: solid;border-right-width: 2px;border-right-color: #D3D3D3;padding-left: 12px;}#rmldchqezd .gt_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#rmldchqezd .gt_first_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;}#rmldchqezd .gt_grand_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#rmldchqezd .gt_first_grand_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-top-style: double;border-top-width: 6px;border-top-color: #D3D3D3;}#rmldchqezd .gt_striped {background-color: rgba(128, 128, 128, 0.05);}#rmldchqezd .gt_table_body {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#rmldchqezd .gt_footnotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#rmldchqezd .gt_footnote {margin: 0px;font-size: 90%;padding: 4px;}#rmldchqezd .gt_sourcenotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#rmldchqezd .gt_sourcenote {font-size: 90%;padding: 4px;}#rmldchqezd .gt_left {text-align: left;}#rmldchqezd .gt_center {text-align: center;}#rmldchqezd .gt_right {text-align: right;font-variant-numeric: tabular-nums;}#rmldchqezd .gt_font_normal {font-weight: normal;}#rmldchqezd .gt_font_bold {font-weight: bold;}#rmldchqezd .gt_font_italic {font-style: italic;}#rmldchqezd .gt_super {font-size: 65%;}#rmldchqezd .gt_footnote_marks {font-style: italic;font-size: 65%;}</style><div id="rmldchqezd" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;"><table class="gt_table"><thead class="gt_col_headings"><tr><th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1">date</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">cases_total</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">deaths_total</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">cases_7day</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">deaths_7day</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">deaths_7day_lead0</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">deaths_7day_lead1</th></tr></thead><tbody class="gt_table_body"><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">2020-03-15</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">3597</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">68</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">0</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">0.00</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">0.00</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">0.00</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">2020-03-16</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">4504</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">91</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">0</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">0.00</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">0.00</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">0.00</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">2020-03-17</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">5903</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">117</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">0</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">0.00</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">0.00</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">0.00</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">2020-03-18</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">8342</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">162</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">0</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">0.00</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">0.00</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">0.00</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">2020-03-19</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">12381</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">212</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">0</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">0.00</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">0.00</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">0.00</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">2020-03-20</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">17998</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">277</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">0</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">0.00</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">0.00</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">0.00</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">2020-03-21</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">24513</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">360</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">0</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">0.00</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">0.00</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">55.57</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">2020-03-22</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">33046</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">457</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">4205</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">55.57</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">55.57</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">69.57</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">2020-03-23</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">43476</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">578</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">5565</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">69.57</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">69.57</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">95.29</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">2020-03-24</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">53906</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">784</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">6858</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">95.29</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">95.29</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">127.29</td></tr></tbody></table></div><p>Now we start the job of actually building the linear models and seeing the real power of the <code>tidymodels</code> framework. Since we have our lead days in columns we revert back to long-form data. For each date we have a case count and 40 lead days with the corresponding death count. As will be seen below, the decline in the fatality rate has been non-linear, so we use a second-order polynomial to regress the <code>date</code> variable.</p><p>Our workflow looks like this:</p><ol style="list-style-type: decimal"><li>Create the lags using <code>tk_augment_lag</code> (above).</li><li><code>pivot</code> to long form.</li><li><code>nest</code> the data by lead day and state.</li><li><code>map</code> the data set for each lead day to a regression model.</li><li>Pull out the adjusted R-Squared using <code>glance</code> for each model to determine the best fit lead time.</li></ol><p>The result is a data frame with our lead times, the nested raw data, model and R-squared for each lead time.</p><pre class="r"><code># make long form to nest# initialize models data framemodels &lt;- us_lags %&gt;%ungroup() %&gt;%pivot_longer(cols = contains(&quot;lead&quot;),names_to = &quot;lead&quot;,values_to = &quot;led_deaths&quot;) %&gt;%select(date, cases_7day, lead, led_deaths) %&gt;%mutate(lead = as.numeric(str_remove(lead, &quot;deaths_7day_lead&quot;))) %&gt;%nest(data = c(date, cases_7day, led_deaths)) %&gt;%# Run a regression on lagged cases and date vs deathsmutate(model = map(data,function(df) {lm(led_deaths ~ cases_7day + poly(date, 2), data = df)}))# Add regression coefficient# get adjusted r squaredmodels &lt;- models %&gt;%mutate(adj_r = map(model, function(x) {glance(x) %&gt;%pull(adj.r.squared)})%&gt;% unlist())print(models)</code></pre><pre><code>## # A tibble: 41 x 4## lead data model adj_r## &lt;dbl&gt; &lt;list&gt; &lt;list&gt; &lt;dbl&gt;## 1 0 &lt;tibble [218 × 3]&gt; &lt;lm&gt; 0.164## 2 1 &lt;tibble [218 × 3]&gt; &lt;lm&gt; 0.187## 3 2 &lt;tibble [218 × 3]&gt; &lt;lm&gt; 0.212## 4 3 &lt;tibble [218 × 3]&gt; &lt;lm&gt; 0.241## 5 4 &lt;tibble [218 × 3]&gt; &lt;lm&gt; 0.272## 6 5 &lt;tibble [218 × 3]&gt; &lt;lm&gt; 0.307## 7 6 &lt;tibble [218 × 3]&gt; &lt;lm&gt; 0.343## 8 7 &lt;tibble [218 × 3]&gt; &lt;lm&gt; 0.383## 9 8 &lt;tibble [218 × 3]&gt; &lt;lm&gt; 0.424## 10 9 &lt;tibble [218 × 3]&gt; &lt;lm&gt; 0.467## # … with 31 more rows</code></pre><p>To decide the best-fit lead time we choose the model with the highest R-squared.</p><pre class="r"><code># Show model fit by lead time# make predictions using best modelbest_fit &lt;- models %&gt;%summarize(adj_r = max(adj_r)) %&gt;%left_join(models, by = &quot;adj_r&quot;)g &lt;- models %&gt;%ggplot(aes(lead, adj_r)) +geom_line() +labs(subtitle = paste(&quot;Best fit lead =&quot;, best_fit$lead, &quot;days&quot;),title = &quot;Model Fit By Lag Days&quot;,x = &quot;Lead Time in Days for Deaths&quot;,caption = &quot;Source: NY Times, Arthur Steinmetz&quot;,y = &quot;Adjusted R-squared&quot;)show(g)</code></pre><p><img src="unnamed-chunk-5-1.png" width="672" /></p><p>We can have some confidence that we are not overfitting the <code>date</code> variable because the significance of the case count remains. With a high enough degree polynomial on the <code>date</code> variable, cases would vanish in importance.</p><pre class="r"><code>best_fit$model[[1]] %&gt;%tidy() %&gt;%gt() %&gt;%tab_options(table.width = &quot;80%&quot;) %&gt;%tab_style(style = table_style,locations = cells_body()) %&gt;%opt_all_caps()</code></pre><style>html {font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Helvetica Neue', 'Fira Sans', 'Droid Sans', Arial, sans-serif;}#fypngpikkj .gt_table {display: table;border-collapse: collapse;margin-left: auto;margin-right: auto;color: #333333;font-size: 16px;font-weight: normal;font-style: normal;background-color: #FFFFFF;width: 80%;border-top-style: solid;border-top-width: 2px;border-top-color: #A8A8A8;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #A8A8A8;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;}#fypngpikkj .gt_heading {background-color: #FFFFFF;text-align: center;border-bottom-color: #FFFFFF;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#fypngpikkj .gt_title {color: #333333;font-size: 125%;font-weight: initial;padding-top: 4px;padding-bottom: 4px;border-bottom-color: #FFFFFF;border-bottom-width: 0;}#fypngpikkj .gt_subtitle {color: #333333;font-size: 85%;font-weight: initial;padding-top: 0;padding-bottom: 4px;border-top-color: #FFFFFF;border-top-width: 0;}#fypngpikkj .gt_bottom_border {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#fypngpikkj .gt_col_headings {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#fypngpikkj .gt_col_heading {color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;text-transform: uppercase;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 6px;padding-left: 5px;padding-right: 5px;overflow-x: hidden;}#fypngpikkj .gt_column_spanner_outer {color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;text-transform: uppercase;padding-top: 0;padding-bottom: 0;padding-left: 4px;padding-right: 4px;}#fypngpikkj .gt_column_spanner_outer:first-child {padding-left: 0;}#fypngpikkj .gt_column_spanner_outer:last-child {padding-right: 0;}#fypngpikkj .gt_column_spanner {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 6px;overflow-x: hidden;display: inline-block;width: 100%;}#fypngpikkj .gt_group_heading {padding: 8px;color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;text-transform: uppercase;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;}#fypngpikkj .gt_empty_group_heading {padding: 0.5px;color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: middle;}#fypngpikkj .gt_from_md > :first-child {margin-top: 0;}#fypngpikkj .gt_from_md > :last-child {margin-bottom: 0;}#fypngpikkj .gt_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;margin: 10px;border-top-style: solid;border-top-width: 1px;border-top-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;overflow-x: hidden;}#fypngpikkj .gt_stub {color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;text-transform: uppercase;border-right-style: solid;border-right-width: 2px;border-right-color: #D3D3D3;padding-left: 12px;}#fypngpikkj .gt_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#fypngpikkj .gt_first_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;}#fypngpikkj .gt_grand_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#fypngpikkj .gt_first_grand_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-top-style: double;border-top-width: 6px;border-top-color: #D3D3D3;}#fypngpikkj .gt_striped {background-color: rgba(128, 128, 128, 0.05);}#fypngpikkj .gt_table_body {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#fypngpikkj .gt_footnotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#fypngpikkj .gt_footnote {margin: 0px;font-size: 90%;padding: 4px;}#fypngpikkj .gt_sourcenotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#fypngpikkj .gt_sourcenote {font-size: 90%;padding: 4px;}#fypngpikkj .gt_left {text-align: left;}#fypngpikkj .gt_center {text-align: center;}#fypngpikkj .gt_right {text-align: right;font-variant-numeric: tabular-nums;}#fypngpikkj .gt_font_normal {font-weight: normal;}#fypngpikkj .gt_font_bold {font-weight: bold;}#fypngpikkj .gt_font_italic {font-style: italic;}#fypngpikkj .gt_super {font-size: 65%;}#fypngpikkj .gt_footnote_marks {font-style: italic;font-size: 65%;}</style><div id="fypngpikkj" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;"><table class="gt_table"><thead class="gt_col_headings"><tr><th class="gt_col_heading gt_columns_bottom_border gt_left" rowspan="1" colspan="1">term</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">estimate</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">std.error</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">statistic</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">p.value</th></tr></thead><tbody class="gt_table_body"><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">(Intercept)</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">4.363e+02</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">3.799e+01</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">11.49</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">4.207e-24</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">cases_7day</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">1.669e-02</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">9.925e-04</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">16.81</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">5.448e-41</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">poly(date, 2)1</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">-7.306e+03</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">2.270e+02</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">-32.18</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">5.869e-84</td></tr><tr><td class="gt_row gt_left" style="font-family: Arial; font-size: small;">poly(date, 2)2</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">4.511e+03</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">1.674e+02</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">26.95</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">1.016e-70</td></tr></tbody></table></div><div id="making-predictions" class="level2"><h3>Making Predictions</h3><p>The best-fit lead time is 23 days but let’s use <code>predict</code> to see how well our model fits to the actual deaths.</p><pre class="r"><code># ------------------------------------------# see how well our model predicts# Function to create prediction plotshow_predictions &lt;- function(single_model, n.ahead) {predicted_deaths &lt;- predict(single_model$model[[1]], newdata = us)date &lt;- seq.Date(from = min(us$date) + n.ahead, to = max(us$date) + n.ahead, by = 1)display &lt;- full_join(us, tibble(date, predicted_deaths))gg &lt;- display %&gt;%pivot_longer(cols = where(is.numeric)) %&gt;%filter(name %in% c(&quot;deaths_7day&quot;, &quot;predicted_deaths&quot;)) %&gt;%ggplot(aes(date, value, color = name)) +geom_line() +labs(title = &quot;Actual vs. Predicted Deaths&quot;,x = &quot;Date&quot;,y = &quot;Count&quot;,caption = &quot;Source: NY Times, Arthur Steinmetz&quot;)gg}show_predictions(best_fit, best_fit$lead)</code></pre><p><img src="unnamed-chunk-7-1.png" width="672" /></p><p>This is a satisfying result, but sadly shows deaths about to spike. This is despite accounting for the improvements in treatment outcomes we’ve accomplished over the past several months. The 23-day lead time model shows a 1.7% mortality rate over the whole length of observations but conditioned on deaths falling steadily over time.</p></div><div id="understanding-the-declining-mortality-rate" class="level2"><h3>Understanding the Declining Mortality Rate</h3><p>Once we’ve settled on the appropriate lag time, we can look at the fatality rate per identified case. This is but one possible measure of fatality rate, certainly not THE fatality rate. Testing rate, positivity rate and others variables will affect this measure. We also assume our best-fit lag is stable over time so take the result with a grain of salt. The takeaway should be how it is declining, not exactly what it is.</p><p>Early on, only people who were very sick or met strict criteria were tested so, of course, fatality rates (on this metric) were much, much higher. To minimize this we start our measure at the middle of April.</p><p>Sadly, we see that fatality rates are creeping up again.</p><pre class="r"><code>fatality &lt;- best_fit$data[[1]] %&gt;%filter(cases_7day &gt; 0) %&gt;%filter(date &gt; as.Date(&quot;2020-04-15&quot;)) %&gt;%mutate(rate = led_deaths / cases_7day)g &lt;- fatality %&gt;% ggplot(aes(date, rate)) +geom_line() +geom_smooth() +labs(x = &quot;Date&quot;, y = &quot;Fatality Rate&quot;,title = &quot;Fatality Rates are Creeping Up&quot;,subtitle = &quot;Fatality Rate as a Percentage of Lagged Cases&quot;,caption = &quot;Source: NY Times, Arthur Steinmetz&quot;) +scale_y_continuous(labels = scales::percent)show(g)</code></pre><p><img src="unnamed-chunk-8-1.png" width="672" /></p></div><div id="state-level-analysis" class="level2"><h3>State-Level Analysis</h3><p>One problem with the national model is each state saw the arrival of the virus at different times, which suggests there might also be different relationships between cases and deaths. Looking at a few selected states illustrates this.</p><pre class="r"><code># ------------------------------------------# state by state analysisstate_subset &lt;- c(&quot;New York&quot;, &quot;Texas&quot;, &quot;California&quot;, &quot;Ohio&quot;)# illustrate selected statesg &lt;- us_states %&gt;%filter(state %in% state_subset) %&gt;%ggplot(aes(date, cases_7day)) +geom_line(color = &quot;orange&quot;) +facet_wrap(~state, scales = &quot;free&quot;) +theme(legend.position = &quot;none&quot;) +geom_line(aes(y = deaths_7day * coeff), color = &quot;red&quot;) +scale_y_continuous(labels = scales::comma,name = &quot;Cases&quot;,sec.axis = sec_axis(deaths_7day ~ . / coeff,name = &quot;Deaths&quot;,labels = scales::comma)) +theme(axis.title.y = element_text(color = &quot;orange&quot;, size = 13),axis.title.y.right = element_text(color = &quot;red&quot;, size = 13)) +labs(title = &quot;U.S. Cases vs. Deaths&quot;,subtitle = &quot;7-Day Average&quot;,caption = &quot;Source: NY Times, Arthur Steinmetz&quot;,x = &quot;Date&quot;)show(g)</code></pre><p><img src="unnamed-chunk-9-1.png" width="576" /></p><p>In particular we note New York, where the virus arrived early and circulated undetected for weeks. Testing was rare and we did not know much about the course of the disease so the death toll was much worse. Tests were often not conducted until the disease was in advanced stages so we would expect the lag to be shorter.</p><p>In Texas, the virus arrived later. There it looks like the consequences of the first wave were less dire and the lag was longer.</p></div><div id="running-state-by-state-models" class="level2"><h3>Running State-by-State Models</h3><p>Now we can run the same workflow we used above over the state-by-state data. Our data set is much larger because we have a full set of lags for each state but building our data frame of list columns is just as easy.</p><p>Looking at the lags by state shows similar results to the national model, on average, as we assume, but the dispersion is large. Early in the pandemic, in New York, cases were diagnosed only for people who were already sick so the lead time before death was much shorter.</p><pre class="r"><code># create lagsus_states_lags &lt;- us_states %&gt;%# create lags by daytk_augment_lags(deaths_7day, .lags = -max_lead:0, .names = &quot;auto&quot;)# fix names to remove minus signnames(us_states_lags) &lt;- names(us_states_lags) %&gt;% str_replace_all(&quot;lag-&quot;, &quot;lead&quot;)# make long form to nest# initialize models data framemodels_st &lt;- us_states_lags %&gt;%ungroup() %&gt;%pivot_longer(cols = contains(&quot;lead&quot;),names_to = &quot;lead&quot;,values_to = &quot;led_deaths&quot;) %&gt;%select(state, date, cases_7day, lead, led_deaths) %&gt;%mutate(lead = as.numeric(str_remove(lead, &quot;deaths_7day_lead&quot;)))# make separate tibbles for each regressionmodels_st &lt;- models_st %&gt;%nest(data = c(date, cases_7day, led_deaths)) %&gt;%arrange(lead)# Run a linear regression on lagged cases and date vs deathsmodels_st &lt;- models_st %&gt;%mutate(model = map(data,function(df) {lm(led_deaths ~ cases_7day + poly(date, 2), data = df)}))# Add regression coefficient# get adjusted r squaredmodels_st &lt;- models_st %&gt;%mutate(adj_r = map(model, function(x) {glance(x) %&gt;%pull(adj.r.squared)})%&gt;% unlist())g &lt;- models_st %&gt;%filter(state %in% state_subset) %&gt;%ggplot(aes(lead, adj_r)) +geom_line() +facet_wrap(~state) +labs(title = &quot;Best Fit Lead Time&quot;,caption = &quot;Source: NY Times, Arthur Steinmetz&quot;)show(g)</code></pre><p><img src="unnamed-chunk-10-1.png" width="576" /></p><p>To see how the fit looks for the data set as a whole we look at a histogram of all the state R-squareds. We see many of the state models have a worse accuracy than the national model.</p><pre class="r"><code># best fit lag by statebest_fit_st &lt;- models_st %&gt;%group_by(state) %&gt;%summarize(adj_r = max(adj_r)) %&gt;%left_join(models_st)g &lt;- best_fit_st %&gt;% ggplot(aes(adj_r)) +geom_histogram(bins = 10, color = &quot;white&quot;) +geom_vline(xintercept = best_fit$adj_r[[1]], color = &quot;red&quot;) +annotate(geom = &quot;text&quot;, x = 0.75, y = 18, label = &quot;Adj-R in National Model&quot;) +labs(y = &quot;State Count&quot;,x = &quot;Adjusted R-Squared&quot;,title = &quot;Goodness of Fit of State Models&quot;,caption = &quot;Source:NY Times,Arthur Steinmetz&quot;)show(g)</code></pre><p><img src="unnamed-chunk-11-1.png" width="672" /></p><p>There are vast differences in the best-fit lead times across the states but the distribution is in agreement with our national model.</p><pre class="r"><code>g &lt;- best_fit_st %&gt;% ggplot(aes(lead)) +geom_histogram(binwidth = 5, color = &quot;white&quot;) +scale_y_continuous(labels = scales::label_number(accuracy = 1)) +geom_vline(xintercept = best_fit$lead[[1]], color = &quot;red&quot;) +annotate(geom = &quot;text&quot;, x = best_fit$lead[[1]] + 7, y = 10, label = &quot;Lead in National Model&quot;) +labs(y = &quot;State Count&quot;,x = &quot;Best Fit Model Days from Case to Death&quot;,title = &quot;COVID-19 Lag Time From Cases to Death&quot;,caption = &quot;Source:NY Times,Arthur Steinmetz&quot;)show(g)</code></pre><p><img src="unnamed-chunk-12-1.png" width="672" /></p></div></div><div id="validating-the-model-against-individual-case-data" class="level1"><h2>3. Validating the Model against Individual Case Data</h2><p>This whole exercise has involved proxying deaths by time and quantity of positive tests. Ideally, we should look at longitudinal data which follows each individual. The state of Ohio provides that so we’ll look at just this one state to provide a reality check on the foregoing analysis. In our proxy model, Ohio shows a best-fit lead time of 31 days, which is much longer than our national-level model.</p><pre class="r"><code># ----------------------------------------------------best_fit_st %&gt;%select(-data, -model) %&gt;%filter(state == &quot;Ohio&quot;) %&gt;%gt() %&gt;%tab_options(table.width = &quot;50%&quot;) %&gt;%tab_style(style = table_style,locations = cells_body()) %&gt;%opt_all_caps()</code></pre><style>html {font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Helvetica Neue', 'Fira Sans', 'Droid Sans', Arial, sans-serif;}#andyjdqprw .gt_table {display: table;border-collapse: collapse;margin-left: auto;margin-right: auto;color: #333333;font-size: 16px;font-weight: normal;font-style: normal;background-color: #FFFFFF;width: 50%;border-top-style: solid;border-top-width: 2px;border-top-color: #A8A8A8;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #A8A8A8;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;}#andyjdqprw .gt_heading {background-color: #FFFFFF;text-align: center;border-bottom-color: #FFFFFF;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#andyjdqprw .gt_title {color: #333333;font-size: 125%;font-weight: initial;padding-top: 4px;padding-bottom: 4px;border-bottom-color: #FFFFFF;border-bottom-width: 0;}#andyjdqprw .gt_subtitle {color: #333333;font-size: 85%;font-weight: initial;padding-top: 0;padding-bottom: 4px;border-top-color: #FFFFFF;border-top-width: 0;}#andyjdqprw .gt_bottom_border {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#andyjdqprw .gt_col_headings {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;}#andyjdqprw .gt_col_heading {color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;text-transform: uppercase;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 6px;padding-left: 5px;padding-right: 5px;overflow-x: hidden;}#andyjdqprw .gt_column_spanner_outer {color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;text-transform: uppercase;padding-top: 0;padding-bottom: 0;padding-left: 4px;padding-right: 4px;}#andyjdqprw .gt_column_spanner_outer:first-child {padding-left: 0;}#andyjdqprw .gt_column_spanner_outer:last-child {padding-right: 0;}#andyjdqprw .gt_column_spanner {border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: bottom;padding-top: 5px;padding-bottom: 6px;overflow-x: hidden;display: inline-block;width: 100%;}#andyjdqprw .gt_group_heading {padding: 8px;color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;text-transform: uppercase;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;}#andyjdqprw .gt_empty_group_heading {padding: 0.5px;color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;vertical-align: middle;}#andyjdqprw .gt_from_md > :first-child {margin-top: 0;}#andyjdqprw .gt_from_md > :last-child {margin-bottom: 0;}#andyjdqprw .gt_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;margin: 10px;border-top-style: solid;border-top-width: 1px;border-top-color: #D3D3D3;border-left-style: none;border-left-width: 1px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 1px;border-right-color: #D3D3D3;vertical-align: middle;overflow-x: hidden;}#andyjdqprw .gt_stub {color: #333333;background-color: #FFFFFF;font-size: 80%;font-weight: bolder;text-transform: uppercase;border-right-style: solid;border-right-width: 2px;border-right-color: #D3D3D3;padding-left: 12px;}#andyjdqprw .gt_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#andyjdqprw .gt_first_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;}#andyjdqprw .gt_grand_summary_row {color: #333333;background-color: #FFFFFF;text-transform: inherit;padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;}#andyjdqprw .gt_first_grand_summary_row {padding-top: 8px;padding-bottom: 8px;padding-left: 5px;padding-right: 5px;border-top-style: double;border-top-width: 6px;border-top-color: #D3D3D3;}#andyjdqprw .gt_striped {background-color: rgba(128, 128, 128, 0.05);}#andyjdqprw .gt_table_body {border-top-style: solid;border-top-width: 2px;border-top-color: #D3D3D3;border-bottom-style: solid;border-bottom-width: 2px;border-bottom-color: #D3D3D3;}#andyjdqprw .gt_footnotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#andyjdqprw .gt_footnote {margin: 0px;font-size: 90%;padding: 4px;}#andyjdqprw .gt_sourcenotes {color: #333333;background-color: #FFFFFF;border-bottom-style: none;border-bottom-width: 2px;border-bottom-color: #D3D3D3;border-left-style: none;border-left-width: 2px;border-left-color: #D3D3D3;border-right-style: none;border-right-width: 2px;border-right-color: #D3D3D3;}#andyjdqprw .gt_sourcenote {font-size: 90%;padding: 4px;}#andyjdqprw .gt_left {text-align: left;}#andyjdqprw .gt_center {text-align: center;}#andyjdqprw .gt_right {text-align: right;font-variant-numeric: tabular-nums;}#andyjdqprw .gt_font_normal {font-weight: normal;}#andyjdqprw .gt_font_bold {font-weight: bold;}#andyjdqprw .gt_font_italic {font-style: italic;}#andyjdqprw .gt_super {font-size: 65%;}#andyjdqprw .gt_footnote_marks {font-style: italic;font-size: 65%;}</style><div id="andyjdqprw" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;"><table class="gt_table"><thead class="gt_col_headings"><tr><th class="gt_col_heading gt_columns_bottom_border gt_center" rowspan="1" colspan="1">state</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">adj_r</th><th class="gt_col_heading gt_columns_bottom_border gt_right" rowspan="1" colspan="1">lead</th></tr></thead><tbody class="gt_table_body"><tr><td class="gt_row gt_center" style="font-family: Arial; font-size: small;">Ohio</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">0.7548</td><td class="gt_row gt_right" style="font-family: Arial; font-size: small;">31</td></tr></tbody></table></div><p>The caveat here is the NY Times data uses the “case” date which is presumably the date a positive test is recorded. The Ohio data uses “onset” date, which is the date the “illness began.” That is not necessarily the same as the test date.</p><pre class="r"><code># source: https://coronavirus.ohio.gov/static/dashboards/COVIDSummaryData.csvohio_raw &lt;- read_csv(&quot;https://coronavirus.ohio.gov/static/dashboards/COVIDSummaryData.csv&quot;,col_types = cols(`Admission Date` = col_date(format = &quot;%m/%d/%Y&quot;),`Date Of Death` = col_date(format = &quot;%m/%d/%Y&quot;),`Onset Date` = col_date(format = &quot;%m/%d/%Y&quot;)))# helper function to fix column names to best practicefix_df_colnames &lt;- function(df) {names(df) &lt;- names(df) %&gt;%str_replace_all(c(&quot; &quot; = &quot;_&quot;, &quot;,&quot; = &quot;&quot;)) %&gt;%tolower()return(df)}# clean up the dataohio &lt;- ohio_raw %&gt;%rename(death_count = `Death Due to Illness Count`) %&gt;%filter(County != &quot;Grand Total&quot;) %&gt;%fix_df_colnames() %&gt;%# data not clean before middle of marchfilter(onset_date &gt;= cutoff_start)</code></pre><p>How comparable are these data sets? Let’s compare the NY Times case count and dates to the Ohio “Illness Onset” dates.</p><pre class="r"><code># create rolling average functionmean_roll_7 &lt;- slidify(mean, .period = 7, .align = &quot;right&quot;)comps &lt;- ohio %&gt;%group_by(onset_date) %&gt;%summarise(OH = sum(case_count), .groups = &quot;drop&quot;) %&gt;%mutate(OH = mean_roll_7(OH)) %&gt;%ungroup() %&gt;%mutate(state = &quot;Ohio&quot;) %&gt;%rename(date = onset_date) %&gt;%left_join(us_states, by = c(&quot;date&quot;, &quot;state&quot;)) %&gt;%transmute(date, OH, NYTimes = cases_7day)g &lt;- comps %&gt;%pivot_longer(c(&quot;OH&quot;, &quot;NYTimes&quot;), names_to = &quot;source&quot;, values_to = &quot;count&quot;) %&gt;%ggplot(aes(date, count, color = source)) +geom_line() +labs(title = &quot;Case Counts from Different Sources&quot;,caption = &quot;Source: State of Ohio, NY Times&quot;,subtitle = &quot;NY Times and State of Ohio&quot;,x = &quot;Date&quot;,y = &quot;Daily Case Count (7-day Rolling Average)&quot;)show(g)</code></pre><p><img src="unnamed-chunk-14-1.png" width="672" /></p><p>We clearly see the numbers line up almost exactly but the Ohio data runs about 4 days ahead of the NY Times data.</p><p>For each individual death, we subtract the onset date from the death date. Then we aggregate the county-level data to statewide and daily data to weekly. Then take the weekly mean of deaths.</p><pre class="r"><code># aggregate the data to weeklyohio &lt;- ohio %&gt;%mutate(onset_to_death = as.numeric(date_of_death - onset_date),onset_year = year(onset_date),onset_week = epiweek(onset_date))onset_to_death &lt;- ohio %&gt;%filter(death_count &gt; 0) %&gt;%group_by(onset_year, onset_week) %&gt;%summarise(death_count_sum = sum(death_count),mean_onset_to_death = weighted.mean(onset_to_death,death_count,na.rm = TRUE)) %&gt;%mutate(date = as.Date(paste(onset_year, onset_week, 1), &quot;%Y %U %u&quot;))g &lt;- onset_to_death %&gt;% ggplot(aes(date, death_count_sum)) +geom_col() +labs(title = &quot;Ohio Weekly Deaths&quot;,caption = &quot;Source: State of Ohio, Arthur Steinmetz&quot;,subtitle = &quot;Based on Illness Onset Date&quot;,x = &quot;Date of Illness Onset&quot;,y = &quot;Deaths&quot;)show(g)</code></pre><p><img src="unnamed-chunk-15-1.png" width="672" /></p><p>When we measure the average lag, we find that it has been fairly stable over time in Ohio. Unfortunately, it differs substantially from our proxy model using untracked cases.</p><pre class="r"><code># helper function to annotate plotspos_index &lt;- function(index_vec, fraction) {return(index_vec[round(length(index_vec) * fraction)])}avg_lag &lt;- round(mean(onset_to_death$mean_onset_to_death))onset_to_death %&gt;% ggplot(aes(date, mean_onset_to_death)) +geom_col() +geom_hline(yintercept = avg_lag) +annotate(geom = &quot;text&quot;,label = paste(&quot;Average Lag =&quot;, round(avg_lag)),y = 20, x = pos_index(onset_to_death$date, .8)) +labs(x = &quot;Onset Date&quot;,y = &quot;Mean Onset to Death&quot;,title = &quot;Ohio Days from Illness Onset Until Death Over Time&quot;,caption = &quot;Source: State of Ohio, Arthur Steinmetz&quot;,subtitle = paste(&quot;Average =&quot;,avg_lag, &quot;Days&quot;))</code></pre><p><img src="unnamed-chunk-16-1.png" width="672" /></p><p>Note the drop off at the end of the date range. This is because we don’t yet know the outcome of the most recently recorded cases. Generally, while we have been successful in lowering the fatality rate of this disease, the duration from onset to death for those cases which are fatal has not changed much, at least in Ohio.</p><p>Since we have the actual number of deaths associated with every onset date we can calculate the “true” fatality rate. As mentioned, the fatality rate of the more recent cases is not yet known. Also the data is too sparse at the front of the series so we cut off the head and the tail of the data.</p><pre class="r"><code>ohio_fatality_rate &lt;- ohio %&gt;%group_by(onset_date) %&gt;%summarize(case_count = sum(case_count),death_count = sum(death_count), .groups = &quot;drop&quot;) %&gt;%mutate(fatality_rate = death_count / case_count) %&gt;%mutate(fatality_rate_7day = mean_roll_7(fatality_rate)) %&gt;%# filter out most recent cases we we don&#39;t know outcome yetfilter(onset_date &lt; max(onset_date) - 30)ohio_fatality_rate %&gt;%filter(onset_date &gt; as.Date(&quot;2020-04-15&quot;)) %&gt;%ggplot(aes(onset_date, fatality_rate_7day)) +geom_line() +geom_smooth() +labs(x = &quot;Illness Onset Date&quot;, y = &quot;Ohio Fatality Rate&quot;,caption = &quot;Source: State of Ohio, Arthur Steinmetz&quot;,title = &quot;Ohio Fatality Rate as a Percentage of Tracked Cases&quot;) +scale_y_continuous(labels = scales::percent, breaks = seq(0, 0.12, by = .01))</code></pre><p><img src="unnamed-chunk-17-1.png" width="672" /></p><p>The fatality rate in Ohio seems to have been worse than our national model but it is coming down. Again, this result comes from a different methodology than our proxy model.</p></div><div id="conclusion" class="level1"><h2>Conclusion</h2><p>Among the vexing aspects of this terrible pandemic is that we don’t know what the gold standard is for treatment and prevention. We are learning as we go. The good news is we ARE learning. For a data analyst the challenge is the evolving relationship of of all of the disparate data. Here we have gotten some insight into the duration between a positive test and mortality. We can’t have high confidence that our proxy model using aggregate cases is strictly accurate because the longitudinal data from Ohio shows a different lag. We have clearly seen that mortality has been declining but our model suggests that death will nonetheless surge along with the autumn surge in cases.</p><p>What are the further avenues for modeling? There is a wealth of data around behavior and demographics with this disease that we don’t fully understand yet. On the analytics side, we might get more sophisticated with our modeling. We have only scratched the surface of the <code>tidymodels</code> framework and we might apply fancier predictive models than linear regression. Is the drop in the fatality rate we saw early in the pandemic real? Only people who were actually sick got tested in the early days. Now, many positive tests are from asymptomatic people. Finally, the disagreement between the case proxy model and the longitudinal data in Ohio shows there is more work to be done.</p><hr /><div id="about-art-steinmetz" class="level3"><h4>About Art Steinmetz</h4><p>Art Steinmetz is the former Chairman, CEO and President of OppenheimerFunds. After joining the firm in 1986, Art was an analyst, portfolio manager and Chief Investment Officer. Art was named President in 2013, CEO in 2014, and, in 2015, Chairman of the firm with $250 billion under management. He stepped down when the firm was folded into Invesco.</p><p>Currently, Art is a private investor located in New York City. He is an avid amateur data scientist and is active in the R statistical programming language community.</p></div></div></description></item><item><title>Winners of the 2020 RStudio Table Contest</title><link>https://www.rstudio.com/blog/winners-of-the-2020-rstudio-table-contest/</link><pubDate>Wed, 23 Dec 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/winners-of-the-2020-rstudio-table-contest/</guid><description><p>The inaugural 2020 RStudio Table Contest came to a close in mid-November and we were delighted to see that over 80 entries were submitted. With such enthusiasm for tables, it took some time to look over all of the entries but we do have a winner.</p><h2 id="the-winner-and-the-runner-up">The Winner and the Runner-up</h2><p><img src="van-der-welden.png" alt=""></p><p>The winner is an entry by Niels van der Velden called Editable Datatables in R Shiny Using SQL. Niels provides a <a href="https://www.nielsvandervelden.com/post/sql_datatable/editable-datatables-in-r-shiny-using-sql/">detailed tutorial here</a> (along with a <a href="https://niels-van-der-velden.shinyapps.io/employee_directory_crud_app/">live example</a> and <a href="https://github.com/nvelden/Employee_Directory_CRUD">GitHub repository</a>). It&rsquo;s a great demonstration of an interactive employee directory where a DT table is central in presenting the employee information. Shiny provides all of the interactivity and the {pool} package handles the storage of the entries with a database backend. Learn more at <a href="https://community.rstudio.com/t/employee-directory-editable-dt-table-contest-submission/81403">RStudio Community</a>.</p><hr><p><img src="gans.png" alt=""></p><p>A close runner-up is an entry by Maya Gans (previous RStudio intern!) that presents the Table Generator Shiny extension available in the {tidyCDISC} package (<a href="https://maya-gans.shinyapps.io/tablecontest-biogen/">live example</a>, <a href="https://github.com/biogen-inc/tidycdisc/tree/table-contest">GitHub repository</a>). This is a complete table-generation system where users could provide their own clinical data and make summary tables without any programming effort. The Shiny application allows for dragging pairs of columns and their summary statistic counterpart to create a {gt} table. There&rsquo;s great flexibility to customize the generated tables as one could subgroup, filter, and reorder the underlying table structure. Crucially, the app supports table export to either HTML or CSV. Learn more at <a href="https://community.rstudio.com/t/tidycdisc-table-contest-submission/86688">RStudio Community</a>.</p><p>Honorable Mentions</p><p>There were so many great entries that we&rsquo;d like to highlight as honorable mentions. All of these were either in the form of a static table or as a tutorial.</p><h3 id="static-tables">Static Tables</h3><p>A large portion of the contest entries were in the form of a static (i.e., not Shiny-based) table. In quite a few cases interactive bits (e.g., sparkline plots) were interspersed in table cells, making for a delightful experience. Here are a few of the tables that deserve some attention.</p><p>Beyoncé and Taylor Swift Albums, by Georgios Karamanis (<a href="https://github.com/gkaramanis/tidytuesday/blob/master/2020-week40/plots/beyonce-swift.png">live example</a>, <a href="https://github.com/gkaramanis/tidytuesday/tree/master/2020-week40">GitHub repository</a>, <a href="https://community.rstudio.com/t/86399">RStudio Community</a>)</p><p><img src="gkaramanis.png" alt=""></p><br><p>What do I binge next? An overview of the top IMDb TV shows, by Cédric Scherer (<a href="https://cedricscherer.netlify.app/files/IMDb_Top250.png">live example</a>, <a href="https://github.com/Z3tt/Rstudio_TableContest_2020">GitHub repository</a>, <a href="https://community.rstudio.com/t/86409">RStudio Community</a>).</p><p><img src="scherer.png" alt=""></p><br><p>2019 NFL Team Ratings, by Kyle Cuilla (<a href="https://rpubs.com/kcuilla/nfl_team_ratings">live example</a>, <a href="https://github.com/kcuilla/2020-RStudio-Table-Contest">GitHub repository</a>, <a href="https://community.rstudio.com/t/81205">RStudio Community</a>).</p><p><img src="cuilla.png" alt=""></p><br><p>The Big Mac Index Table, by A. Calatroni, S. Lussier &amp; R. Krouse (<a href="https://rpubs.com/acalatroni/682678">live example</a>, <a href="https://github.com/agstn/RStudio_table_contest_2020">GitHub repository</a>, <a href="https://community.rstudio.com/t/86123">RStudio Community</a>).</p><p><img src="calatroni.png" alt=""></p><br><p>Technology Figures of the EU, by Florian Handke (<a href="https://rpubs.com/FlorianHandke/table_contest_2020">live example</a>, <a href="https://github.com/FlorianHandke/RStudio_Table_Contest_2020">GitHub repository</a>, <a href="https://community.rstudio.com/t/87855">RStudio Community</a>).</p><p><img src="handke.png" alt=""></p><br><p>Imperial March, by Bill Schmid (<a href="https://github.com/schmid07/R-Studio-Table-Contest-Submission">live example and repository</a>, <a href="https://community.rstudio.com/t/86345">RStudio Community</a>).</p><p><img src="schmid.png" alt=""></p><h3 id="tutorials">Tutorials</h3><p>Sometimes in order to learn a new package or feature it helps to get a more-involved walkthrough. Tutorials are long form articles that dive into a set of features, table-package, or multiple packages, with the goal of giving the reader a deeper understanding of how it all works.</p><p>Comparison Tutorial, by Evangeline &lsquo;Gina&rsquo; Reynolds (<a href="https://evamaerey.github.io/tables/about">tutorial</a>, <a href="https://github.com/EvaMaeRey/tables">GitHub repository</a>, <a href="https://community.rstudio.com/t/87978">RStudio Community</a>).</p><p>Replicating a New York Times Table of Swedish COVID-19 Deaths with gt, by Malcolm Barrett (<a href="https://malco.io/2020/05/16/replicating-an-nyt-table-of-swedish-covid-deaths-with-gt/">tutorial</a>, <a href="https://github.com/malcolmbarrett/malco.io/blob/master/content/post/2020-05-16-replicating-an-nyt-table-of-swedish-covid-deaths-with-gt/index.Rmd">GitHub repository</a>, <a href="https://community.rstudio.com/t/82423">RStudio Community</a>).</p><p>Recreating a Table by The Economist with Reactable, by Connor Rothschild (<a href="https://www.connorrothschild.com/post/economist-table-replication-using-reactable">tutorial</a>, <a href="https://github.com/connorrothschild/economist-table-replication">GitHub repository</a>, <a href="https://community.rstudio.com/t/84725">RStudio Community</a>).</p><p>Top of the Class: Public Spending on Education, by David Smale (<a href="https://davidsmale.netlify.app/portfolio/spending-on-education/">tutorial</a>, <a href="https://github.com/committedtotape/education-spending">GitHub repository</a>, <a href="https://community.rstudio.com/t/86113">RStudio Community</a>).</p><h3 id="table-packages">Table Packages</h3><p>A number of submissions were essentially submissions of the author&rsquo;s package, which supports creating or customizing tables in some way. In fact, the grand prize and runner up are both examples of this! Below are additional honorable mention entries of these table-package submissions.</p><p>DataEditR: Interactive Editor for Viewing, Filtering, Entering &amp; Editing Data, by Dillon Hammill (<a href="https://github.com/DillonHammill/DataEditR/">GitHub repository</a>, <a href="https://community.rstudio.com/t/87976">RStudio Community</a>).</p><p>A Not So Short Introduction to rtables, by Gabriel Becker, Adrian Waddell (<a href="https://waddella.github.io/RStudioTableContest2020/A_Not_So_Short_Introduction_to_rtables.html">post</a>, <a href="https://github.com/waddella/RStudioTableContest2020">GitHub repository</a>, <a href="https://community.rstudio.com/t/86538">RStudio Community</a>).</p><h2 id="in-closing">In Closing</h2><p>We want to thank you all for making this Table Contest so great. It is incredibly hard to judge submissions with such an overall high level of quality. We fully acknowledge that there are many other really great entries we did not highlight in this article. We encourage you to check out all of the entries at <a href="https://community.rstudio.com/tags/c/R-Markdown/tables/38/table-contest">RStudio Community</a>.</p><p>There were so many great submissions that our <a href="https://blog.rstudio.com/2020/12/21/rmd-news/">growing R Markdown team</a> is considering hosting a <em>tables gallery</em> with examples that others can learn from or use as launching pad for their own display tables. We&rsquo;d love to hear your comments and suggestions <a href="https://community.rstudio.com/t/2020-table-contest-winners/91517/2">in RStudio Community!</a></p></description></item><item><title>Latest News from the R Markdown Family</title><link>https://www.rstudio.com/blog/rmd-news/</link><pubDate>Mon, 21 Dec 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rmd-news/</guid><description><p>Seasons greetings from the R Markdown family! We know many folks are headed into holiday breaks, and we hope you all take time away from work to enjoy&hellip;the rest of 2020. If you are like most people, you may be having a hard time keeping up with the latest news and current events in the world. We cannot help much with that (we&rsquo;re in the same boat), but we are here to make your working lives easier! We wanted to take a moment to round up all the latest news from the R Markdown family of packages so that when you re-surface in 2021, you know all you need to know to take advantage of the newest features that will improve your knitting experience. Without further ado, let&rsquo;s get started!</p><h2 id="rmarkdown">rmarkdown</h2><table><thead><tr><th align="center">Last release</th></tr></thead><tbody><tr><td align="center"><img src="https://img.shields.io/badge/CRAN-2.6-brightgreen" alt="Last rmarkdown release 2.6 cran badge"></td></tr></tbody></table><p>We are happy to share that <strong>rmarkdown</strong> (<a href="https://rmarkdown.rstudio.com/docs/">https://rmarkdown.rstudio.com/docs/</a>) version 2.6 is now on CRAN. rmarkdown is a package that helps you create dynamic documents that combine code, rendered output (such as figures), and markdown-formatted text.</p><p>You can install rmarkdown from CRAN with:</p><pre><code>install.packages(&quot;rmarkdown&quot;)</code></pre><p>There have been three new versions released on CRAN since this fall. Below we share some highlights, but you may want to look at the <a href="https://rmarkdown.rstudio.com/docs/news/">release notes</a> for the full details.</p><h3 id="anchor-links">Anchor links</h3><p>An anchor link (also known as a &ldquo;page jump&rdquo;) can help you share the URL to a specific section of a web page. Anchor links can now easily be added to headers in <code>html_document()</code> by using <code>anchor_sections = TRUE</code> in your YAML. First introduced in <strong>rmarkdown</strong> v2.5 as <code>TRUE</code> by default, we reverted to <code>FALSE</code> following community feedback. In v2.6 and later, you can enable anchor links in your YAML header:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-yaml" data-lang="yaml"><span style="color:#007020;font-weight:bold">output</span>:<span style="color:#bbb"></span><span style="color:#bbb"></span><span style="color:#bbb"> </span><span style="color:#007020;font-weight:bold">html_document</span>:<span style="color:#bbb"></span><span style="color:#bbb"></span><span style="color:#bbb"> </span><span style="color:#007020;font-weight:bold">anchor_sections</span>:<span style="color:#bbb"> </span>TRUE<span style="color:#bbb"> </span><span style="color:#60a0b0;font-style:italic"># default is FALSE</span><span style="color:#bbb"></span></code></pre></div><p>This works for any HTML-based output format, including those from the <strong>bookdown</strong> package. The default symbol is a pound sign (<code>#</code>). You can customize the anchor link using CSS (see <a href="https://rmarkdown.rstudio.com/docs/reference/html_document.html"><code>?html_document()</code></a>).</p><h3 id="numbered-sections">Numbered sections</h3><p>Automatically numbered sections have long been available in HTML and PDF output formats, but were missing from other output formats. It is now there! Users who knit to Word (<code>word_document()</code>), PowerPoint (<code>powerpoint_presentation()</code>), and markdown document (e.g., <code>github_document()</code>) output formats can now benefit from automatic numbering of sections, too. This new feature also makes it easier for your outputted report to look the same when you knit the same report to multiple output formats.</p><p>You can enable numbered sections in your YAML header:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-yaml" data-lang="yaml"><span style="color:#007020;font-weight:bold">output</span>:<span style="color:#bbb"></span><span style="color:#bbb"></span><span style="color:#bbb"> </span><span style="color:#007020;font-weight:bold">word_document</span>:<span style="color:#bbb"></span><span style="color:#bbb"></span><span style="color:#bbb"> </span><span style="color:#007020;font-weight:bold">number_sections</span>:<span style="color:#bbb"> </span>TRUE<span style="color:#bbb"> </span><span style="color:#60a0b0;font-style:italic"># default is FALSE</span><span style="color:#bbb"></span></code></pre></div><h3 id="publishing-r-markdown-websites">Publishing R Markdown websites</h3><p>A new <a href="https://rmarkdown.rstudio.com/docs/reference/publish_site.html"><code>publish_site()</code></a> function is now included to easily publish your R Markdown website, as the &lsquo;One-Button&rsquo; publishing experience in the RStudio IDE. It will help you setup your environment for publishing, build your website and deploy to RStudio Connect. Under the hood, it uses the <a href="https://rstudio.github.io/rsconnect/"><strong>rsconnect</strong></a> package.</p><h3 id="and-more-little-things">And more little things&hellip;</h3><p>We also have made some smaller but important changes:</p><ul><li>Support for Pandoc 2.11&rsquo;s new citation processing system. <strong>rmarkdown</strong> 2.5 is required if you already upgraded Pandoc. Pandoc 2.11 is an important version update and it will be shipped with next release of <a href="https://blog.rstudio.com/2020/09/30/rstudio-v1-4-preview-visual-markdown-editing/">Rstudio 1.4</a>. We recommend using last Pandoc (which is currently 2.11.3.1) if you are managing your Pandoc version yourself and you want to update.</li><li>The previous default <code>clean_site(preview = FALSE)</code> has been changed to <code>preview = TRUE</code>, which means it will now show which files and folders to delete without actually deleting them. This will now be also the default behavior of &lsquo;Clean All&rsquo; button in the IDE build pane. If you want to actually delete the files, you need to call <code>rmarkdown::clean_site(preview = FALSE)</code> in the R console.</li><li>Lua filters are now better handled internally. For developers of output formats for R Markdown, this means that you can now pass Lua filters as part of <code>output_format()</code> using <code>pandoc_options(lua_filters = )</code>. A new function <code>pandoc_lua_filter_args()</code> has been added to help build the right commandline argument for Pandoc. If you are new to Lua filters and want to learn more, see <a href="https://rmarkdown.rstudio.com/docs/articles/lua-filters.html"><code>vignette(&quot;lua-filters&quot;, &quot;rmarkdown&quot;)</code></a>.</li></ul><p style="text-align: right;">• See the <a href="https://github.com/rstudio/rmarkdown/releases">release note</a> for full list of changes.</p><h2 id="bookdown">bookdown</h2><table><thead><tr><th align="center">Last release</th></tr></thead><tbody><tr><td align="center"><img src="https://img.shields.io/badge/CRAN-0.21-brightgreen" alt="Last bookdown release 0.21 cran badge"></td></tr></tbody></table><p>We are also happy to share that <strong>bookdown</strong> (<a href="https://bookdown.org/yihui/bookdown/">https://bookdown.org/yihui/bookdown/</a>) version 0.21 is now on CRAN. <strong>bookdown</strong> is a package that facilitates writing books and long-form articles/reports with R Markdown.</p><p>You can install bookdown from CRAN with:</p><pre><code>install.packages(&quot;bookdown&quot;)</code></pre><h3 id="numbered-sections-and-smarter-figure-numbering">Numbered sections and smarter figure numbering</h3><p>You always had the ability to number sections in HTML and PDF books (in fact, it was the default because it was so necessary), but because of the new feature described above added to <strong>rmarkdown</strong>, now you get your nice numbered sections in <strong>bookdown</strong>&lsquo;s non-HTML output formats too, like <code>bookdown::word_document2()</code> or <code>bookdown::powerpoint_presentation2()</code>.</p><p>This is especially helpful because <strong>bookdown</strong>&lsquo;s output formats already have automatic figure numbering enabled (see: <a href="https://bookdown.org/yihui/rmarkdown-cookbook/figure-number.html">https://bookdown.org/yihui/rmarkdown-cookbook/figure-number.html</a>). For example, this code chunk:</p><pre><code>```{r cars, fig.cap = &quot;An amazing plot&quot;, echo = FALSE}plot(cars)```</code></pre><p>In a <strong>bookdown</strong> output format is rendered as:</p><p><img src="img/Fig-cap-num.png" alt="Examples of figure caption numbering"></p><p>With the new section numbering capability, your numbered figures will be numbered by chapter even in output formats other than HTML and PDF. For example, in a <code>word_document2</code>, before the fourth figure of your document would have been labeled <code>Fig. 4</code>, even if it was really the second figure in Chapter 3. Now, the same figure will be labeled <code>Fig. 3.2</code>. If you prefer the former numbering behavior, you may deactivate this by setting <code>number_sections: FALSE</code> in the <code>index.Rmd</code> YAML or in your <code>_output.yaml</code> file.</p><h3 id="a-new-way-to-use-theorem-and-proof-environments">A new way to use theorem and proof environments</h3><p>You may already know (or not) that <strong>bookdown</strong> has <a href="https://bookdown.org/yihui/bookdown/markdown-extensions-by-bookdown.html">special Markdown extensions</a>, one of which is the theorems and proofs. They offer (un)numbered and labeled environments to your book. Currently, you can add them using one of the custom knitr&rsquo;s engines:</p><pre><code>```{theorem, pyth, name=&quot;Pythagorean theorem&quot;}For a right triangle, if $c$ denotes the length of the hypotenuseand $a$ and $b$ denote the lengths of the other two sides, we have$$a^2 + b^2 = c^2$$```</code></pre><p><img src="img/bookdown-theorem-engine.png" alt="A theorem with the knitr&rsquo;s engine"></p><p>Taking advantage of Pandoc <a href="https://pandoc.org/MANUAL.html#divs-and-spans" title="Pandoc's fenced divs">fenced <code>Div</code>s</a> to create <a href="https://bookdown.org/yihui/rmarkdown-cookbook/custom-blocks.html" title="creating custom blocks">custom blocks</a>, now you have another way of creating such an environment:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-markdown" data-lang="markdown">::: {.theorem <span style="color:#d55537;font-weight:bold">#pyth</span> name=&#34;Pythagorean theorem&#34;}For <span style="font-weight:bold">**a right triangle</span><span style="font-weight:bold">**</span>, if $c$ denotes the length of the hypotenuseand $a$ and $b$ denote the lengths of the other two sides, we have$$a^2 + b^2 = c^2$$:::</code></pre></div><p><img src="img/bookdown-new-theorem.png" alt="A theorem with the new fenced Div syntax"></p><p>It will produce the same type of environment that you can reference in your book. However, we gain a very important feature-‐-with this new syntax, you can use any Markdown syntax inside the custom blocks, which was not possible using the <strong>knitr</strong> engine. Have you spotted the bold text in the last example?</p><p>This new syntax is currently only available to PDF and HTML document output formats. To use this new syntax in your existing documents, <code>bookdown::fence_theorems()</code> can help you convert the old syntax to the new syntax.</p><pre><code># Shows the converted textbookdown::fence_theorems(&quot;01-intro.Rmd&quot;)# Convert the document overwriting itbookdown::fence_theorems(&quot;01-intro.Rmd&quot;, output = &quot;01-intro.Rmd&quot;)</code></pre><p style="text-align: right;">• See the <a href="https://github.com/rstudio/bookdown/blob/master/NEWS.md">release note</a> for full list of changes.</p><h2 id="tinytex">tinytex</h2><table><thead><tr><th align="center">Last release</th></tr></thead><tbody><tr><td align="center"><img src="https://img.shields.io/badge/CRAN-0.28-brightgreen" alt="Last tinytex release 0.28 cran badge"></td></tr></tbody></table><p>The latest version of the <strong>tinytex</strong> (<a href="https://yihui.org/tinytex/">https://yihui.org/tinytex/</a>) package is also now on CRAN. <strong>tinytex</strong> is the companion package of the LaTeX distribution TinyTeX based on TeX Live, and the package allows R users to install and maintain their LaTeX distribution using R. You can install tinytex from CRAN with:</p><pre><code>install.packages(&quot;tinytex&quot;)</code></pre><p>In September, we made an important package update with three main improvements that should make the installation process for users run smoother.</p><ol><li><code>tinytex::install_tinytex()</code> will now install pre-built binaries of TinyTeX by default, instead of installing TeX Live with its source installer. The latter is not only slower, but also could end up installing a cutting-edge version of TeX Live that happens to be broken. With the change, these problems will be gone, because the binaries have been fully tested for common R Markdown documents and projects before a new release is made, and they include a set of pre-installed TeX packages required by common R Markdown projects.</li><li>You can now easily install a specific version using, for example, <code>tinytex::install_tinytex(&quot;2020.12&quot;)</code>. The TinyTeX releases live in the repo <a href="https://github.com/yihui/tinytex-releases">https://github.com/yihui/tinytex-releases</a>, where you could find the different versions. There will be a monthly release of 3 binaries for Linux, Windows and Mac. These binaries can also be used by non-R users (see the repo README for how to install for non-R users).</li><li>Lastly, <code>tinytex::install_tinytex()</code> will now detect and re-install all currently installed packages so that you don&rsquo;t lose anything when re-installing or upgrading TinyTeX.</li></ol><p style="text-align: right;"> • See the <a href="https://github.com/yihui/tinytex/releases">release note</a> for full list of changes. </p><h2 id="rticles">rticles</h2><table><thead><tr><th align="center">Last release</th></tr></thead><tbody><tr><td align="center"><img src="https://img.shields.io/badge/CRAN-0.17-brightgreen" alt="Last rticles release 0.17 cran badge"></td></tr></tbody></table><p>We are also happy to announce that <strong>rticles</strong> (<a href="https://github.com/rstudio/rticles">https://github.com/rstudio/rticles</a>) version 0.17 is now on CRAN. The <strong>rticles</strong> package provides a suite of custom R Markdown LaTeX formats and templates for journal articles. You can install <strong>rticles</strong> from CRAN with:</p><pre><code>install.packages(&quot;rticles&quot;)</code></pre><p>Most of the article templates are provided and maintained by the community, and anyone can contribute a new template. Since this summer, three new versions of this package have been released on CRAN with:</p><ul><li><p>New article formats available: <code>bioinformatics_article()</code> and <code>arxiv_article()</code></p></li><li><p>Improved <code>jss_article()</code> and <code>rjournal_article()</code> article templates to better follow submission guidelines.</p></li><li><p>A brand new <a href="https://github.com/rstudio/rticles">README</a> to show all available templates with their contributor and to explain how to contribute more. The function <code>rticles::journals()</code> allows you to quickly look at all available formats.</p></li><li><p>A reorganization of the templates internally which led to a breaking change when using <code>rmarkdown::draft()</code>. Now, only the journal name needs to be provided (e.g., <code>rjournal</code> instead of <code>rjournal_article</code>):</p><pre><code>rmarkdown::draft(&quot;MyArticle.Rmd&quot;, template = &quot;rjournal&quot;, package = &quot;rticles&quot;)</code></pre></li><li><p>Several bug fixes for existing templates and better support for Pandoc 2.11+</p></li></ul><p style="text-align: right;">• See the <a href="https://github.com/rstudio/rticles/blob/master/NEWS.md">release note</a> for full list of changes.</p><h2 id="xaringan">xaringan</h2><table><thead><tr><th align="center">Last release</th></tr></thead><tbody><tr><td align="center"><img src="https://img.shields.io/badge/CRAN-0.19-brightgreen" alt="Last xaringan release 0.19 cran badge"></td></tr></tbody></table><p>In 2016, Wojciech Francuzik filed <a href="https://github.com/yihui/xaringan/issues/3">a feature request</a> for self-contained <strong>xaringan</strong> slides. Over the years, several other users (notably, <a href="https://yihui.org/en/2019/01/rstudio-conf/#jared-lander-s-annual-feature-requests">Jared Lander</a>) requested the same feature. We are happy to announce that this has finally become possible since <strong>xaringan</strong> 0.18! With the option <code>self_contained: true</code>, your images and plots will be embedded in the <code>.html</code> output file.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-yaml" data-lang="yaml"><span style="color:#007020;font-weight:bold">output</span>:<span style="color:#bbb"></span><span style="color:#bbb"></span><span style="color:#bbb"> </span><span style="color:#007020;font-weight:bold">xaringan::moon_reader</span>:<span style="color:#bbb"></span><span style="color:#bbb"></span><span style="color:#bbb"> </span><span style="color:#007020;font-weight:bold">self_contained</span>:<span style="color:#bbb"> </span><span style="color:#007020;font-weight:bold">true</span><span style="color:#bbb"></span></code></pre></div><p>We&rsquo;d like to thank Susan VanderPlas for contributing the pull request to implement this feature.</p><p style="text-align: right;">• See the <a href="https://github.com/yihui/xaringan/blob/master/NEWS.md">release note</a> for full list of changes.</p><h2 id="distill">distill</h2><table><thead><tr><th align="center">Last release</th></tr></thead><tbody><tr><td align="center"><img src="https://img.shields.io/badge/CRAN-1.1-brightgreen" alt="Last distill release 1.1 cran badge"></td></tr></tbody></table><p>Finally, we are proud to announce that version 1.0 of the <a href="https://pkgs.rstudio.com/distill/"><strong>distill</strong> package</a> is now on <a href="https://cran.r-project.org/package=distill">CRAN</a>. The goal of the <strong>distill</strong> package is to provide an R Markdown-based output format optimized for online scientific and technical communication. You can install the latest version from CRAN:</p><pre><code>install.packages(&quot;distill&quot;)</code></pre><p>In a post earlier this month, we shared some highlights from the <a href="https://www.rstudio.com/blog/distill/">latest version of the <strong>distill</strong> package</a>, which now includes site-wide search, a built-in themer, a syntax highlighter optimized for accessibility, and <strong>downlit</strong> integration, to name a few features we are excited about.</p><p style="text-align: right;">• See the <a href="https://pkgs.rstudio.com/distill/news/index.html#distill-v10-cran">release note</a> for full list of changes.</p><h2 id="last-but-not-least">Last but not least!</h2><p>Let&rsquo;s not forget other packages in the R Markdown family that have been updated within the last 6 months:</p><ul><li><p><a href="https://github.com/rstudio/tufte"><strong>tufte</strong></a> is in version 0.9 now with mainly updates for Pandoc 2.11+ support and a new <code>runningheader</code> variable for setting it different than <code>title</code>. <small>See <a href="https://github.com/rstudio/tufte/blob/master/NEWS.md">Changelog</a></small></p></li><li><p><a href="https://yihui.org/knitr/"><strong>knitr</strong></a> is in version 1.30 now updated in September with, among internal and small changes, a set of new exported <code>knitr::hooks_*</code> functions. <small>See <a href="https://github.com/yihui/knitr/blob/master/NEWS.md">Changelog</a></small></p></li><li><p><a href="https://github.com/rstudio/pagedown"><strong>pagedown</strong></a> had 3 releases with mainly bug fixes but 2 new features. A new <code>loa-title</code> to set a title to the list of abbreviations and a news <code>outline</code> argument in <code>pagedown::chrome_print()</code> to generate a bookmarks outline in the PDF. <small>See <a href="https://github.com/rstudio/pagedown/blob/master/NEWS.md">Changelog</a></small></p></li><li><p><a href="https://github.com/rstudio/blogdown"><strong>blogdown</strong></a> had an important release with version 0.21 in October. It was just step one in <a href="https://github.com/rstudio/blogdown/blob/master/NEWS.md">a larger package overhaul</a> that is planned for release in January 2021. Now three years old, the package is currently under heavy development to better support beginners, improve the quality of life for existing users, and offer Hugo support with a versioning system. Version 1.0 will deserve its own blog post - stay tuned! We will definitely appreciate it if you could test the development version with <code>remotes::install_github(&quot;rstudio/blogdown&quot;)</code>, and <a href="https://github.com/rstudio/blogdown/issues">let us know if you have any feedback</a>! <small>See <a href="https://github.com/rstudio/blogdown/blob/master/NEWS.md">Changelog</a></small></p></li></ul><p>We hope this round-up helps you start thinking about some new possibilities for your 2021 knitting projects. A big thank you to all the contributors who helped with these releases by discussing problems, proposing features, and contributing code. Happy holidays!</p></description></item><item><title>RStudio Connect 1.8.6</title><link>https://www.rstudio.com/blog/rstudio-connect-1-8-6/</link><pubDate>Wed, 16 Dec 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-1-8-6/</guid><description><p>RStudio Connect helps teams of all sizes operationalize their data science work, and provides a single point of access to data products for decision makers. In this release, we have emphasized features that will help address maturing DevOps requirements within organizations seeking to deploy and scale data science.</p><p>This release of RStudio Connect builds on the existing Server API, making experimental endpoints officially supported and introducing a brand new slate of API improvements based on feedback we’ve received from the community.</p><ul><li><strong>Automate Deployments</strong> Learn how to implement CI/CD pipelines or programmatic publishing with the RStudio Connect Server API … <a href="https://blog.rstudio.com/2020/12/16/rstudio-connect-1-8-6-deployment-api/">read more</a></li><li><strong>Audit Server Content</strong> Explore new auditing and content management workflows with the RStudio Connect Server API … <a href="https://blog.rstudio.com/2020/12/16/rstudio-connect-1-8-6-server-api/">read more</a></li></ul><p><img src="template-report-showcase.png" alt="RStudio Connect Server API - Report Showcase Examples"></p><h5 align="center">Visit the <a href="https://solutions.rstudio.com/examples/rsc-server-api-overview/">RStudio Connect Server API Showcase</a> for access to code examples and template reports that can be deployed straight to your own Connect server.</h3><ul><li><strong>Bokeh and Streamlit</strong> support has now moved from Beta to being Generally Available … <a href="https://blog.rstudio.com/2020/12/16/rstudio-connect-1-8-6-python-update/">read more</a></li><li><strong>LDAP / Active Directory</strong> Groups will now be synchronized via a background process on a scheduled interval. This change enables support for <code>session$groups</code> in Shiny applications … <a href="https://blog.rstudio.com/2020/12/16/rstudio-connect-1-8-6-admin-digest/">read more</a></li></ul><h3 align="center"><a href="https://rstudio.chilipiper.com/book/rsc-demo">See RStudio Connect in Action</a></h3><blockquote><p>Note: This release includes some deprecations and breaking changes. Please read <a href="https://blog.rstudio.com/2020/12/16/rstudio-connect-1-8-6-admin-digest/">more here</a>, or review the <a href="https://docs.rstudio.com/connect/news/#rstudio-connect-186">release notes</a>.</p></blockquote><p>To receive email notifications for RStudio professional product releases, patches, security information, and general product support updates, subscribe to the <strong>Product Information</strong> list by visiting the RStudio <a href="https://rstudio.com/about/subscription-management/">subscription management portal</a>.</p></description></item><item><title>RStudio Connect 1.8.6 - Administrator Digest</title><link>https://www.rstudio.com/blog/rstudio-connect-1-8-6-admin-digest/</link><pubDate>Wed, 16 Dec 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-1-8-6-admin-digest/</guid><description><h2 id="security--authentication">Security &amp; Authentication</h2><h3 id="ldap--active-directory-changes">LDAP / Active Directory Changes</h3><p>Group handling within RStudio Connect has significantly improved for LDAP / Active Directory within this release. Groups will now be synchronized via a background process on a scheduled interval. The group membership for a user is determined on login rather than at the time of content access, and permission checks will use synced data from the RStudio Connect database rather than making LDAP requests.</p><p>LDAP Groups can be automatically populated upon user login if the <code>LDAP.GroupsAutoProvision</code> configuration option is enabled. This option is disabled by default to prevent an unexpectedly large number of groups from being pulled in unexpectedly. If the number of groups is not a concern, enabling this option is recommended for the optimal user experience.</p><p>As a result of these changes, RStudio Connect will support <code>session$groups</code> (via the HTTP header <code>Shiny-Server-Credentials</code>) in Shiny apps when using LDAP or Active Directory. Groups are listed by name according to the setting <code>LDAP.GroupNameAttribute</code>. LDAP groups are also available to other content types via the HTTP header <code>RStudio-Connect-Credentials</code>.</p><h3 id="groups-page-update">Groups Page Update</h3><p>The Groups page will now be available in the RStudio Connect dashboard under the &ldquo;People&rdquo; tab for all authentication types except those that return Unique IDs instead of group names. Using the Groups page, authorized users can add, remove, and rename groups when necessary. The Groups page can also be used to inspect groups for their user membership lists and perform group searches.</p><h2 id="deprecations--breaking-changes">Deprecations &amp; Breaking Changes</h2><ul><li><p><strong>Breaking Change</strong> The <code>Applications.TempMounting</code> configuration flag has been removed. Previously, disabling the flag would permit R processes to inspect the temporary data of other R processes.</p></li><li><p><strong>Breaking Change</strong> When using Postgres, RStudio Connect verifies that a minimum version of 9.5 is used.</p></li><li><p><strong>Breaking Change</strong> <code>GroupsByUniqueId</code> and <code>GroupsAutoProvision</code> cannot be enabled at the same time. IDs received from the authentication provider are not immediately useful for users when group auto provisioning is enabled. Please see this section of the Admin Guide for more information.</p></li><li><p><strong>Deprecation</strong> The <code>Server.SourcePackageDir</code>setting is deprecated and will be removed in a future release. Administrators should consider migrating to RStudio Package Manager or set up a private package repository. Please review this section of the Admin Guide for <a href="https://docs.rstudio.com/connect/1.8.6/admin/r/package-management/#private-packages">instructions</a>.</p></li><li><p><strong>Deprecation</strong> The following Groups management settings have been deprecated and will be removed in a future release:</p><ul><li><code>LDAP.GroupsAutoRemoval</code></li><li><code>OAuth2.GroupsAutoRemoval</code></li><li><code>Proxy.GroupsAutoRemoval</code></li><li><code>SAML.GroupsAutoRemoval</code></li></ul></li></ul><p>Please review the <a href="https://docs.rstudio.com/connect/news/#rstudio-connect-186">full release notes</a>.</p><h2 id="upgrade-planning">Upgrade Planning</h2><h3 id="upgrade-notes-for-ldap--active-directory-authentication">Upgrade Notes for LDAP / Active Directory Authentication</h3><p>In RStudio Connect 1.8.6, LDAP user groups are determined on login, and group information is synced from the LDAP server to the Connect database in configured intervals.</p><p><strong>What to expect when upgrading to the new LDAP Sync process:</strong></p><ul><li>RStudio Connect enters &ldquo;upgrade mode&rdquo;</li><li>All LDAP users start without any group memberships</li><li>Users are divided into batches sized according to the total number of users</li><li>RStudio Connect will attempt to obtain group memberships for all batches within the configured update interval (default 4 hours), making the best effort to not disrupt users’ normal usage of the system</li><li>Once all users are synced, RStudio Connect enters regular operation where users are updated throughout a configured interval (default 4 hours)</li></ul><p>In some cases, administrators may need to increase the update interval to be longer than 4 hours so that updates can be more spread out throughout the day.</p><p><strong>Learn more about the changes and upgrades in the <a href="https://docs.rstudio.com/connect/admin/authentication/">updated Admin Guide</a>.</strong></p><h3 id="upgrade-rstudio-connect">Upgrade RStudio Connect</h3><p>To perform an upgrade, download and run the installation script. The script installs a new version of RStudio Connect on top of the earlier one. Existing configuration settings are respected. Additional documentation can be <a href="https://docs.rstudio.com/rsc/upgrade/">found here</a>.</p><pre><code># Download the installation scriptcurl -Lo rsc-installer.sh https://cdn.rstudio.com/connect/installer/installer-v1.5.1.sh# Run the installation scriptsudo bash ./rsc-installer.sh 1.8.6</code></pre><hr><p>To receive email notifications for RStudio professional product releases, patches, security information, and general product support updates, subscribe to the <strong>Product Information</strong> list by visiting the RStudio subscription management portal linked below.</p><h3 align="center"><a href="https://rstudio.com/about/subscription-management/">Subscribe to RStudio Product Information Updates</a></h3></description></item><item><title>RStudio Connect 1.8.6 - Deployment API</title><link>https://www.rstudio.com/blog/rstudio-connect-1-8-6-deployment-api/</link><pubDate>Wed, 16 Dec 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-1-8-6-deployment-api/</guid><description><h2 id="automate-deployment-with-the-rstudio-connect-server-api">Automate deployment with the RStudio Connect Server API</h2><p>With RStudio Connect, your team can publish straight from the desktop or server IDE with the push of a button or orchestrate a fully customized deployment pipeline. While push-button publishing is powerful and convenient, it&rsquo;s not the ideal solution for all organizations. Data science teams need tools that can help traditional IT administrators understand how to provide sophisticated oversight for the deployment and management of data science artifacts. RStudio Connect aims to be the solution for those challenges.</p><p>This API update makes programmatic deployment workflows more useful by introducing new content management options like the ability to set environment variables, custom URL paths, and detailed access permissions on first publish. We’ve made many of the internal RStudio Connect Server API capabilities public with this release, so there should be something new and exciting for all the analytic administrators, deployment engineers, and DevOps-minded folks to enjoy.</p><h3 id="public-apis-for-content-deployment">Public APIs for Content Deployment</h3><p>Programmatic deployment workflows are now fully supported with the release of <code>/v1</code> API endpoints (previously <code>/v1/experimental</code>). The pattern for basic deployment is unchanged and can be used for any type of content supported by RStudio Connect.</p><p>Content deployment can be customized, but follows a general framework:</p><ol><li>Create a new content item (<code>POST /v1/content</code>) or identify an existing content item to update.</li><li>Create a bundle capturing your code and its dependencies.</li><li>Upload the bundle archive to RStudio Connect (<code>POST /v1/content/{guid}/bundles</code>).</li><li><strong>New!</strong> Optionally, set environment variables that the content needs at runtime (<code>PATCH /v1/content/{guid}/environment</code>).</li><li>Deploy (activate) that bundle (<code>POST /v1/content/{guid}/deploy</code>) and monitor its progress.</li><li>Poll for updates to a task; obtain the latest information about a dispatched operation (<code>GET /v1/tasks/{id}</code>).</li><li><strong>New!</strong> Optionally, add viewer groups or collaborators (<code>POST /v1/content/{guid}/permissions</code>), set a custom vanity URL path (<code>PUT /v1/content/{guid}/vanity</code>), and add tags for organization and discoverability (<code>POST /v1/content/{guid}/tags</code>).</li></ol><p>To learn more, follow along with basic deployment scenarios and example code in the RStudio Connect API Cookbook:</p><ul><li><a href="https://docs.rstudio.com/connect/1.8.6/cookbook/deploying/">Deploying Content</a></li><li><a href="https://docs.rstudio.com/connect/1.8.6/cookbook/content/">Managing Content</a></li><li><a href="https://docs.rstudio.com/connect/1.8.6/cookbook/organizing/">Organizing Content</a></li><li><a href="https://docs.rstudio.com/connect/1.8.6/cookbook/sharing/">Sharing Content</a></li></ul><h3 id="new-environment-variable-management-api">New Environment Variable Management API</h3><p>When developing content for RStudio Connect, publishers should never place secrets (keys, tokens, passwords, etc.) in the code itself. Sensitive information should be protected through the use of environment variables. These variables have traditionally required configuration through the RStudio Connect dashboard, a method which can result in a failed initial deployment.</p><p>The new API can be used to configure environment variables for a specified content item programmatically:</p><ul><li>Set environment variables with <code>PUT /v1/content/{guid}/environment</code> (removes any existing environment variables)</li><li>Add, update, or delete environment variables with <code>PATCH /v1/content/{guid}/environment</code></li></ul><p>Read more about setting environment variables programmatically in the <a href="https://docs.rstudio.com/connect/1.8.6/cookbook/deploying/#setting-environment-variables">RStudio Connect API Cookbook</a>.</p><h3 id="redeployments">Redeployments</h3><p>Based on feedback we received from users of the experimental deployment APIs, improvements have been made to the workflow for updating existing content items. Redeployment requires that an API client provide the correct unique content identifier for the item you want to update. For convenience, the RStudio Connect API now provides a method for retrieving that content identifier using the combination of content name and owner.</p><p>Read more about deploying new versions of content in the <a href="https://docs.rstudio.com/connect/1.8.6/cookbook/deploying/#deploying-versions/">RStudio Connect API Cookbook</a>.</p><h3 id="advanced-deployments">Advanced Deployments</h3><p>The experimental bundle management APIs for moving content items from one Connect server to another are now fully supported <code>v1</code> workflows as well. In situations where your organization has more than one RStudio Connect server for different stages of development, this pattern can be used to automate the promotion content (e.g., from staging to production). Once content exists on the production server, you may want to reduce the risk of pushing updates to it by adopting a blue-green deployment strategy. Blue-green is a system for creating separation between deployment and release by maintaining two copies of a content item in production (a blue and a green). The new <code>/vanity</code> endpoint can be used to assign a custom URL path to one version while making changes to the other, swapping the URL assignment whenever you want to redirect user traffic.</p><p>Read more about advanced deployment patterns in the <a href="https://docs.rstudio.com/connect/1.8.6/cookbook/promoting/">RStudio Connect API Cookbook</a>.</p><h3 id="a-note-about-experimental-endpoints">A note about <code>/experimental</code> endpoints</h3><p>Those who are familiar with the existing content deployment API patterns may have questions about what these new API changes mean. Questions like,</p><p><em>&ldquo;I already built deployment pipelines using the experimental APIs &ndash; will everything break?&quot;</em></p><p><strong>Your scripts will not break upon upgrading to 1.8.6.</strong> The <code>/v1/experimental</code> endpoints for content deployment are now labeled as <strong>&ldquo;Deprecated&rdquo;</strong>, but they have not been removed. In most cases the update from experimental to <code>/v1</code> should not require extensive changes. Please refer to the API documentation site to learn more about our <a href="https://docs.rstudio.com/connect/1.8.6/api/#overview--versioning-of-the-api">API versioning and deprecation policies</a> .</p><h3 align="center"><a href="https://rstudio.chilipiper.com/book/rsc-demo">See RStudio Connect in Action</a></h3><blockquote><h4 id="rstudio-connect-186">RStudio Connect 1.8.6</h4><ul><li>Return to the general announcement post to learn about more features and <a href="https://blog.rstudio.com/2020/12/16/rstudio-connect-1-8-6/">updates here</a>.</li><li>For upgrade planning notes, continue reading <a href="https://blog.rstudio.com/2020/12/16/rstudio-connect-1-8-6-admin-digest/">more here</a>.</li></ul></blockquote></description></item><item><title>RStudio Connect 1.8.6 - Python Update</title><link>https://www.rstudio.com/blog/rstudio-connect-1-8-6-python-update/</link><pubDate>Wed, 16 Dec 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-1-8-6-python-update/</guid><description><h2 id="bokeh-and-streamlit-support-is-now-generally-available">Bokeh and Streamlit support is now generally available</h2><p>We are happy to announce that the newest interactive Python content types, Bokeh and Streamlit, are now generally available in RStudio Connect.</p><p><strong>Thank You!</strong> to everyone who reached out to provide feedback on Bokeh and Streamlit during the Beta period.</p><p>The RStudio Connect User Guide contains information about our support for Bokeh and Streamlit, including detailed deployment instructions, example applications, and known limitations/compatibility requirements for each framework:</p><ul><li><a href="https://docs.rstudio.com/connect/1.8.6/user/bokeh/">RStudio Connect User Guide for Bokeh</a></li><li><a href="https://docs.rstudio.com/connect/1.8.6/user/streamlit/">RStudio Connect User Guide for Streamlit</a></li></ul><h3 align="center"><a href="https://rstudio.chilipiper.com/book/rsc-demo">Request a demo of Python in RStudio Connect</a></h3><p>For a hands-on approach to learning about Python content in RStudio Connect, try exploring the Jump Start examples. The Jump Start examples now contain tutorials for Bokeh and Streamlit application publishing under the Python tab.</p><p><img src="jumpstart-186.png" alt="Screenshots of the RStudio Connect Jump Start publising tutorial"></p><h3 align="center"><a href="https://rstudio.com/solutions/r-and-python/">Learn how data science teams use RStudio for R and Python</a></h3><blockquote><h4 id="rstudio-connect-186">RStudio Connect 1.8.6</h4><ul><li>Return to the general announcement post to learn about more features and <a href="https://blog.rstudio.com/2020/12/16/rstudio-connect-1-8-6/">updates here</a>.</li><li>For upgrade planning notes, continue reading <a href="https://blog.rstudio.com/2020/12/16/rstudio-connect-1-8-6-admin-digest/">more here</a>.</li></ul></blockquote></description></item><item><title>RStudio Connect 1.8.6 - Server API</title><link>https://www.rstudio.com/blog/rstudio-connect-1-8-6-server-api/</link><pubDate>Wed, 16 Dec 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-1-8-6-server-api/</guid><description><h2 id="audit-server-content-with-the-rstudio-connect-api">Audit Server Content with the RStudio Connect API</h2><p>In 1.8.6, RStudio Connect administrators have the ability to create reports that track and manage the content on their servers. If you’ve ever wanted the answers to questions like:</p><ul><li>How do I produce a list of all the content we have published to RStudio Connect?</li><li>Which applications can be accessed by which people and which groups?</li><li>Which versions of R and Python are actually being used, and by which publishers?</li><li>How much unpublished content is on the server, and can it be removed?</li><li>Which vanity URLs do we have in use across the server, and how do I list them out?</li></ul><p>You’re not alone. We’ve had a flood of requests for making the APIs for accessing this information publicly available. Using an API key and the content enumeration endpoint, RStudio Connect administrators can now build custom reports to answer all these questions and more.</p><p><img src="reports-186.gif" alt="RStudio Connect Server API Showcase"></p><p><strong>Visit the RStudio Connect <a href="https://solutions.rstudio.com/examples/rsc-server-api-overview/">Server API Showcase</a> for access to code examples and template reports that can be deployed straight to Connect.</strong></p><h3 id="content-enumeration-api">Content Enumeration API</h3><p>Retrieve detailed information about the content that is available on your Connect instance using the <code>GET /v1/content/ </code> endpoint. Administrators can retrieve all content items regardless of visibility and permissions.</p><p><img src="content-report.png" alt="Example Content Audit Report"></p><ul><li><strong>Get the code for a <a href="https://solutions.rstudio.com/examples/rsc-apis/basic-audit-report">Basic Server Content Audit Report</a></strong></li></ul><p>Read more about content enumeration in the <a href="https://docs.rstudio.com/connect/1.8.6/cookbook/content/#content-listing">RStudio Connect API Cookbook</a>.</p><h3 id="content-permissions-api">Content Permissions API</h3><p>This set of endpoints will let you manage the user permissions associated with a content item:</p><ul><li>List the permissions for a specified content item with <code>GET /v1/content/{guid}/permissions</code></li><li>Grant access to a user or group for a content item with <code>POST /v1/content/{guid}/permissions</code></li></ul><p>These permissions are used when the content item&rsquo;s <code>access_type</code> is <code>acl</code> (Access Control List).</p><p><img src="acl-report.png" alt="Example Content Access Audit Report"></p><ul><li><strong>Get the code for a <a href="https://solutions.rstudio.com/examples/rsc-apis/acl-audit-report">Content Access Settings Audit Report</a></strong></li></ul><p>Read more about managing content access in the <a href="https://docs.rstudio.com/connect/1.8.6/cookbook/sharing">RStudio Connect API Cookbook</a>.</p><h3 id="vanity-auditing-and-management-apis">Vanity Auditing and Management APIs</h3><p>All RStudio Connect content receives a URL that includes its numerical ID at the time of deployment. Administrators and publishers (if allowed) can create “vanity paths” for content which make the content available at an additional, customized URL.</p><p><code>GET /v1/vanities</code> can be used to list all defined vanity URLs on a server. You must have an API key with administrator privileges to call this endpoint.</p><p><img src="vanity-report.png" alt="Example Vanity URL Audit Report"></p><ul><li><strong>Get the code for a <a href="https://solutions.rstudio.com/examples/rsc-apis/vanity-audit-report">Vanity URL Audit Report</a></strong></li></ul><p>In addition to auditing vanity URLs, there are also API methods for setting and deleting vanities on individual content items:</p><ul><li>Use <code>GET /v1/content/{guid}/vanity</code> to get the vanity URL, if any, for a single content item.</li><li>Use <code>PUT /v1/content/{guid}/vanity</code> to set the vanity URL for a content item.</li><li>Use <code>DELETE /v1/content/{guid}/vanity</code> to remove the vanity URL for a content item and revert to using its numerical ID for URL construction.</li></ul><p>If <code>Authorization.PublishersCanManageVanities</code> is enabled, publishers can set and delete the vanity URL for content items that they have permission to change. Otherwise, administrator permissions are required.</p><p>Read more about content organization and vanity URL management in the <a href="https://docs.rstudio.com/connect/1.8.6/cookbook/organizing/#vanities">RStudio Connect API Cookbook</a>.</p><h3 id="tag-auditing-and-management-apis">Tag Auditing and Management APIs</h3><p>Tags are the primary content organization concept available in RStudio Connect. Tags and content items have a many-to-many relationship: any content item may be associated with multiple tags, and any tag may be associated with multiple content items. This gives you the flexibility to group and organize your content in whatever way best suits your organization.</p><p>The new tag API endpoints introduce management tools for tags as well as their associations with content items.</p><h4 id="tag-and-tagged-content-auditing">Tag and Tagged Content Auditing:</h4><ul><li>List all tags with <code>GET /v1/tags</code></li><li>List all of the content for a specified tag with <code>GET /v1/tags/{id}/content</code></li><li>List all the tags for a specified content item with <code>GET /v1/content/{guid}/tags</code></li></ul><p><img src="tag-report.png" alt="Example Tag Usage Audit Report"></p><ul><li><strong>Get the code for a <a href="https://solutions.rstudio.com/examples/rsc-apis/tag-audit-report">Tag Usage Audit Report</a></strong></li></ul><p>Read more about content organization and tag management in the <a href="https://docs.rstudio.com/connect/1.8.6/cookbook/organizing/#tags">RStudio Connect API Cookbook</a>.</p><blockquote><h4 id="rstudio-connect-186">RStudio Connect 1.8.6</h4><ul><li>Return to the general announcement post to learn about more features and <a href="https://blog.rstudio.com/2020/12/16/rstudio-connect-1-8-6/">updates here</a>.</li><li>For upgrade planning notes, continue reading <a href="https://blog.rstudio.com/2020/12/16/rstudio-connect-1-8-6-admin-digest/">more here</a>.</li></ul></blockquote></description></item><item><title>Announcing the 2020 R Community Survey</title><link>https://www.rstudio.com/blog/2020-survey-announcement/</link><pubDate>Fri, 11 Dec 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/2020-survey-announcement/</guid><description><p>Today, RStudio is launching our third annual R Community Survey (formerly known as the Learning R Survey) to better understand how and why people learn and use the R language and associated tools. We encourage anyone who is interested in R to respond. The survey should only require 5 to 10 minutes to complete, depending on how little or how much information you choose to share with us. You can find the survey here:</p><ul><li>English version: <a href="https://rstd.io/r-survey-en" target="_blank" rel="noopener noreferrer"><a href="https://rstd.io/r-survey-en">https://rstd.io/r-survey-en</a></a></li><li>Spanish version: <a href="https://rstd.io/r-survey-es" target="_blank" rel="noopener noreferrer"><a href="https://rstd.io/r-survey-es">https://rstd.io/r-survey-es</a></a></li></ul><p>If you don&rsquo;t know R yet or use Python more than R, that&rsquo;s fine too! The survey has specific questions for you, and your responses will help us better understand how we can be more encouraging to you and others like you.</p><p>Data and analysis of the 2018 and 2019 community survey data can be found on github at <a href="https://github.com/rstudio/r-community-survey" target="_blank" rel="noopener noreferrer"><a href="https://github.com/rstudio/r-community-survey">https://github.com/rstudio/r-community-survey</a></a> in the 2018/ and 2019/ folders. Results from the 2020 survey will also be posted as free and open source data that github repo in February 2021.</p><p>Please ask your students, Twitter followers, Ultimate Frisbee team, and anyone else you think may be interested to complete the survey. Your efforts will help RStudio, educators, and users understand and grow our data science community.</p><p>You will find a full disclosure of what information will be collected and how it will be used on the first page of the survey. The survey does not collect personally identifiable information nor email addresses, but it does have optional demographic questions.</p><p>Thank you in advance for your consideration and time. We look forward to sharing the results with you next year!</p></description></item><item><title>(Re-)introducing Distill for R Markdown</title><link>https://www.rstudio.com/blog/distill/</link><pubDate>Mon, 07 Dec 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/distill/</guid><description><p>We are proud to announce that version 1.0 of the <a href="https://pkgs.rstudio.com/distill/">distill package</a> is now on <a href="https://cran.r-project.org/web/packages/distill/index.html">CRAN</a>. While Distill has been around for awhile (and you may remember it by its original name, <a href="https://www.rstudio.com/2018/09/19/radix-for-r-markdown/">Radix</a>), this latest version includes so many new features that we wanted to take a moment to re-introduce you to Distill.</p><p>If you just want to jump in and get starting using Distill, you can install the latest version from CRAN:</p><pre class="r"><code>install.packages(&quot;distill&quot;)</code></pre><p>The package website (also built with Distill) is the best place to start: <a href="https://rstudio.github.io/distill/" class="uri">https://rstudio.github.io/distill/</a></p><div class="figure" style="text-align: center"><span id="fig:unnamed-chunk-2"></span><img src="site.png" alt="The Distill website" width="50%" /><p class="caption">Figure 1: The Distill website</p></div><p>If you are not ready to commit <em>just</em> yet, read on to <a href="#what-is-distill">find out more about Distill</a> and <a href="#distill-v1.0">discover new features</a> from the latest release.</p><div id="what-is-distill" class="level2"><h2>What is Distill?</h2><p>Distill is a package built for <a href="https://rmarkdown.rstudio.com/">R Markdown</a>, an ecosystem of packages for creating computational documents in R. The goal of the Distill package is to provide an output format optimized for online scientific and technical communication. The Distill R package provides two HTML <a href="https://pkgs.rstudio.com/distill/reference/index.html#section-output-formats">output formats</a> for R Markdown documents:</p><ul><li><p>Single HTML articles (<a href="https://pkgs.rstudio.com/distill/reference/distill_article.html"><code>distill_article</code></a>), and</p></li><li><p>Multi-page HTML websites, including blogs (<a href="https://pkgs.rstudio.com/distill/reference/distill_website.html"><code>distill_website</code></a>).</p></li></ul><p>These formats are based on the Distill web framework used by the Distill Machine Learning Journal <span class="citation">(<a href="#ref-olah2018the" role="doc-biblioref">Olah et al. 2018</a>)</span>. The Distill web framework was originally developed to catalyze more engaging and effective scientific, technical communication. The idea was to create a platform to help scientists harness the benefits of modern HTML-based communication, which digital designers and journalists have been using to create interactive and engaging articles that meet readers where they are: online.</p><div class="figure" style="text-align: center"><span id="fig:unnamed-chunk-3"></span><img src="distill-pub.png" alt="A Distill publication" width="75%" /><p class="caption">Figure 2: A Distill publication</p></div><p>If you are keen to learn more about how visual aesthetics and interactivity can improve readers’ engagement and learning, we recommend reading <span class="citation"><a href="#ref-hohman2020communicating" role="doc-biblioref">Hohman et al.</a> (<a href="#ref-hohman2020communicating" role="doc-biblioref">2020</a>)</span>.</p><div id="distill-article" class="level3"><h3>Distill for single HTML articles</h3><p>If you have ever knit an R Markdown file to <code>html_document()</code>, then you can think of <code>distill_article()</code> as its scientific alter-ego.</p><div class="figure" style="text-align: center"><span id="fig:unnamed-chunk-4"></span><img src="https://media.giphy.com/media/w73hhH9c8vmbS/giphy.gif" alt="PBS Full-Time Kid Egg in a Bottle Experiment" /><p class="caption">Figure 3: PBS Full-Time Kid Egg in a Bottle Experiment</p></div><p>Distill articles offer users an R Markdown format with built-in bells and whistles that make scientific communication easier, including:</p><ul><li><p>Ability to incorporate JavaScript and D3-based <a href="https://rstudio.github.io/distill/interactivity.html">interactive visualizations</a> made with R.</p></li><li><p>Support for <a href="https://rstudio.github.io/distill/#citations">citations</a>, <a href="https://rstudio.github.io/distill/#appendices">appendices</a>, <a href="https://rstudio.github.io/distill/#footnotes-and-asides">hover-able footnotes, and asides.</a></p></li><li><p>Tools for making articles <a href="https://rstudio.github.io/distill/citations.html">easily citeable</a>, as well as for generating <a href="https://rstudio.github.io/distill/citations.html#google-scholar">Google Scholar</a> compatible citation metadata.</p></li><li><p>Auto-numbering of <a href="https://bookdown.org/yihui/rmarkdown-cookbook/figure-number.html">figures</a> and <a href="https://bookdown.org/yihui/bookdown/tables.html">tables</a>, and <a href="https://bookdown.org/yihui/rmarkdown-cookbook/cross-ref.html">cross-referencing</a> of figures and tables.</p></li><li><p>Built-in support for <a href="https://rstudio.github.io/distill/basics.html#creating-an-article">multiple authors</a> with affiliations and <a href="https://orcid.org/">ORCID iD</a>.</p></li><li><p>Adding <a href="https://rstudio.github.io/distill/metadata.html#creative-commons">creative commons licensed content</a> with specific reuse instructions.</p></li></ul><p>Each of these features are designed to help scientists use the web and R to communicate about their work more effectively. But you can also use Distill to publish any HTML content, like <a href="https://stats.andrewheiss.com/bread/bagels.html">instructions for making bagels</a>- Distill works great for that too.<a href="#fn1" class="footnote-ref" id="fnref1"><sup>1</sup></a><br /></p></div><div id="distill-site" class="level3"><h3>Distill for websites and blogs</h3><p>While a single article may often be all that you need, many data science projects involve a <em>collection</em> of multiple R Markdown documents. When you have more than one R Markdown file knitted to HTML in your collection, it’s a good time to think about knitting them together into a single, navigable website. It is much easier for folks to engage with your work when you can share a link directly to it that they can see and explore it themselves.</p><div class="figure" style="text-align: center"><span id="fig:unnamed-chunk-5"></span><img src="https://media.giphy.com/media/3ohze1V5m8q6Fs2d7a/giphy.gif" alt="PBS Full Time Kid DIY Bird Feeder" width="50%" /><p class="caption">Figure 4: PBS Full Time Kid DIY Bird Feeder</p></div><p>Distill can knit together a collection of distill articles into a cohesive and navigable website. Distill sites feature an <a href="https://rstudio.github.io/distill/website.html#site-navigation">upper navigation bar</a> with links (which may also include dropdown menus). In addition, Distill blogs offer page layout options like <a href="https://rstudio.github.io/distill/blog.html#listing-pages">listing pages</a> and the ability to <a href="https://rstudio.github.io/distill/blog.html#custom-listings">customize them</a>. A listing page doesn’t have to be populated manually; instead, it creates a clickable, sequential list of all your posts (usually with nice thumbnail images and some post metadata like author, date, etc.). Blog posts also get <a href="https://rstudio.github.io/distill/blog.html#website-or-blog">special treatment</a> by Distill — they are never automatically re-rendered when your site is re-built.</p><p>Here is a Distill blog in action, from the <a href="https://blogs.rstudio.com/ai/">RStudio AI blog</a>:</p><div class="figure" style="text-align: center"><span id="fig:unnamed-chunk-6"></span><img src="https://pkgs.rstudio.com/distill/articles/images/rstudio-ai.png" alt="The RStudio AI blog, built with distill" width="75%" /><p class="caption">Figure 5: The RStudio AI blog, built with distill</p></div><p>Distill adds several built-in features to make websites better-suited for scientific communication:</p><ul><li><p><a href="https://rstudio.github.io/distill/website.html#site-search">Site-wide search.</a></p></li><li><p><a href="https://rstudio.github.io/distill/metadata.html#preview-images">Per-article sharing preview images</a> (for OpenGraph, Twitter, Slack, etc.).</p></li><li><p><a href="https://rstudio.github.io/distill/blog_workflow.html#importing-posts">Import posts from other external sources.</a></p></li><li><p><a href="https://rstudio.github.io/distill/blog_workflow.html#canonical-urls">Canonical urls</a> - handy for when your own post is re-published elsewhere.</p></li><li><p>Customizable <a href="https://rstudio.github.io/distill/blog.html#rss-feed">RSS feeds</a> (either with summaries or full post content).</p></li><li><p><a href="https://rstudio.github.io/distill/website.html#site-navigation">Site logo</a> in the upper navbar.</p></li><li><p>Integrated <a href="https://rstudio.github.io/distill/website.html#google-analytics">Google Analytics</a> support.</p></li><li><p><a href="https://rstudio.github.io/distill/website.html#site-metadata">Favicon support</a>.</p></li></ul><p>These are all on top of the features we listed for individual <a href="#distill-article">Distill articles</a>. You can also use RStudio’s <a href="https://www.rstudio.com/2020/09/30/rstudio-v1-4-preview-visual-markdown-editing/">visual markdown editor</a><a href="#fn2" class="footnote-ref" id="fnref2"><sup>2</sup></a> to compose Distill articles, which provides a WYSIWYG (What You See Is What You Get) writing experience as well as the ability to <a href="https://www.rstudio.com/2020/11/09/rstudio-1-4-preview-citations/">insert citations</a> from a document bibliography, reference management software, and even open-source bibliographic databases like <a href="https://pubmed.ncbi.nlm.nih.gov/">PubMed</a>.</p><p>Importantly, Distill websites are built without any kind of static site generator (like <a href="https://jekyllrb.com/">Jekyll</a> or <a href="https://gohugo.io/">Hugo</a>), which means that Distill websites offer users the bliss of building a website without any additional software dependencies (this means, all you need is R and the distill package to make it work).</p></div></div><div id="distill-v1.0" class="level2"><h2>Distill version 1.0</h2><p>So, what is new with Distill? In the rest of this post, we’ll share some highlights from the latest release, but you might want to look at the <a href="https://pkgs.rstudio.com/distill/news/index.html#distill-v10-cran">release notes</a> for the full details.</p><div id="theming" class="level3"><h3>Theming</h3><p>In this latest release, we have introduced the ability to theme your Distill article, website, or blog without needing to write your own <a href="https://en.wikipedia.org/wiki/CSS">CSS</a> from scratch. Distill output formats have a modern and streamlined theme “out of the box,” but we also know that it is important for users to be able to create a non-cookie-cutter site with less friction (i.e., less CSS).</p><div class="figure" style="text-align: center"><span id="fig:unnamed-chunk-7"></span><img src="https://media.giphy.com/media/13FrpeVH09Zrb2/giphy.gif" alt="A totally relatable CSS editing experience." width="30%" /><p class="caption">Figure 6: A totally relatable CSS editing experience.</p></div><p>To sidestep CSS wrangling, you can now <a href="https://rstudio.github.io/distill/website.html?panelset=themed-site&amp;panelset1=theme.css#create-theme">create</a> and <a href="https://rstudio.github.io/distill/website.html?panelset=themed-site&amp;panelset1=theme.css#apply-theme">apply</a> a Distill theme, which allows you to customize common elements without needing to create a CSS file from scratch. To get started, you can use the new <a href="https://distill.netlify.app/reference/create_theme.html"><code>create_theme()</code></a> function:</p><pre class="r"><code>create_theme(name = &quot;bespoke-theme&quot;) </code></pre><p>Follow the docs on <a href="https://rstudio.github.io/distill/website.html#theming">theming</a> to learn more. There, we provide a demo with code for going from the default theme (left) to a bespoke theme (right) by changing only a few fields in <code>bespoke-theme.css</code>:</p><p><img src="theme-before.png" width="49%" /><img src="theme-after.png" width="49%" /></p><p>But themes aren’t just for full websites! You can also theme a <a href="https://rstudio.github.io/distill/basics.html#theming">single Distill article</a>, and you can change the theme for an <a href="https://rstudio.github.io/distill/website.html#apply-theme">individual article</a> within a Distill website. As before, you can always use <a href="https://rstudio.github.io/distill/website.html#custom-style">custom CSS styles</a> to go fully custom; the theme allows you to bypass the detective work typically involved in discovering which CSS selectors are needed to change the key elements most users wish to control. Finally, we provided some <a href="https://rstudio.github.io/distill/website.html#example-themes">example Distill themes</a> for inspiration.</p></div><div id="other-highlights" class="level3"><h3>Other highlights</h3><ul><li><p>Added built-in search functionality, based on <a href="https://fusejs.io/">Fuse.js</a>. Search will be enabled by default for new distill blogs, and can be <a href="https://rstudio.github.io/distill/website.html#site-search">enabled</a> on websites as well.</p></li><li><p>Headings provide anchor links upon hover, making it easier to find and share the URL for a specific of a webpage or article.</p></li><li><p>Improved default syntax highlighting theme, optimized for accessibility based on the <a href="https://github.com/ericwbailey/a11y-syntax-highlighting">a11y syntax highlighting themes</a> by Eric Bailey. All colors in the theme meet the minimum <a href="https://www.w3.org/TR/UNDERSTANDING-WCAG20/visual-audio-contrast-contrast.html">WCAG 2.0 guidelines for contrast accessibility</a> of &gt; 4.5 (AA).</p></li><li><p>Improved handling and display of article categories for blogs, allowing readers to more easily see the categories for each individual post on listing pages.</p></li><li><p>Support for ORCID integration in article metadata:</p><pre class="yaml"><code>---authors:- name: Dianne Cookaffiliation: Monash Universityorcid_id: 0000-0002-3813-7155---</code></pre></li><li><p><a href="https://downlit.r-lib.org/">Downlit</a> integration for automatic linking of R code in code chunks to function reference documentation.</p><div class="figure"><video class="w-100" src="downlit.mp4" controls=""><a href="downlit.mp4"></a></video><p class="caption">Downlit links to any package’s reference pages directly.</p></div></li></ul></div></div><div id="distill-reference-site" class="level2"><h2>Distill reference site</h2><p>In addition to the <a href="https://rstudio.github.io/distill/">documentation site</a>, Distill also gained a <a href="https://pkgs.rstudio.com/distill">reference site</a>, built with <a href="https://pkgdown.r-lib.org/">pkgdown</a>. There, you’ll find a <a href="https://pkgs.rstudio.com/distill/reference/index.html">reference section</a>, <a href="https://pkgs.rstudio.com/distill/articles/examples.html">example gallery</a>, and the <a href="https://pkgs.rstudio.com/distill/news/index.html">latest news</a>. Our sincere thanks to the members of the #rstats community who agreed to have their sites featured in the <a href="https://pkgs.rstudio.com/distill/articles/examples.html">example gallery</a>:</p><div class="figure" style="text-align: center"><span id="fig:unnamed-chunk-11"></span><img src="examples.png" alt="The Distill example gallery" width="75%" /><p class="caption">Figure 7: The Distill example gallery</p></div></div><div id="a-hex-sticker" class="level2"><h2>A hex sticker</h2><p>Last but definitely not least, distill also gained a hex sticker — thanks to our artist <a href="https://www.jungjulie.com/">Julie Jung</a>!</p><div class="figure" style="text-align: center"><span id="fig:unnamed-chunk-12"></span><img src="distill.png" alt="distill hex by Julie Jung" width="25%" /><p class="caption">Figure 8: distill hex by Julie Jung</p></div><p>Phew! If you’ve made it this far, thanks for reading and we hope you give Distill version 1.0 a whirl!</p><div class="figure" style="text-align: center"><span id="fig:unnamed-chunk-13"></span><img src="https://media.giphy.com/media/ToMjGpns6x9Fv3b5C00/giphy.gif" alt="PBS Full Time Kid" width="50%" /><p class="caption">Figure 9: PBS Full Time Kid</p></div></div><div id="acknowledgements" class="level2"><h2>Acknowledgements</h2><p>A big thanks to the 31 contributors who helped with this release by discussing problems, proposing features, and contributing code:</p><p><a href="https://github.com/ADernild">@ADernild</a>, <a href="https://github.com/clauswilke">@clauswilke</a>, <a href="https://github.com/CRLNP">@CRLNP</a>, <a href="https://github.com/henry090">@henry090</a>, <a href="https://github.com/jarrodscott">@jarrodscott</a>, <a href="https://github.com/javierluraschi">@javierluraschi</a>, <a href="https://github.com/jenrichmond">@jenrichmond</a>, <a href="https://github.com/jmbuhr">@jmbuhr</a>, <a href="https://github.com/jthomasmock">@jthomasmock</a>, <a href="https://github.com/kevinushey">@kevinushey</a>, <a href="https://github.com/m-clark">@m-clark</a>, <a href="https://github.com/maelle">@maelle</a>, <a href="https://github.com/mfdf">@mfdf</a>, <a href="https://github.com/michiexile">@michiexile</a>, <a href="https://github.com/mihagazvoda">@mihagazvoda</a>, <a href="https://github.com/MilesMcBain">@MilesMcBain</a>, <a href="https://github.com/mondpanther">@mondpanther</a>, <a href="https://github.com/mrcaseb">@mrcaseb</a>, <a href="https://github.com/mrworthington">@mrworthington</a>, <a href="https://github.com/mvuorre">@mvuorre</a>, <a href="https://github.com/nunompmoniz">@nunompmoniz</a>, <a href="https://github.com/pneuvial">@pneuvial</a>, <a href="https://github.com/relund">@relund</a>, <a href="https://github.com/RodrigoCerqueira">@RodrigoCerqueira</a>, <a href="https://github.com/RRemelgado">@RRemelgado</a>, <a href="https://github.com/slopp">@slopp</a>, <a href="https://github.com/stared">@stared</a>, <a href="https://github.com/taraskaduk">@taraskaduk</a>, <a href="https://github.com/tonytrevisan">@tonytrevisan</a>, <a href="https://github.com/umarcor">@umarcor</a>, and <a href="https://github.com/wordsmith189">@wordsmith189</a>.</p></div><div id="references" class="level2 unnumbered"><h2>References</h2><div id="refs" class="references csl-bib-body hanging-indent"><div id="ref-hohman2020communicating" class="csl-entry">Hohman, Fred, Matthew Conlen, Jeffrey Heer, and Duen Horng (Polo) Chau. 2020. <span>“Communicating with Interactive Articles.”</span> <em>Distill</em>. <a href="https://doi.org/10.23915/distill.00028">https://doi.org/10.23915/distill.00028</a>.</div><div id="ref-olah2018the" class="csl-entry">Olah, Chris, Arvind Satyanarayan, Ian Johnson, Shan Carter, Ludwig Schubert, Katherine Ye, and Alexander Mordvintsev. 2018. <span>“The Building Blocks of Interpretability.”</span> <em>Distill</em>. <a href="https://doi.org/10.23915/distill.00010">https://doi.org/10.23915/distill.00010</a>.</div></div></div><div class="footnotes"><hr /><ol><li id="fn1"><p>Thank you, <a href="https://www.andrewheiss.com/">Andrew Heiss</a>!<a href="#fnref1" class="footnote-back">↩︎</a></p></li><li id="fn2"><p>At the time of publishing, the visual markdown editor is included in the RStudio IDE Preview v1.4. Read the documentation for the most up-to-date status of the editor: <a href="https://rstudio.github.io/visual-markdown-editing/#/" class="uri">https://rstudio.github.io/visual-markdown-editing/#/</a><a href="#fnref2" class="footnote-back">↩︎</a></p></li></ol></div></description></item><item><title>RStudio Package Manager 1.2.0 - Bioconductor & PyPI </title><link>https://www.rstudio.com/blog/package-manager-1-2-0/</link><pubDate>Mon, 07 Dec 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/package-manager-1-2-0/</guid><description><p>Packages are the heart of open source data science, but we know theyaren&rsquo;t always easy. Data scientists need access to ever-evolving toolsto do their best work, and IT needs to understand the risk of newsoftware while providing a stable environment for reproducible work.<a href="https://rstudio.com/products/package-manager">RStudio Package Manager</a>helps teams work together to accomplish these goals. Today we areexcited to announce a greatly expanded focus, enabling teams to realizethese benefits across languages and ecosystems by adding support forBioconductor, beta support for Python packages from PyPI, and newoptions for managing historical CRAN snapshots.</p><p>Ready to start? Visit the <a href="https://packagemanager.rstudio.com">RStudio Public PackageManager</a>, a free and hosted service,or <a href="https://rstudio.com/products/package-manager">evaluate PackageManager</a> for use withinyour organization.</p><h2 id="bioconductor">Bioconductor</h2><p>1.2.0 adds <a href="https://docs.rstudio.com/rspm/1.2.0/admin/getting-started/configuration/#quickstart-bioconductor">first class support forBioconductor</a>,an ecosystem of R packages used in the life sciences. This release makesit possible for data scientists in regulated or restricted environmentsto install and manage Bioconductor, even in environments without directinternet access. Teams can access Bioconductor packages using the<code>BiocManager</code> client, or they can use <code>install.packages</code>. Bioconductorpackages can be combined with local packages and CRAN packagesseamlessly. Package Manager also makes Bioconductor analyses morereproducible, by helping combine a Bioconductor release with acorresponding CRAN snapshot. You no longer have to manually manageincompatibilities between older Bioconductor releases and rolling CRANupdates.</p><p><img src="images/bioc-in-rspm.png" alt="Bioconductor in Package Manager"></p><h2 id="beta-support-for-pypi">Beta Support for PyPI</h2><p>This release adds <a href="https://docs.rstudio.com/rspm/1.2.0/admin/getting-started/configuration/#quickstart-pypi-packages">beta support for mirroring Pythonpackages</a>from PyPI. By adding PyPI support, we intend to make it much easier formultilingual teams to work together, and mitigate the challenges ITorganizations face in maintaining their own PyPI mirrors. One excitingcapability of Package Manager&rsquo;s PyPI support is the addition of PyPIsnapshots, enabling teams to time travel in Python just like they can inR. You can also search for Python packages explore documentation, andtrack package downloads. Package Manager is fully compatible with <code>pip</code>and tools like <code>virtualenv</code> and <code>pyenv</code>.</p><p></p><p>PyPI support is in Beta. We hope you will try the feature and give usyour feedback, but please be cautious about integrating with productionsystems. There are a few known limitations we will be addressing: addingthe option to share local Python packages and enabling fully air-gappedusers.</p><p></p><h2 id="historic-cran-snapshots">Historic CRAN Snapshots</h2><p>In this release, we&rsquo;ve made <a href="https://docs.rstudio.com/rspm/1.2.0/admin/getting-started/configuration/#quickstart-cran">significant improvements for teamsnavigatingCRAN</a>.It is now possible to have granular control over which CRAN snapshotsare available, allowing administrators to present CRAN as it existed oncertain days. This addition unlocks new management strategies. Forinstance, you can provide regular updates, but always stay one weekbehind the latest changes.</p><p></p><p>More granular snapshot access also makes it easier to start usingPackage Manager. Many teams have an existing set of packages theyinstalled from MRAN or a package library they created when theyinstalled R, and now you can easily recreate the same set in PackageManager.</p><p>When compatible, we&rsquo;ve also enabled date-based URLs, making it moreintuitive for data scientists to install packages from an understoodpoint in time.</p><p></p><p>These changes all extend to curated subsets of CRAN as well, so thatadmins can provide approved packages from any point in time.</p><p>There are many additional improvements and bug fixes in this release,please review the <a href="https://docs.rstudio.com/rspm/news">full releasenotes</a>.</p><blockquote><h4 id="upgrade-planning">Upgrade Planning</h4><p>Upgrading to 1.2.0 from 1.1.6 is a major upgrade. The upgrade will beautomatic, but we recommend reviewing the release notes and ensuringthe new desired capabilities are applied along with any relevantupgrades to storage requirements. If you are upgrading from an earlierversion, be sure to consult the release notes for the intermediatereleases, as well.</p></blockquote><h4 id="new-to-rstudio-package-manager">New to RStudio Package Manager?</h4><p><a href="https://rstudio.com/products/package-manager/">Download</a> the 45-dayevaluation today to see how RStudio Package Manager can help you, yourteam, and your entire organization access and organize R packages. Ortake a look at the free <a href="https://packagemanager.rstudio.com">Public PackageManager</a>.</p><p>To stay up to date with new releases of RStudio products, as well as information on minor updates, patches, and potential security notifications, we encourage you to subscribe to the Product Information email list at <a href="https://rstudio.com/about/subscription-management/">https://rstudio.com/about/subscription-management/</a></p></description></item><item><title>RStudio v1.4 Preview: The Little Things</title><link>https://www.rstudio.com/blog/rstudio-v1-4-preview-little-things/</link><pubDate>Wed, 02 Dec 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-v1-4-preview-little-things/</guid><description><p><em>This post is part of a series on new features in RStudio 1.4, currently available as a <a href="https://www.rstudio.com/products/rstudio/download/preview/">preview release</a>.</em></p><p>Today, we continue a <a href="https://www.rstudio.com/blog/2017-09-13-rstudio-v1-1-little-things/">long</a> <a href="https://www.rstudio.com/blog/rstudio-1-3-the-little-things/">tradition</a> of concluding our preview blog series with a look at some of the smaller features we&rsquo;ve added to the IDE.</p><h3 id="deeper-outlines-for-r-scripts">Deeper Outlines for R Scripts</h3><p>If you write R scripts longer than a couple of pages, you probably already use RStudio&rsquo;s document outline pane, which makes it easy to see an overview of your R script and navigate quickly to any section. To make it easier to navigate in longer and more nested scripts, we&rsquo;ve added support for subsections in the outline.</p><img align="center" style="padding: 35px:" src="document-outline.png"><p>Add subsections to your code by using Markdown-style comment headers, with the label followed by four or more dashes. For example:</p><pre><code># Section ----## Subsection ----### Sub-subsection ----</code></pre><p>More information on code sections can be found on our help site: <a href="https://support.rstudio.com/hc/en-us/articles/200484568-Code-Folding-and-Sections">Code Folding and Sections</a>.</p><h3 id="history-navigation-with-the-mouse">History Navigation with the Mouse</h3><p>If you have a mouse with side buttons, you can now use them to jump backwards and forwards through your source history (recent editing locations).</p><img align="center" style="padding: 35px:" src="mouse-navigation.png"><h3 id="render-plots-with-agg">Render Plots with AGG</h3><p>AGG (Anti-Grain Geometry) is a high-performance, high-quality 2D drawing library. RStudio 1.4 integrates with a new AGG-powered graphics device provided by the <a href="https://github.com/r-lib/ragg">ragg R package</a> to render plots and other graphical output from R. It&rsquo;s faster than the one built into R, and it does a better job of rendering fonts and anti-aliasing. Its output also is very consistent, no matter where you run your R code.</p><img align="center" style="padding: 35px:" src="ragg.png"><p>To start using this new device, go to <em>Options -&gt; General -&gt; Graphics -&gt; Backend</em> and select &ldquo;AGG&rdquo;.</p><h3 id="pane-focus-and-navigation">Pane Focus and Navigation</h3><p>If you primarily use the keyboard to navigate the IDE, we&rsquo;ve introduced a couple of new tools that will make it easier to move around. Check <em>Options -&gt; Accessibility -&gt; Highlight Focused Panel</em> and RStudio will draw a subtle dotted border around the part of the IDE that has focus. For example, when your cursor is in the Console panel:</p><img align="center" style="padding: 35px:" src="focused-panel.png"><p>We&rsquo;ve also added a new keyboard shortcut, <kbd>F6</kbd>, which moves focus to the next panel. Using these together makes it much easier to move through the IDE without the mouse!</p><h3 id="natural-sorting-in-files-pane">Natural Sorting in Files Pane</h3><p>Do you find yourself giving your R scripts names like <code>step_001.R</code> so that they are sorted correctly in the Files pane? It&rsquo;s no longer necessary: RStudio 1.4 uses <a href="https://en.wikipedia.org/wiki/Natural_sort_order">natural sort order</a> instead of alphabetical sort order for the Files pane, so that <code>step10.R</code> comes after <code>step9.R</code>, not after <code>step1.R</code>.</p><img align="center" style="padding: 35px:" src="natural-sort-order.png"><p>(See also <a href="https://speakerdeck.com/jennybc/how-to-name-files">Jenny Bryan&rsquo;s advice on naming things</a>.)</p><h3 id="show-grouping-information-in-notebooks">Show Grouping Information in Notebooks</h3><p><a href="https://dplyr.tidyverse.org/articles/grouping.html">Grouping data</a> is a very useful operation, but it isn&rsquo;t always obvious how data has been grouped. R Notebooks now show you information about grouping when displaying data:</p><img align="center" style="padding: 35px:" src="grouping.png"><h3 id="custom-fonts-on-rstudio-server">Custom Fonts on RStudio Server</h3><p>RStudio Desktop can use any font you have installed on your system, but if you use RStudio Server you&rsquo;ve always been stuck with the default. No longer! RStudio Server can now use popular coding fonts like <a href="https://github.com/tonsky/FiraCode">Fira Code</a>, <a href="https://fonts.google.com/specimen/Source+Code+Pro">Source Code Pro</a>, and <a href="https://www.jetbrains.com/lp/mono/">JetBrains Mono</a>.</p><img align="center" style="padding: 35px:" src="custom-fonts.png"><p>These fonts can even be <a href="https://docs.rstudio.com/ide/server-pro/1.4.912-1/r-sessions.html#fonts">installed on the server itself</a> so it isn&rsquo;t necessary to have them installed on the Web browser you use to access RStudio Server.</p><p>You can try out all these features by installing the <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio 1.4 Preview Release</a>. If you do, we welcome your feedback on the <a href="https://community.rstudio.com/c/rstudio-ide">community forum</a>. We look forward to getting a stable release of RStudio 1.4 in your hands soon!</p></description></item><item><title>rstudio::global(2021) Diversity Scholarships</title><link>https://www.rstudio.com/blog/diversity-scholarships/</link><pubDate>Mon, 30 Nov 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/diversity-scholarships/</guid><description><p><img src="https://d33wubrfki0l68.cloudfront.net/7139b26ade68d57777287c324f87834291c1032e/43b2b/2020/10/16/rstudio-global-2021/global-logo-dark_hueec0ce32c918272640b157f9b6a86e05_48888_800x0_resize_q75_box.jpg" alt="rstudio global"></p><p>rstudio::global(2021) will be a very different conference from previous years. We will miss being together physically, but we are enthusiastic about planning this free, virtual event designed to be inclusive of R users in every time zone. Even though the conference itself is free, we are continuing our tradition of diversity scholarships, but with a different focus.</p><p>This year, we have planned for 70 diversity scholarships available to individuals around the world who are a member of a group that is underrepresented at rstudio::conf(). These groups include people of color, those with disabilities, elders/older adults, LGBTQ folks, and women/minority genders. In past years, we have had to limit our scholarships geographically due to visa issues and we are happy to have no such limitations this year.</p><p>The scholarships will have three main components:</p><ul><li>Opportunities for online networking and support before and during the virtual conference</li><li>Two workshops, taught online the week after rstudio::global(2021)</li><li>Practical support, if needed, to enable participation in the virtual conference (such as an accessibility aid, a resource for internet access, or childcare)</li></ul><p>The two workshops will be taught by some of RStudio’s most skilled and experienced educators, focusing on topics about sharing knowledge and teaching others.</p><p><a href="http://mine-cr.com">Mine Çetinkaya-Rundel</a> will lead a workshop titled: “What they forgot to teach you about teaching R”:</p><blockquote><p>In this workshop, you will learn about using the RStudio IDE to its full potential for teaching R. Whether you&rsquo;re an educator by profession, or you do education as part of collaborations or outreach, or you want to improve your workflow for giving talks, demos, and workshops, there is something for you in this workshop. During the workshop we will cover live coding best practices, tips for using RStudio Cloud for teaching and building learnr tutorials, and R Markdown based tools for developing instructor and student facing teaching materials.</p></blockquote><p><a href="https://alison.rbind.io/">Alison Hill</a> will lead a workshop on building websites using R Markdown:</p><blockquote><p>“You should have a website!” You may have heard this one before, or even said it yourself. In this workshop, you’ll learn how to build and customize a website from the comfort of the RStudio IDE using the blogdown package. We’ll also cover basic website care and feeding like using R Markdown to create content, and how to use GitHub and Netlify to improve your workflow. Pre-work will be shared with participants ahead of time, but to get the most out of this workshop, you’ll want to have a GitHub account and be able to push and pull files from a GitHub repository using your local RStudio IDE.</p></blockquote><p>Since this year’s diversity scholarships focus on skills for teaching and sharing, applications will be evaluated considering experience and plans relevant to those skills. We know that people with marginalized identities are often experts and leaders investing in our communities, not only beginners. There are two main criteria:</p><ul><li>Do you already have some experience with R and GitHub? The workshops will assume some working knowledge of both, so show us some of your current work. (If you are new to GitHub, check out <a href="https://guides.github.com/activities/hello-world/">GitHub’s Hello World</a> to get some of your content up in under an hour.)</li><li>What are your current experiences and future plans for knowledge sharing and community building? This is the main theme and focus of our diversity scholarships this year, and we want to multiply our impact through individuals who will spread the love.</li></ul><p>Even with 70 scholarships this year, we expect them to remain competitive, so be sure to highlight your own unique perspective. The application deadline is December 18, 2020.</p><p><a href="https://forms.gle/XPhS7YhbBEJyUoVAA" button type="button" style= "padding: 12px 20px; border: none; font-size: 18px; border-radius: 3px; cursor: pointer; background-color: #4c83b6; color: #fff; box-shadow: 0, 1px, 3px, 0px, rgba(0,0,0,0.10);">Apply now!</a></p></description></item><item><title>Custom Google Analytics Dashboards with R: Downloading Data</title><link>https://www.rstudio.com/blog/google-analytics-part1/</link><pubDate>Fri, 27 Nov 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/google-analytics-part1/</guid><description><script src="index_files/header-attrs/header-attrs.js"></script><style type="text/css">img.screenshot { border: 0.5px solid #888; padding: 3px; background-color: #eee;}</style><p>This week, I’m taking a break from our regular blog content to write about some of the nitty-gritty data science we do within RStudio. Specifically, I’m going to address a question I was asked soon after joining the blogging team:</p><blockquote><p>Which of your blog articles received the most views in the first 15 days they were posted?</p></blockquote><p>I have access to Google Analytics for the blog, and it’s a powerful tool for understanding the flow of visitors through our web site. Nevertheless, the web interface makes it very difficult to compare 15- and 30-day windows of traffic that we need to evaluate blog posts. And given my need to report on the success of the blog to our stakeholders, this turned into a very tedious manual chore.</p><div id="this-sounds-like-a-job-for-code-based-data-science" class="level2"><h2>This Sounds Like A Job For Code-Based Data Science</h2><p>If you’ve been reading our blog over the past few months, we’ve been writing about how we like to use <a href="https://blog.rstudio.com/2020/11/17/an-interview-with-lou-bajuk/" target="_blank" rel="noopener noreferrer">code-based data science</a> to hide complexity and improve reproducibility. I decided that our tedious process for extracting Google Analytics data posed a great opportunity to practice what we preach and build a custom dashboard in R that will:</p><ol style="list-style-type: decimal"><li><strong>Download raw Google Analytics visitor data locally for analysis.</strong> Google provides fairly easy access to sampled visitor data through its Application Programming Interface (API), but I wanted to use the full data set.</li><li><strong>Give stakeholders a simple graphical interface to interact with the data.</strong> I wanted to hide the complexity available in the Google Analytics user interface and give stakeholders one-click access to blog metrics. I also to give them a way to interact with and drill into the data. This also meant writing the code in such a way that it could run in production on a server.</li><li><strong>Provide measurements of blog post viewership over their first 15 days of availability.</strong> Blog posts receive most of their traffic immediately after being posted and then experience a long-tail of diminishing visits. To properly compare the popularity of blog posts, we had to aggregate their metrics over fixed windows of time beginning on their posting dates.</li></ol><p>While it took a few weeks to get the first dashboard working, we now regularly use these R-based dashboards to measure our blog post effectiveness. However, the process of getting the Google Analytics API working was tricky enough that I thought others might find documentation of the process useful.</p><p>To achieve this goal, I’ll address each of these steps listed above in its own blog post over the coming weeks. I’ll provide both code and screen shots along the way as well.</p></div><div id="getting-started-how-to-download-data-from-google-analytics" class="level2"><h2>Getting Started: How To Download Data From Google Analytics</h2><p>Before we begin, I want to start with a few caveats:</p><ol style="list-style-type: decimal"><li><strong>Google APIs often change over time.</strong> The approach I’m showing you works with <a href="https://developers.google.com/analytics/devguides/reporting/core/v4/" target="_blank" rel="noopener noreferrer">Version 4 of the Google Analytics API</a>. Google APIs often change over time, and I can’t guarantee that the code I show will work months or years from now.</li><li><strong>R isn’t one of the official Google languages listed in their documentation.</strong> I use the excellent <a href="https://code.markedmondson.me/googleAuthR/" target="_blank" rel="noopener noreferrer"><code>googleAnalyticsR</code></a> package written by Mark Edmondson in my dashboard, but Google only officially documents Java, Python, and PHP interfaces.</li><li><strong>This may not be the most efficient or best solution.</strong> I hope readers who find simpler, easier, or better methods of authorizing the use Google Analytics API will document their methods as well. I encourage you to share them with the RStudio community using the googleAnalyticsR tag at <a href="https://community.rstudio.com/tag/googleanalyticsr" target="_blank" rel="noopener noreferrer">https://community.rstudio.com/tag/googleanalyticsr</a>.</li></ol><p>With all that said, the dashboards that use this API provide insights into our blog use that would require a great deal of manual work to reproduce using the GA web interface.</p><div id="install-these-packages-if-you-dont-have-them-already" class="level3"><h3>Install These Packages If You Don’t Have Them Already</h3><p>While the finished dashboards use 16 different R packages, the essential ones I use are:</p><ul><li><code>gargle</code>. This package helps us set up our Google Analytics (henceforth abbreviated as GA) authorization and credentials.</li><li><code>googleAnalyticsR</code>. This essential package allows us to download the raw visitor data using the Google Analytics API.</li><li><code>flexdashboard</code>. This package allows us to present the results in a simple Web interface using R Markdown.</li><li><code>reactable</code>. This package allows users of the dashboard to browse, search, reorder, and interact with the data presented.</li></ul><p>All of these packages are available for download at CRAN using <code>install.packages()</code>.</p><p>You should also create an R project for your dashboard at this time. We will need a place to store our Google Analytics credentials, and having a project ready to store them will keep things organized.</p></div><div id="google-analytics-credentials-the-secrets-to-success" class="level3"><h3>Google Analytics credentials: The secrets to success</h3><hr /><p><strong>IMPORTANT UPDATE</strong>: Mark Edmonston, the author <code>googleAnalyticsR</code>, has created a new version of his package that eliminates the need for OAUTH credentials when running on a server. Once that update is available on CRAN, I’ll update this post to document the simpler process of only submitting service account credentials. In the meantime, all the code shown here works; it just does more authentication than is required.</p><hr /><p>I want to begin by talking about what I found to be one of the most challenging pieces of the entire project: Creating, authorizing, and applying Google Analytics credentials. It’s not hugely difficult, but it does have a lot of steps you must get right before you can get any data.</p><p>Here’s a high-level overview of what we’ll need to get the visitor data for our Google Analytics dashboard. To use the Google Analytics API, we need to present two types of credentials that represent:</p><ol style="list-style-type: decimal"><li><strong>The user represented by the client:</strong> The API only provides service on behalf of an authorized <em>user</em>. Most people are pretty familiar with this type of authorization; it’s the equivalent of logging into Google with an email address and password. The trick in this case is that the email address we’ll use will be one Google creates for our particular client.</li><li><strong>The client requesting service:</strong> The API requires that we authorize and provide credentials for each <em>client</em> making requests. In our case, our client will be an R program, which is considered a desktop client. We’ll request a <em>service account</em> to represent this our R program and allow direct server-to-server interactions without human interaction.</li></ol><p>For any of this to work, the author of the dashboard has to be an authorized user of Google Analytics. You can test this by going to the <a href="https://analytics.google.com" target="_blank" rel="noopener noreferrer">Google Analytics Home page (analytics.google.com)</a>. If you are an authorized user, you’ll see the web dashboard. If you aren’t, you’ll get an error message and will have to ask for access from your Google Analytics Administrator. Keep the contact information for your Google Analytics administrator handy; we’ll need that information again later.</p><p>We must perform six steps to download data using the GA API. We need to:</p><ul><li><a href="#step1">Step 1:</a> Request a service account from Google.</li><li><a href="#step2">Step 2:</a> Download the service account key and securely store its JSON file.</li><li><a href="#step3">Step 3:</a> Enable API access to your project.</li><li><a href="#step4">Step 4:</a> Add your service account credentials to your project.</li><li><a href="#step5">Step 5:</a> Create and download your project’s OAUTH credentials from Google. This step will no longer be required after the new version of <code>googleAnalyticsR</code> is available on CRAN (see Important Update notice at the beginning of this section)</li><li><a href="#step6">Step 6:</a> Submit both pieces of information to the Google Analytics API and make a test data request.</li></ul><p>I’ll walk you through each step individually in the following sections. For readers not interested in the gory details, you can skip ahead to <a href="#conclusion">the conclusion of this piece</a> where I’ll recap what we got out of this process and what the next steps are.</p><h3 id="step1">Step 1: Request a service account from Google</h3><p>Google has written <a href="https://developers.google.com/analytics/devguides/reporting/core/v4/authorization" target="_blank" rel="noopener noreferrer">a comprehensive document on how to do API authentication</a>. Because we want to build a stand-alone dashboard, we’re going to use the service account option, which Google describes this way:</p><blockquote><p>Service accounts are useful for automated, offline, or scheduled access to Google Analytics data for your own account. For example, to build a live dashboard of your own Google Analytics data and share it with other users.</p></blockquote><p>This sounds like exactly what we want, so let’s use that option. It will take a few sub-steps, but they are fairly straightforward. Jenny Bryan has written a nice overview about how this process works as part of her <a href="https://gargle.r-lib.org/articles/get-api-credentials.html" target="_blank" rel="noopener noreferrer">gargle package</a>; the description of service accounts is at the bottom of the page.</p><p>To create your service account, you should:</p><ol style="list-style-type: decimal"><li><strong>Go to <a href="https://console.cloud.google.com/cloud-resource-manager" target="_blank" rel="noopener noreferrer">https://console.cloud.google.com/cloud-resource-manager</a> and click on <em>Create Project</em></strong> (see figure below). While you could also use the <a href="https://console.developers.google.com/start/api?id=analytics&credential=client_key" target="_blank" rel="noopener noreferrer">web-based Google setup tool</a> recommended by the Google document, I find that using the cloud resource manager page referenced above simplifies naming your project something other than “My Project”.</li></ol><p><img class="screenshot" src="images/11-create-project.jpg" style="width: 400px;"/></p><ol start="2" style="list-style-type: decimal"><li><strong>Give your project a name.</strong> Here, we’ve named our project <em>Test Project.</em> Click <em>Create</em> once you’ve entered a name.</li></ol><p><img class="screenshot" src="images/12-name-project.jpg" style="width: 400px;"/></p><ol start="3" style="list-style-type: decimal"><li><strong>Click on <em>Go to project settings</em> on the Google Cloud Dashboard project card.</strong> Usually the new project’s card will be at the top left of the page and should have the project name and number. You’ll now go to your project settings page to create a service account to access this project.</li></ol><p><img class="screenshot" src="images/13-project-card.jpg" style="width: 300px;"/></p><ol start="4" style="list-style-type: decimal"><li><strong>Select <em>Create Service Account</em>.</strong> You now have a project created, but you don’t yet have a Google user account that can be used with that project. We’ll create that on the <em>Service Account Details</em> screen.</li></ol><p><img class="screenshot" src="images/14-create-service-account.jpg" style="width: 400px;"/></p><ol start="5" style="list-style-type: decimal"><li><strong>Give your service account a name.</strong> The name will automatically populate the Service Account ID field. <strong>Record the full Service Account ID generated somewhere; we’ll need to register that account with your Google Analytics administrator.</strong> It’s also a good idea to provide a long description of what you intend to do with this account. When you’ve finished with filling in these fields, click <em>Create.</em></li></ol><p><img class="screenshot" src="images/15-service-account-details.jpg" style="width: 400px;"/></p><p>This completes the creation of our service account.</p><h3 id="step2">Step 2: Download the service account key</h3><ol style="list-style-type: decimal"><li><strong>Now that your service account exists, download your key from the three vertical dots menu.</strong> Once your account is created, the dashboard will take you back to the <em>Service Accounts</em> page as shown below.</li></ol><p><img class="screenshot" src="images/21-service-account-created.jpg" style="width: 600px;"/></p><ol start="2" style="list-style-type: decimal"><li><strong>Create your private service account key</strong>. Make your browser window wide enough to see the <em>Actions</em> menu with the three vertical dots. Click on those 3 vertical dots, and you’ll see a pop-up menu. Click on <em>Create key.</em></li></ol><p><img class="screenshot" src="images/22-create-service-account-key.jpg" style="width: 600px;"/></p><ol start="3" style="list-style-type: decimal"><li><strong>Select JSON as your key format.</strong> The <code>googleAnalyticsR</code> package requires the key to be in JSON format. Once you’ve selected that format, click <em>Create.</em></li></ol><p><img class="screenshot" src="images/23-select-JSON-form.jpg" style="width: 400px;"/></p><ol start="4" style="list-style-type: decimal"><li><strong>Store the downloaded key in a folder within your R project.</strong> I typically create a folder in my dashboard project named <em>.secrets</em> where I keep such keys.</li></ol><p><img class="screenshot" src="images/24-key-downloaded.jpg" style="width: 400px;"/></p><p>At this point, you have the service key credentials you need to make requests. However, we still have a couple more steps to do before we can use the API.</p><h3 id="step3">Step 3: Enable API access to your project</h3><p>The fact you have a valid service key is not enough to start making requests. You still need to enable the API from the Google Dashboard. To do this you:</p><ol style="list-style-type: decimal"><li><strong>Go to <a href="https://console.cloud.google.com/apis" target="_blank" rel="noopener noreferrer">https://console.cloud.google.com/apis</a> as shown in the screenshot below and then click on <em>Enable APIs and Services</em>.</strong></li></ol><p><img class="screenshot" src="images/30-google-api-dashboard.jpg" style="width: 400px;"/></p><ol start="2" style="list-style-type: decimal"><li><strong>Search for and click on the Google Analytics API.</strong></li></ol><p><img class="screenshot" src="images/31-google-analytics-api-selection.jpg" style="width: 400px;"/></p><ol start="3" style="list-style-type: decimal"><li><strong>Click on <em>Enable</em> to make the API for your project active.</strong></li></ol><p><img class="screenshot" src="images/32-google-analytics-api-enable.jpg" style="width: 400px;"/></p><p>Sadly, the fact you have a valid service key is not enough to start making requests yet. We still need to authorize the user account with GA.</p><h3 id="step4">Step 4: Add your service account user credentials to your project</h3><p>You now need to add the email associated with that key to the list of authenticated project users. To do this, we’re going to return to the Cloud Resource Manager pane at <a href="https://console.cloud.google.com/cloud-resource-manager" target="_blank" rel="noopener noreferrer">https://console.cloud.google.com/cloud-resource-manager</a>.</p><p>Please note that for many Google Analytics configurations, <strong>only GA administrators may add new members to a project.</strong> If that is the case for you, you’ll will not see the screens shown below. Instead, you must contact your GA administrator and ask them to add your service account email to the project with Viewer rights.</p><p>If you do have the appropriate permissions, however, perform the following 3 tasks:</p><ol style="list-style-type: decimal"><li><strong>Click on the <em>IAM</em> selection on the left-hand-side menu and select <em>ADD</em> from the top submenu</strong> as shown below:</li></ol><p><img class="screenshot" src="images/41-IAM-dash.jpg" style="width: 400px;"/></p><ol start="2" style="list-style-type: decimal"><li><strong>Add your service account user to the project.</strong> Enter the email address for your service account, select <em>Viewer</em> as the role, and click <em>Save</em> as shown below.</li></ol><p><img class="screenshot" src="images/42-IAM-add-email.jpg" style="width: 400px;"/></p><ol start="3" style="list-style-type: decimal"><li><strong>Verify that your service account email has now been added</strong> by observing it in the list of members for this project.</li></ol><p><img class="screenshot" src="images/43-IAM-show-users.jpg" style="width: 400px;"/></p><h3 id="step5">Step 5: Create and download your project’s OAUTH credentials from Google</h3><p>While you may be questioning why you ever started this seemingly endless project at this point, fear not; we’re almost done. All that remains to do is to create and download the OAUTH credentials for your service key.</p><p>Now if you’re anything like me, you’re probably thinking “Wait a minute, I created a service key to bypass all this OAUTH complexity. Why do I need an OAUTH project file now?” I’m glad you asked; it’s because Google:</p><ul><li><strong>Gathers API statistics on a per-project basis.</strong> Google needs to know what project to aggregate your Google Analytics API calls under for reporting and accounting purposes.</li><li><strong>Needs to defend against excessive API calls.</strong> Because you are accessing the API from a computer program, Google has to defend its API against infinite loops and automated attacks. Should Google detect excessive API calls associated with your project, it can throttle its responses to you without affecting other users.</li></ul><p>You don’t actually need a project client ID for debugging purposes because the <code>GoogleAnalyticsR</code> package has a default project associated with it. However, this project ID is shared among all programs using the package, and you may find your API calls denied because too many users are actively using the package. You can avoid this issue entirely by setting your own project client ID as shown below.</p><p>In my opinion, acquiring an OAuth 2.0 client ID for a service account is poorly documented on the Google API dashboard, in the Google documentation, and in the <code>GoogleAnalyticsR</code> package. I found this process difficult to reproduce for our test project even though I’d been through it for my own dashboards. With that said, it’s fairly straightforward if you start in the proper place as shown below:</p><ol style="list-style-type: decimal"><li><strong>Go to the site <a href="https://console.developers.google.com/apis/api/analyticsreporting.googleapis.com/" target="_blank" rel="noopener noreferrer">https://console.developers.google.com/apis/api/analyticsreporting.googleapis.com/</a>.</strong> Please note that this is not the Google Cloud API dashboard we went to in Step 3; this is the Google Analytics Report API dashboard. You probably will have no OAuth 2.0 client IDs shown. Click on <em>+ CREATE CREDENTIALS</em> at the top of the page.</li></ol><p><img class="screenshot" src="images/51-GA-dashboard.jpg" style="width: 600px;"/></p><ol start="2" style="list-style-type: decimal"><li><strong>Select <em>OAuth client ID</em></strong> as the credential you wish to create.</li></ol><p><img class="screenshot" src="images/52-GA-add-credentials.jpg" style="width: 400px;"/></p><ol start="3" style="list-style-type: decimal"><li><strong>Select <em>Desktop app</em> as the application type and enter your a name for your client.</strong> I chose the name “Test Google Analytics script.”</li></ol><p><img class="screenshot" src="images/53-GA-add-client-id.jpg" style="width: 400px;"/></p><ol start="4" style="list-style-type: decimal"><li><strong>Click <em>OK</em> to acknowledge the ID being created</strong>, which will return you to the Google Analytics dashboard.</li></ol><p><img class="screenshot" src="images/54-GA-client-id-created.jpg" style="width: 400px;"/></p><ol start="5" style="list-style-type: decimal"><li><strong>Click the down arrow button next to your new Client ID to download the client ID JSON file.</strong> I typically put this file into my <em>.secrets</em> folder where I also keep my service account private key.</li></ol><p><img class="screenshot" src="images/55-download-ID.jpg" style="width: 600px;"/></p><h3 id="step6">Step 6: Submit both pieces of information to the Google Analytics API and make a test data request.</h3><p>While this multi-step process which may have seemed like something out of <em>Lord of the Rings</em>, you now should have all the credentials and permission to make API requests to Google Analytics. So let’s write code to fetch one day’s Google Analytics data for the rstudio.com site.</p><pre class="r"><code>library(googleAnalyticsR)library(dplyr)library(ggplot2)library(lubridate)library(reactable)library(stringr)## First, authenticate with our client OAUTH credentials from step 5 of the blog post.googleAuthR::gar_set_client(json = &quot;secrets/oauth-account-key.json&quot;)## Now, provide the service account email and private keyga_auth(email = &quot;ga-analysis@test-project-291617.iam.gserviceaccount.com&quot;,json_file = &quot;secrets/service-account-key.json&quot;)## At this point, we should be properly authenticated and ready to go. We can test this## by getting a list of all the accounts that this test project has access to. Typically,## this will be only one if you&#39;ve created your own service key. If it isn&#39;t your only## account, select the appropriate viewId from your list of accounts.my_accounts &lt;- ga_account_list()my_id &lt;- my_accounts$viewId ## Modify this if you have more than one account## Let&#39;s look at all the visitors to our site. This segment is one of several provided## by Google Analytics by default.all_users &lt;- segment_ga4(&quot;AllTraffic&quot;, segment_id = &quot;gaid::-1&quot;)## Let&#39;s look at just one day.ga_start_date &lt;- today()ga_end_date &lt;- today()## Make the request to GAdata_fetch &lt;- google_analytics(my_id,segments = all_users,date_range = c(ga_start_date, ga_end_date),metrics = c(&quot;pageviews&quot;),dimensions = c(&quot;landingPagePath&quot;),anti_sample = TRUE)## Let&#39;s just create a table of the most viewed postsmost_viewed_posts &lt;- data_fetch %&gt;%mutate(Path = str_trunc(landingPagePath, width=40)) %&gt;%count(Path, wt=pageviews, sort=TRUE)head(most_viewed_posts, n=5)</code></pre><p>Assuming you have the appropriate permissions, client ID, and service key, you should get a result that looks similar to this one I pulled from the rstudio.com web site.</p><table><thead><tr class="header"><th></th><th>Path</th><th align="right">n</th></tr></thead><tbody><tr class="odd"><td>1</td><td>rstudio.cloud/index.html</td><td align="right">22173</td></tr><tr class="even"><td>2</td><td>rstudio.com/index.html</td><td align="right">18240</td></tr><tr class="odd"><td>3</td><td>rstudio.com/products/rstudio/download…</td><td align="right">16120</td></tr><tr class="even"><td>4</td><td>www.shinyapps.io/admin/index.html</td><td align="right">8327</td></tr><tr class="odd"><td>5</td><td>support.rstudio.com/hc/en-us/articles…</td><td align="right">7486</td></tr></tbody></table><h2 id="conclusion">Wrapping Up Our First Custom Google Analytics Script</h2><p>While many of the details of the Google Analytics API may seem elaborate and arcane, I want to emphasize some of the main ideas behind this process:</p><ul><li><strong>You don’t have to settle for what the Google Analytics user interface gives you.</strong> The GA UI contains many general-purpose analytical views. However, if your organization wants to manage to web metrics that its interface doesn’t provide, the GA API and custom code allows you to create your own web metrics from raw GA data.</li><li><strong>Two credentials unlock your ability to create your own web analytics.</strong> While the setup process to access the GA API seems complicated, it really boils down to agreeing on two basic credentials: one for the user authorizing the request and the other for the client program running it.</li><li><strong>Once you can download your own GA data, you can apply ordinary R code to understand it.</strong> While the Google Analytics UI may take you days to learn, once you can download the raw GA data, you can return to R and Python tools to wrangle that data into the web characteristics you want to measure. Best of all, once that code is written, you can hand it to others who don’t have to understand anything about GA or its APIs. Your program becomes a useful tool instead of just a big toolbox.</li></ul><p>This post has focused entirely on getting authorized to download Google Analytics data. The next post will focus on how to create a flex dashboard for stakeholders to interact with the data. The last post in this series will show how to create windowed views of this data and public a self-contained dashboard that can be used on demand from RStudio Connect.</p></div></div><div id="to-learn-more" class="level2"><h2>To Learn More</h2><p>This post only focused on using <code>GoogleAnalyticsR</code> to download data through the GA API, but the package is capable of much much more. I highly recommend taking a look at <a href="http://code.markedmondson.me/googleAnalyticsR/index.html" target="_blank" rel="noopener noreferrer">the extensive package documentation</a> and its <a href="https://github.com/MarkEdmondson1234/googleAuthR" target="_blank" rel="noopener noreferrer">github repository</a> as well as <a href="https://code.markedmondson.me" target="_blank" rel="noopener noreferrer">author Mark Edmondson’s blog</a>.</p></div></description></item><item><title>How California Uses Shiny in Production to Fight COVID-19</title><link>https://www.rstudio.com/blog/using-shiny-in-production-to-monitor-covid-19/</link><pubDate>Thu, 19 Nov 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/using-shiny-in-production-to-monitor-covid-19/</guid><description><figure><img align="center" src="hero.jpg"><figcaption><i>Short term forecast from the <a href="https://calcat.covid19.ca.gov/cacovidmodels/" target="_blank" rel="noopener noreferrer">California COVID Assessment Tool (CalCAT)</a></i></figcaption></figure><blockquote><p>&ldquo;Things move along so rapidly nowadays that people saying: &ldquo;It can&rsquo;t be done,&rdquo; are always being interrupted by somebody doing it.&rdquo; <em>&ndash; Puck magazine, 1903.</em></p></blockquote><p>As we at RStudio have talked about the topic of <a href="https://blog.rstudio.com/2020/05/19/driving-real-lasting-value-with-serious-data-science/" target="_blank" rel="noopener noreferrer">serious data science</a>, we often field questions about the suitability of R for use in large-scale, production environments. Those questions typically coalesce around:</p><ol><li><strong>Speed:</strong> Is R fast enough to run production workloads?</li><li><strong>Scalability:</strong> Can R be used for large scale production?</li><li><strong>Infrastructure:</strong> What kind of R infrastructure do administrators need to run production applications?</li></ol><p>Instead of debating these question in theory in this post, we&rsquo;ll instead turn to an organization that is not just talking about deploying Shiny dashboards in large-scale production, but is actually &ldquo;doing it&rdquo;.</p><p>Many definitions exist for what constitutes an application being in large-scale production. For the purposes of this article, we&rsquo;ll define large-scale production as:</p><div style="padding:0px 30px 0px 30px; margin:20px 0;"><div class=".quote-spacing"><div class=".quote-size"><i>Applications serving thousands of users on a daily basis.</i></div></div></div><p>One application that fits this definition nicely is the California COVID Assessment Tool (CalCAT) which serves 32 million Californian citizens. CalCAT is a Shiny dashboard written in R by a group of data scientists within the California Department of Public Health (CDPH) and is hosted on an array of commercial <a href="https://rstudio.com/products/team/" target="_blank" rel="noopener noreferrer">RStudio Team</a> servers.</p><p>RStudio recently talked with members of the team who deployed this dashboard to understand how this public, large-scale Shiny app came to be. The following sections present some of our takeaways from those discussions.</p><style type="text/css">.quote-spacing { padding:0 80px; }.quote-size { font-size: 160%; line-height: 34px; }.speaker-quote { padding-left: 50px; text-indent: -50px; }.no-speaker-quote { padding-left: 50px; }[@media] only screen and (max-width: 600px) {.quote-spacing { padding-top:0; }.quote-size { font-size: 120%; line-height: 28px; }}</style><h2 id="cdphs-first-shiny-dashboard-tracked-opiod-use">CDPH&rsquo;s First Shiny Dashboard Tracked Opiod Use</h2><figure><div style="padding: 35px 0 0 0;"><a href="https://skylab.cdph.ca.gov/ODdash/" target="_blank" rel="noopener noreferrer"><img align="center" src="opioid-dashboard.png"></a></div><figcaption><i>CDPH's <a href="https://skylab.cdph.ca.gov/ODdash/" target="_blank" rel="noopener noreferrer">Opioid Overdose Surveillance application</a></i></figcaption></figure><p>The CalCAT dashboard project was born out of CDPH&rsquo;s experience fielding a prior public-facing Shiny dashboard in 2016, namely the <a href="https://skylab.cdph.ca.gov/ODdash/" target="_blank" rel="noopener noreferrer">CDPH Opioid Overdose Surveillance application</a>. That application evolved largely from:</p><ul><li><strong>A need to get data out quickly.</strong> CDPH didn&rsquo;t really have an enterprise-level dashboarding solution secured in 2016. When the opioid crisis arrived, the department realized it needed to get data out quickly and update it as needed as the epidemic gripped the state.</li><li><strong>The ability to deploy a dashboard using free software and cloud resources.</strong> When looking for a dashboarding solution, one of the developers evaluated Shiny, realized it was free and open source, and that RStudio offered shinyapps.io for a very low cost way for CDPH to deploy it. Without the need for a capital investment to get started, they created some basic visualizations, shopped them to leadership including the director of the department, and got the full go-ahead to develop and deploy shortly thereafter. This allowed them to get their opioid dashboard out in 3 or 4 months, which was unheard of at the time.</li><li><strong>A positive reception by users.</strong> California was one of the first states in the country that had a public opioid overdose dashboard. This positive experience with Shiny and shinyapps.io generated interest in R and encouraged the building of more internal infrastructure for hosting and deploying these apps.</li></ul><h2 id="covid-19s-arrival-made-sharing-data-mission-critical">COVID-19&rsquo;s Arrival Made Sharing Data Mission Critical</h2><p>When COVID-19 arrived in the United States in early 2020, many organizations, both inside and outside of the California Department of Public Health, suddenly found themselves wanting data to respond to the pandemic. That demand led to:</p><ul><li><strong>The formation of the CalCAT development team.</strong> CalCAT evolved out of some early work with Johns Hopkins University regarding scenario-based models. Initially, CalCAT just wanted to develop a quick lightweight app to explore the simulations that Johns Hopkins was providing and to share it using an RStudio Connect server with other CDPH staff.</li><li><strong>Creation of a extranet-hosted Shiny dashboard for COVID-19.</strong> Based on their experience with the Opioid Dashboard, the team developed an internal Shiny app to provide visualizations of what was going on throughout the state. As the dashboard evolved, CDPH moved it to the state government extranet for others to access.</li><li><strong>Expanding the dashboard to serve other departments with data.</strong> While the app began as an effort to share data with county health officers and local epidemiologists, people from other departments started asking, &ldquo;How did you get this number? We can&rsquo;t replicate it.&rdquo; That led the team to expand the app to allow users to download the code and data behind the visualizations and do their own analyses.</li></ul><p>Once other departments gained access to the data, the app quickly became a vital source of COVID information throughout the state because it:</p><ul><li><strong>Allowed authenticated access to internal confidential data.</strong> Because the COVID dashboard authenticated county health officers to gain access to the Shiny app, it could include aggregated confidential data beyond what would normally be available to the general public.</li><li><strong>Supported county-based dashboards.</strong> County health jurisdictions found that they could download their county&rsquo;s data and republish it on their own dashboards, thereby giving their users visibility into their local situation.</li><li><strong>Drove county-level pandemic actions.</strong> California established <a href="https://covid19.ca.gov/safer-economy/" target="_blank" rel="noopener noreferrer">hard metrics such as case and infection rates</a> to guide which businesses were allowed to open. The data published by this extranet dashboard ensured everyone was working from a consistent set of measurements and actions that were authorized by the state.</li></ul><h2 id="responding-to-the-emergency-creating-a-public-dashboard-for-california-citizens">Responding to the Emergency: Creating A Public Dashboard for California Citizens</h2><figure><div style="padding: 35px 0 0 0;"><a href="https://calcat.covid19.ca.gov/cacovidmodels/" target="_blank" rel="noopener noreferrer"><img align="center" src="covid-dashboard.jpg"></a></div><figcaption><a href="https://calcat.covid19.ca.gov/cacovidmodels/" target="_blank" rel="noopener noreferrer"><i>The CalCAT public dashboard</i></a></figcaption></figure><p>The extranet site helped CDPH and the county health officers understand both the depth and breadth of pandemic infections within California. However, on March 4, 2020, the following announcement spurred the department to build a public site.</p><blockquote><p>&ldquo;As part of the state&rsquo;s response to address the global COVID-19 outbreak, Governor Gavin Newsom today declared a State of Emergency to make additional resources available, formalize emergency actions already underway across multiple state agencies and departments, and help the state prepare for broader spread of COVID-19. The proclamation comes as the number of positive California cases rises and following one official COVID-19 death.&rdquo; <i>&ndash; Gavin Newsom, Governor of California, March 4, 2020</i></p></blockquote><p>In response to the Governor&rsquo;s mandate, the team:</p><ul><li><strong>Deployed the public COVID dashboard app you see today.</strong> Based on their work with their internal county-based dashboard and with advice from DJ Patil, the Chief Data Scientist of the United States in the Obama administration, the team modified and upgraded the internal county-based app into what you currently see today. This dashboard allows people to explore both the California models and an ensemble of estimates from other organizations to provide a single picture for the state and its counties. The team used R to do some statistical work in the background while also creating interactive visualizations to share those results.</li><li><strong>Made their code open source.</strong> The CalCAT team made <a href="https://github.com/StateOfCalifornia/CalCAT" target="_blank" rel="noopener noreferrer">the source code for the site public on Github</a> so anybody in the world could access and improve on it. In addition to the website, they also created an open data portal for the state that includes additional aggregated data.</li></ul><h2 id="cdpds-r-infrastructure-evolved-to-support-the-pandemic-efforts">CDPD&rsquo;s R Infrastructure Evolved to Support the Pandemic Efforts</h2><p>As CalCAT gained popularity and the team gained experience, the infrastructure supporting the team evolved to meet the new demands by adding:</p><ul><li><strong>Multiple hosting environments.</strong> The CalCAT environment now features both a public-facing environment and an extranet environment that requires authentication with partners and staff. CDPH now also has internal testing platforms on which they run apps before they go out to the public-facing and extranet servers.</li><li><strong>Professional products</strong>. While the project started off with open source Shiny Servers and shinyapps.io for the Opioid Dashboard, the team later moved to RStudio Server Pro for development and then added RStudio Connect and RStudio Package Manager for publishing. They now run multiple instances of those products to spread the workload out and accommodate the millions of users who access the public site.</li><li><strong>Collaborative workflows</strong>. Once the team grew beyond just one or two developers, it created <a href="https://github.com/StateOfCalifornia/CalCAT" target="_blank" rel="noopener noreferrer">a Github repository</a> where it could collaboratively work on code, push changes, and adopt changes from others. While this workflow required scientists within the department to learn basic devops software development techniques, the team decided the benefits from collaboration were worth climbing that learning curve.</li></ul><h3 id="calcats-success-has-encouraged-r-use-within-cdph">CalCAT&rsquo;s Success Has Encouraged R Use Within CDPH</h3><p>The project team noted how much the Opiod dashboard changed CDPH&rsquo;s thinking about how R could be used to deliver data to the public by:</p><ul><li><strong>Providing examples of what was possible.</strong> The Opioid dashboard expanded the scope of what could be done with CDPH data. The CalCAT dashboard proved that, with the help of their infrastructure and IT team, such applications could be scaled up to provide service to the public. Collaborating with IT also introduced the CalCAT scientists to software tools they wouldn&rsquo;t have discovered themselves.</li><li><strong>Rapidly deploying new apps.</strong> After the COVID dashboard was up, other groups started asking for new apps that could tackle other aspects of the crisis. One such application was a very simple program to create unique IDs for COVID tests, which was mandated and published within a week. The ability to respond quickly to department needs burnished R&rsquo;s reputation within CDPH.</li><li><strong>Creating an internal R community.</strong> The team is already seeing real expansion in personnel with R skills, especially in hiring. Their job descriptions now ask for R skills, and people are being recruited from other disciplines. Increasingly, the personnel within the department are coming in with R experience.</li><li><strong>Embracing a code-based approach.</strong> One developer noted that writing code to do data science instead of using a point-and-click tool was analogous to a team doing rock climbing. Working code creates a path and anchors for others to use, and new developers then can use those anchors to follow in their footsteps.</li></ul><h2 id="takeaways">Takeaways</h2><p>The CalCAT experience shows that, despite claims to the contrary, R can be used for large-scale production applications. When we re-examine the three categories of concern about R with which we started the piece, we discover that:</p><ul><li><strong>Speed of development was the key to success.</strong> This was an application that had to be deployed quickly in response to a national emergency. Using R and Shiny allowed the team to deploy an interactive app that provided access to COVID data in weeks, not months.</li><li><strong>Scaling up production use was an evolutionary process.</strong> The team took advantage of its prior experience with the Opioid Dashboard to deploy both the extranet and public versions of the COVID-19 application. The team had already deployed public apps on shinyapps.io and had deployed server infrastructure in house as part of their extranet application. When the time came to go public with the public CalCAT dashboard, scaling up became mostly a matter of replicating servers they already had experience with.</li><li><strong>Infrastructure to support this application was available off the shelf.</strong> Instead of having to roll their own deployment process, the group was able to use RStudio&rsquo;s server product suite to do the app development as well as the large-scale deployment on an array of RStudio Connect servers.</li></ul><p>By using a code-based approach, the California Department of Public Health has built a repository of human and intellectual capital around building public health dashboards. This small team&rsquo;s work and open source code can now be passed on to others both within and outside of California government. Their efforts will likely spawn new projects that will better inform citizens and continue to help them stay safe throughout this unprecedented pandemic.</p><h2 id="to-learn-more">To Learn More</h2><p>You can learn about each of RStudio&rsquo;s commercial products by following the links below.</p><ul><li><a href="https://rstudio.com/products/rstudio-server-pro/" target="_blank" rel="noopener noreferrer">RStudio Server Pro</a> delivers fully integrated development environments for R and Python accessible via a browser.</li><li><a href="https://rstudio.com/products/connect/" target="_blank" rel="noopener noreferrer">RStudio Connect</a> connects data scientists with decision makers with a one-button publishing solution from the RStudio IDE.</li><li><a href="https://rstudio.com/products/package-manager/" target="_blank" rel="noopener noreferrer">RStudio Package Manager</a> controls package distribution for reproducible data science.</li><li><a href="https://rstudio.com/products/team/" target="_blank" rel="noopener noreferrer">RStudio Team</a> bundles RStudio Server Pro, RStudio Connect, and RStudio Package Manager products to ease purchasing and administration.</li></ul></description></item><item><title>Why RStudio Focuses on Code-Based Data Science</title><link>https://www.rstudio.com/blog/an-interview-with-lou-bajuk/</link><pubDate>Tue, 17 Nov 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/an-interview-with-lou-bajuk/</guid><description><p>Michael Lippis of The Outlook podcast recently interviewed RStudio&rsquo;s Lou Bajuk to discuss data science with R and Python, and why RStudio encourages its customers to adopt a multi-lingual data science approach. During the interview, Michael and Lou examined three main topics:</p><ol><li><a href="#mission">RStudio&rsquo;s mission to support open source data science</a></li><li><a href="#randpy">How and why RStudio supports R and Python within its products</a></li><li><a href="#dsinvest">How business leaders are delivering value from data science investments</a></li></ol><p>I&rsquo;ve extracted the most interesting parts of the podcast interview below and edited the quotes for clarity and length. <a href="https://www.rstudio.com/collections/additional-talks/r-and-python-build-greater-analytic-investment-value-outlook-podcast-2020/">You can listen to the entire interview here</a>.</p><h2><a name="mission">RStudio's Mission To Benefit Open Source Data Science</a></h2><div class="question-quote"><span class="speaker-name">Mike:</span>What has been the focus of RStudio since its inception?</div><div class="speaker-quote"><span class="speaker-name">Lou:</span>From the beginning, our primary purpose has been to create free and open source software for data science, scientific research, and technical communication. We do this because free and open source software enhances the production and consumption of knowledge and really facilitates collaboration and reproducible research, not only in science, but in education and industry as well. To support this, we spend over half of our engineering resources developing free and open-source software as well as providing extensive support to the open-source data science community.</div><div class="question-quote"><span class="speaker-name">Mike:</span>How does RStudio help organizations make sense of data regardless of their ability to pay?</div><div class="speaker-quote"><span class="speaker-name">Lou:</span>We do this as part of our primary mission around supporting open source data science software. It allows anyone with access to a computer to participate freely in a global economy that really rewards and demands data literacy. So the core of our offerings which enables everyone to do data science is, and will always be, free and open source.</div><div class="no-speaker-quote">However for those organizations that want to take the data science that they do in R and Python and deploy it at scale, our professional products provide an enterprise-ready modular platform to help them do that. This platform addresses the security, scalability, and other enterprise requirements organizations need to allow their team to deploy their work, collaborate within their team, and communicate with the decision makers that they ultimately support.</div><div class="question-quote"><span class="speaker-name">Mike:</span>So what is RStudio's public benefit?</div><div class="speaker-quote"><span class="speaker-name">Lou:</span>We announced in January that we're now registered as a Public Benefit Corporation (PBC) in Delaware. We believe that corporations should strive to fulfill a public beneficial purpose and that they should be run for the benefit of all of our stakeholders. And this is something that's really a critical part of our founder and CEO JJ Allaire's philosophy.</div><div class="no-speaker-quote">Our stated public benefit is to create open source software for scientific and technical computing, which means that the open source mission we've been talking about is codified into our corporate charter. And as a PBC, we are committed to considering the needs of not only our shareholders, but all our stakeholders including our community, our customers and employees.</div><div class="no-speaker-quote">And as part of this, we're now also a Certified B Corporation®, which means we've met the certification requirements set out by the nonprofit <a href="https://bcorporation.net/about-b-lab" target="_blank" rel="noopener noreferrer">B Lab®</a>. That means that we've met the highest verified standards of things like social and environmental performance, transparency, and accountability.</div><h2><a name="randpy">Multilingual Data Science</a></h2>The interview continued with Lou diving into why RStudio has committed to supporting both R and Python within its products.<div class="question-quote"><span class="speaker-name">Mike:</span>Why are R and Python in RStudio, and what challenges are you addressing?</div><div class="speaker-quote"><span class="speaker-name">Lou:</span>In talking to our many customers and others in the data science field, we've seen that many data science teams today are bilingual, leveraging both R and Python in their work. And while both languages have unique strengths, these teams frequently struggle to use them together.</div><div class="no-speaker-quote">So for example, a data scientist might find themself constantly needing to switch contexts between multiple development environments. The leader of a data science team might be wrestling with how to share results from their team consistently, so they could deliver value to the larger organizations while promoting collaboration between the R and Python users on their team. The Dev Ops and IT admins spend time and resources attempting to maintain, manage, and scale separate environments for R and Python in a cost effective way.</div><div class="no-speaker-quote">To help data science teams and the organizations they're in solve these challenges, and in line with our ongoing mission to support the open source data science ecosystem, we've focused our professional products on providing a single centralized infrastructure for bilingual teams using R and Python.</div><div class="question-quote"><span class="speaker-name">Mike:</span>Is it possible for a data scientist to use R and Python in a single project?</div><div class="speaker-quote"><span class="speaker-name">Lou:</span>Absolutely. And there multiple ways that can be done. One of the most popular is an open source package called <a href="https://rstudio.github.io/reticulate/"><code>reticulate</code></a> that we've developed. <code>Reticulate</code> is an open source package that is available to anyone using R. It provides a comprehensive set of tools for interoperability between Python and R, including things like:</div><ul class="no-speaker-quote"><li>Calling Python from R in a variety of ways, whether you're doing something with R Markdown, importing Python modules, or using Python interactively within an R session.</li><li>Translating data objects between R and Python</li><li>Binding to different versions of Python, including virtual and Conda environments.</li></ul><div class="no-speaker-quote"></div><div class="question-quote"><span class="speaker-name">Mike:</span>What about data scientists whose primary language is Python? What does RStudio provide for them?</div><div class="speaker-quote"><span class="speaker-name">Lou:</span>First off, we've been working on making the RStudio IDE a better environment for Python coding. In addition to the <code>reticulate</code> package we just discussed, we've just announced some new features in the upcoming release of our IDE, RStudio 1.4. This includes displaying Python objects in the environment pane, viewing Python data frames, and tools for configuring Python versions and different Conda virtual environments. All this is going to make life easier for someone who wants to code Python within the RStudio IDE.</div><div class="no-speaker-quote">Secondly, for a team where you might have multiple different data scientists who have different preferences for what IDEs they want to use, our pro platform provides a centralized workbench supporting multiple different development environments. In addition to our own IDE we support Jupyter notebooks and Jupyter Lab as development environments, and we're working on more options for the near future. This includes Visual Studio Code, which we're going to be announcing a beta of very shortly. </div><div class="no-speaker-quote">And finally with our platform, Python-oriented data scientists can create data products and interactive web applications such as Plotly, Streamlit, or Bokeh in their framework of choice and then directly share those analyses with their stakeholders.</div><div class="no-speaker-quote">We believe this ability for Python data scientists to share their results in a single place alongside the data products created by the R data scientists is critical to actually impacting decision making at an organization</div><div class="question-quote"><span class="speaker-name">Mike:</span>How can data science leaders promote collaboration across a bilingual team?</div><div class="speaker-quote"><span class="speaker-name">Lou:</span>Data science leaders often see their teams struggle to collaborate and share work across disparate, open source tools. Often they waste time translating code from one language to another to put it into production. These activities really distract them from their core work. And as a result, their business stakeholders are less likely to see results or must wait longer for them.</div><div class="no-speaker-quote">With RStudio products, a bilingual team can work together, building off each other's work. Best of all, it can publish, schedule, and email regular updates for interactive analyses and custom reports built in both languages. So you, the data science team, and your stakeholders will always know where to look for these valuable insights.</div><div class="question-quote"><span class="speaker-name">Mike:</span>Has RStudio done anything to help DevOps engineers and IT administrators deal with the difficulty of maintaining separate environments for each data science language?</div><div class="speaker-quote"><span class="speaker-name">Lou:</span>Absolutely. DevOps and IT is a critical stakeholder in the whole process of doing data science effectively in your organization. So with RStudio products, DevOps, and IT can maintain a single infrastructure for provisioning, scaling and managing environments for both R and Python users. This means that IT only needs to configure, maintain and secure a single system.</div><div class="no-speaker-quote">A single system also makes it easy for IT to leverage their existing automation tools and other analytic investments and provide data scientists with transparent access to their servers or Kubernetes or SLURM clusters, directly from the development tools those data scientists prefer. They can easily configure all the critical capabilities around access, monitoring, and environment management. And of course RStudio's Support, Customer Success, and Solutions Engineering teams are here to help and advise these teams as they scale out their applications.</div><div class="question-quote"><span class="speaker-name">Mike:</span>How do business stakeholders view bilingual data science teams?</div><div class="speaker-quote"><span class="speaker-name">Lou:</span>Ultimately most decision makers really don't care what language a data science insight was created in. They just want to be able to trust the information and use it to make the right decision. That's why we're so focused on making it easy for data scientists to create these data products, regardless of whether they're R or Python, and then easily share them with their different stakeholders.</div><h2><a name="dsinvest">Delivering Value from Data Science Investments</a></h2>Mike and Lou wrapped up by discussing how businesses can improve the value they derive from their data science.<div class="question-quote"><span class="speaker-name">Mike:</span>What would you say to business leaders that are worried about the value of their data science investment?</div><div class="speaker-quote"><span class="speaker-name">Lou:</span>One of the big challenges that organizations face with data science is not just how they solve today's problems, but how they ensure that they continue to deliver value over time. Too many organizations find themselves either struggling to maintain the value of their legacy systems, reinventing the wheel year after year, or being held over a barrel by vendor lock-in.</div><div class="no-speaker-quote">To address this, we recommend a few approaches:</div><div class="no-speaker-quote">One is to build your analyses with code, not clicks. Data science teams should use a code-oriented approach because code can be developed and adapted to solve similar problems in the future. This reusable and extensible code then becomes the core intellectual property for an organization. It'll make it easier over time to solve new problems and to increase the aggregate value of your data science work. This is why code-first data science is really a critical part of RStudio's philosophy and our roadmap.</div><div class="no-speaker-quote">The second major approach is to manage your data science environments for reproducibility. Organizations need a way to reproduce reports and dashboards as projects, tools, and dependencies change. You'll often hear about repeatability and reproducibility when talking about a heavily regulated environment like pharmaceuticals, and it's certainly particularly critical there. However, it's critical for every industry; otherwise your team may spend far too much time attempting to recreate old results. Worse, you may get different answers to the same questions at different points in time, which really undermines your team's credibility.</div><div class="no-speaker-quote">And third, deploy tools and interactive applications to keep insights up to date, because no one wants to make a decision based on old data. Publishing your insights on web-based interactive tools such as the RStudio Connect platform helps keep your business stakeholders up-to-date and gives them on demand access and scheduled updates. By deploying insights in this way, your data scientists are free to spend their time solving new problems rather than solving the same problem again and again.</div><div class="question-quote"><span class="speaker-name">Mike:</span>Has RStudio done anything to empower business stakeholders with better decision-making?</div><div class="speaker-quote"><span class="speaker-name">Lou:</span>This is really a key focus of ours. Many data science vendors out there focus on creating models and then putting these models into "production". which typically means integrating these models into some system for automated decision-making. For example, a model might determine what marketing offer to present to someone who visits a website.</div><div class="no-speaker-quote">Though our products certainly support this through the ability to deploy R and Python models as APIs to plug into other systems, our focus is broader. We want to make it easy for a data science team to create tailored reports, dashboards, and interactive web-based applications, using frameworks like Shiny that they can then easily and iteratively share with their decision makers. This iterative and interactive aspect is critical because decision-makers will invariably come back with questions like "What if you run this analysis on a different time period?" or "What if this parameter is different?".</div><div class="no-speaker-quote">Interactive applications give these decision-makers tremendous flexibility to answer their own "what if?" questions. When it's easy for the data scientist to create a new version, tweak the code, and redeploy it, it's also more convenient for the decision maker. It allows them to get a timely answer that's really super focused on what they actually need as opposed to a generic report.</div><div class="no-speaker-quote">We call these reports tailored or curated because of their flexibility. Open source data science means that these teams can provide their stakeholders with exactly the information they need in the best format for presenting that information rather than being constrained by the black box limitations of a BI reporting tool.</div><div class="question-quote"><span class="speaker-name">Mike:</span>Can you provide the Outlook series audience with an overview of RStudio Team?</div><div class="speaker-quote"><span class="speaker-name">Lou:</span>RStudio Team is a bundle of our professional software for data analysis, package management, and data product sharing. The Team product is a way of getting all three products, but each of these products can also be purchased individually to fit into and complement an organization's existing data science investments.</div><div class="no-speaker-quote">The first component is RStudio Server Pro, which provides a centralized work bench for analyzing data and then developing and sharing new data products and interactive applications. This is the platform where the data scientists develop their insights. </div><div class="no-speaker-quote">Secondly, RStudio Connect is a centralized portal for distributing these dashboards, reports and applications created by the data scientists, whether they're written in R and Python. This includes the ability to schedule and send email reports to your community of users and to provide all the access control and scalability and reproducibility that a modern enterprise really needs. </div><div class="no-speaker-quote">Thirdly, RStudio Package Manager supports both the development side (RStudio Server Pro) and the deployment side (RStudio Connect) by managing the wealth of open source data science packages you might need to create and run these analyses. Open source data science hosts a world of people creating these great packages on the cutting edge of statistics and data science but managing these packages over time can be very difficult. RStudio Package Manager makes maintenance and reproducibility much easier.</div><div class="question-quote"><span class="speaker-name">Mike:</span>All right, Lou. So, can you share a use case with our audience?</div><div class="speaker-quote"><span class="speaker-name">Lou:</span>We have a ton of <a href="https://www.rstudio.com/about/customer-stories/">great customer stories at rstudio.com</a>, but one of my favorites is Redfin. Redfin is a technology-powered real estate brokerage that serves more than 90 metropolitan areas across the U.S. and Canada. Now when Redfin was smaller, they used to do a lot of planning using basic data models implemented in spreadsheets and gathering input from emails or files saved in Google drive.</div><div class="no-speaker-quote">But Redfin wanted to get better, smarter answers. They wanted to make these models more complex and to scale these models to handle the increasing scope of the business. And they found that spreadsheets just wouldn't work anymore. They weren't able to apply the more statistical approaches for forecasting that they wanted and maintaining the formulas and spreadsheets was error prone and slow. Plus, the amount of time that it took to consolidate user input into these spreadsheets limited how many iterations of their models they could run. These workbooks then would be painfully slow, sometimes taking 10 more minutes to open up and use. Sometimes they would crash, leaving people unable to use them at all. </div><div class="no-speaker-quote">Redfin used RStudio products to move their data models from spreadsheets to a much more reproducible and scalable data science environment. They saw our products as a way to replicate the interactivity that users loved in spreadsheets, but host all this on a server that was easy to access and maintain. This approach allowed them to build in all those complex statistical approaches that they wanted while still keeping the end interface simple for the end users.</div><div class="question-quote"><span class="speaker-name">Mike:</span>All right, Lou. Where can the audience get more information on RStudio's solutions?</div><div class="speaker-quote"><span class="speaker-name">Lou:</span>All the information is available on our website at <a href="https://www.rstudio.com/">rstudio.com</a> and there we talk about our products. We also make it easy to either download our products and try them out, or set up a call with our great sales team to help provide some guidance and answer any questions you have. I also encourage your listeners to follow <a href="https://blog.rstudio.com">the RStudio blog at blog.rstudio.com</a>, where we write about many of the themes I talked about today as well as share updates on our products and our company.</div><h2><a name="randpy">For More Information</a></h2><p>If you&rsquo;d like to learn more about some of the topics discussed in this interview, we recommend exploring:</p><ul><li>An overview of how RStudio helps multi-lingual data science teams at <a href="https://www.rstudio.com/solutions/r-and-python/">R & Python: A Love Story </a>.</li><li><a href="https://www.rstudio.com/about/">RStudio's mission and status</a> as a Public Benefit Corporation.</li><li><a href="https://www.rstudio.com/about/customer-stories/">More examples</a> of the problems RStudio's customers are solving with our products.</li></ul></description></item><item><title>RStudio 1.4 Preview: New Features in RStudio Server Pro</title><link>https://www.rstudio.com/blog/rstudio-1-4-preview-server-pro/</link><pubDate>Mon, 16 Nov 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-1-4-preview-server-pro/</guid><description><p><em>This blog post is part of a series on new features in RStudio 1.4, currently available as a <a href="https://www.rstudio.com/products/rstudio/download/preview/">preview release</a>.</em></p><p>Today, we&rsquo;re going to talk about what&rsquo;s new in RStudio Server Pro (RSP) 1.4. The 1.4 release includes integration with a frequently requested editor (VS Code), several quality of life improvements for working with Launcher environments, new user administration commands, and long-awaited SAML support! Let&rsquo;s get started!</p><h2 id="rstudio-server-pro">RStudio Server Pro</h2><h3 id="single-sign-on-authentication-with-saml-20--openid-connect">Single Sign-On Authentication with SAML 2.0 &amp; OpenID Connect</h3><p>RSP 1.4 comes with native support for SAML and OpenID authentication for Single Sign-On. This allows RSP to leverage any authentication capabilities provided by your organization&rsquo;s Identity Management such as multi-factor authentication.</p><p><strong>Even when using SSO authentication with SAML or OpenID, RSP continues to require local system accounts.</strong> Similar to the authentication mechanisms supported previously by RSP, automatic account creation (provisioning) can be done via <code>sssd</code> integration with your LDAP or Active Directory and with RSP configured to use PAM sessions. You can find more information in the admin guide <a href="https://docs.rstudio.com/ide/server-pro/1.4.1021-2/authenticating-users.html#user-provisioning">here</a> and <a href="https://docs.rstudio.com/ide/server-pro/1.4.1021-2/authenticating-users.html#pam-basics">here</a>.</p><p>If you already have LDAP or Active Directory integration working with RSP with PAM or proxied authentication, getting SAML or OpenID working is just a matter of configuring both RSP and your organization&rsquo;s Identity Management to trust each other. We have some migration recommendations described <a href="https://docs.rstudio.com/ide/server-pro/1.4.1021-2/authenticating-users.html#authentication-migration">here</a>.</p><p>When configuring your Identity Management, the only information RSP needs to know about each user is their local account username, so this information is required in assertions or claims sent during authentication. By default, RSP expects an attribute called &ldquo;Username&rdquo; (case-sensitive) for SAML and a claim called &ldquo;preferred_username&rdquo; for OpenID, but those can be customized if necessary.</p><p>Note that RSP will not be able to use email addresses or any other user identifier for authentication purposes. If <code>sssd</code> integration is used, the username received by RSP must exactly match the one provided by <code>sssd</code> for the same user.</p><p>The admin guide contains more information on how to configure <a href="https://docs.rstudio.com/ide/server-pro/1.4.1021-2/authenticating-users.html#saml-single-sign-on-authentication">SAML</a> and <a href="https://docs.rstudio.com/ide/server-pro/1.4.1021-2/authenticating-users.html#openid-connect-authentication">OpenID</a>.</p><blockquote><p>Note: SAML and OpenID cannot yet be configured with Google because it does not provide usernames, only emails. If Google is your preferred authentication, you can keep using it, but be aware it will be deprecated in a future release. We will provide a migration path from Google accounts to OpenID at that time.</p></blockquote><h3 id="vs-code-sessions-preview">VS Code Sessions (Preview)</h3><p>Many data science teams use <a href="https://code.visualstudio.com/">VS Code</a> side by side with RStudio as a tool for reproducible research. In this RSP update, we&rsquo;re making it easier to use these tools together; you can now run VS Code sessions in addition to RStudio and Jupyter sessions inside RSP, providing your data scientists with all of the editing tools they need to do their data science more effectively!</p><p>Just like RStudio sessions, RSP manages all of the authentication and supervision of VS Code sessions, while providing you a convenient dashboard of running sessions. Starting a new VS Code session is as easy as choosing <code>VS Code</code> when you start a new session.</p><img align="center" src="vscode-session.png"><p>Note that RStudio does not bundle VS Code (it must be installed separately) and that VS Code is only available when RSP is configured with the Job Launcher. The VS Code editing experience is provided by the use of the open source <a href="https://github.com/cdr/code-server">code-server</a> which must be installed and configured in order to be used. This setup can be done easily by simply running the command <code>sudo rstudio-server install-vs-code &lt;install path&gt;</code>, which will download all the necessary binaries and automatically configure the <code>/etc/rstudio/vscode.conf</code> file which enables VS Code integration. See the <a href="https://docs.rstudio.com/ide/server-pro/1.4.1021-2/vs-code-sessions-preview.html">admin guide</a> for more details.</p><p>Currently, VS Code Sessions are a Preview feature. The feature itself is stable and usable, but you may find some bugs, and we are still working to complete some aspects of the VS Code development workflow. We highly encourage you to submit your feedback to let us know how we can improve!</p><h3 id="job-launcher-project-sharing">Job Launcher Project Sharing</h3><p>In previous versions of RSP, use of the Job Launcher automatically prevented you from using the Project Sharing and Realtime Collaboration features within RStudio sessions. We&rsquo;re excited to announce that this limitation has now been removed, and you can share projects within Launcher sessions just the same as with regular sessions.</p><p>By default, when selecting the users to share projects with from within a session, only users that have signed in and used RSP will be shown, whereas previously the entire system&rsquo;s users were displayed. This previous behavior was in some cases exhausting, and now also makes no sense in containerized environments (e.g., Kubernetes). The old behavior can be restored by setting <code>project-sharing-enumerate-server-users=1</code> in the <code>/etc/rstudio/rsession.conf</code> configuration file.</p><img align="center" src="project-sharing.png"><h3 id="local-launcher-load-balancing">Local Launcher Load Balancing</h3><p>In previous versions of RSP, if you wanted to load balance your sessions between multiple nodes running the Local Job Launcher plugin, you had to use an external load balancer to balance traffic between Job Launcher nodes. In RSP 1.4, load balancing has been improved when used with the Local Launcher to ensure that sessions are automatically load balanced across Launcher nodes that are running RSP and configured in the load balancer configuration file <code>/etc/rstudio/load-balancer</code>. Simply ensure that each RSP instance is configured to connect to its node-local Launcher instance. For more details, see the <a href="https://docs.rstudio.com/ide/server-pro/1.4.1021-2/job-launcher.html#load-balancing-1">admin guide</a>.</p><h3 id="user-administration">User Administration</h3><p>RSP 1.3 introduced the ability to track named user licenses visually in the admin dashboard, as well as the ability to lock users that are no longer using RSP to free up license slots. In 1.4, we have added new admin commands to perform these operations from the command line instead of having to use the GUI. These commands allow you to:</p><ul><li>List all RSP users</li><li>Add new users before they have signed in, indicating whether or not they should have administrator privileges</li><li>Change the admin status of a user</li><li>Lock and unlock users</li></ul><p>Documentation for these commands can be found in the <a href="https://docs.rstudio.com/ide/server-pro/1.4.1021-2/server-management.html#listing-users">admin guide</a>.</p><hr><p>If you&rsquo;re interested in giving the new RStudio Server Pro features a try, please <a href="https://www.rstudio.com/products/rstudio/download/preview">download the RStudio 1.4 preview</a>. Note that RStudio Server Pro 1.4 requires database connectivity; see the <a href="http://docs.rstudio.com/ide/server-pro/1.4.1021-2/database.html">admin guide</a> for full documentation on prerequisites.</p></description></item><item><title>Where Does RStudio Fit into Your Cloud Strategy?</title><link>https://www.rstudio.com/blog/cloud-strategy/</link><pubDate>Thu, 12 Nov 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/cloud-strategy/</guid><description><p><sup>Photo by <a href="https://unsplash.com/@mantashesthaven?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText" target="blank" rel="noopener noreferrer"> Mantas Hesthaven</a> on <a href="https://unsplash.com/s/photos/journey?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText" target="blank" rel="noopener noreferrer">Unsplash</a></sup></p><p>Over the last few years, more companies have begun migrating their data science work to the cloud. As they do, they naturally want to bring along their favorite data science tools, including RStudio, R, and Python. In this blog post, we discuss the various ways RStudio products can help you along that journey.</p><h2 id="why-do-organizations-want-to-move-to-the-cloud">Why Do Organizations Want to Move to the Cloud?</h2><p>There are many reasons why organizations are looking to use cloud services more widely for data science. They include:</p><ul><li><strong>Long delays and high startup costs for new data science teams:</strong> When you bring a new team of data scientists onboard, it can be costly and time consuming to spin up the necessary hardware for the team. New hardware might be needed for developing data science analyses or for sharing interactive Shiny applications for stakeholders. These burdens tend to fall either on the individual data scientists or on DevOps and IT administrators who are responsible for configuring servers.</li><li><strong>Obstacles to collaboration between organizations or groups:</strong> If a team is restricted to operating within their organization&rsquo;s firewall, it can be very difficult to support collaboration or instruction between groups that don&rsquo;t normally interact with each other. For example, running a data science workshop or statistics class can be unwieldy if everyone is working within their own separate environments.</li><li><strong>High costs of computing infrastructure:</strong> Another key challenge is the potentially high costs of setting up and maintaining an organization&rsquo;s computing infrastructure, including both hardware and software. These costs include the initial investments, maintenance and upgrade fees, and the related manpower costs.</li><li><strong>Difficulty scaling to meet variable demand:</strong> Scaling server resources to satisfy highly variable data science demands can be very difficult because organizations rarely maintain excess capacity. For example, an organization may want to publish a news article or a COVID dashboard for which they expect high demand, only to discover that it needs the IT organization to spin up a back-end Kubernetes cluster to handle the load.</li><li><strong>Excessive time and costs moving the data to the analysis:</strong> If an organization&rsquo;s data is already stored on one of the major cloud providers or in a remote data center, moving that data to your laptop for analysis can be slow and expensive. Ideally, you should perform the data access, transformation and analysis as close to where the data lives as possible. Not doing so could subject you to excessive data transfer charges to move the data.</li></ul><h2 id="let-your-data-science-goals-drive-your-cloud-strategy">Let Your Data Science Goals Drive Your Cloud Strategy</h2><p>Depending on the circumstances of your organization and what specific challenges you are trying to address, you should consider four possible options for your data science cloud strategy:</p><ul><li><strong>Hosted and Software as a Service (SaaS) offerings:</strong> A fully hosted service can minimize the cost and time required to start up a new project. However, functionality may be limited compared to on premise offerings and integration with your internal data and infrastructure can be challenging.</li><li><strong>Deployment to a Virtual Private Cloud (VPC) provider:</strong> Deploying software on a major cloud platform such as Amazon Web Services (AWS) or Azure can provide the full flexibility and customization of on premise software. However, setting up a virtual private cloud application often requires more management overhead to integrate with your internal systems as well as careful administration of usage to avoid unexpected usage charges.</li><li><strong>Cloud marketplace Offerings:</strong> Pre-built applications offered on services such as the AWS and Azure Marketplaces make it easy to get started at a pay-as-you-go hourly cost, but require careful management to ensure the software is available and running only when needed.</li><li><strong>Data science in your data lake:</strong> By embedding your data science tools into your existing data platform, your computations can be run close to the data, minimize overhead, and easily tie into your data pipeline. However, this adds additional complexity and potential limitations.</li></ul><p>We&rsquo;re provided the table below to help you assess the various RStudio cloud offerings. It matches up problems and potential solutions with specific RStudio options and resources to consider. The options are arranged in order of increasing complexity of configuration and administration.</p><div class="text-center mt-5"><strong>Table 1: Summary of Cloud Options for RStudio Software</strong></div><table><thead><tr><th class="problem"> Problem </th><th class="solution"> Potential Solution </th><th class="proscons"> Pros and Cons </th><th class="options"> Options to consider </th></tr></thead><tr><td>Simplify and reduce startup costs </td><td> SaaS/Hosted offering </td><td><div class="procon">Pros:</div><ul><li>Simplest and lowest cost to deploy</li><li>Hardware and software managed by the provider</li><li>Costs may be fixed, variable or a mix of the two</li></ul><div class="procon">Cons:</div> <ul><li>Limited integration with your organization’s internal data and security protocols. </li><li>May not be cost efficient for large groups</li><li>May have limited options for custom configuration</li></ul></td><td><div class="action">Create data science analyses with <a href="https://rstudio.cloud/" target="blank" rel="noopener noreferrer">RStudio Cloud</a></div><div class="action">Share Shiny applications with <a href="https://www.shinyapps.io/" target="blank" rel="noopener noreferrer">shinyapps.io</a></div><div class="action">Manage packages with <a href="https://packagemanager.rstudio.com/client/#/" target="blank" rel="noopener noreferrer">RStudio Public Package Manager</a>, a free service to provide easy installation of package binaries, and access to previous package versions</div></td></tr><tr><td> Promote collaboration or instruction between organizations or groups </td><td> SaaS/Hosted offering </td><td> <div class="procon">Pros:</div><ul><li>Same pros as above, plus the ability to easily share projects</li></ul><div class="procon">Cons:</div><ul><li>Same cons as above</li></ul></td><td> Share projects or teach classes/workshops with <a href="https://rstudio.cloud/" target="blank" rel="noopener noreferrer">RStudio Cloud</a> </td></tr><tr><td rowspan="2"> Mitigate high costs of computing infrastructure </td><td> Marketplace Offerings </td><td> <div class="procon">Pros:</div><ul><li>Easy to get started at minimal, pay-as-you-go (hourly) cost.</li><li>Access to specialized hardware (e.g GPUs)</li></ul><div class="procon">Cons:</div><ul><li>To manage hourly costs, careful management is required to ensure software is running only when needed </li></ul></td><td>RStudio products on <a href="https://aws.amazon.com/marketplace/seller-profile?id=6185573f-e9d3-4df1-a8da-2cd4996a3561" target="blank" rel="noopener noreferrer">AWS Marketplace</a>, <a href="https://azuremarketplace.microsoft.com/en-us/marketplace/apps?search=rstudio" target="blank" rel="noopener noreferrer">Azure Marketplace</a>, and <a href="https://console.cloud.google.com/marketplace/partners/rstudio-launcher-public" target="blank" rel="noopener noreferrer">Google Cloud Platform</a>.</td></tr><tr><td> Deployment to a VPC on a major cloud provider </td><td> <div class="procon">Pros:</div><ul><li>Outsources hardware costs</li><li>Integrates with existing analytic assets on cloud platforms</li><li>Allows easy customization and configuration</li><li>Provides access to specialized hardware (e.g GPUs)</li><li>Ensures data sovereignty by running your processes in a local cloud region</li></ul><div class="procon">Cons:</div><ul><li>Complexity of managing software configuration and integration with your organization’s on-premise data and security protocols. </li><li>Costs may be highly variable, based on usage</li></ul></td><td> <div class="action">Deploy RStudio products in a VPC, using cloud formation templates for AWS and Azure ARM template (See <a href="https://github.com/rstudio/rstudio-cloud-tools" target="blank" rel="noopener noreferrer">RStudio Cloud Tools</a>)</div><div class="action">Deploy RStudio products via Docker e.g. use EKS (Elastic Kubernetes Service) on AWS. (See <a href="http://github.com/rstudio/rstudio-docker-products" target="blank" rel="noopener noreferrer">Docker images for RStudio Professional Products</a>)</div><div class="action"><a href="https://docs.rstudio.com/pro-drivers/" target="blank" rel="noopener noreferrer">Connect to cloud based data storage</a>, such as Redshift or S3.</div></td></tr><tr><td> Scale to meet variable demand </td><td> Clustering approaches, including Kubernetes </td><td> <div class="procon">Pros:</div><ul><li>Cloud-deployed applications can be easily scaled to meet demand, since cloud providers provide container resources on demand.</li></ul><div class="procon">Cons:</div><ul><li>Careful management required to avoid unnecessary compute costs, while still matching job requirements to computational needs.</li></ul></td><td><div class="action">In addition to the points above, <a href="https://solutions.rstudio.com/launcher/kubernetes/" target="blank" rel="noopener noreferrer">RStudio Server Pro's Launcher</a> integrates with Kubernetes, an industry-standard clustering solution that allows efficient scaling.</div><div class="action">RStudio Connect provides <a href="https://support.rstudio.com/hc/en-us/articles/231874748-Scaling-and-Performance-Tuning-in-RStudio-Connect" target="blank" rel="noopener noreferrer">many options to scale and tune performance</a>, including being part of an autoscaling group. These options allow Connect to deliver dashboards, Shiny applications, and other types of content to large numbers of users.</div></td></tr><tr><td> Minimize data movement </td><td> Data lakes </td><td> <div class="procon">Pros:</div><ul><li>Run your computations close to the data, minimizing overhead</li><li>Tie your data science directly into your data pipeline</li></ul><div class="procon">Cons:</div><ul><li>Adds additional complexity and potential limitations</li></ul></td><td><div class="action"><a href="https://www.qubole.com/qubole-supercharges-capabilities-for-data-science-and-exploration-via-rstudio-integration/" target="blank" rel="noopener noreferrer">RStudio Server Pro in Qubole Data Platform</a>, for Azure, AWS and GCP</div><div class="action"><a href="https://spark.rstudio.com/examples/databricks-cluster/" target="blank" rel="noopener noreferrer">Use sparklyr with DataBricks</a></div><div class="action"><a href="https://docs.rstudio.com/pro-drivers/" target="blank" rel="noopener noreferrer">Connect to cloud based data storage</a>, such as Redshift or S3.</div><div class="action">Managed RStudio Server Pro on Spark and Hadoop on Azure and AWS (<a href="https://cazena.com/data-lake-solutions/rstudio" target="blank" rel="noopener noreferrer">Cazena</a>) </div></td></tr></table><h2 id="ready-to-take-rstudio-to-the-cloud">Ready to Take RStudio to the Cloud?</h2><p>If you&rsquo;d like to take RStudio along on your journey to the cloud, you can start by exploring the resources linked in the table above. We also invite you to join us on December 2 for a webinar, &ldquo;<a href="https://rstudio.com/registration/why-data-science-in-the-cloud/" target="blank" rel="noopener noreferrer">What does it mean to do data science in the cloud?</a>&rdquo;, conducted with our partner <a href="https://www.procogia.com/" target="blank" rel="noopener noreferrer">ProCogia</a>. You can <a href="https://rstudio.com/registration/why-data-science-in-the-cloud/" target="blank" rel="noopener noreferrer">register for the webinar here</a>.</p><p>Our product team is also happy to provide advice and guidance along this journey. If you&rsquo;d like to set up a time to talk with us, you can <a href="https://rstudio.chilipiper.com/book/schedule-time-with-rstudio" target="blank" rel="noopener noreferrer">book a time here</a>. We look forward to being your guide.</p><style>table thead th {border-bottom: 1px solid #ddd;}th {font-size: 90%;background-color: #4D8DC9;color: #fff;vertical-align: center}td {font-size: 80%;background-color: #F6F6FF;vertical-align: top;line-height: 16px;}caption {padding: 0 0 16px 0;}table {width: 100%;}th.problem {width: 15%;}th.solution {width: 15%;}th.proscons {width: 35%;}th.options {width: 35%;}div.action {padding: 0 0 16px 0;}div.procon {padding: 0 0 0 0;}td.ul {padding: 0 0 0 0;margin-block-start: 0em;}table {border-top-style: hidden;border-bottom-style: hidden;border-collapse: separate;text-indent: initial;border-spacing: 2px;}table>thead>tr>th, .table>thead>tr>th {font-size: 0.7em !important;}table>tbody>tr>td {line-height: inherit;vertical-align: baseline;}table tbody td {font-size: 14px;}</style><blockquote></blockquote></description></item><item><title>The Appsilon shiny.semantic PoContest</title><link>https://www.rstudio.com/blog/the-appsilon-shiny-semantic-pocontest/</link><pubDate>Tue, 10 Nov 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/the-appsilon-shiny-semantic-pocontest/</guid><description><p>One of our Full Service Partners, <a href="https://appsilon.com/shiny" target="_blank" rel="noopener noreferrer">Appsilon</a>, recently held an internal competition to help test an open source R package they developed called <a href="https://github.com/Appsilon/shiny.semantic" target="_blank" rel="noopener noreferrer"><code>shiny.semantic</code></a>. This package is designed to quickly help create proof-of-concept Shiny apps using the Fomantic UI library. To make the competition a little more interesting, Appsilon reached out and asked if we would judge the submissions on their technical and creative merit. Here’s a sneak peek of the apps we got to review, and a summary of the winners.</p><p>The goal of the <code>shiny.semantic</code> library is to help developers create beautiful, sophisticated apps rapidly for demoing or proof-of-concept (PoC) purposes. By quickly creating a visually appealing PoC, users are able to showcase the capabilities of Shiny without having to invest in a lot of development time up front.</p><p>While <code>shiny</code> works with the Bootstrap library under the hood, <code>shiny.semantic</code> uses <a href="https://fomantic-ui.com/" target="_blank" rel="noopener noreferrer">Fomantic UI</a> (formerly Semantic UI) to develop its layout structure. Fomantic UI groups elements with similar layout concepts together, letting users create beautiful reactive HTML outputs. Because the structure is semantically grouped, users can add related elements more easily, allowing them to create complex layouts with little development overhead.</p><p>Appsilon wanted to prove that with <code>shiny.semantic</code>, it’s possible to create a great looking and high quality Shiny app in under 24 hours. As a result, they ran an internal competition to test this theory.</p><p>Contest Rules:</p><ol><li>You must use the <code>shiny.semantic</code> package.</li><li>Your demonstration must be built by a single person.</li><li>Development time must not exceed 24 hours.</li></ol><p>Below are the winners of the competition along with some honorable mentions.</p><h3 id="most-technically-impressive-a-hrefhttpsdemoappsilonaiappspolluter-target_blank-relnoopener-noreferrerpolluter-alerta">Most Technically Impressive: <a href="https://demo.appsilon.ai/apps/polluter/" target="_blank" rel="noopener noreferrer">Polluter Alert</a></h3><img align="center" src="polluter-alert.png"><p>This is a Shiny dashboard created by Appsilon co-founder <a href="https://appsilon.com/author/pawel/" target="_blank" rel="noopener noreferrer">Pawel Przytula</a> that allows the user to view and report sources of air pollution in a particular area over time. The app is made up of a Leaflet map that shows live pollution sources and additional data about the sources as gathered from the <a href="https://developer.airly.eu/" target="_blank" rel="noopener noreferrer">Airly API</a>. In addition to exploring current air pollution sources, the user can also upload their own polluters using their webcam and GPS location.</p><p>This application is quite sophisticated in that it uses multiple APIs and leverages different tools to connect them. Even though it’s just a proof of concept, this app would be widely helpful for local governments.</p><p>Development Time: 17 hours<br><a href="https://github.com/Appsilon/shiny.semantic-hackathon-2020/tree/master/polluter-alert" target="_blank" rel="noopener noreferrer">Github Repository</a></p><h3 id="runner-up-a-hrefhttpsdemoappsilonaiappsfifa19-target_blank-relnoopener-noreferrerfifa-19a">Runner up: <a href="https://demo.appsilon.ai/apps/fifa19/" target="_blank" rel="noopener noreferrer">FIFA &lsquo;19</a></h3><img align="center" src="fifa.png"><p>Inspired by <a href="https://ekrem-bayar.shinyapps.io/FifaDash/" target="_blank" rel="noopener noreferrer">Ekrem Bayar’s FIFA &lsquo;19 Shiny dashboard</a> and using data from the <a href="https://sofifa.com/" target="_blank" rel="noopener noreferrer">SoFIFA dataset</a>, this application by <a href="https://appsilon.com/author/dominik/" target="_blank" rel="noopener noreferrer">Dominik Krzemiński</a> lets users explore FIFA &lsquo;19 data by comparing teams, players, and leagues. This app is a great example of all the different visualizations available in Shiny, and for being developed in under 10 hours, we are quite impressed! Dominik leads the development of <a href="http://shiny.tools" target="_blank" rel="noopener noreferrer">open source packages at Appsilon</a>, so it’s no surprise that he has such a great handle on <code>shiny.semantic</code>’s capabilities.</p><p>Development Time: 9 hours<br><a href="https://github.com/Appsilon/shiny.semantic-hackathon-2020/tree/master/fifa19" target="_blank" rel="noopener noreferrer">Github Repository</a></p><h3 id="most-creatively-impressive-a-hrefhttpsdemoappsilonaiappspixelator-target_blank-relnoopener-noreferrersemantic-pixelatora">Most Creatively Impressive: <a href="https://demo.appsilon.ai/apps/pixelator/" target="_blank" rel="noopener noreferrer">Semantic Pixelator</a></h3><img align="center" src="semantic-pixelator.png"><p>Created by <a href="https://appsilon.com/author/pedro/" target="_blank" rel="noopener noreferrer">Pedro Silva</a>, one of the Grand Prize Winners of the <a href="https://blog.rstudio.com/2020/07/13/winners-of-the-2nd-shiny-contest/" target="_blank" rel="noopener noreferrer">2020 RStudio Shiny Contest</a>, Semantic Pixelator is a clever way for users to explore images by composing them into a mosaic using loaders, icons, and other semantically-related UI elements.</p><p>The user can start with a random image or upload their own picture (as you might have recognized in the header photo at the top of this article) to start. The user can then use the sidebar to refine different parameters such as the size of the generated grid, the base element type, and different color options. The user can then use the palette generator to generate a color palette based on the result, as well as download both the current palette details and the generated composition. The app is also full of Easter eggs, so try exploring and even typing in random words!</p><p>Development Time: 24 hours<br><a href="https://github.com/Appsilon/shiny.semantic-hackathon-2020/tree/master/semantic.pixelator" target="_blank" rel="noopener noreferrer">Github Repository</a></p><h2 id="honorable-mentions">Honorable Mentions</h2><h3 id="a-hrefhttpsdemoappsilonaiappsmosaic-target_blank-relnoopener-noreferrershiny-mosaica"><a href="https://demo.appsilon.ai/apps/mosaic/" target="_blank" rel="noopener noreferrer">Shiny Mosaic</a></h3><img align="center" src="shiny-mosaic.png"><p>This application allows users to create a photo mosaic using their own photo. First, users select a theme for the final image, either dogs, cats, or a custom one. Then, either by using their webcam or uploading a photo, the user can generate and download a photo collage of their image generated from an image library of the selected theme. So, for instance, you could make a photo mosaic of a picture of your dog that is composed of hundreds of images of other dogs.</p><p>Development Time: 24 hours<br><a href="https://github.com/Appsilon/shiny.semantic-hackathon-2020/tree/master/mosaic" target="_blank" rel="noopener noreferrer">Github Repository</a></p><h3 id="a-hrefhhttpsdemoappsilonaiappssquaremantic-target_blank-relnoopener-noreferrersquaremantica"><a href="hhttps://demo.appsilon.ai/apps/squaremantic/" target="_blank" rel="noopener noreferrer">Squaremantic</a></h3><img align="center" style="padding: 0px;" src="squaremantic.png"><p>This application developed by Jakub Chojna creates a visually appealing square layout from text input. The user can update the formatting of the letters via the sidebar input and eventually download the final result as a PDF file. We can definitely see a graphic design team using this application to generate ideas and proof of concepts. Now if only it helped to create hex-shaped images of R package logos&hellip;</p><p>Development Time: 20 hours<br><a href="https://github.com/Appsilon/shiny.semantic-hackathon-2020/tree/master/squaremantic" target="_blank" rel="noopener noreferrer">Github Repository</a></p><h3 id="a-hrefhttpsdemoappsilonaiappssemantic_memory-target_blank-relnoopener-noreferrersemantic-memorya"><a href="https://demo.appsilon.ai/apps/semantic_memory/" target="_blank" rel="noopener noreferrer">Semantic Memory</a></h3><img align="center" style="padding: 0px;" src="semantic-memory.png"><p>Inspired by one of the 2019 Shiny Contest winners, <a href="https://community.rstudio.com/t/shiny-contest-submission-hex-memory-game/25336" target="_blank" rel="noopener noreferrer">Hex Memory Game</a>, Semantic Memory was created by <a href="https://appsilon.com/author/kuba/" target="_blank" rel="noopener noreferrer">Jakub Nowicki</a>, using <code>shiny.semantic</code>. Two players try to find as many pairs of RStudio and Appsilon package hex logos as they can, while the app tallies scores and displays a winner for each game.</p><p>Development Time: 12 hours<br><a href="https://github.com/Appsilon/shiny.semantic-hackathon-2020/tree/master/semantic_memory" target="_blank" rel="noopener noreferrer">Github Repository</a></p><h3 id="vote-for-the-peoples-choice-winner">Vote for the People’s Choice Winner</h3><p>Which <code>shiny.semantic</code> PoC app is your favorite? Click <a href="https://forms.gle/KPvtkdKSTsY94h6a8" target="_blank" rel="noopener noreferrer">here</a> to browse all six submissions and vote for the People’s Choice Award. The PoC app with the most votes from the R community will win a special prize from the Appsilon team.</p><h3 id="to-learn-more">To Learn More</h3><p>If you’d like learn more about using and deploying Shiny applications, we encourage you to check out the following links:</p><ul><li>The ever-popular <a href="https://github.com/rstudio/cheatsheets/raw/master/shiny.pdf" target="_blank" rel="noopener noreferrer">Shiny Cheatsheet</a> provides a tour of the Shiny package and explains how to build and customize an interactive app.</li><li>The <a href="https://shiny.rstudio.com/" target="_blank" rel="noopener noreferrer">Shiny project site</a> provides tutorials, resources and many examples of Shiny applications.</li><li>If you need to deploy and share your Shiny applications, <a href="https://www.shinyapps.io/" target="_blank" rel="noopener noreferrer">shinyapps.io</a> lets you deploy your apps on the web in minutes, while <a href="https://rstudio.com/products/connect/" target="_blank" rel="noopener noreferrer">RStudio Connect</a> allows you to share Shiny apps (and many other types of data science content) within your organization.</li></ul></description></item><item><title>RStudio 1.4 Preview: Citations</title><link>https://www.rstudio.com/blog/rstudio-1-4-preview-citations/</link><pubDate>Mon, 09 Nov 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-1-4-preview-citations/</guid><description><p><em>This post is part of a series on new features in RStudio 1.4, currently available as a <a href="https://www.rstudio.com/products/rstudio/download/preview/">preview release</a>.</em></p><p>A few weeks ago we blogged about the new <a href="https://blog.rstudio.com/2020/09/30/rstudio-v1-4-preview-visual-markdown-editing/">visual markdown editor</a> included in RStudio v1.4.Today we&rsquo;ll go into more depth on the <a href="https://rstudio.github.io/visual-markdown-editing/#/citations">citation features</a> included in visual mode, including easy insertion of citations from:</p><ol><li>Your document bibliography.</li><li><a href="#citations-from-zotero">Zotero</a> personal or group libraries.</li><li><a href="#citations-from-dois">DOI</a> (Document Object Identifier) references.</li><li>Searches of <a href="https://www.crossref.org/">Crossref</a>, <a href="https://datacite.org/">DataCite</a>, or <a href="https://pubmed.ncbi.nlm.nih.gov/">PubMed</a>.</li></ol><h2 id="inserting-citations">Inserting Citations</h2><p>You may insert citations using the <strong>Insert -&gt; Citation</strong> command, after placing your cursor in the body of your document where you&rsquo;d like to insert the citation.Alternatively, you can use markdown syntax directly (e.g. by typing <code>[@cite]</code> or <code>@cite</code>) .</p><p>Use the <kbd><img src="images/citation_2x.png" width="15" height="14"/></kbd> toolbar button or the <kbd>⇧⌘ F8</kbd> keyboard shortcut to show the <strong>Insert Citation</strong> dialog:</p><img src="images/visual-editing-citation-search.png" class="illustration" width="918"/><p>Note that you can insert multiple citations by using the add button on the right side of the item display.</p><p>If you insert citations from Zotero, DOI look-up, or a search then they are automatically added to your document bibliography.</p><h3 id="markdown-syntax">Markdown Syntax</h3><p>You can also insert citations directly using markdown syntax (e.g. <code>[@cite]</code>).When you do this a completion interface is provided for searching available citations:</p><img src="images/visual-editing-citations.png" width="700"/><p>If you aren&rsquo;t familiar with Pandoc&rsquo;s citation syntax, here&rsquo;s a quick refresher.Citations go inside square brackets and are separated by semicolons.Each citation must have a key, composed of &lsquo;@&rsquo; + the citation identifier from the database, and may optionally have a prefix, a locator, and a suffix.Here are some examples:</p><div class="illustration"><div>Blah Blah <span class="citation">[</span><span class="citation">@doe99</span>, pp. 33-35, 38-39<span class="citation">]</span>.</div><div>Blah Blah <span class="citation">[</span><spanclass="citation">@smith04</span>;<span class="citation">@doe99</span><span class="citation">]</span>.</div><div>Smith says blah <span class="citation">[</span><span class="citation">-@smith04</span><span class="citation">]</span>.</div><div><span class="citation">@smith04</span> <span class="citation">[</span>p. 33<span class="citation">]</span> says blah.</div></div><p>See the <a href="https://pandoc.org/MANUAL.html#citation-syntax">Pandoc Citations</a> documentation for additional information on Pandoc citation syntax.</p><h3 id="citation-ids">Citation IDs</h3><p>Before inserting a citation from an external source you will may wish to customize it&rsquo;s ID.Within the <strong>Insert Citation</strong> dialog, click the edit button on the right side of citations to change their ID:</p><img src="images/visual-editing-citations-id.png" class="illustration"/><p>If you insert a new citation via code completion, you will also be provided with the opportunity to change its default citation ID.</p><p>For citations inserted from Zotero, you can also use the <a href="https://retorque.re/zotero-better-bibtex/">Better BibTeX</a> plugin to generate citation IDs and handle BibTeX export (this can be enabled via <a href="https://rstudio.github.io/visual-markdown-editing/#/options?id=citation-options">Citation Options</a> if you have Better BibTeX installed).</p><h3 id="citation-preview">Citation Preview</h3><p>Once you&rsquo;ve inserted a citation, place the cursor over it to see a preview of it along with a link to the source if one is available:</p><img src="images/visual-editing-cite-popup.png" width="700"/><p>The preview (and generated bibliography) will use the currently defined <a href="https://citationstyles.org/">CSL style</a> for the document (as specified in the <code>csl</code> metadata field. A repository of CSL styles can be found at <a href="http://zotero.org/styles">http://zotero.org/styles</a>.</p><h2 id="citations-from-zotero">Citations from Zotero</h2><p><a href="https://zotero.org">Zotero</a> is a popular free and open source reference manager.If you use Zotero, you can also insert citations directly from your Zotero libraries.If you have Zotero installed locally it&rsquo;s location will be detected automatically and citations from your main library (<strong>My Library</strong>) will be available:</p><img src="images/visual-editing-citations-zotero-browse.png" class="illustration" width="918"/><p>Not that while they aren&rsquo;t enabled by default, you can also insert citations from Zotero <a href="#group-libraries">Group Libraries</a> (see the next section for details).</p><p>Zotero references will also show up automatically in completions:</p><img src="images/visual-editing-citation-completions.png" width="426"/><p>Items from Zotero will appear alongside items from your bibliography with a small &ldquo;Z&rdquo; logo juxtaposed over them.If you insert a citation from Zotero that isn&rsquo;t already in your bibliography then it will be automatically added to the bibliography.</p><p>If you are running both RStudio and Zotero on your desktop, then no additional configuration is required for connecting to your Zotero library.If however you using RStudio Server and/or want to access your Zotero library over the web, then a few more steps are required (see the <a href="https://rstudio.github.io/visual-markdown-editing/#/citations?id=zotero-web-api">Zotero Web API</a> documentation for details).</p><h3 id="group-libraries">Group Libraries</h3><p><a href="https://www.zotero.org/support/groups">Zotero Groups</a> provide a powerful way to share collections with a class or work closely with colleagues on a project.By default, Zotero Group Libraries are not included in the <strong>Insert Citation</strong> dialog or citation completions.However, there are options available to use group libraries at a global, per-project, or per-document level.</p><p>For example, here we specify a project-level option to use the <em>Reproducible Research Series (Year 1)</em> group library:</p><img src="images/visual-editing-citation-zotero-group.png" class="illustration" width="543"/><p>You can also specify one or more libraries within YAML.For example:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-yaml" data-lang="yaml">---<span style="color:#bbb"></span><span style="color:#bbb"></span><span style="color:#007020;font-weight:bold">title</span>:<span style="color:#bbb"> </span><span style="color:#4070a0">&#34;Reproducible Research&#34;</span><span style="color:#bbb"></span><span style="color:#bbb"></span><span style="color:#007020;font-weight:bold">zotero</span>:<span style="color:#bbb"> </span><span style="color:#4070a0">&#34;Reproducible Research Series (Year 1)&#34;</span><span style="color:#bbb"></span><span style="color:#bbb"></span>---<span style="color:#bbb"></span></code></pre></div><p>Note that you can also turn off Zotero entirely for a document using <code>zotero: false</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-yaml" data-lang="yaml">---<span style="color:#bbb"></span><span style="color:#bbb"></span><span style="color:#007020;font-weight:bold">title</span>:<span style="color:#bbb"> </span><span style="color:#4070a0">&#34;Reproducible Research&#34;</span><span style="color:#bbb"></span><span style="color:#bbb"></span><span style="color:#007020;font-weight:bold">zotero</span>:<span style="color:#bbb"> </span><span style="color:#007020;font-weight:bold">false</span><span style="color:#bbb"></span><span style="color:#bbb"></span>---<span style="color:#bbb"></span></code></pre></div><h2 id="citations-from-dois">Citations from DOIs</h2><p>Use the <strong>From DOI</strong> pane of the <strong>Insert Citation</strong> dialog to insert a citation based on a <a href="https://www.doi.org/">DOI</a> (e.g that you have retrieved from a PubMed or other search):</p><img src="images/visual-editing-citation-insert-doi.png" width="918"/><p>If you are using markdown syntax, you can also paste a DOI after the <code>[@</code> and it will be looked up:</p><img src="images/visual-editing-citations-doi.png" width="700"/><p>Once you&rsquo;ve confirmed that it&rsquo;s the correct work (and possibly modified the suggested ID), the citation will be inserted into the document and an entry for the work added to your bibliography.</p><h2 id="citations-from-search">Citations from Search</h2><p>Use the <strong>Crossref</strong>, <strong>DataCite</strong>, and <strong>PubMed</strong> panes of the <strong>Insert Citation</strong> dialog to search one of those services for a citation:</p><img src="images/visual-editing-citations-crossref.png" class="illustration" width="918"/><p>Items inserted from a search will automatically be added to your bibliography.</p><p>Note that for PubMed queries you can use the full supported query syntax.For example, this query searches on the author and title fields: <code>Peterson[Author] AND Embolism[Title]</code>.You can learn more about building PubMed queries here: <a href="https://pubmed.ncbi.nlm.nih.gov/advanced/">https://pubmed.ncbi.nlm.nih.gov/advanced/</a>.</p><h2 id="try-it-out">Try it out!</h2><p>You can try out the new features from this blog series by installing the <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio 1.4 Preview Release</a>.If you do, we very much welcome your feedback on our <a href="https://community.rstudio.com/c/rstudio-ide">community forum</a>.Complete documentation for using citations can be found at <a href="https://rstudio.github.io/visual-markdown-editing/#/citations">https://rstudio.github.io/visual-markdown-editing/#/citations</a>.</p><style type="text/css">kbd {display: inline-block;text-align: center;padding: 0em 0.4em;border: 1px solid hsl(113, 0%, 89%);border-radius: 4px;background: hsl(113, 0%, 97%);}.illustration {border: 1px solid rgb(230, 230, 230);padding: 6px;}.citation {color: blue;}</style></description></item><item><title>RStudio 1.4 Preview: Rainbow Parentheses</title><link>https://www.rstudio.com/blog/rstudio-1-4-preview-rainbow-parentheses/</link><pubDate>Wed, 04 Nov 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-1-4-preview-rainbow-parentheses/</guid><description><p><img src="colorfulHeader.png" alt=""></p><p><em>This post is part of a series on new features in RStudio 1.4, currently available as a <a href="https://www.rstudio.com/products/rstudio/download/preview/">preview release</a>.</em></p><p>Beautiful code themes and rainbow parentheses, a tale as old as&hellip;well <a href="https://github.com/rstudio/rstudio/issues/1888">at least 2017</a>. Being able to color your parentheses (and brackets and braces) based on the level of nesting has been a highly requested feature for years and we&rsquo;re happy to announce that it&rsquo;s available in the upcoming 1.4 release of RStudio.</p><h3 id="enabling-rainbow-parentheses">Enabling Rainbow Parentheses</h3><p>Rainbow parentheses are turned off by default. To enable them:</p><ol><li><p>Open Global Options from the Tools menu</p></li><li><p>Select Code -&gt; Display</p></li><li><p>Enable the Rainbow Parentheses option at the bottom</p></li></ol><p><img src="rainbowOptions.png" alt=""></p><h3 id="optional-use">Optional Use</h3><p>If you would prefer to only use the Rainbow Parentheses option on a per-file basis (just for specific debugging, for example) you can toggle this option by using the Command Palette.</p><ol><li><p>Open the Command Palette by either using the keyboard shortcut (Default: Control/Command + Shift + P) or through the Tools -&gt; Command Palette menu option.</p></li><li><p>Type <code>rainbow</code> to quickly highlight the <code>Toggle Rainbow Parentheses Mode</code> option and select it to toggle the option.</p></li></ol><p><img src="toggleRainbow.png" alt=""></p><p>This is on the file itself so the rest of your environment will continue to respect the global setting.</p><h3 id="configuring">Configuring</h3><p>If you don&rsquo;t like the default colors, or they don&rsquo;t quite work for your theme, you can customize them to whatever you like. See <a href="https://rstudio.github.io/rstudio-extensions/rstudio-theme-creation.html">this article</a> on writing your own RStudio theme. The relevant classes to change are <code>.ace_paren_color_0</code> to <code>.ace_paren_color_6</code>.</p><p><img src="colorfulCode.png" alt=""></p><h3 id="try-it-out"><strong>Try it out!</strong></h3><p>You can try out the new features from this blog series by installing the <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio 1.4 Preview Release</a>. If you do, we very much welcome your feedback on our <a href="https://community.rstudio.com/c/rstudio-ide">community forum</a>!</p></description></item><item><title>2020 Table Contest Deadline Extended</title><link>https://www.rstudio.com/blog/table-contest-deadline-extended/</link><pubDate>Fri, 30 Oct 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/table-contest-deadline-extended/</guid><description><p>The original deadline for the <a href="https://www.rstudio.com/blog/announcing-the-2020-rstudio-table/">2020 Table Contest</a> was scheduled for October 31, 2020.</p><p>We know you&rsquo;ve been busy and that&rsquo;s okay. Because we&rsquo;ve had a number of requests for extensions — including some interest in summarizing election data — the deadline has been extended by two weeks to November 14, 2020.</p><p>If you have already submitted an entry, you are free to update it up to the closing date. Find all <a href="https://community.rstudio.com/tag/table-contest">table contest submissions on RStudio Community</a>.</p></description></item><item><title>Why RStudio Supports Python for Data Science</title><link>https://www.rstudio.com/blog/why-rstudio-supports-python/</link><pubDate>Fri, 30 Oct 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/why-rstudio-supports-python/</guid><description><p>As RStudio&rsquo;s products have increasingly supported Python over the past year, some of our seasoned customers have given us quizzical looks and ask, &ldquo;Why are you adding Python support? I thought you were an R company!&rdquo;</p><p>Just to set the record straight, RStudio does love R and the R community, and we have no plans to change that. However, if RStudio&rsquo;s goal is to &ldquo;enhance the production and consumption of knowledge by everyone, regardless of economic means&rdquo; (which is what we say in <a href="https://rstudio.com/about/">our mission statement)</a>, that means we have to be open to all ways of approaching that goal, not just the R-based ones.</p><p>This still leaves open the question of why we would embrace a language that some in the data science world think of as a competitor. And while I can&rsquo;t claim we have a definitive answer, we do have something more than anecdotes to encourage R users to embrace Python as well. We have data.</p><h2 id="survey-data-says-r-and-python-are-used-for-different-things">Survey Data Says &ldquo;R and Python Are Used for Different Things&rdquo;</h2><blockquote><p>&ldquo;In God we trust; others must provide data.&rdquo;</p><p><i> &ndash; Attributed to W. Edwards Deming and others, <a href="https://quoteinvestigator.com/2017/12/29/god-data/" target="_blank" rel="noopener noreferrer"> including Anonymous</a></i></p></blockquote><p>RStudio has run a broad-based survey of people who use or intend to use R over the past two years. In the <a href="https://community.rstudio.com/t/help-us-better-understand-the-r-community-by-completing-the-2nd-annual-r-community-survey/47242" target="_blank" rel="noopener noreferrer">2019 edition of the survey</a>, we asked our more than 2,000 respondents to answer two questions:</p><blockquote><p>&ldquo;What applications do you use R for most?&rdquo;</p></blockquote><p>and</p><blockquote><p>&ldquo;What applications do you use Python for most?&rdquo;</p></blockquote><p>Respondents were allowed to check as many answers as they wished in both cases. They also were allowed to enter their own application categories as an open-ended response. It is important to note that while this data is indicative of user attitudes, it is by no means conclusive.</p><p>Below are the summary plots for the results of these survey questions.</p><p><img src="thumbnail.jpg" alt="Summary bar chart showing that R is used most commonly for visualization, statistical analysis, and data transformation"></p><caption>Figure 1: R is used most commonly for visualization, statistical analysis, and data transformation.</caption><p><img src="python.jpg" alt="Summary bar chart showing that R users employ Python most commonly for data transformation and machine learning."></p><caption>Figure 2: R users employ Python most commonly for data transformation and machine learning.</caption><p>Taking these charts at face value (again, read the next section before you do that), we can draw some interesting conclusions:</p><ul><li><p><strong>R users use Python!</strong> We had just over 2,000 survey respondents who said they use R while nearly 1,100 survey respondents said they used Python. Because our survey is focused on R users, this means that <strong>roughly half of our respondents are using Python as well as R.</strong></p></li><li><p><strong>Visualization and statistical analysis are R&rsquo;s most common uses.</strong> Nearly 9 out of 10 R users apply it in these ways. Data transformation is also a close third place.</p></li><li><p><strong>Data transformation and machine learning are Python&rsquo;s most common applications.</strong> A majority of Python users do data transformation and machine learning with the language. No other applications are as common; only a third of Python users use it for statistical analysis or modeling.</p></li></ul><h2 id="think-of-these-results-as-directional-instead-of-hard-numbers">Think of These Results As Directional Instead of Hard Numbers</h2><p>While these analyses are interesting and the sample sizes reasonable, readers should understand that these results aren&rsquo;t really representative of all data scientists. As the creator and primary analyst for this survey, I can give you several reasons why you shouldn&rsquo;t put too much stock in these numbers beyond their overall direction:</p><ul><li><p><strong>We only surveyed people interested in R.</strong> The introduction to the survey specifically says that it is open to &ldquo;anyone who is interested in R, regardless of whether they have learned the language.&rdquo; If a Python-only user looked at the survey, it&rsquo;s unlikely they would have completed it, which means they aren&rsquo;t represented in the results.</p></li><li><p><strong>We didn&rsquo;t do a random sample.</strong> We solicited responses by asking RStudio employees to invite their Twitter and RStudio Community followers to fill it out. It&rsquo;s highly unlikely that our friends and followers are representative of the larger data science or statistics community, and it undoubtedly leaves out broad swaths of the population of programmers and data scientists.</p></li><li><p><strong>None of the data has been weighted to be representative of any broad population.</strong> We have not weighted the anonymous demographic information collected in the survey to represent any larger population. That means the survey may have significant gender, ethnic, industry, and educational biases that we haven&rsquo;t corrected for.</p></li></ul><p>The best way to think of this survey is that it represents the views of a few thousand of RStudio&rsquo;s friends and customers. While this doesn&rsquo;t give us any conclusions about the general population of data scientists or programmers, we can use it to think about what we can do to make those people more productive.</p><h2 id="rstudio-should-and-does-support-both-r-and-python">RStudio Should (and Does) Support Both R and Python</h2><p>Despite the fact that we can&rsquo;t use this survey for general conclusions, we can use this data to think about how RStudio should support our customers and data science community in their work:</p><ul><li><p><strong>We should reject the myth that users must choose between R or Python.</strong> We had always hypothesized that R users use more than one language to do data science. The data we collected from this survey supports that hypothesis. Becoming an R-only company would only make data science jobs more difficult.</p></li><li><p><strong>We should embrace Python because fully half of our community uses it in addition to R.</strong> With more than 50% of R users applying Python to various applications, not supporting Python would force those users to use other tools to get their jobs done.</p></li><li><p><strong>Embracing Python as well as R means that our products should support it too.</strong> Forcing data scientists to swap back and forth between different programming environments is inefficient and lowers productivity. By supporting Python in all our products, both free and commercial, we can help our customers get results faster and more seamlessly. That in turn, will help RStudio achieve our broader mission: &ldquo;to enhance the production and consumption of knowledge by everyone, regardless of economic means&rdquo;.</p></li></ul><p>While RStudio already <a href="https://rstudio.com/solutions/r-and-python/" target="_blank" rel="noopener noreferrer">offers Python support in its products</a>, we&rsquo;ll be adding to that support in new versions that will be released in the coming months. Those announcements will appear both here on blog.rstudio.com and on the main web site, so check regularly for when those are released.</p><hr><h3 id="survey-details">Survey Details</h3><p>RStudio fielded its 2019 R community survey beginning on December 13, 2019. We closed the survey on January 10, 2020 after it had accumulated 2,176 responses. Its details are as follows:</p><ul><li>The survey was fielded in both English and Spanish. Of the 2,176 responses, 1,838 were in English and 338 were in Spanish. All Spanish results were translated into English for analysis.</li><li>The survey consists of 52 questions, but it includes branching so not all respondents answer all questions. It also includes questions to detect survey-completing robots.</li><li>Respondents were solicited from posts on community.rstudio.com and Twitter followers of RStudio employees.</li><li>Survey results are not representative of any broader population</li><li>Complete data and incomplete processing scripts can be found in the survey&rsquo;s <a href="https://github.com/rstudio/learning-r-survey" target="_blank" rel="noopener noreferrer">Github repository</a></li><li>The data and scripts are open source and available to anyone interested.</li><li>RStudio expects to field this year&rsquo;s survey in December, 2020.</li></ul></description></item><item><title>RStudio 1.4 Preview: Multiple Source Columns</title><link>https://www.rstudio.com/blog/rstudio-1-4-preview-multiple-source-columns/</link><pubDate>Wed, 21 Oct 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-1-4-preview-multiple-source-columns/</guid><description><p><em>This post is part of a series on new features in RStudio 1.4, currently available as a <a href="https://www.rstudio.com/products/rstudio/download/preview/">preview release</a>.</em></p><p>RStudio v1.4 introduces the capability to configure your workbench with multiple source columns.Multiple source columns give you the ability to view your source editors side-by-side and quickly reference code and data without switching tabs.</p><h3 id="configuring-a-side-by-side-view">Configuring a Side-by-Side View</h3><p>Your default RStudio view will not change after installing RStudio v1.4.Follow the instructions below to configure your new layout:</p><ol><li><p>In the menu bar, open the &ldquo;Tools&rdquo; menu.</p></li><li><p>From the drop down menu, choose &ldquo;Global Options&rdquo;.</p><p><img src="toolsMenu.png" alt=""></p></li><li><p>In the pane on the left hand side of the options window, click &ldquo;Pane Layout&rdquo;.</p></li><li><p>To add a column, click on the &ldquo;Add Column&rdquo; button.</p><ol><li><p>You&rsquo;ll see a column labeled &ldquo;Source&rdquo; appear in the layout preview.</p></li><li><p>You can add up to three additional columns.</p></li></ol><p><img src="globalOptions.png" alt=""></p></li><li><p>Select &ldquo;Apply&rdquo; and you&rsquo;ll see the columns appear on your workbench, each with an empty R Script open and ready to go.</p></li><li><p>Select &ldquo;Ok&rdquo; to close the &ldquo;Global Options&rdquo; pane and start working with your new columns.</p></li></ol><p>Alternatively, you can quickly add columns with the new commands <em>Add Source Column</em> and <em>Open File in New Column</em>.<em>Add Source Column</em> can be accessed with the keyboard shortcut <code>Ctrl + F7</code> and can be found in the <em>View</em> menu under <em>Panes (View -&gt; Panes -&gt; Add Source Column).</em> To choose a specific file to open in a new column, navigate to the <em>File</em> menu <em>(File -&gt; Open File in New Column)</em>.If you want to be able to pull open the file chooser for a new column via the keyboard, we encourage you to <a href="https://support.rstudio.com/hc/en-us/articles/206382178-Customizing-Keyboard-Shortcuts">use the Modify Keyboard Shortcuts command</a> to add one.</p><h3 id="navigating-multiple-columns"><strong>Navigating Multiple Columns</strong></h3><p>Within your new layout, you can easily move tabs between columns with the same drag and drop convention you use to organize tabs today.Any command you run outside of a specific column (e.g. via the menu bar or keyboard) interacts with the last column you used.For example, if you choose Open a New File, the file will be opened in that last column you&rsquo;ve selected.</p><p>We&rsquo;ve added two commands to aid the experience of moving between columns: <em>Focus Next Pane</em> (<code>F6</code>) and <em>Focus Previous Pane</em> (<code>Shift + F6</code>).These commands not only enable you to move between source columns, they also move focus to any other pane you have open.<em>Focus Next Pane</em> gives focus to the pane to the right of your active one, while <em>Focus Previous Pane</em> moves focus to the left.We&rsquo;ve added a new accessibility option which allows you to easily visualize which panel has focus; while it is off by default, you can turn it on in the &ldquo;Global Options&rdquo; pane (<em>Global Options -&gt; Accessibility -&gt; Highlight focused Panel</em>).</p><h3 id="removing-source-columns"><strong>Removing Source Columns</strong></h3><p>To remove a column, simply close or drag out all tabs within that column.Alternatively, you can close the columns as well as any open tabs by navigating to the <em>Pane Layout -&gt; Global Options</em> pane described above and choosing <em>Remove Column</em>.This closes the left most column and will prompt you to save any unsaved files.</p><h3 id="try-it-out"><strong>Try it out!</strong></h3><p>You can try out the new side-by-side source columns by installing the <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio 1.4 Preview Release</a>.If you do, we very much welcome your feedback on our <a href="https://community.rstudio.com/c/rstudio-ide">community forum</a>!</p></description></item><item><title>rstudio::global(2021) Updates</title><link>https://www.rstudio.com/blog/rstudio-global-2021-updates/</link><pubDate>Fri, 16 Oct 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-global-2021-updates/</guid><description><p>Back in July we made the difficult decision to cancel rstudio:conf(2021) for the health and safety of our attendees and the broader community.</p><p>Today, we are thrilled to announce that rstudio::global(2021), our first ever virtual event focused on all things R and RStudio, now has a date locked in: <strong>January 21, 2021!</strong></p><p>Our goal is to make rstudio::global(2021) our most inclusive and global event, making the most of the freedom from geographical and economic constraints that comes with an online event.That means that the conference will be free, designed around participation from every time zone, and have speakers from around the world.</p><h2 id="what-is-rstudioglobal2021">What is rstudio::global(2021)?</h2><ul><li><p>A <strong>24-hour hour global virtual event</strong> (so make sure you don&rsquo;t run out of coffee!)</p></li><li><p><strong>3 awesome keynotes</strong>.More on that below!</p></li><li><p><strong>30 talks with live Q&amp;A.</strong> Learn from a diverse array of experts about a wide range topics of interest to data scientists.</p></li><li><p><strong>20 rapid fire lightning talks.</strong></p></li><li><p><strong>30+ birds of a feather (BoF) sessions</strong>.One of the best parts of rstudio::conf is connecting with so many amazing people in the data science community.The BoFs provide an opportunity to connect with people who are working on the same problems, in the same fields, and using the same tools.</p></li><li><p><strong>Social events.</strong> Fun and interesting opportunities to connect with your peers.</p></li><li><p><strong>Diversity scholar program</strong>.We believe in building on-ramps so that people from diverse backgrounds can learn R, build their knowledge, and then contribute back to the community.Since this conference will be virtual, we will be opening up the program to participants from all over the world.Expect to hear more on this soon!</p></li></ul><h2 id="who-are-the-keynote-speakers">Who are the keynote speakers?</h2><div class="row"><div class="column1"><img src="vicki-boykis.jpg" style="border-radius: 50%;width: 140px;height: auto;"></div><div class="column2"><p class="name">Vicki Boykis</p><p>Vicki Boykis is a machine learning engineer at Automattic, the company behind Wordpress.com. She works mostly in Python, R, Spark, and SQL, and really enjoys building end-to-end data products. Outside of work she publishes the <a href="https://vicki.substack.com">Normcore Tech newsletter</a> and blogs at <a href="https://veekaybee.github.io/">https://veekaybee.github.io/</a>. In her "spare time", she blogs, reads, and <a href="https://twitter.com/vboykis">writes terrible joke tweets about data</a>.</p><p>Vicky will discuss how that as people who can write code and analyze data, we have a lot of input and power over what our digital and work worlds looks like, and therefore can act as agents of change and repair.</p></div></div><div class="row"><div class="column1"><img src="john-burn-murdoch.jpg" style="border-radius: 50%;width: 140px;height: auto;"></div><div class="column2"><p class="name">John Burn-Murdoch</p><p>John Burn-Murdoch is the Financial Times' senior data visualisation journalist, and creator of the FT's coronavirus trajectory tracker charts. He has been leading the FT's data-driven coverage of the pandemic, exploring its impacts on health, the economy and wider society. When pandemics are not happening, he also uses data and graphics to tell stories on topics including politics, economics, climate change and sport, and is a visiting lecturer at the London School of Economics.</p><p>John will discuss the lessons he's learned reporting on and visualising the pandemic, including the world of difference between making charts for a technical audience and making charts for a mass audience. You'll learn from his experience navigating the highly personal and political context within which people consume and evaluate graphics and data, and how that can help us better design and communicate with visualisations down the pipeline for the future.</p></div></div><div class="row"><div class="column1"><img src="hadleywickham.jpg" style="border-radius: 50%;width: 140px;height: auto;"></div><div class="column2"><p class="name">Hadley Wickham</p><p>Hadley is Chief Scientist at RStudio, a member of the R Foundation, and Adjunct Professor at Stanford University and the University of Auckland. He builds tools (both computational and cognitive) to make data science easier, faster, and more fun. His work includes packages for data science (the tidyverse: including ggplot2, dplyr, tidyr, purrr, and readr) and principled software development (roxygen2, testthat, devtools, pkgdown).</p><p>Hadley will talk about how the tidyverse has evolved since its creation (just five years ago!). You'll learn about our greatest successes, learn from our biggest failures, and get some hints of what's coming down the pipeline for the future.</p></div></div><h2 id="registration-for-rstudioglobal2021-will-open-in-early-december">Registration for rstudio::global(2021) will open in early December</h2><p>If you would like to receive notifications about the details, please subscribe below.</p><style type="text/css">.form-design {margin: 40px 0;padding: 22px 80px 40px 80px;margin: 40px 0;background-color: #F8F8F8;}@media only screen and (max-width: 767px) {.form-design {padding: 20px;margin: 30px 0;}}@media only screen and (max-width: 480px) {.mktoForm .mktoFormCol .mktoLabel {width: 87%;}}.mktoHtmlText.mktoHasWidth {width: 100% !important;}.mktoForm.mktoHasWidth.mktoLayoutLeft {width: 100% !important;color: #404040 !important;}.mktoForm.mktoHasWidth.mktoLayoutLeft h3,#confirmform h3 {font-weight: 400 !important;font-size: 22px !important;padding-bottom: 8px !important;}.mktoEmailField {max-width: 100%;}.mktoForm .mktoAsterix {display: none !important;}input {background-color: transparent !important;}.mktoForm {padding: initial !important;}.mktoButton {background-color: #4287c7 !important;background-image: none !important;border: 1px #4287c7 !important;color: #fff !important;height: 45px !important;text-transform: uppercase !important;font-weight: 600 !important;font-size: 18px !important;box-shadow: none !important;float: left;margin-top: 12px !important;}.mktoForm input[type=email] {box-shadow: none !important;border-radius: 0 !important;border: 1px solid #999999 !important;padding: 15px 15px !important;color: #404040 !important;margin-top: 7px;font-size: 18px;}.mktoForm .mktoLabel {padding-top: 1em !important;}.mktoForm select.mktoField {font-size: 14px !important;}.mktoForm textarea.mktoField {height: 5em !important;}body .mktoForm .mktoCheckboxList>label {margin-left: 0;}.mktoFieldDescriptor .mktoFormCol>.mktoOffset {display: none;}.mktoForm input[type=text],.mktoForm input[type=url],.mktoForm input[type=email],.mktoForm input[type=tel],.mktoForm input[type=number],.mktoForm input[type=date],.mktoForm input[type=tel],select {height: 44px !important;}.mktoForm .mktoRequiredField label.mktoLabel {font-weight: 400 !important;}label {margin-bottom: 8px !important;}.mktoForm fieldset legend {margin: 0 0 0.5em 0 !important;font-size: 16px;font-weight: 600;}.mktoForm .mktoLabel {padding-top: 0.5em;}.mktoFormRow>.mktoFieldDescriptor.mktoFormCol>.mktoFieldWrap.mktoRequiredField>textarea {width: 100% !important;}fieldset.mktoFormCol {width: 100% !important;}.mktoForm {width: 100% !important;}.mktoFormRow,.mktoFieldWrap,.mktoButtonRow {width: 100%;}.mktoForm input[type=url],.mktoForm input[type=text],.mktoForm input[type=date],.mktoForm input[type=tel],.mktoForm input[type=email],.mktoForm input[type=number],.mktoForm textarea.mktoField,.mktoForm select.mktoField {width: 100% !important;}.mktoFieldWrap {box-sizing: border-box;}.mktoFormCol:nth-child(even) .mktoFieldWrap {padding-right: 0 !important;}.mktoButton:hover {background-color: #ffffff !important;color: #5a9ddb !important;border: 1px solid #5a9ddb !important;-webkit-transition: all .3s;transition: all .3s;}.mktoButton {background-color: #5a9ddb !important;background-image: none !important;border: 1px solid #5a9ddb !important;color: #fff !important;width: 200px !important;margin-bottom: 7px;text-transform: uppercase !important;font-weight: bold !important;font-size: 14px !important;height: 45px;font-family: "Source Sans Pro", Arial, Helvetica, sans-serif !important;}.mktoButtonRow {}.mktoButtonWrap {margin-left: 0 !important;}.mktoForm .mktoCheckboxList {padding: 1.2em 0.3em 0.3em 0.3em;}@media only screen and (max-width: 480px) {.mktoFormCol {width: 100% !important;}.mktoFieldWrap {padding-right: 0 !important;}.mktoForm .mktoRadioList,.mktoForm .mktoCheckboxList {width: 13%;}}</style><div class="form-design"><script src="//pages.rstudio.net/js/forms2/js/forms2.js"></script><form id="mktoForm_3297"></form><script>MktoForms2.loadForm("//pages.rstudio.net", "709-NXN-706", 3297);</script><script>MktoForms2.whenReady(function (form){form.onSuccess(function(values, followUpUrl){form.getFormElem().hide();document.getElementById('confirmform').style.display = 'block';return false;});});</script><div id="confirmform" style="display:none;"><h3>Thank you!</h3><p>Your email address has been submitted. We will keep you updated regarding rstudio::global!</p></div></div><style type="text/css">.name {font-size: 22px;font-weight: 400;margin-top: 0;margin-bottom: 12px;}.column1 {float: left;width: 24%;text-align: center;}.column2 {float: left;width: 76%;}@media only screen and (max-width: 600px) {.column1 {float: none;width: 100%;margin-bottom: 24px;}.column2 {float: none;width: 100%;}}.row {margin-top: 40px;}.row:after {content: "";display: table;clear: both;}h2 {margin-top: 50px;}</style></description></item><item><title>RStudio v1.4 Preview: Command Palette</title><link>https://www.rstudio.com/blog/rstudio-v1-4-preview-command-palette/</link><pubDate>Wed, 14 Oct 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-v1-4-preview-command-palette/</guid><description><p><em>This post is part of a series on new features in RStudio 1.4, currently available as a <a href="https://www.rstudio.com/products/rstudio/download/preview/">preview release</a>.</em></p><h2 id="whats-a-command-palette">What&rsquo;s a Command Palette?</h2><p>Just as a paint palette gives the artist instant access to all their colors, a command palette is a software affordance that gives instant, searchable access to all of a program&rsquo;s commands. RStudio 1.4 introduces this very popular tool to our workbench.</p><img align="center" style="padding: 35px;" src="command-palette-ide.png"><p>Command palettes have become a fixture of modern IDEs, and with good reason. They improve:</p><ul><li><strong>Keyboard accessibility</strong>; even commands that do not have keyboard shortcuts are easily invoked from the palette.</li><li><strong>Speed</strong>; it is often much faster to invoke a command from the palette with a few quick keystrokes than to reach for the mouse or drill into a menu.</li><li><strong>Discoverability</strong>; since the palette lists all the commands, it can be browsed to find a command for a task by name without trying to figure out which menu or toolbar might contain it.</li></ul><h2 id="invoking-the-palette">Invoking the Palette</h2><p>The palette can be invoked with the keyboard shortcut <kbd>Ctrl</kbd> + <kbd>Shift</kbd> + <kbd>P</kbd> (<kbd>Cmd</kbd> + <kbd>Shift</kbd> + <kbd>P</kbd> on macOS).</p><p>It&rsquo;s also available on the <em>Tools</em> menu (<em>Tools</em> -&gt; <em>Show Command Palette</em>).</p><h2 id="content">Content</h2><p>RStudio&rsquo;s command palette has three main types of content:</p><h3 id="commands">Commands</h3><p>First and foremost, the command palette serves as a way to search for and invoke RStudio commands quickly with just a few keystrokes. Every RStudio command is in the palette, unless it&rsquo;s been explicitly hidden in the current mode.</p><p>To find a command, enter a word or short sequence of characters from the command. For example, to create a new script, start typing <code>new scr</code>.</p><img align="center" style="padding: 35px;" src="new-script.png"><p>You can keep typing to filter the list, or press <kbd>Up</kbd>/<kbd>Down</kbd> to choose a command from the list and then <kbd>Enter</kbd> to execute the chosen command. Commands are displayed with their bound keyboard shortcuts, if any, so that you know how to invoke the command directly with the keyboard next time.</p><p>If your command doesn&rsquo;t have a shortcut, you can <a href="https://support.rstudio.com/hc/en-us/articles/206382178-Customizing-Keyboard-Shortcuts">use the <em>Modify Keyboard Shortcuts</em> command</a> to add one.</p><h3 id="settings">Settings</h3><p>In addition to all of RStudio&rsquo;s commands, the command palette provides easy access to most of its settings. You&rsquo;ll see the word <code>Setting</code> in front of settings, along with a small control that allows you to change the setting.</p><p>For example, you can turn RStudio&rsquo;s code margin indicator off and on or move it to a different column. If you have a code editor open, you&rsquo;ll see these changes reflected in real time as you make them.</p><img align="center" style="padding: 35px;" src="margin.png"><p>Note that the settings displayed are your personal (user-level) settings. Just like the settings in Global Options, they can be overridden by project-level settings, and some settings don&rsquo;t take effect until after a restart.</p><h3 id="rstudio-addins">RStudio Addins</h3><p>Finally, the command palette shows all of the commands exposed by any installed RStudio add-ins. You can find these by typing any part of the add-in name and/or part of the command. For example, to use a command from the excellent <a href="https://github.com/r-lib/styler">styler addin</a>:</p><img align="center" style="padding: 35px;" src="style-selection.png"><p>This makes the palette user-extensible; if you want to add your own commands to the palette, you can <a href="https://rstudio.github.io/rstudioaddins/">create an RStudio Addin</a> to do so with just a few lines of code, or use the <a href="https://www.garrickadenbuie.com/blog/shrtcts/">shrtcts addin</a> to do so in even fewer lines of code!</p><h2 id="search-syntax">Search Syntax</h2><p>The command palette&rsquo;s search syntax is simple; it looks for complete matches for each space-separated term you enter. So, for example, a query for <code>new proj</code> will find all of the entries that contain the term <code>new</code> AND the term <code>proj</code>.</p><p>In the future, we hope to improve the matching heuristics by prioritizing complete matches and recently or frequently used commands.</p><p>You can try out the new Command Palette by installing the <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio 1.4 Preview Release</a>. If you do, please let us know how we can make it better on the <a href="https://community.rstudio.com/c/rstudio-ide">community forum</a>!</p></description></item><item><title>Open Source Data Science in Investment Management</title><link>https://www.rstudio.com/blog/open-source-data-science-in-investment-management/</link><pubDate>Tue, 13 Oct 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/open-source-data-science-in-investment-management/</guid><description><p><sup>Photo by <a href="https://unsplash.com/@kellysikkema?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText" target="_blank" rel="noopener noreferrer">Kelly Sikkema</a> on <a href="https://unsplash.com/s/photos/flood?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText" target="_blank" rel="noopener noreferrer">Unsplash</a></sup></p><h2 id="surviving-the-data-deluge">Surviving the Data Deluge</h2><p>Many of the strategies at my old investment shop were thematically oriented. Among them was the notion of the &ldquo;data deluge.&rdquo; We sought to invest in companies that were positioned to help other companies manage the exponentially growing torrent of data arriving daily and turn that data into actionable business intelligence.</p><p>Ironically, we ourselves struggled to effectively use our own data. At the same time our employees throughout the company were taking it upon themselves to find the tools and do the work to better our processes. The tools they were using were open source, and it was only once we officially embraced an open source workflow that we dramatically sped up our business intelligence development.</p><h2 id="legacy-system-migration-started-the-flood">Legacy System Migration Started the Flood</h2><p>You might be surprised to see how old-school much of the investment management world is. Venerable firms like mine have legacy systems with customer records going back decades. These platforms run on COBOL, and our gray-bearded IT folks did a magnificent job of keeping them running. That was a good thing, because it took three years running in parallel before we could flip our CRM entirely to SalesForce.</p><p>A benefit of this migration was opening up an incredible volume of data, but that data posed a deeper question: What should we do with it? SalesForce has a plethora of analytic reports, but we needed serious data science for deeper insights. While SalesForce will happily create custom analysis for a fee, this is a brittle, slow process and, once it is done, the learning about HOW to do the work would have belonged to SalesForce, not us.</p><p>Fortunately open source tools allowed us to jump to a higher plane of insight very quickly, iterate continuously, and build our institutional knowledge about how to do this kind of work. Critically, an open source workflow allowed us to keep all the code for our projects in-house and usable for subsequent analysis, even if the original authors have moved on.</p><h2 id="data-science-helped-tame-the-fund-redemption-flow">Data Science Helped Tame The Fund Redemption Flow</h2><style type="text/css">.quote-spacing { padding:0 80px; }.quote-size { font-size: 160%; line-height: 34px; }[@media] only screen and (max-width: 600px) {.quote-spacing { padding:0; }.quote-size { font-size: 120%; line-height: 28px; }}</style><div style="background-color: #4476bb; color: #ffffff; padding:20px 30px 30px 30px; margin:50px 0;"><div class=".quote-spacing"><p class=".quote-size"><i>“Ultimately the project allowed us to retain several hundred million dollars of assets we would have otherwise lost, dropping millions of dollars to the bottom line.”</i></p></div></div><p>As an investment management firm, we earn less than one percent a year on client assets, yet sales people are typically paid a commission of several percent up front. That means we have to hold onto the customer&rsquo;s investment for several years before we recoup the upfront commission. Meanwhile, the client can redeem their assets at any time. A dirty little secret of our industry is that annual redemptions run about 25% of the total asset base. It always bothered me that redemptions were treated as an exogenous factor, so when I became CEO I decided to pay some attention to it. The possibility of reducing redemption rates gave us a strong incentive to deliver a great customer experience. It is always cheaper to keep a customer than to find a new one.</p><p>Mutual fund redemption rates are an ideal subject for data science research. Financial firms have hundreds of sales people who talk to thousands of financial advisers who talk to hundreds of thousands of clients on a routine basis. We have literally millions of customer accounts. So why do people redeem? Bad fund performance is one reason. Changing goals or life circumstances are another. How can we help the customers make a better informed decision about whether getting out is the right choice? At the simplest level, we expect more communication with the customer would be better, but should that communication be a visit, a call or an email? What should we say? A visit or a call to every adviser is expensive so how should we identify the most at-risk accounts. Ultimately our intervention program must be cheaper than what it costs to attract new accounts.</p><p>As we thought about these questions we learned that our corporate parent had recently put together a data science team. As a new group, it had excess capacity which we were able to use, and the team members were enthusiastic about a project that would drop results directly to the bottom line.</p><p>First, the sales group sat down with the data scientists to discuss their assumptions about why people redeem. Then the science team went to work on the mountains of data we had and tested those assumptions. As expected, we found that fund performance was a big driver of redemptions, but also that market performance, which affects all funds, was a bigger one.</p><p>When we examined communication, we found that an increase in inbound calls was predictive of redemptions, even when those calls only concerned routine matters. We discovered that in-person visits helped reduce redemptions, but that email outreach was worthless. There were also some surprises. We found that advisers we call &ldquo;churners&rdquo; were more likely to redeem if we contacted them. Finally, since we had so many customers, we designed A/B tests to find the most effective treatments.</p><p>Most of the work the data scientists did used the R language. They did a great job satisfying management&rsquo;s constant barrage of questions because iterative analysis is so easy with tools like R, and the powerful visualization tools made communication of results easy for sales people to grasp. As the CEO, I was gratified at how clear the presentations were and at how quickly presenters answered my difficult questions, in some cases on the fly during the presentations. As an R user myself, I know its code-based workflow lends itself to rapid iteration while, at the same time, documenting the process used. It was easy to unroll the tape to see every step that led to any conclusion.</p><p>While this project was not a panacea for all redemptions, we did manage to bend the redemption curve downward a couple percentage points while making our customers happier. Our sales people knew better how to spend their time speaking to the right people with the right message. Ultimately the project allowed us to retain several hundred million dollars of assets we would have otherwise lost, dropping millions of dollars to the bottom line.</p><h2 id="our-quantitative-strategy-focus-on-long-term-insights">Our Quantitative Strategy: Focus on Long-Term Insights</h2><p></style></p><div style="background-color: #4476bb; color: #ffffff; padding:20px 30px 30px 30px; margin:50px 0;"><div class=".quote-spacing"><p class=".quote-size"><i>“The data scientists used Python for this project but, as with the sales project, the rapid prototyping, easy iterating and powerful tools for clear visualization enabled by open source tools created a jazz band of creativity between the investment analysts and the data scientists.”</i></p></div></div><p>The sales side was pretty eager to embrace anything that might give them an edge with the customer. However, the senior leaders of the investment teams, who are generally analytical and quantitative, were quite wary of &ldquo;the robots taking over.&rdquo; Seeing how many purely quantitative funds there are out there, I understand their concern.</p><p>Our challenge was to convince the investment teams to see our fuller quantitative effort as a force multiplier. One day I was making the pitch for a more &ldquo;quant-y&rdquo; approach to a portfolio manager, and my argument was &ldquo;You&rsquo;ve been using a shovel to dig ditches for 20 years. While those may have been nice ditches, I&rsquo;m now giving you a backhoe. Do you really want to reject it?&rdquo; Fortunately, even as I was trying to convince this manager, the analysts in the trenches were already taking matters into their own hands thanks to the availability of open source tools.</p><p>The challenge for us was to recognize what data science could do to enhance our process rather than remake ourselves into a whole new company. Oppenheimerfunds (now Invesco) is fundamentally a long-term investment manager, not a high-frequency trader. Our quantitative insights need to persist over a duration that is actionable on our human-centered decision time line and be large enough to matter when we hold investments over months and years, not seconds and minutes.</p><p>One avenue that we explored was text mining of corporate regulatory findings and earnings call transcripts. The question was &ldquo;do changes in company reports over time signal future stock returns?&rdquo; Modern open-source data science tools allow near instant processing of the vast collections of documents, and these transcripts fit into a somewhat standard template, making it easy to compare changes over time.</p><p>Our security analysts started noodling around with this problem on their own using open source tools, but they recognized their expertise was limited. We didn&rsquo;t have the data scientists in-house who could do this work, so we partnered with Draper Labs in Boston who had a large team and were looking to branch out from their traditional work with defense contractors. One fun aspect of this project was the mutual learning that occurred. Our portfolio managers began to understand the force multiplier feature of these techniques, our analysts upped their data science game, and the data scientists learned the practical challenges of real-world trading beyond the results in the academic literature.</p><p>We sold our company before this project came to full fruition, but we were excited by the early results. While we were not the only firm working on text mining, there still seemed to be plenty of gold in the ground. The data scientists used Python for this project but, as with the sales project, the rapid prototyping, easy iterating and powerful tools for clear visualization enabled by open source tools created a jazz band of creativity between the investment analysts and the data scientists.</p><p>This project also illustrated the importance of support from the top. The security analysts were the driving force behind this project but their bosses were ambivalent at first. Air cover from the office of the CEO made clear this project was a priority for the firm. It would not have been done otherwise.</p><h2 id="in-conclusion-let-water-find-its-own-level">In Conclusion: Let Water Find Its Own Level</h2><p>Many techniques used by modern data scientists were developed decades ago. Now, two factors have converged to make serious data science possible for every firm:</p><ol><li><p>The exponential decline in computing costs, with cloud computing being the most recent evolutionary step, and</p></li><li><p>Open source data science tools such as R and Python.</p></li></ol><p>Open source tools put incredible power in the hands of everyone in the organization. Not everyone should or will use these tools but, as we found at OppenheimerFunds, once people see what is possible on their own, the push to advance business intelligence gathers its own momentum, and internal support to devote serious resources to the effort will grow. Managers within investment firms will inevitably want to support these efforts within their own teams because the competitive pressures are too strong, and the insights gained too great. Those that ignore the trend risk being swept away.</p><hr /><h3 id="about-art-steinmetz">About Art Steinmetz</h3><p>Art Steinmetz is the former Chairman, CEO and President of OppenheimerFunds. After joining the firm in 1986, Art was an analyst, portfolio manager and Chief Investment Officer. Art was named President in 2013, CEO in 2014, and, in 2015, Chairman of the firm with $250 billion under management. He stepped down when the firm was folded into Invesco.</p><p>Currently, Art is a private investor located in New York City. He is an avid amateur data scientist and is active in the R statistical programming language community.</p></description></item><item><title>Using R to Drive Agility in Clinical Reporting: Questions and Answers</title><link>https://www.rstudio.com/blog/driving-agility-in-clinical-reporting-q-a/</link><pubDate>Thu, 08 Oct 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/driving-agility-in-clinical-reporting-q-a/</guid><description><p>Following our recent RStudio webinar, <a href="https://rstudio.com/resources/webinars/using-r-to-drive-agility-in-clinical-reporting/" target="_blank" rel="noopener noreferrer">Using R to Drive Agility in Clinical Reporting</a>, we received an unprecedented number of questions from the audience. In this blog post, we attempt to answer as many of the 70+ questions that we received as possible. In some cases, we have grouped multiple questions into one to streamline the answers. The opinions are our own and do not necessarily reflect GlaxoSmithKline plc&rsquo;s (GSK) position or strategy.</p><h3 id="can-you-speak-to-any-specific-advantages-of-r-over-sas-with-respect-to-stakeholder-engagement-efficiency-or-development-time">Can you speak to any specific advantages of R over SAS with respect to stakeholder engagement, efficiency, or development time?</h3><p>The main selling point for R is that it provides additional capabilities through tools like Shiny, Rmarkdown and the <code>officer</code> package, which allow us to improve the way we communicate with other functions. We&rsquo;ve never really tried to make any speed comparisons or claims when promoting the use of R. It can be faster to develop in R, especially when simulating data, but for a large part of what we do there&rsquo;s not much difference in speed of development, and the execution time differences are negligible. That said, the licensing model (R being free) allows us to scale more easily when we use parallelism, which does reduce execution times.</p><h3 id="can-you-provide-further-details-on-your-approaches-to-validation">Can you provide further details on your approaches to validation?</h3><p>Although it wasn&rsquo;t the main focus of the talk, we received a lot of questions relating to validation. It&rsquo;s important to note that we&rsquo;re still in the midst of our journey and still have some open questions. At this moment in time, the Working Area for R Programming (WARP) environment is <strong>qualified</strong>. We are now developing a GxP batch execution model that we will validate. The R Validation Hub&rsquo;s <a href="https://www.pharmar.org/white-paper/" target="_blank" rel="noopener noreferrer">white paper</a> highlights some of the challenges in this space and describes a risk-based approach for validation of R packages which is quite similar to the approach we will be taking.</p><h3 id="are-you-currently-submitting-results-from-your-r-environment-to-regulatory-agencies">Are you currently submitting results from your R environment to regulatory agencies?</h3><p>We have not yet used the environment to produce study tables, listing and figures for production. However, our aim is to use R in production beginning in 2021.</p><h3 id="can-you-speak-more-to-the-challenges-faced-with-respect-to-the-adoption-of-shiny-is-it-used-for-regulatory-work">Can you speak more to the challenges faced with respect to the adoption of Shiny? Is it used for regulatory work?</h3><p>Building enthusiasm for Shiny has not been difficult at all. The bigger challenge is training and supporting new Shiny users. Thus far, this has mostly been webinar-based. We are active in promoting good practice and the use of training modules and have begun to share template code via our GitHub Enterprise instance. Thankfully, we don&rsquo;t generally have to worry about deployment, as RStudio Connect takes care of that for us. But some additional support for our users has been required to ensure that they understand how the deployments work, particularly with respect to data. We haven&rsquo;t yet attempted to include a Shiny app in a submission, but this is a very interesting area that GSK and several other companies have been looking at.</p><p>GSK Biostatistics is currently building a Shiny-based application for clinical data analysis and reporting. Given that this is part of a GxP compliant workflow, it has raised many interesting questions around audit trails, quality control (QC), and so on. The longer term hope is that dynamic visualisation applications will replace many static outputs for reporting out clinical trial data and analyses.</p><h3 id="why-did-gsk-decide-to-first-integrate-r-rather-than-other-open-source-languages-such-as-python">Why did GSK decide to first integrate R rather than other open source languages such as Python?</h3><p>R offered a better starting point at GSK. Many of our Research Statisticians and Data Scientists were already using R. While we recognize the value of different programming languages for clinical reporting, we chose to begin our open source journey with R to leverage the synergies of the efforts coming from Statistical Data Sciences who already are driving broader R use within Biostatistics.</p><h3 id="what-is-your-strategy-for-python">What is your strategy for Python?</h3><p>We have taken a strategic decision to treat R as a gateway to additional open source languages such as Python. The process for Python will likely be similar, but we will learn a lot from our journey with R. Given that our Python user base is much smaller, it is likely that we will initially explore the use of Python through R and RStudio (which is also a great IDE for Python), focusing on its strengths in the Machine Learning space.</p><h3 id="as-gsk-integrates-into-r-what-is-the-migration-strategy-for-sas-macro-libraries-into-r">As GSK integrates into R, what is the migration strategy for SAS macro libraries into R?</h3><p>We are not planning to migrate or translate our SAS reporting tools into R. There are a number of open source R packages that already provide fundamental capabilities required along the conventional clinical data pipeline from collection to clinical study report. Packages that were developed in-house for the R4QC project were primarily designed to reduce the volume of code used by streamlining common functionality with the reporting pipeline. In addition, there are a number of groups in the industry that are developing open source R packages that could collectively deliver the pipeline without the need for translating our SAS macro libraries.</p><h3 id="will-you-be-open-sourcing-your-packages">Will you be open sourcing your packages?</h3><p>I am pleased to say that we have very recently agreed upon an approach that will enable us to open source several of the packages that we have developed. More news to follow later in the year.</p><h3 id="how-do-you-address-the-underlying-discrepancies-between-basic-statistical-calculation-in-sas-and-r">How do you address the underlying discrepancies between basic statistical calculation in SAS and R?</h3><p>This is a great question. In R4QC, we noticed differences in both default rounding methodology and quantile calculations. In survival analyses, we also noticed a difference in the reported standard error. In all cases, results from both languages were consistent with their respective documentation.</p><style type="text/css">.quote-spacing { padding:0 80px; }.quote-size { font-size: 160%; line-height: 34px; }[@media] only screen and (max-width: 600px) {.quote-spacing { padding:0; }.quote-size { font-size: 120%; line-height: 28px; }}</style><div style="background-color: #4476bb; color: #ffffff; padding:50px 30px 30px 30px; margin:50px 0;"><div class=".quote-spacing"><p class=".quote-size"><i>“Fundamentally, we believe that the challenge is not to ensure that the results from R and SAS match completely for a given analysis. Indeed, some analyses in one language may not even be possible to easily replicate in the other. Rather, the analysis should be based on sound statistical theory, taking into account such aspects as the underlying statistical hypothesis and distribution of the data.”</i></p></div></div><h3 id="why-not-combine-all-potential-tools-including-sas-r-and-python-altogether-to-gain-the-best-insights">Why not combine all potential tools, including SAS, R and Python altogether to gain the best insights?</h3><p>Our ultimate objective is exactly this - to expand the analyst&rsquo;s toolkit in order to facilitate more agility in our delivery of insights. In the GxP world where we must demonstrate reproducibility, traceability, and data integrity, it is not simply the flip of a switch. Along with other leaders in the industry, we are working toward a modern analytics environment where our capabilities are not constrained by the capabilities of a single piece of software.</p><h3 id="what-is-the-percentage-uptake-of-r-among-sas-programmers">What is the percentage uptake of R among SAS programmers?</h3><p>This is a great question, one that we are still learning through. Our operational uptake in the clinical reporting space has been slower that expected, but we believe it is primarily due to the timing study milestones and the availability of our training offerings.</p><p>When we green-lighted using R for Independent QC programming of non-statistical displays, many teams found themselves actively working on studies that were already in progress. SAS programs had already been developed for Independent QC, and the teams could not justify the time and effort required to reprogram using R without putting timelines at risk.</p><p>In addition, our initial training modules were designed as intensive, face-to-face sessions with hands-on programming opportunities. It may go without saying, but COVID-19 threw a wrench into the 2020 plans for making these trainings available to Biostatistics staff. The training team has worked hard to redesign the sessions to be delivered virtually and we have successfully restarted the training effort.</p><p>We expect to see a stronger rate of uptake in the near future.</p><h3 id="how-will-the-move-to-r-reflect-in-gsks-recruitment-strategy">How will the move to R reflect in GSK&rsquo;s recruitment strategy?</h3><p>The importance of R as a key skill will continue to grow. We are already actively seeking programmers with skills in R, for conventional reporting activities (datasets, tables, listings, figures), future activities (Shiny apps), and conventional Data Science roles. Further roles within our Statistical Data Sciences team will start to appear on our website soon.</p><h3 id="conclusion">Conclusion</h3><p>We&rsquo;re confident that the answers above address the overwhelming majority of questions that we received, but if we still haven&rsquo;t managed to respond to your questions then we can only apologize. Thank you for listening to the webinar and reading through this Q&amp;A. We look forward to sharing more in the future!</p></description></item><item><title>RStudio v1.4 Preview: Python Support</title><link>https://www.rstudio.com/blog/rstudio-v1-4-preview-python-support/</link><pubDate>Wed, 07 Oct 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-v1-4-preview-python-support/</guid><description><p>Last week, we introduced RStudio&rsquo;s new <a href="https://blog.rstudio.com/2020/09/30/rstudio-v1-4-preview-visual-markdown-editing/">visual markdown editor</a>. Today, we&rsquo;re excited to introduce some of the expanded support for Python in the next release of RStudio.</p><h2 id="python-support">Python Support</h2><p>The RStudio 1.4 release introduces a number of features that will further improve the Python editing experience in RStudio:</p><ul><li><p>The default Python interpreter to be used by RStudio / <code>reticulate</code> can now be customized in the Global Options pane,</p></li><li><p>The Environment pane now displays a summary of Python objects available in the main module when the <code>reticulate</code> REPL is active,</p></li><li><p>Python objects can now be viewed and explored within the RStudio data viewer and object explorer,</p></li><li><p><code>matplotlib</code> plots are now displayed within the Plots pane when <code>show()</code> is called.</p></li></ul><h3 id="configuring-the-default-python-interpreter">Configuring the Default Python Interpreter</h3><p>When working with <code>reticulate</code>, one normally selects a Python interpreter using <code>reticulate</code> functions &ndash; for example, via <code>reticulate::use_python(…, required = TRUE)</code> or by setting the <code>RETICULATE_PYTHON</code> environment variable. (Or, alternatively, they trust <code>reticulate</code> to find and activate an appropriate version of Python as available on their system.)</p><p>However, one might want to control the version of Python without explicitly using <code>reticulate</code> to configure the active Python session. RStudio now provides a Python options pane, available both globally (via <code>Tools -&gt; Global Options…</code>), or per-project (via <code>Tools -&gt; Project Options…</code>), which can be used to configure the default version of Python to be used in RStudio.</p><p>Within the Python preferences pane, the default Python interpreter to be used by RStudio can be viewed and modified:</p><img src="images/python-preferences-pane.png" alt="The Python preferences pane." width="700"/><p>When the <code>Select…</code> button is pressed, RStudio will find and display the available Python interpreters and environments:</p><img src="images/python-interpreter-list.png" alt="The list of Python interpreters available on the system." width="700"/><p>RStudio will display system interpreters, Python virtual environments (created by either the Python <code>virtualenv</code> or <code>venv</code> modules), and Anaconda environments (if Anaconda is installed). Once an environment has been selected, RStudio will instruct <code>reticulate</code> to use that environment by default for future Python sessions.</p><p>Note that the <code>RETICULATE_PYTHON</code> environment variable still takes precedence over the default interpreter set here. If you&rsquo;d like to use RStudio to configure the default version of Python, but are setting <code>RETICULATE_PYTHON</code> within your <code>.Renviron</code> / <code>.Rprofile</code> startup files, you may need to unset it.</p><h3 id="environment-pane-support">Environment Pane Support</h3><p>The RStudio environment pane is now capable of displaying the contents of Python modules when the <code>reticulate</code> REPL is active. By default, the contents of the main module are displayed.</p><img src="images/python-environment-pane.png" alt="The RStudio IDE surface, with the Environment pane displaying Python objects." width="700"/><p>Similar to how R environments are displayed within the Environment pane, one can also view the contents of other loaded Python modules.</p><img src="images/python-environment-pane-numpy.png" alt="The environment pane, currently viewing the contents of the numpy module." width="700"/><p>In addition, <a href="https://pandas.pydata.org/" title="pandas - Python Data Analysis Library">pandas</a> <code>DataFrame</code> objects can be opened and viewed similarly to R <code>data.frame</code> objects, and other Python objects can be viewed in the object explorer.</p><h3 id="exploring-python-objects">Exploring Python Objects</h3><p>Python objects can be explored either by calling the <code>View()</code> function from the <code>reticulate</code> REPL, or by using the associated right-most buttons in the Environment pane.</p><img src="images/python-object-explorer.png" alt="The object explorer, used to view a simple Python dictionary." width="700"/><h3 id="displaying-matplotlib-plots">Displaying <code>matplotlib</code> Plots</h3><p><a href="https://matplotlib.org/"><code>matplotlib</code></a> is a popular Python module, used to create visualizations in Python. With RStudio 1.4, the IDE can now also display <code>matplotlib</code> plots within the Plots pane.</p><img src="images/python-matplotlib-example.png" alt="A heart drawn within the Plots pane via the matplotlib package." width="700"/><p>Data scientists using Python might also be familiar with the <a href="https://seaborn.pydata.org/"><code>seaborn</code></a> module, which provides a higher-level interface on top of <code>matplotlib</code> for producing high quality data visualizations. RStudio can also render plots generated by the <code>seaborn</code> package:</p><img src="images/python-seaborn-example.png" width="700"/><p>Currently, only static (non-interactive) plots are supported &ndash; we hope to support interactive graphics in a future release of RStudio.</p><h3 id="getting-started">Getting Started</h3><p>You can try out the new Python features by installing the <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio 1.4 Preview Release</a>. If you do, please let us know how we can make it better on the <a href="https://community.rstudio.com/c/rstudio-ide">community forum</a>!</p></description></item><item><title>RStudio v1.4 Preview: Visual Markdown Editing</title><link>https://www.rstudio.com/blog/rstudio-v1-4-preview-visual-markdown-editing/</link><pubDate>Wed, 30 Sep 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-v1-4-preview-visual-markdown-editing/</guid><description><p>Today we&rsquo;re excited to announce availability of our first <a href="https://www.rstudio.com/products/rstudio/download/preview/">Preview Release</a> for RStudio 1.4, a major new release which includes the following new features:</p><ul><li>A <a href="https://rstudio.github.io/visual-markdown-editing">visual markdown editor</a> that provides improved productivity for composing longer-form articles and analyses with R Markdown.</li><li>New <a href="https://github.com/rstudio/rstudio/pull/6862">Python capabilities</a>, including display of Python objects in the Environment pane, viewing of Python data frames, and tools for configuring Python versions and conda/virtual environments.</li><li>The ability to add <a href="https://github.com/rstudio/rstudio/pull/7339">source columns</a> to the IDE workspace for side-by-side text editing.</li><li>A new <a href="https://github.com/rstudio/rstudio/pull/6848">command palette</a> (accessible via <kbd>Ctrl+Shift+P</kbd>) that provides easy keyboard access to all RStudio commands, add-ins, and options.</li><li>Support for <a href="https://github.com/rstudio/rstudio/pull/7027">rainbow parentheses</a> in the source editor (enabled via <strong>Options -&gt; Code -&gt; Display).</strong></li><li>New RStudio Server Pro features including SAML authentication, local launcher load-balancing, and support for project sharing when using the launcher.</li><li>Dozens of other small improvements and bugfixes.</li></ul><p>You can try out these new features now in the <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio v1.4 Preview Release</a>.</p><p>Over the next few weeks we&rsquo;ll be blogging about each of these new features in turn.Today we&rsquo;ll take a quick tour of the new visual markdown editor (see the full <a href="https://rstudio.github.io/visual-markdown-editing">Visual Markdown Editing</a> documentation for more details).</p><h2 id="visual-markdown-editing">Visual Markdown Editing</h2><p>R Markdown users frequently tell us that they&rsquo;d like to see more of their content changes in real-time as they write, both to reduce the time required by the edit/preview cycle, and to improve their flow of composition by having a clearer view of what they&rsquo;ve already written.</p><p>To switch into visual mode for a markdown document, use the <kbd><img src="images/visual_mode_2x.png" width="15" height="13"/></kbd> button with the compass icon at the top-right of the editor toolbar:</p><img src="images/visual-editing.png" width="700"/><p>With visual mode, we&rsquo;ve tried to create a <a href="https://en.wikipedia.org/wiki/WYSIWYM">WYSIWYM</a> editor for people that love markdown.The editor maintains a lightweight feel that emphasizes semantics over styling.You can also still use most markdown constructs (e.g., <code>##</code> or <code>**bold**</code>) directly for formatting, and when switching between visual and source mode your editing location and undo/redo state are fully preserved:</p><img src="images/visualmode-demo.gif" style="border-top: 1px solid black; border-bottom: 1px solid black;" width="700"/><p>You can also configure visual mode to write markdown using <a href="https://rhodesmill.org/brandon/2012/one-sentence-per-line/">one sentence per-line</a>, which makes working with markdown files on GitHub much easier (enabling line-based comments for sentences and making diffs more local to the actual text that has changed).See the documentation on markdown <a href="https://rstudio.github.io/visual-markdown-editing/#/markdown?id=writer-options">writing options</a> for additional details.</p><p>Anything you can express in pandoc markdown (including tables, footnotes, attributes, etc.) can be edited in visual mode.Additionally, there are many productivity enhancements aimed at authoring technical content like embedded code, equations, citations, cross-references, and inline HTML/LaTeX.</p><h2 id="embedded-code">Embedded Code</h2><p>R, Python, SQL and other code chunks can be edited using the standard RStudio source editor.You can execute the currently selected code chunk using either the run button at the top right of the code chunk or using the <kbd>Cmd+Shift+Enter</kbd> keyboard shortcut:</p><img src="images/visual-editing-execute-code.png" width="700"/><p>Chunk output is displayed inline (you can switch to show the output in the console instead using the Options toolbar button, accessible via the gear icon), and all of the customary commands from source mode for executing multiple chunks, clearing chunk output, etc. are available.</p><h2 id="tables">Tables</h2><p>You can insert a table using the <strong>Table</strong> menu.You can then use either the main menu or a context menu to insert and delete table rows and columns:</p><img src="images/visual-editing-table-context.png" width="700"/><p>Note that if you select multiple rows or columns the Insert or Delete command will behave accordingly.For example, to insert two rows first select two rows then use the Insert command.</p><p>Try editing a table in visual mode then see what it looks like in source mode: all of the table columns will be perfectly aligned (with cell text wrapped as required).</p><h2 id="citations">Citations</h2><p>Visual mode uses the standard Pandoc markdown representation for citations (e.g. <code>[@citation]</code>).Citations can be inserted from a variety of sources:</p><ol><li>Your document bibliography.</li><li><a href="#citations-from-zotero">Zotero</a> personal or group libraries.</li><li><a href="#citations-from-dois">DOI</a> (Document Object Identifier) references.</li><li>Searches of <a href="https://www.crossref.org/">Crossref</a>, <a href="https://datacite.org/">DataCite</a>, or <a href="https://pubmed.ncbi.nlm.nih.gov/">PubMed</a>.</li></ol><p>Use the <kbd><img src="images/citation_2x.png" width="15" height="14"/></kbd> toolbar button or the <kbd>Cmd+Shift+F8</kbd> keyboard shortcut to show the <strong>Insert Citation</strong> dialog:</p><img src="images/visual-editing-citation-search.png" class="illustration" width="700"/><p>If you insert citations from Zotero, DOI look-up, or a search, they are automatically added to your document bibliography.</p><p>You can also insert citations directly using markdown syntax (e.g. <code>[@cite]</code>).When you do this a completion interface is provided for searching available citations:</p><img src="images/visual-editing-citations.png" width="700"/><h2 id="equations">Equations</h2><p>LaTeX equations are authored using standard Pandoc markdown syntax (the editor will automatically recognize the syntax and treat the equation as math).When you aren&rsquo;t directly editing an equation it will appear as rendered math:</p><img src="images/visual-editing-math.png" width="700"/><p>As shown above, when you select an equation with the keyboard or mouse you can edit the equation&rsquo;s LaTeX.A preview of the equation will be shown below it as you type.</p><h2 id="images">Images</h2><p>You can insert images using either the <strong>Insert -&gt; Image</strong> command (<kbd>Ctrl+Shift+I</kbd> keyboard shortcut) or by dragging and dropping images from the local filesystem.If an image isn&rsquo;t already in your markdown document&rsquo;s directory, it will be copied to an <code>images/</code> folder in your project.</p><p>Select an image to re-size it in place (automatically preserving their aspect ratio if you wish):</p><img src="images/visual-editing-images.png" width="700"/><h2 id="cross-references">Cross References</h2><p>The <a href="https://bookdown.org">bookdown</a> package includes markdown extensions for cross-references and part headers.The <a href="https://bookdown.org/yihui/blogdown/">blogdown</a> package also supports bookdown style cross-references, as does the <a href="https://rstudio.github.io/distill/">distill</a> package.</p><p>Bookdown cross-references enable you to easily link to figures, equations, and even arbitrary labels within a document.In raw markdown, you would for example write a cross-reference to a figure like this: <code>\@ref(fig:label)</code>, where the <code>label</code> is the name of the code chunk used to make the figure.For figure cross-referencing to work, you&rsquo;ll also need to add a figure caption to the same code chunk using the knitr chunk option <code>fig.cap</code>, such as <code>fig.cap=&quot;A good plot&quot;</code>.</p><p>Cross-references are largely the same in visual mode, but you don&rsquo;t need the leading <code>\</code> (which in raw markdown is used to escape the <code>@</code> character).For example:</p><img src="images/visual-editing-xref.png" width="700"/><p>As shown above, when entering a cross-reference you can search across all cross-references in your project to easily find the right reference ID.</p><p>Similar to hyperlinks, you can also navigate to the location of a cross-reference by clicking the popup link that appears when it&rsquo;s selected:</p><img src="images/visual-editing-xref-navigate.png" width="700"/><p>You can also navigate directly to any cross-reference using IDE global search:</p><img src="images/visual-editing-xref-search.png" class="illustration" width="700"/><p>See the bookdown documentation for more information on <a href="https://bookdown.org/yihui/bookdown/cross-references.html">cross-references</a>.</p><h2 id="footnotes">Footnotes</h2><p>You can include footnotes using the <strong>Insert -&gt; Footnote</strong> command (or the <kbd>Cmd+Shift+F7</kbd> keyboard shortcut).Footnote editing occurs in a pane immediately below the main document:</p><img src="images/visual-editing-footnote.png" class="illustration" width="700"/><h2 id="emojis">Emojis</h2><p>To insert an emoji, you can use either the <strong>Insert</strong> menu or use the requisite markdown shortcut plus auto-complete:</p><table><thead><tr class="header"><th><p><strong>Insert -&gt; Special Characters -&gt; Emoji...</strong></p></th><th><p>Markdown Shortcut</p><p></p></th></tr></thead><tbody><tr class="odd"><td><p><img src="images/visual-editing-emoji-dialog.png" /></p></td><td><p><img src="images/visual-editing-emoji-completion.png" /></p></td></tr></tbody></table><p>For markdown formats that support text representations of emojis (e.g. <code>:grinning:</code>), the text version will be written.For other formats the literal emoji character will be written.Currently, GitHub Flavored Markdown and Hugo (with <code>enableEmjoi = true</code> in the site config) both support text representation of emojis.</p><h2 id="latex-and-html">LaTeX and HTML</h2><p>You can include raw LaTeX commands or HTML tags when authoring in visual mode.The raw markup will be automatically recognized and syntax highlighted.For example:</p><img src="images/visual-editing-raw.png" width="700"/><p>The above examples utilize <em>inline</em> LaTeX and HTML.You can also include blocks of raw content using the commands on the <strong>Format -&gt; Raw</strong> menu.For example, here is a document with a raw LaTeX block:</p><img src="images/visual-editing-latex-block.png" width="700"/><h2 id="learning-more">Learning More</h2><p>See the <a href="https://rstudio.github.io/visual-markdown-editing">Visual Markdown Editing</a> documentation to learn more about using visual mode.</p><p>You can try out the visual editor by installing the <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio 1.4 Preview Release</a>.If you do, please let us know how we can make it better on the <a href="https://community.rstudio.com/c/rstudio-ide">community forum</a>!</p><style type="text/css">p {hyphens: none !important;}table {margin-top: 0;border-top: none;border-bottom: none;}thead th {text-align: left;padding-left: 8px;}th, td {padding: 1px;}table thead th {border-bottom: none;}.illustration {border: 1px solid rgb(230, 230, 230);}kbd {display: inline-block;margin: 0;padding: 0.25em 0.25em;border: 1px solid rgb(230, 230, 230);border-radius: 3px;background: #f9f9f9;font-family: monospace;font-size: 1em;text-align: center;letter-spacing: 0;line-height: 1;color: inherit;}</style></description></item><item><title>Introducing torch for R</title><link>https://www.rstudio.com/blog/torch/</link><pubDate>Tue, 29 Sep 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/torch/</guid><description><p><sup>Photo by <a href="https://unsplash.com/@ilepilin?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Igor Lepilin</a> on <a href="https://unsplash.com/s/photos/torch?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup></p><p>As of this writing, two deep learning frameworks are widely used in thePython community: <a href="https://www.tensorflow.org/">TensorFlow</a> and<a href="https://pytorch.org/">PyTorch</a>. TensorFlow, together with itshigh-level API Keras, has been usable from R since 2017, via the<a href="https://tensorflow.rstudio.com/">tensorflow</a> and<a href="https://keras.rstudio.com/">keras</a> packages. Today, we are thrilled toannounce that now, you can use Torch <a href="http://torch.mlverse.org/">natively fromR</a>!</p><p>This post addresses three questions:</p><ul><li>What is deep learning, and why might I care?</li><li>What&rsquo;s the difference between <code>torch</code> and <code>tensorflow</code>?</li><li>How can I participate?</li></ul><p>If you are already familiar with deep learning &ndash; or all you can thinkright now is &ldquo;show me some code&rdquo; &ndash; you might want to head directly overto the <a href="https://blogs.rstudio.com/ai/posts/2020-09-29-introducing-torch-for-r/">more technical introduction on the AI blog</a>. Otherwise, youmay find it more useful to hear about the context first, and then playwith the step-by-step example in that complementary post.</p><h2 id="what-is-deep-learning-and-why-might-i-care">What is deep learning, and why might I care?</h2><p>If you&rsquo;re a data scientist, and your data normally comes in tabular,mostly-numerical form, a toolbox of linear and non-linear methods likethose presented in James et al.&lsquo;s <em>Introduction to Statistical Learning</em>may be all you need. This holds even more strongly if the number of datapoints is limited, as tends to be the case in some academic fields, suchas anthropology or ethnology. In this case, Bayesian modeling, as taughtby Richard McElreath&rsquo;s <em>Statistical Rethinking</em>, may be the bestapproach. Carrying the argument to the extreme: Yes, we <em>can</em> constructdeep learning models to predict penguin species based on biometricattributes, and doing this may be very useful in teaching, but this typeof task is not really where deep learning shines.</p><p>In contrast, deep learning has seen its greatest successes when thereare <em>lots</em> of data of a type that is often (misleadingly) called&ldquo;unstructured&rdquo; &ndash; images, text, heterogeneous data resistingunification. Over the last decade, public triumphs have spread fromimage classification and related tasks, like segmentation and detection(important in many sciences), to natural language processing (NLP);prominent examples are translation, summarization, and dialoguegeneration. Beyond these areas of benchmark datasets and official,academically organized competitions, deep learning is pervasively employed ingenerative art, recommendation systems, and probabilistic modeling.Needless to say, current research is working to expand its limits evenmore, striving to integrate capabilities for e.g. concept learning orcausal inference.</p><p>Many readers are likely to work in a field that could benefit from deeplearning. But even if you don&rsquo;t, learning about how a technology worksyields power, power to look behind appearances and make up your own mindand decisions.</p><h2 id="whats-the-difference-between-torch-and-tensorflow">What&rsquo;s the difference between <code>torch</code> and <code>tensorflow</code>?</h2><p>In the Python world, as of 2020, which framework you end up using for aproject may be largely a matter of chance and context. (Admittedly, tosay so takes the fun out of &ldquo;TensorFlow vs. PyTorch&rdquo; debates, but that&rsquo;sno different from other popular &ldquo;comparison games&rdquo;. Take <em>vim vs.emacs</em>, for example. How many people, among those who use one of thempreferentially, have come to do so because &ldquo;that&rsquo;s what I learned first&rdquo;or &ldquo;that&rsquo;s what was used in my first company&rdquo;?).</p><p>Not too long ago, there was a big difference, though. Before theintroduction of TensorFlow 2 (the current release is 2.3), TensorFlowcode was compiled to a static graph, and raw TensorFlow code was hard towrite. Many users didn&rsquo;t have to write low-level code, however: Thehigh-level API <a href="http://keras.io">Keras</a> provided concise, declarativeidioms to define, train, and evaluate a neural network. On the otherhand, Keras did not, at that time, offer a way to easily customize thetraining process. Ease of customization, then, used to be PyTorch&rsquo;scompetitive advantage, relevant to researchers in particular. On theother hand, PyTorch did not, initially, excel in production anddeployment facilities. Historically, thus, the respective strengths usedto be seen as ease of experimentation on the one side, and productionreadiness on the other.</p><p>Today, however, with TensorFlow having become more flexible and PyTorch beingincreasingly employed in production settings, the traditional dichotomyhas weakened. For the R user, this means that practical considerationsare likely to prevail.</p><p>One such practical consideration that, for some users, may be oftremendous importance, is the following. <code>tensorflow</code> and <code>keras</code> arebased on <a href="https://github.com/rstudio/reticulate">reticulate</a>, thathelpful genie which lets you use Python packages seamlessly from R. Inother words, they do not <em>replace</em> Python TensorFlow/Keras; instead,they wrap its functionality and in many cases, add syntactic sugar,resulting in more R-like, aestethically-pleasing (to the R user) code.</p><p><code>torch</code> is different. It is built directly on<a href="https://github.com/pytorch/pytorch/blob/master/docs/libtorch.rst">libtorch</a>,PyTorch&rsquo;s C++ backend. There is no dependency on Python, resulting in aleaner software stack and more straightforward installation. This shouldmake a huge difference, especially in environments where users have nocontrol over, or are not allowed to modify, the software theirorganization provides.</p><p>Otherwise, at the current point in time, maturity of the ecosystem (onthe R side) naturally constitutes a major difference. As of thiswriting, a lot more functionality &ndash; as well as documentation &ndash; isavailable in the <code>tensorflow</code> ecosystem than in the <code>torch</code> one. Buttime doesn&rsquo;t stand still, and we&rsquo;ll get to that in a second.</p><p>To wrap up, let&rsquo;s quickly mention another aspect, to be explained inmore detail in a dedicated article. Due to its in-built facility to doautomatic differentiation, <code>torch</code> can also be used as an R-native,high-performing, highly-customizable optimization tool, beyond the realmof deep learning. For now though, back to our hopes for the future.</p><h2 id="how-can-i-participate">How can I participate?</h2><p>As with other projects, we sincerely hope that the R community will findthe new functionality useful. But that is not all. We also hope thatyou, many of you, will take part in the journey. There is not just awhole framework to be built. There is not just a whole &ldquo;bag of datatypes&rdquo; to be taken care of (images, text, audio&hellip;), each of whichrequires their own pre-processing functionality. There is also theexpanding, flourishing ecosystem of libraries built on top of PyTorch:<a href="https://github.com/OpenMined/PySyft">PySyft</a> and<a href="https://github.com/facebookresearch/CrypTen">CrypTen</a> forprivacy-preserving machine learning, <a href="https://github.com/rusty1s/pytorch_geometric">PyTorchGeometric</a> for deeplearning on manifolds, and <a href="http://pyro.ai/">Pyro</a> for probabilisticprogramming, to name just a few.</p><p>Whether small PRs for <a href="https::/github.com/mlverse/torch">torch</a> or<a href="https::/github.com/mlverse/torchvision">torchvision</a>, or modelimplementations, or help with porting some of the PyTorch ecosystem &ndash;we welcome any participation and support from the R community!</p><p>Thanks for reading, and have fun with <code>torch</code>!</p></description></item><item><title>RStudio Named Strong Performer in the Forrester Wave™: Notebook-Based Predictive Analytics and Machine Learning, Q3 2020</title><link>https://www.rstudio.com/blog/forrester-wave/</link><pubDate>Fri, 25 Sep 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/forrester-wave/</guid><description><figure><img align="center" style="padding: 35px;" src="pbc-hero2.jpg"></figure><p>As we&rsquo;ve discussed in recent blog posts on <a href="https://blog.rstudio.com/2020/05/19/driving-real-lasting-value-with-serious-data-science/" target="_blank" rel="noopener noreferrer">Serious Data Science</a>, <a href="https://blog.rstudio.com/2020/07/15/interoperability-maximize-analytic-investments/" target="_blank" rel="noopener noreferrer">Interoperability</a>, and <a href="https://blog.rstudio.com/2020/07/28/practical-interoperability/" target="_blank" rel="noopener noreferrer">R and Python</a>, RStudio, PBC is dedicated to investing in free and open-source software for data science and helping people understand and improve the world through data. We use our commercial software to fund this investment and focus on helping our customers leverage their investment in open source languages, such as R and Python, and scale them across their entire organization while meeting their enterprise requirements.</p><p>Generally, we measure our success by the satisfaction of our users and customers. We have been gratified and humbled to have millions of people using our open source offerings (such as the <a href="https://rstudio.com/products/rstudio/" target="_blank" rel="noopener noreferrer">RStudio IDE</a>, <a href="https://shiny.rstudio.com/" target="_blank" rel="noopener noreferrer">Shiny</a>, and the <a href="https://www.tidyverse.org/" target="_blank" rel="noopener noreferrer">tidyverse</a> packages), and thousands of organizations using our commercial software (such as <a href="https://rstudio.com/products/team/" target="_blank" rel="noopener noreferrer">RStudio Team</a> and <a href="https://rstudio.com/products/cloud/" target="_blank" rel="noopener noreferrer">RStudio Cloud</a>.</p><p>However, as our commercial products are used more widely by larger enterprises, there are times when our product champions look for third party validation of our products, especially when discussing them with buyers and decision makers in IT and lines of business. With that in mind, we are excited and gratified to announce that Forrester Research has named us a Strong Performer in <a href="https://rstudio.com/forrester" target="_blank" rel="noopener noreferrer">The Forrester Wave™: Notebook-Based Predictive Analytics and Machine Learning, Q3 2020</a>.</p><h2 id="rstudio-named-a-strong-performer">RStudio Named a Strong Performer</h2><blockquote><p>&ldquo;RStudio Team provides the core functionality for enterprise teamsthat primarily want to use R and some Python todevelop and deploy their models&hellip;.at a rock-bottom price.&rdquo;</p></blockquote><p>The Forrester Wave™: Notebook-Based Predictive Analytics and Machine Learning, Q3 2020 focuses on code-first data science platforms, which &ldquo;&hellip;bring software development discipline to otherwise unruly data science processes, increasing productivity, boosting collaboration, and improving time-to-value.&rdquo; These platforms also make &ldquo;&hellip;individual data scientists dramatically more productive, especially by providing collaboration capabilities for teams.&rdquo; The report also describes the importance of open source data science, including R and Python, to give organizations access to the latest and greatest in machine learning algorithms and to help attract top data science talent.</p><p>In this report, we were named a Strong Performer by this independent research firm, and we received the highest scores possible in the evaluation criteria of security, apps, open source, and platform infrastructure. We believe these results validate our approach to our core market, which focuses on open source, code-first, serious data science.</p><p>We are excited to share these results with you and encourage you to read <a href="https://rstudio.com/forrester" target="_blank" rel="noopener noreferrer">Forrester&rsquo;s full report</a> for details.</p><style type="text/css">button.btn-white {background-color: transparent;color: #ffffff;border: 1px solid #ffffff;text-transform: uppercase;font-size: 12px;margin-right: 15px;}button.btn-white:hover {background-color: #ffffff;color: #00563f;border: 1px solid #ffffff;}</style><div style="background-color: #00563f;text-align: center;margin-top:30px;padding-top:24px;padding-bottom:50px;color: #ffffff;font-weight: 400;"><p style="font-size: 32px;margin-bottom: 15px;">To learn more</p><p style="margin-bottom: 35px;">Schedule a meeting with RStudio or read the full report from Forrester®</p><a href="https://rstudio.chilipiper.com/book/schedule-time-with-rstudio" target="_blank" rel="noopener noreferrer" ><button class="btn-white">Schedule your meeting</button></a><a href="https://rstudio.com/forrester/" target="_blank" rel="noopener noreferrer"><button class="btn-white">Read the Report</button></a></div></description></item><item><title>Ease Uncertainty by Boosting Your Data Science Team's Skills</title><link>https://www.rstudio.com/blog/ease-uncertainty/</link><pubDate>Wed, 23 Sep 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/ease-uncertainty/</guid><description><p><sup>Artwork by <a href="https://www.allisonhorst.com" target="_blank" rel="noopener noreferrer">Allison Horst</a></sup></p><p>As we all begin the fall planning and budgeting season, everyone I know is feeling a lot of uncertainty. I think that&rsquo;s natural; after all, many of us still don&rsquo;t know when we&rsquo;ll be able to:</p><ul><li>Return to work in an office.</li><li>Send our children to school five days a week.</li><li>Go out to eat in a restaurant normally (whatever that means).</li></ul><p>What hasn&rsquo;t changed, though, is expectations for data science teams to deliver results. The global economic downturn due to the COVID-19 virus has hurt business revenues and profits for many organizations. That economic pressure suggests that data science leaders should think about how they can best demonstrate the value of their teams to the business, especially as we look forward to 2021.</p><p>In the past week, I&rsquo;ve found a few new learning resources that I thought could help data science teams learn new skills and communicate their value better. They are:</p><ol><li><a href="https://www.tmwr.org" target="_blank" rel="noopener noreferrer">**Tidymodeling with R:**</a> On September 17, 2020, RStudio&rsquo;s <a href="https://juliasilge.com/blog/tidymodels-book/" target="_blank" rel="noopener noreferrer">Julia Silge announced this new book</a> she is co-authoring with Max Kuhn. Julia and Max have put the first eleven chapters up online at <a href="https://www.tmwr.org" target="_blank" rel="noopener noreferrer"><a href="http://www.tmwr.org">www.tmwr.org</a></a> and will be adding new chapters as new ones become available. This book is particularly important because it teaches how to use R and the <code>tidymodels</code> package while encouraging good methodology and statistical processes.</li><li><a href="https://www.infoworld.com/article/3411819/do-more-with-r-video-tutorials.html" target="_blank" rel="noopener noreferrer">**50 <em>Do More With R</em> videos:**</a> IDG&rsquo;s Sharon Machlis has been creating short how-to videos for a couple of years now, and her collection has become quite comprehensive. The videos cover a wide variety of topics from spiffing up your ggplots to how to search, sort, and filter tweets by hashtag with <code>rtweet</code> and <code>reactable</code>. I think these videos could be particularly useful as the basis for a weekly &ldquo;Lunch and Learn&rdquo; program to build data science techniques.</li><li><a href="https://medium.com/data-science-and-machine-learning-at-pluralsight/data-studio-pluralsight-f1b4b5d8a6e3" target="_blank" rel="noopener noreferrer">**Knowledge Sharing is the New Gold:**</a> In this article, Shan Huang writes how her team built a Data Studio using RStudio Connect to share their insights better with others in her company. The system they&rsquo;ve built achieves many of the goals of what we view as serious data science, including allowing their data scientists to work interoperably between R and Python and communicating better with stakeholders. To my mind, this piece does a great job describing the outcomes data science leaders should be aiming for with their teams; if nothing else, it provides a good set of talking points for your data science budget planning meeting.</li></ol><p>We here at RStudio are also ramping up our efforts this fall to support more serious data science and interoperability in both our open source and commercial products. Check back with the blog regularly to catch all the announcements; we expect it to be a very busy fall.</p><p>And don&rsquo;t forget to subscribe to receive updates on our very first <a href="https://blog.rstudio.com/2020/07/17/rstudio-global-2021/" target="_blank" rel="noopener noreferrer">rstudio::global(2021)</a> virtual conference scheduled for early 2021. This will be our first completely virtual event, featuring 24 hours of speakers from around the world sharing their reflections on how they use R and extend it into new packages and communities. You&rsquo;ll be hearing more about this exciting event in the coming weeks.</p></description></item><item><title>Learning Data Science with RStudio Cloud: A Student's Perspective</title><link>https://www.rstudio.com/blog/rstudio-cloud-a-student-perspective/</link><pubDate>Thu, 17 Sep 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-cloud-a-student-perspective/</guid><description><p>On August 5, 2020, <a href="https://blog.rstudio.com/2020/08/05/rstudio-cloud-announcement/" target="_blank" rel="noopener noreferrer">RStudio announced the general availability of RStudio Cloud</a>, its cloud-based platform for doing, teaching, and learning data science using only a browser. We often recommend RStudio Cloud to teachers because it simplifies course setup, makes it easy to distribute and manage exercises, and helps the instructor track student progress. With COVID-19 driving more courses online this fall, RStudio has developed a number of resources for instructors to use with RStudio Cloud, including:</p><ul><li><a href="https://rstudio.com/resources/webinars/rstudio-cloud-in-the-classroom/" target="_blank" rel="noopener noreferrer"><strong>RStudio Cloud in the Classroom</strong></a>, a webinar conducted by RStudio&rsquo;s Mine Çentinkaya-Rundel.</li><li><a href="https://rstudio.com/resources/webinars/teaching-r-online-with-rstudio-cloud/" target="_blank" rel="noopener noreferrer"><strong>Teaching R online with RStudio Cloud</strong></a>, another webinar by Mine Çentinkaya-Rundel. She also published <a href="https://education.rstudio.com/blog/2020/04/teaching-with-rstudio-cloud-q-a/" target="_blank" rel="noopener noreferrer">a follow-up post</a> that provides answers to questions raised during the webinar and additional resources available to teachers.</li></ul><p>However, while we often hear from teachers about how RStudio Cloud has helped them, we don&rsquo;t often get to hear what the experience is like from the student&rsquo;s point of view. We recently had the opportunity to have an in-depth discussion with Lara Zaremba, a Master&rsquo;s degree student in International Economics at Johann Wolfgang Goethe-Universität, in Frankfurt Germany. In the video below, Lara describes some of what she experienced collaborating with other students and being a mentor to others on the platform.</p><script src="https://fast.wistia.com/embed/medias/ipsl06x9s1.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_ipsl06x9s1 seo=false videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/ipsl06x9s1/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><p>An earlier part of our conversation with Lara also highlighted the need for increased diversity within the R community. While Lara felt that RStudio Cloud empowered her to become successful as a student and mentor, she mentioned that she had not seen herself as a data scientist because of the stereotypes associated with &ldquo;computer people&rdquo;.</p><p>One of the most powerful forces helping overcome those stereotypes today is <a href="https://rladies.org" target="_blank" rel="noopener noreferrer">R-Ladies Global (rladies.org)</a>, a world-wide organization dedicated to promoting gender diversity through meetups, code, teaching, and leadership. We encourage all to support their work and to help everyone see themselves as full members of the R community. You can learn more about how you can participate and make our community more welcoming to all at <a href=" https://rladies.org/about-us/help/" target="_blank" rel="noopener noreferrer"><a href="https://rladies.org/about-us/help/">https://rladies.org/about-us/help/</a></a>.</p></description></item><item><title>Announcing the 2020 RStudio Table Contest</title><link>https://www.rstudio.com/blog/announcing-the-2020-rstudio-table/</link><pubDate>Tue, 15 Sep 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-the-2020-rstudio-table/</guid><description><p>Here at RStudio, we love tables. No, not tables like the one pictured above, but data tables.Did you know that data tables have been used since the 2nd century? They’ve really been around.</p><p>Tables are a fantastic way to communicate lists of quantitative and qualitative information. Sometimes, tables can fall very short of their potential for greatness. But that was the past: we now have some excellent R packages at our disposal to generate well-designed and functional presentation tables. And because of this renaissance of table-making in R, we’re announcing a contest: The 2020 RStudio Table Contest. It will run from September 15th to October 31st, 2020.</p><p>One thing we love about the R community is how open and generous you are in sharing the code and process you use to solve problems. This lets others learn from your experience and invites feedback to improve your work. We hope this contest encourages more sharing and helps to recognize the many outstanding ways people work with and display data with R.</p><h2 id="contest-judging-criteria">Contest Judging Criteria</h2><p>Tables will be judged based on technical merit, artistic design, and quality of documentation. We recognize that some tables may excel in only one category and others in more than one or all categories. Honorable mentions will be awarded with this in mind.</p><p>We are working with maintainers of many of the R community’s most popular R packages for building tables, including Yihui Xie of <a href="https://rstudio.github.io/DT/" target="_blank" rel="noopener noreferrer"><code>DT</code></a>, Rich Iannone of <a href="https://gt.rstudio.com/" target="_blank" rel="noopener noreferrer"><code>gt</code></a>, Greg Lin of<a href="https://glin.github.io/reactable/" target="_blank" rel="noopener noreferrer"><code>reactable</code></a>, David Gohel of<a href="https://davidgohel.github.io/flextable/articles/overview.html" target="_blank" rel="noopener noreferrer"><code>flextable</code></a>, David Hugh-Jones of <a href="https://hughjonesd.github.io/huxtable/" target="_blank" rel="noopener noreferrer"><code>huxtable</code></a>, and Hao Zhu of <a href="https://cran.r-project.org/web/packages/kableExtra/vignettes/awesome_table_in_html.html" target="_blank" rel="noopener noreferrer"><code>kableExtra</code></a>. Many of these maintainers will help review submissions built with their packages.</p><h2 id="requirements">Requirements</h2><p>A submission must include all code and data used to replicate your entry. This may be a fully knitted R Markdown document with code (for example published to RPubs or shinyapps.io), a repository, or rstudio.cloud project.</p><p>A submission can use any table-making package available in R, not just the ones mentioned above.</p><p><strong>Submission Types</strong> - We are looking for three types of table submissions,</p><ol><li><strong>Single Table Example</strong>: This may highlight interesting structuring of content, useful and tricky features – for example, enabling interaction – or serve as an example of a common table popular in a specific field. Be sure to document your code for clarity.</li><li><strong>Tutorial</strong>: It’s all about teaching us how to craft an excellent table or understand a package’s features. This may include several tables and narrative.</li><li><strong>Other</strong>: For submissions that do not easily fit into one of the types above.</li></ol><p><strong>Category</strong> - Given that tables have different features and purposes, we’d also like you to further categorize the submission table. There are four categories, static-HTML, interactive-HTML, static-print, and interactive-Shiny. Simply choose the one that best fits your table.</p><p>You can submit your entry for the contest by filling the form at <a href="https://rstd.io/table-contest-2020" target="_blank" rel="noopener noreferrer">rstd.io/table-contest-2020</a>. The form will generate a post on RStudio Community, which you can then edit further at a later date. You may make multiple entries.</p><p>The deadline for submissions is October 31st, 2020, at midnight Pacific Time.</p><h2 id="prizes">Prizes</h2><p><strong>Grand Prize</strong></p><ul><li>One year of shinyapps.io Basic plan or one year of <a href="https://rstudio.cloud/plans/premium" target="_blank" rel="noopener noreferrer">RStudio.Cloud Premium</a>.</li><li>Additionally, any number of RStudio t-shirts, books, and mugs (worth up to $200).<br>Unfortunately, we may not be able to send t-shirts, books, or other items larger than stickers to non-US addresses for which shipping and customs costs are high.</li></ul><p><strong>Honorable Mention</strong></p><ul><li>A good helping of hex stickers for RStudio packages plus a side of hexes for table-making packages, and other goodies.</li></ul><h3 id="tables-gallery">Tables Gallery</h3><p>Previous Shiny Contests have driven significant improvement to the popular <a href="https://shiny.rstudio.com/gallery/" target="_blank" rel="noopener noreferrer">Shiny Gallery</a> and we hope this contest will spur development of a similar Tables Gallery. Winners and other participants may be invited to feature their work on such a resource.</p><p>We will announce the winners and their submissions on the RStudio blog, RStudio Community, and also on Twitter.</p></description></item><item><title>Debunking R and Python Myths: Answering Your Questions</title><link>https://www.rstudio.com/blog/dispelling-r-and-python-myths-qanda/</link><pubDate>Thu, 10 Sep 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/dispelling-r-and-python-myths-qanda/</guid><description><p>Here&rsquo;s a quick quiz: which of the following statements are true:</p><ol><li>Henry Ford invented the automobile.</li><li>Bob Marley recorded and sang the song, <em>Don&rsquo;t Worry; Be Happy</em>.</li><li>Medieval Europeans believed the earth was flat.</li></ol><p>All of these statements are common myths and are false (see the end of this article for sources for the true answers). It was in this context that members of the RStudio team sat down with partner Lander Analytics to tackle another modern-day myth, namely that data scientists must choose between R and Python for their data science work.</p><img align="center" style="padding: 35px:" src="lego-keyboard.jpg"><p style="text-align: right"><em>Photo by <a href="https://unsplash.com/@jamesponddotco" target="_blank" rel="noopener noreferrer">James Pond</a> on <a href="https://unsplash.com/photos/26vBUtlufFo" target="_blank" rel="noopener noreferrer">Unsplash</a></em></p><p>Our panelists for this webinar were:</p><ul><li><strong>Daniel Chen</strong>, Ph.D. student and author of <em>Pandas for Everyone</em>, the Python/Pandas complement to <em>R for Everyone</em>.</li><li><strong>Jared P. Lander</strong>, Chief Data Scientist of Lander Analytics</li><li><strong>Sean Lopp</strong>, Product manager at RStudio</li><li><strong>Carl Howe</strong>, Content lead at RStudio</li><li><strong>Samantha Toet</strong>, Partner marketing specialist at RStudio.</li></ul><p>You can view the recording of our webinar at <a href="https://rstudio.com/resources/webinars/debunking-the-r-vs-python-myth/" target="_blank" rel="noopener noreferrer">Debunking the R vs. Python Myth</a>. That site also has complete biographies of our panelists.</p><p>The sections below summarize many of the questions and answers brought up during the panel. We have paraphrased and distilled many of these responses for brevity and narrative quality. We&rsquo;ve also added answers to questions that were asked by our attendees but that we answered after the webinar aired.</p><h2 id="whats-behind-the-r-and-python-myth">What&rsquo;s behind the R and Python Myth?</h2><h3 id="what-is-interoperability">What is interoperability?</h3><p><strong>Samantha:</strong> I like starting with this question because it gets at the heart of the R and Python myth. The myth assumes that data scientists must choose between R or Python. The reality is that interoperability allows the two languages to work together.</p><p>This interoperability idea aligns with the concept of serious data science (see <a href="https://blog.rstudio.com/2020/05/19/driving-real-lasting-value-with-serious-data-science/" target="_blank" rel="noopener noreferrer">this article</a> and its successor posts for definitions and details) in that we gain credibility because the prior work has already been proven, we gain agility because we don&rsquo;t have to reinvent the wheel, and we gain durability because the work is built on existing knowledge. While this concept of interoperability and its importance isn&rsquo;t new, it&rsquo;s especially significant right now with the wealth of different open source tools at hand.</p><p><strong>Jared:</strong> The user of the tool doesn&rsquo;t need to know that there are other languages at play under the hood. For example, the R language itself is built on C and FORTRAN code and libraries.</p><p><strong>Carl:</strong> I like to think of interoperability as our ability to build on prior work. This isn&rsquo;t a new idea &ndash; it&rsquo;s the foundation of all science. However, data science is one of the first scientific disciplines to be built on a foundation of code. Eliminating building blocks for our work simply because they aren&rsquo;t written in our favorite language leads toward reinventing the wheel again and again. The good news is that R and Python now both have features that facilitate interoperability, and that means we can build on top of them faster.</p><h3 id="are-there-any-specific-best-practices-for-interoperability-and-building-a-multi-lingual-team">Are there any specific best practices for interoperability and building a multi-lingual team?</h3><p><strong>Daniel:</strong> In terms of tooling, some of the best things you can do is to use a version control system, the most popular one of which is <code>git</code>. That&rsquo;s a powerful way to create data science that&rsquo;s credible. Version control literally leaves you an audit of your code and allows you to go back in time to track down bugs.</p><p><strong>Jared:</strong> I always like to keep all the steps in my process as isolated as possible. You can use git to keep everything under control and then apply a CI/CD (Continuous Integration/Continuous Delivery) system as a way to kick off workflows.</p><p>Workflows have become more popular for managing data science, especially working in a multi-tool environment. We use documents like RMarkdown to kick off many individual steps and run those in sequence. And best of all, you don&rsquo;t care what language your steps are using. You can have one step in R, another in Python, and yet another in Julia. That allows you to take advantage of parallelism, and, when combined with Docker, allows you to dictate the appropriate execution environment for each step.</p><p><strong>Sean:</strong> If we take a step back, best practices are the same in any language. Most were refined by software developers way before data science was a thing, and now they are being adopted by data scientists. Those practices include using tools such as version control systems, testing infrastructure, continuous integration tools, and a documented workflow such as Jared was referring to.</p><p>More interesting is thinking about the potential differences between practices for software engineering and for data analysts. It&rsquo;s kind of a lot to take someone graduating from Excel to R and Python and convince them to adopt all these complex tools. So for me, the first steps for analytic software come down to two things:</p><ol><li><strong>Working in a common server environment.</strong> The server environment gives us a centrally managed execution environment that provides a shared playbook for folks to work in. Using a server allows everyone to share software versions, packages, and network access, all under a consistent set of organizational policies.</li><li><strong>Having a safety net that tracks the common environment.</strong> That safety net includes having version control for the source code and a system for managing the execution environment. We recently released a package called <code>renv</code> for R; on the Python side, you might be familiar with <code>virtualenv</code> or perhaps <code>conda</code> to do something similar.</li></ol><p><strong>Carl:</strong> It&rsquo;s important to ask where is your data going to live and in what format. An unsung open source project trying to create better answers to those questions for R, Python, C++, Rust, and a host of other systems is the <a href="https://arrow.apache.org/docs/python/feather.html" target="_blank" rel="noopener noreferrer">Feather file format</a>, which is part of <a href="https://arrow.apache.org/" target="_blank" rel="noopener noreferrer">Apache Arrow</a>. The project&rsquo;s goal is to implement a fast common data representation which can be used in memory and doesn&rsquo;t require serialization steps. When both languages can read and write data in a common format, interoperability becomes both easier and faster.</p><h2 id="what-are-the-preferred-tools-for-each-step-in-the-data-science-lifecycle">What are the preferred tools for each step in the data science lifecycle?</h2><p><strong>Daniel:</strong> I don&rsquo;t really think projects are language-specific. Short of programming microcontrollers, which tends to be done in Python or C, both languages can do pretty much anything. If you find yourself asking, &ldquo;which language should I use for this project?&quot;, you should also ask yourself:</p><ol><li><strong>Who am I developing this <em>for</em>?</strong> Think about who will use the results you create. Will they just be looking at them or will they want to tinker with them?</li><li><strong>Who am I developing this <em>with</em>?</strong> If I&rsquo;m part of a team or group, what is their preferred language? Who will provide support long-term and what language skills do they have?</li></ol><p>For me, I really like how R does publication and communication, including R Markdown report generation, Shiny dashboards, <code>ggplot</code> for plotting, I personally find that ecosystem easier to use. So when I&rsquo;m doing a final presentation, I usually drop down into those tools.</p><p>Prior to presentation though, I find I think more about the people on the team and not so much about the tools or data. You have to even think beyond your team; your IT department that has to manage the infrastructure, for example. Waging a language &ldquo;holy war&rdquo; within a team sets the stage for an attitude that does not embrace learning and makes creating value less efficient. And at the end of the day, we can call R from Python and Python from R, so the choice isn&rsquo;t nearly as critical as you think.</p><p><strong>Carl:</strong> When I think about preferred tools, I like to break it down into the process that Garrett Grolemund and Hadley Wickham put forth in <a href="https://r4ds.had.co.nz/explore-intro.html" target="_blank" rel="noopener noreferrer"><em>R for Data Science</em></a>. They have a 5 step process where you start with data ingestion, you tidy up your data, you then go through a loop of transformation and analysis, and you finally do the presentation.</p><img align="center" style="padding: 35px;" src="data-science-explore.png"><p>Before we get there, I think we all usually start with the RStudio Integrated Development Environment (IDE) or some other equivalent tool that helps us be more productive writing code, regardless of language. Not everyone has to &ndash; I spent a lot of my career writing R in emacs</p><p>Now I&rsquo;m a mostly R guy, so when I think of this process, I have some favorite R packages:</p><ol><li><strong>ingestion</strong>: <code>readr</code> for ingesting all kinds of files, not just .csv files. There are also other important packages like <code>DBI</code> and <code>odbc</code> that provide access to databases, which is a best practice for doing ingestion.</li><li><strong>tidy</strong>: <code>tidyr</code> is an awesome package for a variety of tidying tools</li><li><strong>analysis</strong>: This is pretty wide open. Many of the 17,000+ packages available in R live in this category, so I&rsquo;m not going to arbitrarily choose a few.</li><li><strong>communication</strong>: My go-to packages here are <code>ggplot2</code>, <code>rmarkdown</code>, <code>flexdashboard</code>, and <code>Shiny</code>. These are killer apps for presenting to business people because they provide pictures and text. I&rsquo;ll also provide a shout out for the daily build versions of the RStudio IDE which feature what-you-see-is-what-you-get editing for R Markdown. They make it as easy to edit R Markdown as it is to edit ordinary text in Microsoft Word. If you have anyone who is allergic to all the funny annotations to use R Markdown, get the latest version of the RStudio IDE.</li></ol><h3 id="what-should-a-data-science-manager-think-about-team-composition-when-considering-interoperability">What should a data science manager think about team composition when considering interoperability?</h3><p><strong>Jared:</strong> The first and most important thing is that the team is the people and you have to manage the people more than the tools and more than the data and everything else. They are everything. That might sound a little saccharine, but they are the people who get the work done.</p><p>It&rsquo;s only recently that we have companies who say that &ldquo;Everything has to be done in the same language.&rdquo; Banks have been operating for decades and they&rsquo;re running C, COBOL, Java, SQL, and a host of others. It&rsquo;s very common in most industries to have a multitude of languages. It may not be as common for a single team as we&rsquo;re seeing in data science teams, but it happens.</p><p>There&rsquo;s often talk of this holy war between R and Python in data science. You&rsquo;ve got to stop that. You can&rsquo;t make any one employee feel worse or better based on their language choice. It may come down from the leader of the team that one language is better than another, but you can&rsquo;t make your team members feel like second-class citizens. It&rsquo;s not good for the people.</p><p>The most important thing is for everyone to feel valued and not to defer to pseudo-standardization.</p><p><strong>Sean:</strong> There is an assumption that if we can standardize on a single tool, our team will be more efficient. That is, there will be less costs associated with infrastructure, maintenance, and development. This is entirely false.</p><p>If you look at your costs, the most valuable and expensive resource on a data science team is the data scientist. They are what bring differentiating capabilities to the business. For that reason, providing them with the right tools for them to be successful is significant. You wouldn&rsquo;t tell a mechanic not to use all the tools in their toolbox, or a musician not to play different instruments, so why should a data scientist standardize on one programming language?</p><p>We see this a lot at RStudio, this myth that if data scientists would only converge on one tool, it will be cheaper. In reality, by converging on a single tool you may be missing out on being able to recruit a diverse group of team members that bring a wealth of different skills and backgrounds.</p><h2 id="what-are-some-real-world-examples-of-r-and-python-working-together">What are some real-world examples of R and Python working together?</h2><p>Finally, we discussed some examples of successful R and Python data products. Daniel walked us through a <a href="https://github.com/chendaniely/2020-08-26-rstudio_debunk" target="_blank" rel="noopener noreferrer">Shiny app that has a Python sci-kit learn machine learning model</a> running inside of it. Carl shared some insights about <a href="https://blog.rstudio.com/2020/07/28/practical-interoperability/" target="_blank" rel="noopener noreferrer">3 wild-caught R &amp; Python interoperability examples</a>, including an example he built to monitor solar panel and weather data using asynchronous interoperability on a Raspberry Pi.</p><p>The key takeaway from this discussion is that data science should not be stifled for the sake of language loyalty. By embracing the differences between R and Python data science teams can expand their capabilities and therefore deliver the most significant results.</p><h2 id="attendee-questions">Attendee Questions</h2><p>We didn&rsquo;t have time during the webinar to address attendee questions, so we thought we&rsquo;d follow up on a few of what we thought were the most important here. Multiple questions on the same topic have been consolidated into one whenever possible.</p><h3 id="languages-for-data-science">Languages for data science</h3><p>Despite our best efforts, our attendees still wanted us to recommend a single best language.</p><h4 id="can-you-share-the-blog-where-you-define-serious-data-science">Can you share the blog where you define &ldquo;serious data science&rdquo;?</h4><p><strong>Carl:</strong> Absolutely. The initial article is titled, <a href="https://blog.rstudio.com/2020/05/19/driving-real-lasting-value-with-serious-data-science/" target="_blank" rel="noopener noreferrer">Driving Real, Lasting Value with Serious Data Science </a> and it defines serious data science work as having 3 essential attributes, each of which we discussed in detail in subsequent articles. RStudio believes that for work to be considered serious data science, the work has to be:</p><ol><li><strong>Credible:</strong> <a href="https://blog.rstudio.com/2020/06/02/is-your-data-science-credible-enough/" target="_blank" rel="noopener noreferrer">Is Your Data Science Credible Enough?</a>.</li><li><strong>Agile:</strong> <a href="https://blog.rstudio.com/2020/06/09/is-your-data-science-team-agile/" target="_blank" rel="noopener noreferrer">Is Your Data Science Team Agile?</a></li><li><strong>Durable:</strong> <a href="https://blog.rstudio.com/2020/06/24/delivering-durable-value/" target="_blank" rel="noopener noreferrer">Does your Data Science Team Deliver Durable Value?</a></li></ol><h4 id="when-trying-to-help-an-analyst-move-from-excel-to-r-or-python-which-would-you-suggest-teaching-first-for-a-beginner-who-is-trying-to-learn-data-analytics-which-tool-would-you-recommend-r-or-python">When trying to help an analyst move from Excel to R or Python, which would you suggest teaching first? For a beginner who is trying to learn Data Analytics. Which tool would you recommend? R or Python?</h4><p><strong>Carl:</strong> Either will work. If the analyst has written code before, then Python may appear more familiar. If the analyst has not been a programmer in the past, I&rsquo;d have them learn R first because it is easier to learn for non-programmers who just want to get something done and don&rsquo;t want to learn computer science along the way.</p><h4 id="weve-focused-on-r-and-python-here-what-about-julia">We&rsquo;ve focused on R and Python here. What about Julia?</h4><p><strong>Carl:</strong> In my view, Julia turns the 2 language problem into a 3 language problem. Once again, if Julia is the fastest way for you to solve a problem, use it. However, I don&rsquo;t think most people believe that Julia eliminates the 2 (or 3 or N) language problem.</p><p>The good news, though, is that you can call Julia within R and Python as well. Both R Markdown and Jupyter notebooks allow you go create notebooks and chunks that run Julia code.</p><h3 id="tooling">Tooling</h3><h4 id="whats-the-easiest-way-to-set-up-a-shared-environment-for-my-team-assuming-im-the-only-one-with-programming-experience">What&rsquo;s the easiest way to set up a shared environment for my team (assuming I&rsquo;m the only one with programming experience)</h4><p><strong>Daniel:</strong> If you&rsquo;re in a shared cloud environment, <code>renv</code> can be used to completely set up the R environment. Python has <code>virtualenv</code> and <code>conda</code> environments that you would set up for everyone to use on the Python side. You then need to tell people to run a few lines of code, in the beginning, to &ldquo;set everything up&rdquo;.</p><p>Fortunately, many companies and people have their own internal R packages to make the set up part easier for everyone. If you don&rsquo;t have such a system at your organization, then you should build such a standard setup package. While it may take more time now, it will save huge amounts of time later.</p><h4 id="are-we-talking-purely-about-using-the-open-source-tools-r-studio-offers">Are we talking purely about using the Open Source tools R Studio offers?</h4><p><strong>Carl:</strong> Everything we discussed applies to the open source tools and to RStudio&rsquo;s professional tools. The differences between the two are largely related to enterprise features and scalability for large installations. Functionally, though, they cover almost all the same capabilities.</p><h4 id="are-there-other-ways-to-incorporate-python-in-r-other-than-with-the-reticulate-package">Are there other ways to incorporate Python in R other than with the <code>reticulate</code> package?</h4><p><strong>Carl:</strong> While `reticulate` is probably the best known, <code>rPython</code>, <code>SnakeCharmR</code>, and <code>PythonInR</code> all provide the same functionality for R to call Python. You can also use R from Python with the <code>PypeR</code>, <code>pyRserve</code>, and <code>rpy2</code> packages. You might find <a href="https://towardsdatascience.com/from-r-vs-python-to-r-and-python-aa25db33ce1" target="_blank" rel="noopener noreferrer">this site helpful</a> in sorting out the pros and cons of each language.</p><h4 id="i-dont-understand-how-does-version-control-affect-the-development-process">I don&rsquo;t understand. How does version control affect the development process?</h4><p><strong>Daniel:</strong> Version control is related in the sense that when you have teams working in different languages in the same project. you need some way to coordinate the work of multiple people. Version control systems like <code>git</code> allow for multiple developers to work together without stepping on each other&rsquo;s toes by keeping track of who has changed what and detecting when changes are likely to conflict. They also are often used to drive automated testing and continuous integration tools farther down the deployment pipeline.</p><h4 id="do-you-guys-containerize-models-eg-using-docker-and-k8s-does-anyone-have-any-other-good-ways-of-exposing-models-in-production">Do you guys containerize models e.g. using Docker and k8s? Does anyone have any other good ways of exposing models in production?</h4><p><strong>Daniel:</strong> Yes, containers are a common way to share models. You can expose those containerized models in production with the <code>plumber</code> package in R and make the model callable with a REST API. Depending on how frequently the model needs to be updated, you can also just save out the model object once it is trained and reference that saved model in production.</p><p><strong>Carl:</strong> While this may not seem to directly answer your question, I recently wrote an article titled <a href="https://blog.rstudio.com/2020/08/27/expand-your-data-science-resources/" target="_blank" rel="noopener noreferrer"><em>3 Ways to Expand Your Data Science Compute Resources</em></a> which talks about how the Launcher feature in RStudio Server Pro can run R jobs independently of interactive sessions. That same Launcher feature also is capable of launching containerized versions of R applications on SLURM and Kubernetes, allowing you to train models using a centralized computational cluster.</p><h3 id="applications-and-use">Applications and Use</h3><h4 id="i-heard-from-python-people-that-python-is-better-at-doing-ml-at-scale-on-large-datasets-and-that-python-is-better-at-creating-ml-models-that-can-be-put-into-production-can-you-comment-on-those-claims">I heard from &ldquo;Python people&rdquo; that Python is better at doing ML at scale on large datasets and that Python is better at creating ML models that can be put into production. Can you comment on those claims?</h4><p><strong>Carl</strong>: I think this is one of the myths that keeps getting promoted without any evidence to support it. Yes, Python has many machine learning libraries. So does R. Yes, Python can use the <code>keras</code> and <code>tensorflow</code> packages for building models. So can R. Yes, Python can run on large Spark clusters at scale. So can R (and we might argue that the <code>sparklyr</code> package provides a more programmer-friendly way of doing so than the native Python and SparkR interfaces).</p><p>To my mind, the key to any sort of lasting and robust solution to a computational problem is to decompose the problem into clear components and to provide simple elegant interfaces to those components. That sort of solution design doesn&rsquo;t care whether you create the solution in R, Python, Basic, COBOL, or System 370 assembly code. And as Jared noted in our webinar, multi-billion-dollar banks around the world run production systems written in all those languages every day of the week. That suggests to me that the key to good production systems is in how you build them, not what language you write them in.</p><h4 id="at-present-how-efficiently-is-r-optimized-for-running-deep-learning-algorithms-in-tensorflow-as-compared-to-python-">At present, how efficiently is R optimized for running Deep Learning algorithms in TensorFlow as compared to Python ?</h4><p><strong>Carl:</strong> I don&rsquo;t think any of us feel like we know a definitive answer to that question. Both R and Python have Tensorflow interface libraries. The Tensorflow code they run is actually identical; the interface libraries are simply wrappers around the native Tensorflow code. So to first order, I&rsquo;d expect the two languages to be roughly the same in terms of efficiency if you are invoking Tensorflow directly.</p><p>For those interested in the efficiency of solving AI problems using R and Python, I highly recommend visiting the <a href="https://blogs.rstudio.com/ai/" target="_blank" rel="noopener noreferrer">RStudio AI blog</a> which has regular articles on this topic and more.</p><p>With that said, though, I&rsquo;d emphasize that for many problems, your real challenge is not the efficiency of the code, but how clearly and quickly you can express the problem you are going to solve in whatever programming language you use. In most applications, the bottleneck to finding a solution to a problem is not the computer, but the person at the keyboard trying to write the program to solve the problem. You&rsquo;ll ultimately achieve the greatest efficiency if you use the best tool for the job in expressing that program first and worry about optimizing how fast it runs only when you find you need to.</p><h2 id="debunking-r-and-python-myths-summary">Debunking R and Python Myths Summary</h2><p>All of us find it difficult to fight conventional wisdom. In this article, we&rsquo;ve argued that not everything that people believe is true. Specifically, we&rsquo;ve debunked the following myths:</p><ol><li>Henry Ford invented the automobile. Actually, <a href="https://www.daimler.com/company/tradition/company-history/1885-1886.html" target="_blank" rel="noopener noreferrer">Karl Benz invented the automobile in 1886</a>, many years prior to <a href="https://www.thehenryford.org/collections-and-research/digital-collections/artifact/252049/l" target="_blank" rel="noopener noreferrer">Henry Ford&rsquo;s 1896 Runabout.</a></li><li>Bob Marley recorded and sang the song, <em>Don&rsquo;t Worry; Be Happy</em>. <a href="https://en.wikipedia.org/wiki/Bob_Marley" target="_blank" rel="noopener noreferrer">Bob Marley died in 1981</a>, roughly 7 years before <a href="https://en.wikipedia.org/wiki/Bobby_McFerrin" target="_blank" rel="noopener noreferrer">Bobby McFerrin wrote the song in 1988</a>. Most people remember a YouTube video featuring Bob Marley that was overdubbed with the Bobby McFerrin song in 2011.</li><li>Medieval Europeans believed the earth was flat. Actually, <a href="https://www.washingtonpost.com/blogs/answer-sheet/post/busting-a-myth-about-columbus-and-a-flat-earth/2011/10/10/gIQAXszQaL_blog.html" target="_blank" rel="noopener noreferrer">Pythagoras and later Aristotle and Euclid wrote about the earth being round</a> in the sixth century B.C.</li><li>Data science teams have to choose between R and Python. Successful data scientist teams know that development time is almost always the most costly part of any solution and will therefore use the best tools and environments for their particular job, regardless of language.</li></ol><h3 id="packages-discussed--supplemental-resources"><strong>Packages Discussed &amp; Supplemental Resources</strong></h3><ul><li><a href="https://rstudio.github.io/renv/articles/renv.html" target="_blank" rel="noopener noreferrer">Renv:</a> A package for managing R environments for reproducibility.</li><li><a href="https://rstudio.github.io/reticulate/" target="_blank" rel="noopener noreferrer">Reticulate:</a> A package for running Python from R</li><li><a href="https://github.com/chendaniely/2020-08-26-rstudio_debunk" target="_blank" rel="noopener noreferrer">Sample code for Python ML model with Shiny: </a> Daniel Chen&rsquo;s github repository for the programs he references in his examples.</li><li><a href="https://blog.rstudio.com/2020/07/28/practical-interoperability/" target="_blank" rel="noopener noreferrer">3 wild-caught R &amp; Python interoperability examples: </a> Carl Howe&rsquo;s blog post illustrating 3 R and Python interoperability applications submitted by other data scientists.</li><li><a href="https://rstudio.com/resources/webinars/debunking-the-r-vs-python-myth/" target="_blank" rel="noopener noreferrer">Debunking the R vs. Python Myth: </a> The original webinar from which this article summarizes and expands on.</li><li><a href="https://blogs.rstudio.com/ai/" target="_blank" rel="noopener noreferrer">The RStudio AI blog: </a> The RStudio blog that discusses machine learning applications with both R and Python.</li></ul></description></item><item><title>3 Fun Shiny Apps for Your Long Labor Day Weekend</title><link>https://www.rstudio.com/blog/3-fun-shiny-apps-for-your-long-labor-day-weekend/</link><pubDate>Fri, 04 Sep 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/3-fun-shiny-apps-for-your-long-labor-day-weekend/</guid><description><p>Photo by <a href="https://unsplash.com/@vincent_keiman_nl?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText" target="_blank" rel="noopener noreferrer">Vincent Keiman</a> on <a href="https://unsplash.com/s/photos/barbeque-horizontal?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText" target="_blank" rel="noopener noreferrer">Unsplash</a></p><p>Many of us in the United States have a long Labor Day weekend coming up, which, contrary to its name, we usually celebrate by not doing labor. That fact leaves us facing a conundrum on the Friday before the weekend: we don&rsquo;t want to start a new project that might require a lot of research or thinking, yet we really feel like we should do something before we knock off for the week.</p><p>In this spirit, I&rsquo;ve selected three submissions from the recent Second Annual Shiny Contest whose content is <em>NOT</em> directly related to data science or work. I feature them only to illustrate their creativity and possibly, to suggest light-hearted ways to enjoy your time off. If you&rsquo;d like to make your own choices, please check out all the entries in the <a href="https://rpubs.com/minebocek/shiny-contest-2020-submissions" target="_blank" rel="noopener noreferrer">full listing of the 2nd Annual RStudio Shiny contest submissions</a>.</p><p>So without further ado, here are our three recommendations for ways to relax with Shiny on your labor-day weekend. If you want to try out the apps for yourself, just click on the images and a live version of the app will open in a new window.</p><h2 id="learn-to-paint-with-bob-ross">Learn To Paint With Bob Ross</h2><a href="https://karamanis.shinyapps.io/bob_ross/" target="_blank" rel="noopener noreferrer"><img align="center" style="padding: 35px;" src="bob-ross-app.jpg"></a><p>With all the ennui of our current business and political worlds, perhaps you need a soothing and calm voice to help you learn to paint. I&rsquo;m referring, of course, to Bob Ross&rsquo; <em>Joy of Painting</em> videos that appeared on the Public Broadcasting System from 1983 to 1994 and are <a href="https://www.youtube.com/user/BobRossInc" target="_blank" rel="noopener noreferrer">now available on YouTube</a>.</p><p>The Shiny app submitted by Georgios Karamanis is minimalist in appearance, but that&rsquo;s part of its charm. It reads in a database of the Bob Ross episodes created by Walt Hickey of FiveThirtyEight.com and combines the painting elements being taught into a simple, hand-drawn painting. This visual synthesis allows the user to visually browse through the paintings created in each episode and decide which painting techniques they wish to learn. You then take the season and episode number from the app and it&rsquo;s off to YouTube you go to be soothed, reassured, and taught that all you really need to start painting is a nice wet wash of platinum white.</p><p>You can read more about this app on <a href="https://community.rstudio.com/t/bob-ross-painting-by-the-elements-2020-shiny-contest-submission/56922" target="_blank" rel="noopener noreferrer">its submission page at community.rstudio.com</a>, and you can learn about Bob Ross from <a href="http://bobross.com" target="_blank" rel="noopener noreferrer">his web site</a> and <a href="https://en.wikipedia.org/wiki/Bob_Ross" target="_blank" rel="noopener noreferrer">wikipedia page</a>.</p><h2 id="play-some-hangman">Play Some Hangman</h2><a href="https://smirnovayu.shinyapps.io/hangman_en/" target="_blank" rel="noopener noreferrer"><img align="center" style="padding: 35px;" src="hangman.jpg"></a><p>I think pretty much everyone has played Hangman at one time or another. The goal is pretty simple: guess the letters that make up the target word in as few guesses as possible. Each wrong guess adds another element to the hangman figure. If you guess wrong 10 times, you&rsquo;re hanged.</p><p>Ten tries is quite a few, so I&rsquo;m sure most readers of this blog will find themselves winning most games. Should you find yourself bored with so much winning in English, try the unadvertised <a href="https://smirnovayu.shinyapps.io/hangman_ru/" target="_blank" rel="noopener noreferrer">Russian version</a>.</p><p>You can learn more about the author and the app in its <a href="https://community.rstudio.com/t/hangman-2020-shiny-contest-submission/54937" target="_blank" rel="noopener noreferrer">submission description</a>.</p><h2 id="adopt-a-cat">Adopt a Cat</h2><a href="https://nsilbiger.shinyapps.io/AdoptDontShop/" target="_blank" rel="noopener noreferrer"><img align="center" style="padding: 35px;" src="adopt-dont-shop.jpg"></a><p>You know that the internet is made of cats, right? Maybe it&rsquo;s time to put away your computer and adopt one. If you live in the Los Angeles, California area, Nyssa Silbiger and Margaret Siple have you covered: you can browse the collection of the Los Angeles Kitten Rescue cats available for adoption with the Adopt Don&rsquo;t Shop app.</p><p>If you&rsquo;re in a data sciency mood, you can examine the distribution of species and names in their appropriate tabs, but most people will want to go straight to the <em>Kitten Tinder</em> page and browse the kitties. Best of all, should you fall in love with one (and you probably will), hitting the &ldquo;I want to adopt!&rdquo; button will take you straight to that cat&rsquo;s adoption page.</p><p>Check out the <a href="https://community.rstudio.com/t/adopt-dont-shop-2020-shiny-contest-submission/59166" target="_blank" rel="noopener noreferrer">submission entry</a> for details on the app and its code but don&rsquo;t feel you have to. Browsing adoptable cats is a perfectly acceptable Labor Day activity.</p><h2 id="cant-decide-practice-making-decisions">Can&rsquo;t Decide? Practice Making Decisions</h2><a href="https://sparktuga.shinyapps.io/ShinyDecisions/" target="_blank" rel="noopener noreferrer"><img align="center" style="padding: 35px;" src="shiny-decisions.png"></a><p>I know I said I&rsquo;d only highlight 3 apps, but I&rsquo;m adding a 4th because it&rsquo;s one of those apps that captivates users. I&rsquo;m speaking of one of the winners of the Shiny Contest, <a href="https://sparktuga.shinyapps.io/ShinyDecisions/" target="_blank" rel="noopener noreferrer">Shiny Decisions</a>. This app is about making the best decisions in bad situations while you try to save the world. And while I&rsquo;m sure everyone remembers <em>War Games</em> computer Joshua claiming &ldquo;The only winning move is not to play,&rdquo; you&rsquo;ll want to play this one anyway. Read the <a href="https://community.rstudio.com/t/shiny-decisions-card-swiping-game-2020-shiny-contest-submission/58723" target="_blank" rel="noopener noreferrer">submission entry</a> for more details or just click through on the image above to play the game immediately.</p><h3 id="for-more-information">For More Information</h3><p>We thank all of the 183 developers who submitted the 220 apps in the 2nd Annual Shiny Contest for their hard work. We&rsquo;ll be highlighting more topical selections from the Second Annual Shiny Contest applications during September 2020. In the meantime, you can <a href="https://blog.rstudio.com/2020/07/13/winners-of-the-2nd-shiny-contest/" target="_blank" rel="noopener noreferrer">read about the featured winners of that contest</a> in our July blog post.</p></description></item><item><title>3 Ways to Expand Your Data Science Compute Resources</title><link>https://www.rstudio.com/blog/expand-your-data-science-resources/</link><pubDate>Thu, 27 Aug 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/expand-your-data-science-resources/</guid><description><div style="padding: 35px 0 0 0;"><img src="launch.jpeg" style="display:none;margin-left:auto;margin-right:auto;"></div><p style="text-align: right !important;margin-top: 0px;margin-bottom: 30px;"><i>Photo by <a style="color: #000000;" href="https://unsplash.com/@uncle_rickie">Richard Gatley</a> on <a style="color: #000000;" href="https://unsplash.com/photos/La5V_Qr6h3A">Unsplash</a></i></p><p>Data science leaders have embraced the work-from-home era created by COVID-19. Most data science teams have continued their work either using their company laptops or server-based IDEs such as RStudio Server. However, these home workers often run into the limitations of their laptops when they:</p><ul><li><strong>Run long-lived programs:</strong> Machine learning models and simulations frequently run for hours or days on laptops.</li><li><strong>Demand lots of memory:</strong> Training models, parameter tuning, or working with complex datasets (such as genomic data) often require more RAM than even the most tricked-out laptop has.</li><li><strong>Need specialized architectures:</strong> Some machine learning libraries perform best on GPUs or with optimized system architectures that are not available on most laptops.</li></ul><h2 id="embrace-server-based-data-science-development">Embrace Server-Based Data Science Development</h2><p>The key to freeing data scientists from laptop limitations is to embrace server-based development, as we noted in a prior post, <a href="https://blog.rstudio.com/2020/05/12/equipping-wfh-data-science-teams/" target="_blank" rel="noopener noreferrer">Equipping Work From Home Data Science Teams</a>. Providing data scientists with access to a server-based IDE like RStudio Server can give them more processors, cores, memory, and architecture options than would be available on their laptops. Additionally, with <a href="https://rstudio.com/products/rstudio-server-pro/" target="_blank" rel="noopener noreferrer">RStudio Server Pro</a>, data scientists can go even further by launching interactive or batch sessions on SLURM and Kubernetes clusters.</p><p><img src="background-jobs.png" alt="Figure 1: 3 ways RStudio Server allows data scientists to use server resources for their jobs." title="Diagram showing 3 different types of background jobs"></p><p>As shown in Figure 1, RStudio offers three ways for data scientists to take advantage of centralized resources and escape the limitations of their laptops:</p><ul><li><strong>Local background jobs:</strong> In any version of RStudio, data scientists can run an R script in the background. This is especially helpful in RStudio Server, where the task has access to more resources, and you don&rsquo;t have to worry about shutting off the laptop or a Windows update interrupting the process.</li></ul><p><img src="background-menu.png" alt="Menu options allow you to run jobs in the background or using Launcher." title="Background job menu"></p><ul><li><strong>Interactive Launcher sessions on RStudio Server Pro:</strong> RStudio Server Pro adds the ability for a data scientist to start an interactive session on a Kubernetes or SLURM cluster, giving them the full power of RStudio, but with code executing in these unique and powerful environments. These interactive sessions are useful for exploratory data analysis and debugging.</li><li><strong>RStudio Server Pro Launcher jobs:</strong> Finally, data scientists can execute ad-hoc, long-running scripts and programs on clusters using Launcher and let them run without any further console interaction. This approach can be particularly useful for model training, ETL jobs, and other workloads that may run for hours or days. Running these workloads in a batch-oriented mode allows the data scientist to work on other projects without being blocked waiting for results to arrive.</li></ul><div style="overflow-x:auto;"><table><tr><th scope="col"></th><th scope="col">RStudio Server</th><th scope="col">Interactive Launcher Sessions on RStudio Server Pro</th><th scope="col">Launcher Jobs on RStudio Server Pro</th></tr><tr><th scope="row">Typical RAM</th><td>Tens to hundreds of gigabytes</td><td>Multiple terabytes</td><td>Multiple terabytes</td></tr><th scope="row">Typical Processor Cores</th><td>Tens</td><td>Hundreds to Thousands</td><td>Hundreds to Thousands</td></tr><th scope="row">Typical Jobs</th><td>Routine analyses</td><td>Interactive tasks requiring large compute, GPUs, or RAM such as exploratory data analysis</td><td>Batch tasks like parameter tuning, ETL, or model training and scoring</td></tr><tr><th scope="row">Setup required</th><td>RStudio Server install</td><td>RStudio Server Pro + Cluster add-in</td><td>RStudio Server Pro + Cluster add-in</td></tr><tr><th scope="row">Limitations</th><td>Server Resources</td><td>Best for interactive work, not parallel tasks</td><td>Jobs kicked off manually, limited job feedback</td></tr></table></div><p>Figure 3: Three Ways to Expand Data Science Computational Resources Using RStudio Pro and Launcher.</p><h2 id="central-servers-improve-data-scientist-productivity">Central Servers Improve Data Scientist Productivity</h2><p>Data scientists benefit from using RStudio Server and RStudio Server Pro for their analysis because:</p><ul><li><strong>Unblock the data scientist from waiting for long-lived jobs:</strong> Instead of going out for a cup of coffee while waiting to fit their model to a large training set, data scientists can run the model fitting in the background and work on other code while waiting for it to complete.</li><li><strong>Free the data scientist from having to shoehorn their analysis onto a small platform:</strong> Laptop memory and processor limitations often force data scientists to sample their data or recode their models to run in a smaller footprint. By providing access to servers that have many times the resources of their laptops, data scientists can use their full data sets to fit complete models.</li><li><strong>Allow data scientists more flexibility and make IT happy:</strong> Data scientists are able to use more flexible resources and server architectures such as access to GPUs. Server-based development is also a great benefit for IT professionals who are able to see expanded use of the platforms they&rsquo;ve built and reduced costs through elastic compute.</li></ul><h2 id="for-more-information-about-background-and-cluster-jobs">For More Information About Background and Cluster Jobs</h2><p>To learn more about the new Launcher capabilities built into RStudio:</p><ul><li><a href="https://rstudio.com/resources/rstudioconf-2019/rstudio-job-launcher-changing-where-we-run-r-stuff/" target="_blank" rel="noopener noreferrer">Darby Hadley&rsquo;s rstudio::conf video</a> where he demonstrates using both background and cluster jobs using Launcher from his IDE.</li><li><a href="https://solutions.rstudio.com/examples/jobs-overview/" target="_blank" rel="noopener noreferrer">A text overview of the launching system</a> showing example workloads and how they are launched from the IDE.</li><li><a href="https://blog.rstudio.com/2019/03/14/rstudio-1-2-jobs/" target="_blank" rel="noopener noreferrer">The RStudio Server 1.2 preview blog post by Jonathan McPherson</a> where he provides details on how the launching system works.</li></ul><p>If you&rsquo;d like to try out RStudio Server Pro for your team, you can learn how to download an evaluation copy from the <a href="https://rstudio.com/products/rstudio-server-pro/" target="_blank" rel="noopener noreferrer">RStudio Server Pro</a> product page.</p></description></item><item><title>Why Package & Environment Management is Critical for Serious Data Science</title><link>https://www.rstudio.com/blog/why-package-environment-management-is-critical-for-serious-data-science/</link><pubDate>Thu, 20 Aug 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/why-package-environment-management-is-critical-for-serious-data-science/</guid><description><sup><p style="text-align: right !important;margin-top: 0px;margin-bottom: 30px;"><i>Photo by <a style="color: #000000;" href="https://unsplash.com/@markusspiske">Markus Spiske</a> on <a style="color: #000000;" href="https://unsplash.com/photos/RWTUrJf7I5w">Unsplash</a></i></p></sup><div class="lt-gray-box"><p><em>This is a guest post from RStudio&rsquo;s partner, <a href="https://www.procogia.com/" target="_blank" rel="noopener noreferrer">ProCogia</a></em></p></div><h3 id="the-rapid-advancement-of-r-presents-a-challenge-to-reproducibility">The rapid advancement of R presents a challenge to reproducibility</h3><p>Thanks to our vibrant and engaged community, R is continually evolving as successful open source software. It’s exciting to have frequent releases and refinements to our favorite tools, but this also can present challenges to maintaining the <a href="https://blog.rstudio.com/2020/05/19/driving-real-lasting-value-with-serious-data-science/" target="_blank" rel="noopener noreferrer">integrity and reproducibility</a> of our work. When new tools and packages are released, useRs like to tinker and stay on the cutting edge, but we don’t want our experimental playground to break our important workflows. We like to collaborate, but when package versions collide, this can lead to problems ranging from error messages and frustration to silent bugs and unexpected code behavior.</p><p>For other stakeholders in the wider organization, these frequent updates present related challenges. For data science leaders, they may struggle with how to make sure their team has access to the latest methods, while still consistently delivering reproducible results to the rest of the organization. For IT and DevOps, they may feel inundated with requests to constantly update, validate, and maintain production systems delivering data science applications.</p><h3 id="the-renv-package-helps-create-reproducible-project-environments">The <code>renv</code> package helps create reproducible project environments</h3><p>To address these sorts of challenges, users of other programming languages are likely familiar with virtual environments and project management tools, but analogous best practices have not seen widespread adoption within the R community. Enter <code>renv</code>, a new package for reproducible environments in R that:</p><ul><li>Is simple to use in new or existing projects, and</li><li>Doesn’t interrupt existing workflows</li></ul><p>I recently co-hosted a <a href="https://garciamikep.github.io/useR-webinar/#1" target="_blank" rel="noopener noreferrer">webinar</a> on upgrading to R 4.0 and package management with <code>renv</code>. In preparation, my co-host and I worked on the same set of RMarkdown-based <code>xaringan</code> slides, and shared our code on GitHub. Ironically, we hadn’t checked to make sure we were using the same version of R, nor did we use any package management tool to ensure consistent package versions. Surely we didn’t need any fancy tools for such a simple set of slides? Wrong! The night before our presentation, I compiled the slides and discovered the formatting was completely mangled. The next morning we decided to practice what we were about to preach, and incorporated <code>renv</code> into the project and switched to using R 4.0. Presto, the slides compiled perfectly.</p><p>This formatting issue was easy to detect, and although the mangled slides were not exactly professional looking, it was a relatively harmless bug. Not all bugs are. An environment management tool such as <code>renv</code> is essential to keeping exploratory and side projects isolated from sensitive or business-critical work, and ensuring reproducibility and accuracy.</p><p>Incorporating <code>renv</code> into either a new or existing project is straightforward:</p><ol><li>Initialize the project environment with a single function call. <code>renv</code> will automatically detect your package dependencies, or you can choose to start with a blank slate.</li><li>Continue your workflow as normal, occasionally taking a snapshot (again, just one function call) to update the project environment to reflect any packages that have been added or removed.</li><li>If something goes wrong, you can revert to an earlier state of the project with a single function call.</li></ol><h3 id="advantages-of-using-the-renv-package">Advantages of using the <code>renv</code> package</h3><p>The <code>renv</code> package is compatible with almost anywhere your team gets their packages (CRAN, Github, <a href="https://rstudio.com/products/package-manager/" target="_blank" rel="noopener noreferrer">RStudio Package Manager</a>, the <a href="https://blog.rstudio.com/2020/07/01/announcing-public-package-manager/" target="_blank" rel="noopener noreferrer">recently introduced RStudio Public Package Manager</a>, GitHub, BioConductor, GitLab, BitBucket, custom local packages…). For teams familiar with Python, the workflow will feel familiar, and <code>renv</code> also integrates with <code>pipenv</code> and <code>reticulate</code> for multilingual projects.</p><p>Ultimately, why would I recommend <code>renv</code> over other options?</p><ul><li>Disk space. <code>renv</code> doesn’t re-install the same version of a package if already installed for another project.</li><li><code>renv</code> improves upon deficiencies in Packrat, a previously existing package manager for R.</li><li><code>renv</code> is highly compatible with various ways to source and manage your packages.</li></ul><hr><p><strong>About ProCogia:</strong></p><p><a href="https://www.procogia.com/" target="_blank" rel="noopener noreferrer"><img src="./procogia-logo.png" alt="ProCogia logo" align="left"></a>An RStudio <a href="https://rstudio.com/certified-partners/" target="_blank" rel="noopener noreferrer">Full Service Partner</a>, <a href="https://procogia.com/" target="_blank" rel="noopener noreferrer">ProCogia</a> is based out of Seattle, Washington. Our consulting capability extends to building, deploying, and supporting scalable data science solutions for our clients. We are passionate about developing data-driven solutions that provide highly informed answers to your most critical questions.</p></description></item><item><title>R & RStudio - The Interoperability Environment for Data Analytics</title><link>https://www.rstudio.com/blog/r-and-rstudio-the-interoperability-environment-for-data-analytics/</link><pubDate>Mon, 17 Aug 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/r-and-rstudio-the-interoperability-environment-for-data-analytics/</guid><description><p>On the RStudio Developer Blog we’ve recently written a series on <a href="https://blog.rstudio.com/2020/07/07/interoperability-july/" target="_blank">interoperability and R</a>, including why <a href="https://blog.rstudio.com/2020/07/15/interoperability-maximize-analytic-investments/" target="_blank">enterprises should embrace workflows that are open to diverse toolsets</a>.</p><p>The designers of R, from its very beginnings, have dealt directly with how best to tap into other tools.Statisticians, analysts, and data scientists have long been challenged to bring together all the statistical methods and technologies required to perform the analysis the situation calls for—and this challenge has grown as more tools, libraries, and frameworks become available.</p><p>John Chambers writing on the design philosophy behind the S programming language, the predecessor to R,</p><blockquote>“[W]e wanted to be able to begin in an interactive environment, where they did not consciously think of themselves as programming. Then as their needs become clearer and their sophistication increased, they should be able to slide gradually into programming, when the language and system aspects would become more important.”</blockquote><p>Part of this design philosophy is to minimize the amount of effort and overhead required to get your analytics work done. It is not fair to assume that every data scientist is programming all day, or coming from a computer science background, but they still need to implement some of the most sophisticated tools programmers use.</p><p>The ecosystem around R has striven to strike the right balance between a domain specific environment optimized for data science workflows and output, and a general programming environment. For example CRAN, Bioconductor, rOpenSci, and GitHub provide collections of packages written with data science in mind, which extend core R’s functionality, letting you tap into (and share) statistical methods and field-specific tools — when and only when you need them.</p><p>Many of the most popular packages offer interfaces to tools in other languages. For example, most tidyverse packages include compiled (C/C++) code. Interestingly, core R itself connects you to tooling mostly written in other programming languages. As of R 4.0.2 over 75% of the lines in core R’s codebase are written C or Fortran (C 43%, Fortran 33%, &amp; R 23.9%).</p><h2 id="rstudio---design-philosophy-and-development-priorities">RStudio - design philosophy and development priorities</h2><p>Our <a href="https://rstudio.com/about/" target="_blank">mission</a> at RStudio is to create free and open source software for data science, scientific research, and technical communication. R is a wonderful environment for data analysis, and we’ve focused on making it easier to use. We do this through our IDE and <a href="https://rstudio.com/about/pbc-report/" target="_blank">open sources packages</a>, such as the tidyverse. We also do this by making data science easier to learn through <a href="https://rstudio.com/products/cloud/" target="_blank">RStudio Cloud</a> and our support for <a href="https://education.rstudio.com/" target="_blank">data science education</a>. And we help make R easier to manage and scale out across an organization through our <a href="https://rstudio.com/products/team/" target="_blank">our professional products</a>, supporting best practices for data science in the enterprise through our <a href="https://solutions.rstudio.com/" target="_blank">solutions team</a>.</p><p>As part of this effort, we have focused heavily on enabling and supporting interoperability between R and other tools.We recently outlined in a <a href="https://blog.rstudio.com/2020/07/07/interoperability-july/" target="_blank">recent blog post</a> how the RStudio IDE allows you to embed many different languages in RMarkdown documents, including:</p><ul><li><strong>Using R &amp; Python together</strong> through the <code>reticulate</code> package</li><li><strong>SQL code</strong> for accessing databases,</li><li><strong>BASH code</strong> for shell scripts,</li><li><strong>C and C++ code</strong> using the <code>Rcpp</code> package,</li><li><strong>STAN code</strong> with <code>rstan</code> for Bayesian modeling,</li><li><strong>Javascript</strong> for doing web programming,</li><li><strong>and many more languages</strong>. You can find a complete list of the many platforms supported in the language engines chapter of the book, <a href="https://bookdown.org/yihui/rmarkdown/language-engines.html" target="_blank">R Markdown: The Definitive Guide</a>.</li></ul><p>And we work with the community to support:</p><ul><li>Bilingual data science teams, by providing a single platform for data scientists to develop in R or Python (<a href="https://rstudio.com/products/rstudio-server-pro/" target="_blank">RStudio Server Pro</a>), and to deploy applications built with either (through <a href="https://rstudio.com/products/connect/" target="_blank">RStudio Connect</a>)</li><li>Making it easy to create web applications with shiny or put models into production via plumber APIs</li><li>Supporting easy access to data sources, such <code>odbc</code>, <code>DBI</code>, and <code>dbplyr</code> for <a href="https://db.rstudio.com/" target="_blank">database access and wrangling</a>.</li><li>Incubating <a href="https://ursalabs.org/" target="_blank">Ursa Labs</a>, which is focused on building the next generation of cross language tools, leveraging the Apache Arrow project.</li><li>Integration from R with other modeling frameworks, including <a href="https://tensorflow.rstudio.com/" target="_blank">TensorFlow</a> and <a href="https://spark.rstudio.com/mlib/" target="_blank">SparkMLlib</a></li><li>Using <a href="https://spark.rstudio.com/" target="_blank">Sparklyr</a> and <a href="https://docs.rstudio.com/rsp/integration/launcher-kubernetes/" target="_blank">Launcher with kubernetes</a> to distribute your calculations or modeling operations over many machines, <em>which we will be discussing in more depth in an upcoming blog post</em>.</li></ul><p>This list goes on and on and grows by the week.</p><p>R with RStudio is a wonderful environment for anyone who seeks understanding through the analysis of data. It does this by finding a balance between a domain specific environment and a general programming language that doesn&rsquo;t prioritize data scientists. That is, it strives to be an environment optimized for analytics workflows and output. At the fulcrum of this balance is extensive interoperability, the ability to pull in interfaces into other technologies as they are needed, and a vibrant community sustaining these. This has been the goal for R since initial design principles, through the extensive work shared by the R community, and significant continued investment by RStudio.</p></description></item><item><title>How to Deliver Maximum Value Using R & Python </title><link>https://www.rstudio.com/blog/how-to-deliver-maximum-value-using-r-python/</link><pubDate>Thu, 13 Aug 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/how-to-deliver-maximum-value-using-r-python/</guid><description><sup><p style="text-align: right !important;margin-top: 0px;margin-bottom: 30px;"><i>Photo by <a style="color: #000000;" href="https://unsplash.com/@vladhilitanu">Vlad Hilitanu</a> on <a style="color: #000000;" href="https://unsplash.com/photos/1FI2QAYPa-Y">Unsplash</a></i></p></sup><div class="lt-gray-box"><p><em>This is a guest post from RStudio&rsquo;s partner, <a href="https://www.landeranalytics.com/" target="_blank" rel="noopener noreferrer">Lander Analytics</a></em></p></div><p>R and Python are two of the more prominent data science languages. These languages didn&rsquo;t become popular by accident, they grew by making their tools easier and more productive. Python&rsquo;s Pandas allowed Python to create heterogeneous <a href="https://twitter.com/chendaniely/status/1279142678656155654" target="_blank" rel="noopener noreferrer">panel data</a> inspired by R&rsquo;s <code>data.frame</code> object. R&rsquo;s <em>caret</em> and <em>tidymodels</em> libraries unified the machine learning API just like Python&rsquo;s <em>scikit-learn</em>.</p><p>Look under the hood of most applications and libraries and you&rsquo;ll eventually find a different language from the one you&rsquo;re using. This isn&rsquo;t a new concept; each language has their own strengths. Generally, though, the more interoperable languages are, the easier an end-user can pick the tool best suited for their work.</p><p>This is why language wars never made sense and many core developers never participate in the &ldquo;mine is better&rdquo; debate. Groups such as <a href="https://ursalabs.org/" target="_blank" rel="noopener noreferrer">Ursa Labs</a>, using projects such as <a href="https://arrow.apache.org/" target="_blank" rel="noopener noreferrer">Apache Arrow</a>, are even trying to expand interoperability across more languages.</p><p>As the RStudio team discussed in a <a href="https://blog.rstudio.com/2020/07/15/interoperability-maximize-analytic-investments/" target="_blank" rel="noopener noreferrer">recent blog post</a>, the more interoperable these languages become, the better for us as data scientists to pick the tool we know for the task at hand. The next question for newcomers naturally becomes: which language should I learn?</p><p>I&rsquo;ve talked about this in the <a href="https://chendaniely.github.io/2019/08/28/r-or-python-which-one-to-learn-first/" target="_blank" rel="noopener noreferrer">past</a>. Essentially, it does not matter. But the language that is being used in the place you want to work in, friends that know a language and you can potentially ask for help, or even the first tutorial that you read online that resonated the best with you, are all subjective ways to help you pick the &ldquo;first&rdquo; language. The data skills around manipulating data into tidy format are transferable across languages. Learning how to think sequentially and breaking down problems are all skills you learn by doing; You will rarely meet another data scientist or programmer who doesn&rsquo;t know how to at least &ldquo;read&rdquo; another language.</p><p>In general, I recommend that you should choose your language based on its support of:</p><ul><li>Tidy data principles</li><li>Interactive interfaces</li><li>Powerful Libraries</li><li>Interoperability with everything else you use</li></ul><p>From a data science perspective, I recommend using <a href="https://r4ds.had.co.nz/tidy-data.html" target="_blank" rel="noopener noreferrer">tidy data principles</a> as the central point to learn a new language. Over time, you&rsquo;ll certainly find something &ldquo;on the other side,&rdquo; that you will want to try out. For example, I love how I can <a href="https://speakerdeck.com/chendaniely/using-python-with-r" target="_blank" rel="noopener noreferrer">program microcontrollers with Python using MicroPython and CircuitPython</a>, and how easy the R ecosystem has made communicating results and findings.</p><p>This means I can work on a Python analysis (or Python-using team) and easily deploy Shiny applications around it using <em>reticulate</em>. Conversely, we could convert R output into a plumber REST API for our Python Django, Flask, or Pyramid application, or even directly run R using <em>rpy2</em>. These interfacing layers allow maintainers to only maintain a wrapper and not a full re-implementation of a library. As an R user, this also means R libraries can be created around Python libraries so the R community does not need to re-implement Python tools. The R Keras package is a great example of taking a fully maintained Python package and wrapping it for R users using <em>reticulate</em>. With tools like <em>reticulate</em>, you have a simple way to call Python natively within R.</p><p>The most common Python data types are also seamlessly accessible as R objects, which means you can incorporate Python into all of the R publication and communication tools like Shiny and RMarkdown. The converse is also true from the Python perspective. R code can be run within Python using <em>rpy2</em>, which means popular Python web frameworks can also benefit from R pipelines. If the language itself does not matter, why not learn both and leverage the best of both worlds simultaneously?</p><p>Data science teams should be multilingual and the need for dual Python and R training is evident in data science programs (e.g., <a href="https://ubc-mds.github.io/2020-02-03-teach-python-and-r/" target="_blank" rel="noopener noreferrer">University of British Columbia</a>) that aim to teach both simultaneously. This isn&rsquo;t without its challenges, but it is acknowledging and addressing the need for eventually knowing both Python and R. New data science tools inspired by another language benefits us all as data scientists. The ease of interoperability gives the user the flexibility to fill in any tool gaps for their own needs. Instead of &ldquo;what language should I use&rdquo;,you now think about the whole team and consider which programming interface best resonates with the users, or which infrastructure stack so the SysAdmins feel most comfortable in deploying and maintaining.</p><hr><p><strong>About Lander Analytics:</strong></p><p><a href="https://www.landeranalytics.com/" target="_blank" rel="noopener noreferrer"><img src="./lander-logo.jpg" alt="Lander Analytics logo" align="left"></a>An RStudio <a href="https://rstudio.com/certified-partners/" target="_blank" rel="noopener noreferrer">Full Service Partner</a>, Lander Analytics is a New York-based data science firm, whose staff specializes in statistical consulting and infrastructure, running the full gamut of RStudio product assistance from procurement, implementation and installation to ongoing maintenance and support. They also provides open source training services for R, Python, Stan, Deep Learning, SQL and numerous other languages.</p></description></item><item><title>rstudio::global() talk deadline extension</title><link>https://www.rstudio.com/blog/rstudio-global-talk-deadline-extension/</link><pubDate>Thu, 13 Aug 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-global-talk-deadline-extension/</guid><description><p>We&rsquo;ve received requests from a number of you to <a href="https://blog.rstudio.com/2020/07/17/rstudio-global-call-for-talks/">submit talks for rstudio::global()</a> after the deadline (tomorrow, August 14). We know things are particuarly tough at the moment, so we&rsquo;re extending the deadline by two weeks for everyone; <strong>the new submission deadline is August 28</strong> at 11:59PM PDT. We&rsquo;ll aim to get decisions back by late September.</p><p><a href="https://forms.gle/5mzcMKd75xaDf2zi9"><strong>APPLY NOW!</strong></a></p></description></item><item><title>Do, Share, Teach, and Learn Data Science with RStudio Cloud</title><link>https://www.rstudio.com/blog/rstudio-cloud-announcement/</link><pubDate>Wed, 05 Aug 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-cloud-announcement/</guid><description><p>RStudio is proud to announce the general availability of <a href="https://rstudio.cloud/" target="_blank" rel="noopener noreferrer">RStudio Cloud</a>, its cloud-based platform for doing, teaching, and learning data science using only a browser. This general release incorporates feedback from thousands of users, based on more than 3.5 million hours of compute time.</p><h2 id="what-is-rstudio-cloud">What is RStudio Cloud?</h2><p>RStudio Cloud is a lightweight, cloud-based solution that allows anyone to do, share, teach, and learn data science online. RStudio Cloud makes it easy to:</p><ul><li><strong>Analyze your data</strong> using the RStudio IDE, directly from your browser.</li><li><strong>Share projects</strong> with your team, class, workshop, or the world.</li><li><strong>Teach data science</strong> with R to your students or colleagues.</li><li><strong>Learn data science</strong> in an instructor-led environment or with interactive tutorials.</li></ul><p>With RStudio Cloud, there&rsquo;s nothing to configure, and no dedicated hardware or installation is required. Individual users, instructors, and students only need a browser.</p><p>We will always offer a free plan for casual use, and we now offer paid premium plans for professionals, instructors, researchers, and organizations as well. Learn more about what plans are <a href="https://rstudio.cloud/plans/free" target="_blank" rel="noopener noreferrer">available here</a>.</p><p>RStudio Cloud is a great platform for both casual and professional data scientists. The ability to share projects makes it easy for researchers from different groups or institutions to collaborate and is useful for any data science team that wants to build and share data science work without having to maintain their own IT infrastructure. With this release, our new premium offerings also give users the option of scaling their environments up to more cores and memory if needed.</p><p>We have many exciting features planned over the coming months. This first release focuses on helping users teach and learn Data Science. These capabilities have been refined through extensive alpha/beta testing by over a thousand academic institutions and other organizations.</p><figure><img align="center" style="padding: 35px;" src="tutorial-screen.jpg"><figcaption><em>Figure 1: RStudio Cloud Primers help you learn the basics of data science via interactive tutorials. See <a href="https://rstudio.cloud/learn/primers" target="_blank" rel="noopener noreferrer">https://rstudio.cloud/learn/primers</a> for more information.</em></figcaption></figure><p>RStudio Cloud simplifies the process of teaching R, whether to students or colleagues, by letting the instructor focus on the content, not the infrastructure. Students learn directly from their web browsers, with nothing to install locally, and with no infrastructure for the instructor to maintain.</p><figure><img align="center" style="padding: 35px;" src="cloud-components.jpg"><figcaption><em>Figure 2: Create projects within your personal workspace to teach and share with others.</em></figcaption></figure><p>As shown in Figure 2, RStudio Cloud is designed from the ground up to make data science teaching easier for instructors and students, including new features such as:</p><ul><li><strong>Projects:</strong> The fundamental unit of work on RStudio Cloud, projects encapsulate R code, packages and data files and provide isolation from other analyses. Projects can be public or private.</li><li><strong>Spaces:</strong> Every RStudio Cloud user gets a personal workspace in which to create projects. You can also create private, shared spaces that function as virtual classrooms for courses and workshops.</li><li><strong>Members:</strong> Users who can access a space. Members can be assigned different roles, giving them capabilities appropriate for instructors, TAs and students.</li><li><strong>Assignments:</strong> When teaching a class or workshop, projects can be made into assignments. Students can make copies of projects created by the instructor, with the necessary environment automatically replicated. Instructors can peek into student projects and check their progress.</li></ul><p>To learn more or sign up for free, visit <a href="https://rstudio.cloud/" target="_blank" rel="noopener noreferrer">RStudio Cloud</a> or check out our recent webinar on <a href="https://rstudio.com/resources/webinars/teaching-r-online-with-rstudio-cloud/" target="_blank" rel="noopener noreferrer">Teaching R Online with RStudio Cloud</a>.</p></description></item><item><title>RStudio Adds New R Features in Qubole's Open Data Lake</title><link>https://www.rstudio.com/blog/rstudio-adds-new-r-features-in-qubole-s-open-data-lake/</link><pubDate>Mon, 03 Aug 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-adds-new-r-features-in-qubole-s-open-data-lake/</guid><description><sup><p class="text-right">Launch RStudio Server Pro from inside the Qubole platform</p></sup><p>We are excited to team up with Qubole to offer data science teams the ability to <a href="https://spark.rstudio.com/examples/qubole-cluster/" target="_blank" rel="noopener noreferrer">use RStudio Server Pro from directly within the Qubole Open Data Lake Platform</a>. Qubole is an open, simple, and secure data lake platform for machine learning, streaming and ad-hoc analytics. RStudio and Qubole customers now have access to RStudio’s out-of-the-box features and Qubole’s unique managed services that supercharge data science and data exploration workflows for R users, while optimizing costs for R-based projects. Within the Qubole platform, data scientists are able to easily access and analyze large datasets using the RStudio IDE, securely within their enterprise running in their public cloud environment of choice (AWS, Azure, or Google).</p><p>With massive amounts of data becoming more accessible, data scientists increasingly need more computational power. Cluster frameworks such as Apache Spark, and their integration with R using the SparkR and SparklyR libraries, help these users quickly make sense of their big data and derive actionable insights for their businesses. However, high CPU costs, long setup times, and complex management processes often prevent data scientists from taking advantage of these powerful frameworks.</p><p>Now that Qubole has added RStudio Server Pro into its offering, it now offers its users:</p><ul><li><strong>Single click access to Spark clusters</strong>. With Qubole’s authentication mechanisms, no additional sign-in is required.</li><li><strong>Automatic persistence</strong> of users’ files and data sets when clusters are restarted.</li><li><strong>Pre-installed packages</strong> such as Sparklyr, tidyverse, and other popular R packages.</li><li><strong>Cluster Package Manager</strong> allows users to define cluster-wide R &amp; Python dependencies for Spark applications</li><li><strong>Performance optimizations</strong> such as Qubole’s optimized spark distribution allows the cluster to automatically scale up when the sparklyr application needs more resources and downscales as cluster resources are not in use.</li><li><strong>Spark UI, Logs, and Resource Manager links</strong> available in the RStudio Connections pane for seamlessly managing applications.</li></ul><div style="text-align: center;"><iframe width="560" height="315" src="https://www.youtube-nocookie.com/embed/vfmdaIwxbMw" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></div><p>Enterprise users benefit from this new integration because this new upgraded platform:</p><ul><li><strong>Limits CPU expenses to what users need.</strong> The Qubole cluster automatically scales up when the sparklyr application needs more resources, and downscales when cluster resources are not un use.</li><li><strong>Allows on-demand cluster use.</strong> With single-click integration, users can seamlessly access large datasets that can persist automatically.</li><li><strong>Simplifies cluster management.</strong> Qubole’s Cluster Package Manager, with pre-installed R packages, lets users define R and Python dependencies across their clusters.</li></ul><h3 id="how-do-i-enable-this-integration">How do I enable this integration?</h3><p>If you already are a Qubole customer, and would like to enable RStudio Server Pro in your environment, please <a href="https://www.qubole.com/company/contact-us/" target="_blank" rel="noopener noreferrer">contact</a> your Qubole support team.</p><h3 id="want-to-learn-more-about-rstudio-server-pro">Want to learn more about RStudio Server Pro?</h3><p><a href="https://rstudio.com/products/rstudio-server-pro/" target="_blank" rel="noopener noreferrer">RStudio Server Pro</a> is the preferred data analysis and integrated development experience for professional R users and data science teams who use R and Python. RStudio Server Pro enables the collaboration, centralized management, metrics, security, and commercial support that professional data science teams need to operate at scale.</p><p><strong><a href="https://rstudio.com/products/rstudio-server-pro/evaluation/" target="_blank" rel="noopener noreferrer">Try a Free 45 Day Evaluation</a></strong> or <strong><a href="https://rstudio.chilipiper.com/book/rsp-demo" target="_blank" rel="noopener noreferrer">See in in Action</a></strong></p></description></item><item><title>3 Wild-Caught R and Python Applications</title><link>https://www.rstudio.com/blog/practical-interoperability/</link><pubDate>Tue, 28 Jul 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/practical-interoperability/</guid><description><p>A colleague recently asked me an intriguing question:</p><p>&ldquo;Do you know of any good examples of R and Python interoperability being used to solve real-world problems instead of just toy examples?&rdquo;</p><p>I had to admit that I hadn&rsquo;t gone looking for such examples but that I&rsquo;d see if I could find any. So I put the following query out on Twitter:</p><blockquote><p>Does anyone have a cool mixed R and Python app (probably in R Markdown but not required) that they&rsquo;d like to share?Ideally it would be one that shows both languages to their best advantage.</p></blockquote><p>I received several responses to my plea, so I thought I&rsquo;d share them here to illustrate some of the characteristics of R interoperability &ldquo;in the wild.&rdquo; At the end, I&rsquo;ll use these examples to discuss three broad motivations for interoperability that readers may find useful.</p><p><br /></p><h3 id="example-1-wrapping-a-user-interface-around-a-simulation">Example 1: Wrapping a User Interface Around A Simulation</h3><figure><a href="https://tomicapretto.shinyapps.io/simulation-app" target="_blank" rel="noopener noreferrer"><div style="padding: 35px 0 0 0;"><img align="center" src="simulation.jpg"></div></a><figcaption>Figure 1: A Shiny App that estimates densities using Python</figure><p>The first example I&rsquo;d like to share is a simulation built by Tomas Capretto (@CaprettoTomas), who described it as follows:</p><blockquote>I have an application built on Shiny that uses functions to estimate densities<br>written in Python. It works both in http://shinyapps.io and locally.<br>Online app: <a href="https://tomicapretto.shinyapps.io/simulation-app" target="_blank" rel="noopener noreferrer"> https://tomicapretto.shinyapps.io/simulation-app/</a><br>More info: <a href="https://github.com/tomicapretto/density_estimation" target="_blank" rel="noopener noreferrer">https://github.com/tomicapretto/density_estimation</a><br></blockquote><p>As I&rsquo;ll discuss in the next section, this application shows a common division of labor in interoperable applications: it uses an interactive user interface based on Shiny that then calls a collection of Python routines that do the primary computation.</p><p>By the way, another of my respondents, Nikolay Dudaev (@nikdudaev), built his application to work the other way around from this first one:</p><blockquote><p>There is something I cannot share but I do all the analysis, tidying, transformations etc in R and then display the results in the app written inPython and Dash.</p></blockquote><p><br /></p><h3 id="example-2-processing-lake-catchment-data-using-a-gis">Example 2: Processing Lake Catchment Data Using a GIS</h3><figure><a href="https://fishandwhistle.net/post/2020/calling-qgis-from-r/" target="_blank" rel="noopener noreferrer"><div style="padding: 35px 0 0 0;"><img align="center" src="catchment.png"></div></a><figcaption>Figure 2: A lake watershed computed with the help of SAGA and GRASS GIS algorithms.</figcaption></figure><p>Another common interoperability use case is using another language to gain access to a unique code base. In this case, Dewey Dunnington (@paleolimbot ) needed a Geographic Information System (GIS) installer called QGIS, which gives him access to SAGA and GRASS GIS systems.</p><blockquote>Not sure if this totally fits the bill, but it's Python + R working together in the same blog post!<br><a href="https://fishandwhistle.net/post/2020/calling-qgis-from-r/" target="_blank" rel="noopener noreferrer">https://fishandwhistle.net/post/2020/calling-qgis-from-r/</a></blockquote><p>While Dewey notes that he could have done this using R libraries that access the SAGA and GRASS systems, he ultimately decided that it was easier just to call the Python versions which already had QGIS installed.</p><p><br /></p><h3 id="example-3-multi-omics-factor-analysis">Example 3: Multi-Omics Factor Analysis</h3><figure><a href="https://github.com/bioFAM/MOFA2" target="_blank" rel="noopener noreferrer"><div style="padding: 35px 0 0 0;"><img align="center" src="mofa.png"></div></a><figcaption>Figure 3: MOFA infers an interpretable low-dimensional representation in terms of a few latent factors.</figcaption></figure><p>OK, I admit it: I only have the vaguest idea of what this program does. Fortunately the repository pointed to by Ryan Thompson&rsquo;s (@DarwinAwdWinner) tweet provides a pretty good description, provided you know about multi-omic data sets.</p><blockquote>MOFA is written partly in Python and partly in R: <br><a href="https://github.com/bioFAM/MOFA2" target="_blank" rel="noopener noreferrer">https://github.com/bioFAM/MOFA2</a></blockquote><p>See Ryan&rsquo;s Github repo for more details. Interoperability takes place during the processing workflow as follows:</p><ol><li>The user loads the source data and trains the model using either a Python notebook or R code (which calls Python code using the <code>reticulate</code> package). Both versions process the training data and output a model.</li><li>The model data is then processed downstream for viewing and interactive exploration using Shiny. At present, the downstream process only runs in R.</li></ol><p><br /></p><h2 id="interoperability-helps-data-scientists-avoid-reinventing-the-wheel">Interoperability Helps Data Scientists Avoid Reinventing the Wheel</h2><figure><div style="padding: 35px 0 0 0;"><img align="center" src="interoperability-hexes.jpg"></div><figcaption>Figure 4: Three motivations for writing interoperable code.</figcaption></figure><blockquote><p>&ldquo;If I have seen further, it is by standing on the shoulders of giants.&rdquo;&ndash; Isaac Newton, 1675</p></blockquote><p>When faced with a difficult challenge in their jobs, few data scientists say to themselves, &ldquo;I think I&rsquo;ll include new languages in my analysis just for fun.&rdquo; Instead, data scientists typically write interoperable code to solve problems and to build on the work of others, just as Isaac Newton said 345 years ago.</p><p>Our examples above illustrate three motivations that underlie many interoperable applications. These motivations frequently overlap and in some situations, all three might apply. Nonetheless, many interoperable applications come about because data scientists want to:</p><ul><li><strong>Accelerate results.</strong> Tomas Capretto, the author of our first example, wrapped some existing Python simulation code in a Shiny application to allow interactive exploration of the simulation results. While he could have achieved the same goal by rewriting his interactive simulation entirely in R, he instead used existing code to create a shorter path to the result he was trying to achieve. This approach meant his users could interact with his simulation right away instead of waiting for him to rewrite the simulator (our June blog post &ldquo;<a href="https://blog.rstudio.com/2020/06/09/is-your-data-science-team-agile/" target="_blank" rel="noopener noreferrer">Is Your Data Science Team Agile?</a>&rdquo; provides more details of why this type of agility is important).</li><li><strong>Access specialized knowledge.</strong> Dewey Dunnington needed access to SAGA and GRASS GIS systems because those geographical information systems contain algorithms that represent the gold standard for GIS work. While R and Python boast tens of thousands of libraries and modules, we shouldn&rsquo;t expect them each to contain every key algorithm for every field of research, especially as fields of study have become increasingly more specialized. Interoperable code allows us to build on that prior work that&rsquo;s been done, regardless of what language it was written in.</li><li><strong>Participate in an ecosystem.</strong> Ryan Thompson&rsquo;s application demonstrates another type of mixed R and Python workflow. His front-end R and Python notebooks allow data scientists to collect data using whichever tool they are more comfortable with, and the model created by those front-end notebooks then drives an R-only downstream processing step. His MOFA model is one of many used in the the open source <a href="https://bioconductor.org" target="_blank" rel="noopener noreferrer"><em>bioconductor.org</em></a> community, which includes hundreds of developers writing primarily in R. Similar large communities have built up around Python-based machine learning packages such as <a href = "https://tensorflow.rstudio.com" target="_blank" rel="noopener noreferrer"><em>tensorflow</em></a> and <a href = "https://keras.rstudio.com" target="_blank" rel="noopener noreferrer"><em>keras</a></em>, and around cluster-based computing via the <a href = "https://spark.rstudio.com" target="_blank" rel="noopener noreferrer"><em>sparklyr</em></a> package for R. Writing interoperable code allows data scientists to both participate in and contribute to those communities, even if they aren&rsquo;t fluent in the native programming language of each group. Better yet, these cross-language projects expand the population of the communities and encourage the sustainability of their code bases for the greater world.</li></ul><p>The examples I&rsquo;ve given here are only a small subset of many interoperability efforts taking place in the R and Python communities. If you have other interoperability examples you&rsquo;d like to showcase, please send me an email at <a href="mailto::carl@rstudio.com" target="_blank" rel="noopener noreferrer"><a href="mailto:carl@rstudio.com">carl@rstudio.com</a></a> or tag me on Twitter at @cdhowe. I&rsquo;m particularly interested in how you arrived at your interoperability approach and what benefits you gained from interoperability.</p></description></item><item><title>4 Tips to Make Your Shiny Dashboard Faster</title><link>https://www.rstudio.com/blog/4-tips-to-make-your-shiny-dashboard-faster/</link><pubDate>Tue, 21 Jul 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/4-tips-to-make-your-shiny-dashboard-faster/</guid><description><figure><img src="./shiny-comparisons.gif" alt="Fast versus slow Shiny app" /><figcaption>A slow-running Shiny application (left) and an optimized one (right)</figcaption></figure><p><em>This is a guest post from RStudio&rsquo;s partner, <a href="https://appsilon.com/" target="_blank" rel="noopener noreferrer">Appsilon Data Science</a></em></p><p>When developing Shiny applications, we at Appsilon strive to implement functionality, enhance appearance, and optimize the user&rsquo;s experience. However, we often forget about one of the most important elements of UX: the speed of the application. Nobody wants to use a slow application that takes seconds (or minutes) to load or navigate. In this article, I will share four tips and best practices that will help your Shiny applications run much faster. Those tips are:</p><ol><li>Figure out why your Shiny app is running slowly</li><li>Use faster functions</li><li>Pay attention to scoping rules for Shiny apps</li><li>Use caching operations</li></ol><p>The theme underlying these tips can be summed up by this quote:</p><blockquote><p>"The reason for Shiny's slow action [is] usually not Shiny." - Winston Chang</p></blockquote><h3 id="1-measure-where-your-shiny-app-is-spending-its-time">1. Measure Where Your Shiny App Is Spending Its Time</h3><p>With R, we can find some very useful solutions for verifying which parts of our code are less effective. One of my favorite tools is the <em>profvis</em> package, whose output is shown below:</p><figure><img align="center" style="padding: 35px;" src="profvis.jpg"><br />A timing measurement created by the <em>profvis</em> package</figure><p>Profvis allows you to measure the execution time and R memory consumption of R code. The package itself can generate a readable report that helps us identify inefficient parts of the code, and it can be used to test Shiny applications. You can see profvis in action <a href="https://rstudio.com/resources/shiny-dev-con/profiling/" target="_blank" rel="noopener noreferrer">here</a>.</p><p>If we are only interested in measuring a code fragment rather than a complete application, we may want to consider simpler tools such as the <em>tictoc</em> package, which measures the time elapsed to run a particular code fragment.</p><h3 id="2-use-faster-functions">2. Use Faster Functions</h3><p>Once you&rsquo;ve profiled your application, take a hard look at the functions consuming the most time. You may achieve significant performance gains by replacing the functions you routinely use with faster alternatives.</p><p>For example, a Shiny app might search a large vector of strings for ones starting with the characters &ldquo;an&rdquo;. Most R programmers would use a function such as <code>grepl</code> as shown below:</p><pre><code> grepl(&quot;^an&quot;, cnames),</code></pre><p>However, we don&rsquo;t need the regular expression capabilities of grepl to find strings starting with a fixed pattern. We can tell grepl not to bother with regular expressions by adding the parameter <code>fixed = TRUE</code>. Even better, though, is to use the base R function <code>startsWith</code>. As you can see from the benchmarks below, both options are faster than the original grepl, but the simpler startsWith function performs the search more than 30 times faster.</p><pre><code>microbenchmark::microbenchmark(grepl(&quot;an&quot;, cnames),grepl(&quot;an&quot;, cnames, fixed = TRUE)startsWith(cnames, &quot;an&quot;))Unit: microsecondsexpr min lq mean median uq max nevalgrepl(&quot;an&quot;, cnames) 2046.846 2057.7725 2082.44583 2067.474 2089.499 2449.035 100grepl(&quot;an&quot;, cnames, fixed = TRUE) 1127.246 1130.7440 1146.35229 1132.597 1136.032 1474.634 100startsWith(cnames, &quot;an&quot;) 62.982 63.2485 64.47847 63.548 64.155 79.528 100</code></pre><p>Similarly, consider the following expressions:</p><pre><code>sum_value &lt;- 0for (i in 1:100) {sum_value &lt;- sum_value + i ^ 2}</code></pre><p>versus</p><pre><code>sum_value &lt;- sum((1:100) ^ 2)</code></pre><p>Even a novice R programmer would likely use the second version because it takes advantage of the vectorized function <code>sum</code>.</p><p>When we create more complex functions for our Shiny apps, we should similarly look for vectorized operations to use instead of loops whenever possible. For example, the following code does a simple computation on two columns in a long data frame:</p><pre><code>frame &lt;- data.frame (col1 = runif (10000, 0, 2),col2 = rnorm (10000, 0, 2))for (i in 1:nrow(frame)) {if (frame[i, 'col1'] + frame[i, 'col2'] &gt; 1) {output[i] &lt;- &quot;big&quot;} else {output[i] &lt;- &quot;small&quot;}}</code></pre><p>However, an equivalent output can be obtained much faster by using <code>ifelse</code> which is a vectorized function:</p><pre><code> output &lt;- ifelse(frame$col1 + frame$col2 &gt; 1, &quot;big&quot;, &quot;small&quot;)</code></pre><p>This vectorized version is easier to read and computes the same result about 100 times faster.</p><h3 id="3-pay-attention-to-object-scoping-rules-in-shiny-apps">3. Pay Attention to Object Scoping Rules in Shiny Apps</h3><ol><li><strong>Global</strong>: Objects in global.R are loaded into R&rsquo;s global environment. They persist even after an app stops. This matters in a normal R session, but not when the app is deployed to Shiny Server or Connect. To learn more about how to scale Shiny applications to thousands of users on RStudio Connect, <a href="https://support.rstudio.com/hc/en-us/articles/231874748-Scaling-and-Performance-Tuning-in-RStudio-Connect" target="_blank" rel="noopener noreferrer">this recent article</a> has some excellent tips.</li><li><strong>Application-level:</strong> Objects defined in app.R outside of the <code>server</code> function are similar to global objects, except that their lifetime is the same as the app; when the app stops, they go away. These objects can be shared across all Shiny sessions served by a single R process and may serve multiple users.</li><li><strong>Session-level:</strong> Objects defined within the <code>server</code> function are accessible only to one user session.</li></ol><p>In general, the best practice is:</p><ul><li>Create objects that you wish to be shared among all users of the Shiny application in the global or app-level scopes (e.g., loading data that users will share).</li><li>Create objects that you wish to be private to each user as session-level objects (e.g., generating a user avatar or displaying session settings).</li></ul><h3 id="4-use-caching-operations">4. Use Caching Operations</h3><p>If you&rsquo;ve used all of the previous tips and your application still runs slowly, it&rsquo;s worth considering implementing caching operations. In 2018, RStudio introduced the ability to <a href="https://blog.rstudio.com/2018/11/13/shiny-1-2-0/" target="_blank" rel="noopener noreferrer">cache charts</a> in the Shiny package. However, if you want to speed up repeated operations other than generating graphs, it is worth using a custom caching solution.</p><p>One of my favorite packages that I use for this case is <a href="https://cran.r-project.org/web/packages/memoise/" target="_blank" rel="noopener noreferrer">memoise</a>. Memoise saves the results of new invocations of functions while reusing the answers from previous invocations of those functions.</p><p>The <code>memoise</code> package currently offers 3 methods for storing cached objects:</p><ol><li><code>cache_mem</code> - storing cache in RAM (default)</li><li><code>cache_filesystem(path)</code> - storing cache on the local disk</li><li><code>cache_s3(s3_bucket)</code> - storage in the AWS S3 file database</li></ol><p>The selected caching type is defined by the <code>cache</code> parameter in the <code>memoise</code> function.</p><p>If our Shiny application is served by a single R process and its RAM consumption is low, the simplest method is to use the first option, cache_mem, where the target function is defined and its answers cached in the global environment in RAM. All users will then use shared cache results, and the actions of one user will speed up the calculations of others. You can see a simple example below:</p><pre><code>library(memoise)# Define an example expensive to calculate functionexpensive_function &lt;- function(x) {sum((1:x) ^ 2)Sys.sleep(5) # make it seem to take even longer}system.time(expensive_function(1000)) # Takes at least 5 secondsuser system elapsed0.013 0.016 5.002system.time(expensive_function(1000)) # Still takes at least 5 secondsuser system elapsed0.016 0.015 5.005# Now, let's cache results using memoise and its default cache_memorymemoised_expensive_function &lt;- memoise(expensive_function)system.time(memoised_expensive_function(1000)) # Takes at least 5 secondsuser system elapsed0.016 0.015 5.001system.time(memoised_expensive_function(1000)) # Returns much fasteruser system elapsed0.015 0.000 0.015</code></pre><p>The danger associated with using in-memory caching, however, is that if you don&rsquo;t manage the cached results, it will grow without bound and your Shiny application will eventually run out of memory. You can manage the cached results using the <code>timeout</code> and <code>forget</code> functions.</p><p>If the application is served by many processes running on one server, the best option to ensure cache sharing among all users is to use <code>cache_filesystem</code> and store objects locally on the disk. Again, you will want to manage the cache, but you will be limited only by your available disk space.</p><p>In the case of an extensive infrastructure using many servers, the easiest method will be to use <code>cache_s3</code> which will store its cached values on a shared external file system – in this case, AWS S3.</p><hr><p><strong>About Appsilon Data Science:</strong></p><p><a href="https://appsilon.com/" target="_blank" rel="noopener noreferrer"><img src="./appsilon-logo.png" alt="Appsilon logo" align="left"></a>One of the winners of the <a href="https://blog.rstudio.com/2020/07/13/winners-of-the-2nd-shiny-contest/" target="_blank" rel="noopener noreferrer">2020 Shiny Contest</a> and a <a href="https://rstudio.com/certified-partners/" target="_blank" rel="noopener noreferrer">Full Service RStudio Partner</a>, <a href="https://appsilon.com/" target="_blank" rel="noopener noreferrer">Appsilon</a> delivers enterprise Shiny apps, data science and machine learning consulting, and support with R and Python for customers all around the world.</p></description></item><item><title>rstudio::global() call for talks</title><link>https://www.rstudio.com/blog/rstudio-global-call-for-talks/</link><pubDate>Fri, 17 Jul 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-global-call-for-talks/</guid><description><p>We&rsquo;re excited to announce that the call for talks for <a href="https://www.rstudio.com/blog/rstudio-global-2021/">rstudio::global(2021)</a> is now open! Since we&rsquo;re rethinking the conference to make the most of the new venue, the talks are going to be a little different to usual.</p><p>This year we are particularly interested in talks from people who can&rsquo;t usually make it in person, or are newer to conference speaking. We&rsquo;re excited to partner with <a href="https://www.articulationinc.com/">Articulation Inc</a> to offer free speaker coaching: as long as you have an interesting idea and are willing to put in some work, we&rsquo;ll help you develop a great talk. (And if you&rsquo;re an old hand at conference presentations, we&rsquo;re confident that Articulation can help you get even better!)</p><p>Talks will be 20 minutes long and recordings will be due in early December, and you&rsquo;ll also be part of the live program in January; details TBD. We&rsquo;ll provide support to make sure that everyone can produce a high quality video regardless of circumstances.</p><p>To apply, as well as the usual title and abstract, you&rsquo;ll need to create a 60 second video that introduces you and your proposed topic. In the video, you should tell us who you are, why your topic is important, and what attendees will take away from it. We&rsquo;re particularly interested in hearing about:</p><ul><li><p>How you&rsquo;ve used R (by itself or with other technologies) to solve a challenging problem.</p></li><li><p>Your favourite R package (whether you wrote it or not) and how it significantly eases an entire class of problems or extends R into new domains.</p></li><li><p>Your techniques for teaching R to help it reach new domains and new audiences.</p></li><li><p>Broad reflections on the R community, R packages, or R code.</p></li></ul><p>Applications close August 28, and you&rsquo;ll hear back from us in late September.</p><p><a href="https://forms.gle/5mzcMKd75xaDf2zi9"><strong>APPLY NOW!</strong></a></p></description></item><item><title>rstudio::global(2021)</title><link>https://www.rstudio.com/blog/rstudio-global-2021/</link><pubDate>Fri, 17 Jul 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-global-2021/</guid><description><p>We&rsquo;ve made the difficult decision to cancel rstudio:conf(2021) for the health and safety of our attendees and the broader community 😢. Instead, we&rsquo;re excited to announce rstudio::global(2021): our first ever virtual event focused on all things R and RStudio!</p><p>We have never done a virtual event before and we&rsquo;re feeling both nervous and excited. We will make rstudio::global() our most inclusive and global event, making the most of the freedom from geographical and economic constraints that comes with an online event. That means that the conference will be free, designed around participation from every time zone, and have <a href="https://www.rstudio.com/blog/rstudio-global-call-for-talks/">speakers from around the world</a>.</p><p>We&rsquo;re still working through the details, but as of today we&rsquo;re thinking that most talks will be pre-recorded (so you can watch at your leisure), accompanied by a 24 hour live event filled with keynotes, interviews, opportunities to share knowledge, and as much fun as we can possibly squeeze into a virtual event! We don&rsquo;t know the precise dates yet, but it&rsquo;s likely to be late January 2021.</p><p>We&rsquo;ll share more over the next few weeks: if you would like to receive notifications about the details, please subscribe below.</p><script src="//pages.rstudio.net/js/forms2/js/forms2.min.js"></script><form id="mktoForm_3297"></form><script>MktoForms2.loadForm("//pages.rstudio.net", "709-NXN-706", 3297);</script><p>(If you already registered for rstudio::conf() as a superfan, we&rsquo;ll be in touch shortly to find out if you&rsquo;d prefer a refund or to transfer your registration to 2022. If you have any questions in the mean time, please feel free to reach out to <a href="mailto:conf@rstudio.com">conf@rstudio.com</a>)</p></description></item><item><title>sparklyr 1.3: Higher-order Functions, Avro and Custom Serializers</title><link>https://www.rstudio.com/blog/sparklyr-1-3/</link><pubDate>Thu, 16 Jul 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/sparklyr-1-3/</guid><description><img src="sparklyr.png"/><p><a href="https://sparklyr.ai"><code>sparklyr</code></a> 1.3 is now available on <a href="https://cran.r-project.org/web/packages/sparklyr/index.html">CRAN</a>, with the following major new features:</p><ul><li><a href="#higher-order-functions">Higher-order Functions</a> to easily manipulate arrays and structs</li><li>Support for Apache <a href="#avro">Avro</a>, a row-oriented data serialization framework</li><li><a href="#custom-serialization">Custom Serialization</a> using R functions to read and write any data format</li><li><a href="#other-improvements">Other Improvements</a> such as compatibility with EMR 6.0 &amp; Spark 3.0, and initial support for Flint time series library</li></ul><p>To install <code>sparklyr</code> 1.3 from CRAN, run</p><pre><code class="language-{r" data-lang="{r">install.packages(&quot;sparklyr&quot;)</code></pre><p>In this post, we shall highlight some major new features introduced in sparklyr 1.3, and showcase scenarios where such features come in handy. While a number of enhancements and bug fixes (especially those related to <code>spark_apply()</code>, <a href="https://arrow.apache.org/">Apache Arrow</a>, and secondary Spark connections) were also an important part of this release, they will not be the topic of this post, and it will be an easy exercise for the reader to find out more about them from the sparklyr <a href="https://github.com/sparklyr/sparklyr/blob/master/NEWS.md">NEWS</a> file.</p><h2 id="higher-order-functions">Higher-order Functions</h2><p><a href="https://issues.apache.org/jira/browse/SPARK-19480">Higher-order functions</a> are built-in Spark SQL constructs that allow user-defined lambda expressions to be applied efficiently to complex data types such as arrays and structs. As a quick demo to see why higher-order functions are useful, let&rsquo;s say one day Scrooge McDuck dove into his huge vault of money and found large quantities of pennies, nickels, dimes, and quarters. Having an impeccable taste in data structures, he decided to store the quantities and face values of everything into two Spark SQL array columns:</p><pre><code class="language-{r" data-lang="{r">library(sparklyr)sc &lt;- spark_connect(master = &quot;local&quot;, version = &quot;2.4.5&quot;)coins_tbl &lt;- copy_to(sc,tibble::tibble(quantities = list(c(4000, 3000, 2000, 1000)),values = list(c(1, 5, 10, 25))))</code></pre><p>Thus declaring his net worth of 4k pennies, 3k nickels, 2k dimes, and 1k quarters. To help Scrooge McDuck calculate the total value of each type of coin in sparklyr 1.3 or above, we can apply <code>hof_zip_with()</code>, the sparklyr equivalent of <a href="https://spark.apache.org/docs/latest/api/sql/index.html#zip_with">ZIP_WITH</a>, to <code>quantities</code> column and <code>values</code> column, combining pairs of elements from arrays in both columns. As you might have guessed, we also need to specify how to combine those elements, and what better way to accomplish that than a concise one-sided formula <code>~ .x * .y</code> in R, which says we want (quantity * value) for each type of coin? So, we have the following:</p><pre><code class="language-{r" data-lang="{r">result_tbl &lt;- coins_tbl %&gt;%hof_zip_with(~ .x * .y, dest_col = total_values) %&gt;%dplyr::select(total_values)result_tbl %&gt;% dplyr::pull(total_values)</code></pre><pre><code>[1] 4000 15000 20000 25000</code></pre><p>With the result <code>4000 15000 20000 25000</code> telling us there are in total $40 dollars worth of pennies, $150 dollars worth of nickels, $200 dollars worth of dimes, and $250 dollars worth of quarters, as expected.</p><p>Using another sparklyr function named <code>hof_aggregate()</code>, which performs an <a href="https://spark.apache.org/docs/latest/api/sql/index.html#aggregate">AGGREGATE</a> operation in Spark, we can then compute the net worth of Scrooge McDuck based on <code>result_tbl</code>, storing the result in a new column named <code>total</code>. Notice for this aggregate operation to work, we need to ensure the starting value of aggregation has data type (namely, <code>BIGINT</code>) that is consistent with the data type of <code>total_values</code> (which is <code>ARRAY&lt;BIGINT&gt;</code>), as shown below:</p><pre><code class="language-{r" data-lang="{r">result_tbl %&gt;%dplyr::mutate(zero = dplyr::sql(&quot;CAST (0 AS BIGINT)&quot;)) %&gt;%hof_aggregate(start = zero, ~ .x + .y, expr = total_values, dest_col = total) %&gt;%dplyr::select(total) %&gt;%dplyr::pull(total)</code></pre><pre><code>[1] 64000</code></pre><p>So Scrooge McDuck&rsquo;s net worth is $640 dollars.</p><p>Other higher-order functions supported by Spark SQL so far include <code>transform</code>, <code>filter</code>, and <code>exists</code>, as documented in <a href="https://spark.apache.org/docs/latest/api/sql/index.html">here</a>, and similar to the example above, their counterparts (namely, <code>hof_transform()</code>, <code>hof_filter()</code>, and <code>hof_exists()</code>) all exist in sparklyr 1.3, so that they can be integrated with other <code>dplyr</code> verbs in an idiomatic manner in R.</p><h2 id="avro">Avro</h2><p>Another highlight of the sparklyr 1.3 release is its built-in support for Avro data sources. Apache Avro is a widely used data serialization protocol that combines the efficiency of a binary data format with the flexibility of JSON schema definitions. To make working with Avro data sources simpler, in sparklyr 1.3, as soon as a Spark connection is instantiated with <code>spark_connect(..., package = &quot;avro&quot;)</code>, sparklyr will automatically figure out which version of <code>spark-avro</code> package to use with that connection, saving a lot of potential headaches for sparklyr users trying to determine the correct version of <code>spark-avro</code> by themselves. Similar to how <code>spark_read_csv()</code> and <code>spark_write_csv()</code> are in place to work with CSV data, <code>spark_read_avro()</code> and <code>spark_write_avro()</code> methods were implemented in sparklyr 1.3 to facilitate reading and writing Avro files through an Avro-capable Spark connection, as illustrated in the example below:</p><pre><code class="language-{r" data-lang="{r">library(sparklyr)# The `package = &quot;avro&quot;` option is only supported in Spark 2.4 or highersc &lt;- spark_connect(master = &quot;local&quot;, version = &quot;2.4.5&quot;, package = &quot;avro&quot;)sdf &lt;- sdf_copy_to(sc,tibble::tibble(a = c(1, NaN, 3, 4, NaN),b = c(-2L, 0L, 1L, 3L, 2L),c = c(&quot;a&quot;, &quot;b&quot;, &quot;c&quot;, &quot;&quot;, &quot;d&quot;)))# This example Avro schema is a JSON string that essentially says all columns# (&quot;a&quot;, &quot;b&quot;, &quot;c&quot;) of `sdf` are nullable.avro_schema &lt;- jsonlite::toJSON(list(type = &quot;record&quot;,name = &quot;topLevelRecord&quot;,fields = list(list(name = &quot;a&quot;, type = list(&quot;double&quot;, &quot;null&quot;)),list(name = &quot;b&quot;, type = list(&quot;int&quot;, &quot;null&quot;)),list(name = &quot;c&quot;, type = list(&quot;string&quot;, &quot;null&quot;)))), auto_unbox = TRUE)# persist the Spark data frame from above in Avro formatspark_write_avro(sdf, &quot;/tmp/data.avro&quot;, as.character(avro_schema))# and then read the same data frame backspark_read_avro(sc, &quot;/tmp/data.avro&quot;)</code></pre><pre><code># Source: spark&lt;data&gt; [?? x 3]a b c&lt;dbl&gt; &lt;int&gt; &lt;chr&gt;1 1 -2 &quot;a&quot;2 NaN 0 &quot;b&quot;3 3 1 &quot;c&quot;4 4 3 &quot;&quot;5 NaN 2 &quot;d&quot;</code></pre><h2 id="custom-serialization">Custom Serialization</h2><p>In addition to commonly used data serialization formats such as CSV, JSON, Parquet, and Avro, starting from sparklyr 1.3, customized data frame serialization and deserialization procedures implemented in R can also be run on Spark workers via the newly implemented <code>spark_read()</code> and <code>spark_write()</code> methods. We can see both of them in action through a quick example below, where <code>saveRDS()</code> is called from a user-defined writer function to save all rows within a Spark data frame into 2 RDS files on disk, and <code>readRDS()</code> is called from a user-defined reader function to read the data from the RDS files back to Spark:</p><pre><code class="language-{r" data-lang="{r">library(sparklyr)sc &lt;- spark_connect(master = &quot;local&quot;)sdf &lt;- sdf_len(sc, 7)paths &lt;- c(&quot;/tmp/file1.RDS&quot;, &quot;/tmp/file2.RDS&quot;)spark_write(sdf, writer = function(df, path) saveRDS(df, path), paths = paths)spark_read(sc, paths, reader = function(path) readRDS(path), columns = c(id = &quot;integer&quot;))</code></pre><pre><code># Source: spark&lt;?&gt; [?? x 1]id&lt;int&gt;1 12 23 34 45 56 67 7</code></pre><h2 id="other-improvements">Other Improvements</h2><h3 id="sparklyrflint">Sparklyr.flint</h3><p><a href="https://github.com/r-spark/sparklyr.flint">Sparklyr.flint</a> is a sparklyr extension that aims to make functionalities from the <a href="https://github.com/twosigma/flint">Flint</a> time-series library easily accessible from R. It is currently under active development. One piece of good news is that, while the original <a href="https://github.com/twosigma/flint">Flint</a> library was designed to work with Spark 2.x, a slightly modified <a href="https://github.com/yl790/flint">fork</a> of it will work well with Spark 3.0, and within the existing sparklyr extension framework. <code>sparklyr.flint</code> can automatically determine which version of the Flint library to load based on the version of Spark it&rsquo;s connected to. Another bit of good news is, as previously mentioned, <code>sparklyr.flint</code> doesn&rsquo;t know too much about its own destiny yet. Maybe you can play an active part in shaping its future!</p><h3 id="emr-60">EMR 6.0</h3><p>This release also features a small but important change that allows sparklyr to correctly connect to the version of Spark 2.4 that is included in Amazon EMR 6.0.</p><p>Previously, sparklyr automatically assumed any Spark 2.x it was connecting to was built with Scala 2.11 and attempted to load any required Scala artifacts built with Scala 2.11 as well. This became problematic when connecting to Spark 2.4 from Amazon EMR 6.0, which is built with Scala 2.12. Starting from sparklyr 1.3, such problem can be fixed by simply specifying <code>scala_version = &quot;2.12&quot;</code> when calling <code>spark_connect()</code> (e.g., <code>spark_connect(master = &quot;yarn-client&quot;, scala_version = &quot;2.12&quot;)</code>).</p><h3 id="spark-30">Spark 3.0</h3><p>Last but not least, it is worthwhile to mention sparklyr 1.3.0 is known to be fully compatible with the recently released Spark 3.0. We highly recommend upgrading your copy of sparklyr to 1.3.0 if you plan to have Spark 3.0 as part of your data workflow in future.</p><h2 id="acknowledgement">Acknowledgement</h2><p>In chronological order, we want to thank the following individuals for submitting pull requests towards sparklyr 1.3:</p><ul><li><a href="https://github.com/jozefhajnala">Jozef Hajnala</a></li><li><a href="https://github.com/falaki">Hossein Falaki</a></li><li><a href="https://github.com/samuelmacedo83">Samuel Macêdo</a></li><li><a href="https://github.com/yl790">Yitao Li</a></li><li><a href="https://github.com/Loquats">Andy Zhang</a></li><li><a href="https://github.com/javierluraschi">Javier Luraschi</a></li><li><a href="https://github.com/nealrichardson">Neal Richardson</a></li></ul><p>We are also grateful for valuable input on the sparklyr 1.3 roadmap, <a href="https://github.com/sparklyr/sparklyr/pull/2434">#2434</a>, and <a href="https://github.com/sparklyr/sparklyr/pull/2551">#2551</a> from <a href="https://github.com/javierluraschi">@javierluraschi</a>, and insightful advice on <a href="https://github.com/sparklyr/sparklyr/issues/1773">#1773</a> and <a href="https://github.com/sparklyr/sparklyr/issues/2514">#2514</a> from <a href="https://github.com/mattpollock">@mattpollock</a> and <a href="https://github.com/benmwhite">@benmwhite</a>.</p><p>Please note if you believe you are missing from the acknowledgement above, it may be because your contribution has been considered part of the next sparklyr release rather than part of the current release. We do make every effort to ensure all contributors are mentioned in this section. In case you believe there is a mistake, please feel free to contact the author of this blog post via e-mail (yitao at rstudio dot com) and request a correction.</p><p>If you wish to learn more about <code>sparklyr</code>, we recommend visiting <a href="https://sparklyr.ai">sparklyr.ai</a>, <a href="https://spark.rstudio.com">spark.rstudio.com</a>, and some of the previous release posts such as <a href="https://blogs.rstudio.com/ai/posts/2020-04-21-sparklyr-1.2.0-released/">sparklyr 1.2</a> and <a href="https://blog.rstudio.com/2020/01/29/sparklyr-1-1/">sparklyr 1.1</a>.</p><p>Thanks for reading!</p><p>This post was originally posted on <a href="https://blogs.rstudio.com/ai/">blogs.rstudio.com/ai/</a></p></description></item><item><title>Interoperability: Getting the Most Out of Your Analytic Investments</title><link>https://www.rstudio.com/blog/interoperability-maximize-analytic-investments/</link><pubDate>Wed, 15 Jul 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/interoperability-maximize-analytic-investments/</guid><description><p><sup>Photo by <a href="https://unsplash.com/@federize?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText" target="_blank" rel="noopener noreferrer">Federico Beccari</a> on <a href="https://unsplash.com/s/photos/connection?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText" target="_blank" rel="noopener noreferrer">Unsplash</a></sup></p><h3 id="the-challenges-of-complexity-and-underutilization">The Challenges of Complexity and Underutilization</h3><p>Organizations typically have multiple different environments and frameworks to support their analytic work, with each tool providing specialized capabilities or serving different audiences. These usually include:</p><ul><li><strong>Spreadsheets</strong> created in Excel or Google Sheets,</li><li><strong>Data science tools</strong> including R, Python, SPSS, SAS, and many others,</li><li><strong>BI tools</strong> such as Tableau or PowerBI,</li><li><strong>Data storage and management frameworks</strong> including databases and Spark clusters,</li><li><strong>Job management clusters</strong> such as Kubernetes and Slurm.</li></ul><p>For example, in our most recent R Community Survey, we asked what tools and languages respondents used besides R. The results shown in Figure 1 illustrate the wide variety of tools that may be present in an organization.</p><figure><img align="center" style="padding: 35px;" src="tools-chart.jpg"><figcaption>Figure 1: Respondents we surveyed use a wide variety of tools in addition to R.</figcaption></figure><p>These tools and frameworks provide flexibility and power but can also have two unpleasant, unintended consequences: <strong>productivity-undermining complexity</strong> for various stakeholders and <strong>underutilization of expensive analytic frameworks</strong>.</p><p>The stakeholders in the organization experience these consequences because:</p><ul><li><strong>Data scientists require multiple environments to get their work done.</strong> If data scientists have to leave their native tools to access other things they need such as a Spark cluster or a database, they have to switch contexts and remember how to use systems they might only rarely touch. Often, this means they won’t fully exploit the data and other resources available, or they waste time learning and relearning various systems, APIs, languages, and interfaces.</li><li><strong>Data science leaders worry about productivity.</strong> When their teams struggle in this way, these leaders worry that their teams aren’t delivering the full value that they could. This inefficiency can make it more difficult to defend budgets and hire additional team members when needed. These leaders may also face criticism from other departments demanding to know why the data science team isn’t fully utilizing expensive BI deployments or powerful computing resources.</li><li><strong>IT spends time and money supporting underutilized resources.</strong> Analytic infrastructures such as Spark or Kubernetes require considerable resources to set up and maintain. If these resources are being underutilized, IT will question their lack of ROI and whether they should continue to maintain them. These questions can lead to uncomfortable internal friction between departments, particularly depending on who requested the investments in the first place and what expectations were used to justify them.</li></ul><figure><img align="center" style="padding: 35px;" src="interoperability.png"><figcaption> Figure 2: Interoperability is a key strength of the R ecosystem.</figcaption></figure><h3 id="teams-need-interoperable-tools">Teams Need Interoperable Tools</h3><p>Interoperable systems that give a data scientist direct access to different platforms from their native tools can help address these challenges. Everyone benefits from this approach because:</p><ul><li><strong>Data scientists keep working in their preferred environment.</strong> Rather than constantly switching between different tools and IDEs and interrupting their flow, data scientists can continue to work in the tools and languages they prefer. This makes the data scientist more productive and reduces the need to keep switching contexts.</li><li><strong>Data science leaders get more productivity from their teams.</strong> When teams are more productive, they deliver more value to their organization. Delivered value helps them justify more training, tools, and team members. Easier collaboration and reuse of each other’s work further increases productivity. For example, if a data scientist who prefers R can easily call the Python script developed by a colleague from their preferred language, they avoid reimplementing the same work twice.</li><li><strong>Teams make better use of IT resources.</strong> Since it is easier for data scientists to use the frameworks and other infrastructure IT has put in place, they use those resources more consistently. This higher utilization helps the organization achieve the expected ROI from these analytic investments.</li></ul><h2 id="encouraging-interoperability">Encouraging Interoperability</h2><p>Interoperability is a mindset more than technology. You can encourage interoperability throughout your data science team with four initiatives:</p><ol><li><strong>Embrace open source software.</strong> One of the advantages of open source software is the wide community providing specialized packages to connect to data sources, modeling frameworks, and other resources. If you need to connect to something, there is an excellent chance someone in the community has already built a solution. For example, as shown in Figure 2, the R ecosystem already provides interoperability with many different environments.</li><li><strong>Make the data natively accessible.</strong> Good data science needs access to good up-to-date data. Direct access to data in the data scientist’s preferred tool, instead of requiring the data scientist to use specialized software, helps the data scientist be more productive and makes it easier to automate a data pipeline as part of a data product. Extensive resources exist to help, whether your data is in <a href="https://db.rstudio.com/" target="_blank" rel="noopener noreferrer">databases</a>, <a href="https://spark.rstudio.com/" target="_blank" rel="noopener noreferrer">Spark clusters</a>, or elsewhere.</li><li><strong>Provide connections to other data science or ML tools.</strong> Every data scientist has a preferred language or tool, and every data science tool has its unique strengths. By providing easy connections to other tools, you expand the reach of your team and make it easier to collaborate and benefit from the work of others. For example, the <a href="https://rstudio.github.io/reticulate/" target="_blank" rel="noopener noreferrer">reticulate</a> package allows an R user to call Python in a variety of ways, and the <a href="https://tensorflow.rstudio.com/" target="_blank" rel="noopener noreferrer">Tensorflow package</a> provides an interface to large-scale TensorFlow machine learning applications.</li><li><strong>Make your compute environments natively accessible.</strong> Most data scientists aren’t familiar with job management clusters such as Kubernetes and Slurm and often struggle to use them. By making these environments available directly from their native tools, your data scientists are far more likely to use them. For example, <a href="https://rstudio.com/products/rstudio-server-pro/" target="_blank" rel="noopener noreferrer">RStudio Server Pro</a> allows a data scientist to run a script on a Kubernetes or Slurm cluster directly from within their familiar IDE.</li></ol><p>Eric Nantz, a Research Scientist at Eli Lilly and Company, spoke at rstudio::conf 2020 about the importance of interoperability in R:</p><script src="https://fast.wistia.com/embed/medias/jyk6q0svdy.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_jyk6q0svdy videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/jyk6q0svdy/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><h2 id="learn-more-about-interoperability">Learn more about Interoperability</h2><p>In future posts, we will expand on this idea of Interoperability, with a particular focus on teams using R and Python, and how open source data science can complement BI tools.</p><p>If you’d like to learn more about Interoperability, we recommend these resources:</p><ul><li><a href="https://blog.rstudio.com/2020/07/07/interoperability-july/" target="_blank" rel="noopener noreferrer">In this recent blog post</a>, we introduced the idea of interoperability with an example of calling multiple different languages from the RStudio IDE.</li><li><a href="https://rstudio.com/about/customer-stories/brown-forman/" target="_blank" rel="noopener noreferrer">In this recent customer spotlight</a>, Paul Ditterline, Manager Data Science at Brown-Forman, describes how RStudio products helped their data science team “turn into application developers and data engineers without learning any new languages or computer science skills.”</li><li><a href="https://solutions.rstudio.com/production/integrations/" target="_blank" rel="noopener noreferrer">This article</a> describes how RStudio products integrate with many different frameworks, including databases, Spark, Kubernetes, Slurm, Git, etc.</li><li><a href="https://rstudio.com/solutions/r-and-python/" target="_blank" rel="noopener noreferrer">R and Python, a Love Story</a> shows how RStudio products helped bilingual data science teams collaborate more productively, and have a greater impact on their organization.</li><li>At rstudio::conf 2020, George Kastrinakis from Financial Times <a href="https://rstudio.com/resources/rstudioconf-2020/building-a-new-data-science-pipeline-for-the-ft-with-rstudio-connect/" target="_blank" rel="noopener noreferrer">presented a case study</a> on building a new data science pipeline, using R and RStudio Connect.</li></ul></description></item><item><title>RStudio Connect 1.8.4</title><link>https://www.rstudio.com/blog/rstudio-connect-1-8-4/</link><pubDate>Tue, 14 Jul 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-1-8-4/</guid><description><h2 id="a-place-for-python-applications">A place for Python applications</h2><p>For data science teams looking to promote data-driven decision making within an organization, interactive applications built in R and Python are often the gold standard. These interactive tools, built by data science teams, are powerful when put in the hands of the right people, but they don’t do much good at all if they only run locally. R users who build Shiny applications hit this roadblock, and so do Python users working with tools like Dash, Bokeh, and Streamlit. IT departments want to help but often aren&rsquo;t sure how. RStudio Connect solves this problem, for both R and Python applications.</p><p>RStudio Connect 1.8.4 focuses on helping Python users by including support for a full suite of interactive application types. Support for publishing Dash applications is now generally available, and this release introduces new Beta offerings for <a href="https://bokeh.org/" target="_blank">Bokeh</a> and <a href="https://www.streamlit.io/" target="_blank">Streamlit</a> application deployment.</p><h3 align="center"><a href="https://rstudio.chilipiper.com/book/rsc-demo">See RStudio Connect in Action</a></h3><h2 id="interactive-python-applications">Interactive Python Applications</h2><h3 id="get-started-with-the-rstudio-connect-jump-start">Get started with the RStudio Connect Jump Start</h3><p>For a hands-on approach to learning about Python content in RStudio Connect, try exploring the Jump Start Examples. This resource contains lightweight data science project examples built with various R and Python frameworks. The Jump Start Examples appear in the RStudio Connect dashboard when you first log in. You can download each project, run it locally, and follow the provided instructions to publish it back to the RStudio Connect server; or you could simply browse the examples and deployment steps to get a sense for how you might publish your own project.</p><p><img src="python-jump-start-184.png" alt="Python Jump Start Screenshot"></p><h3 id="develop-and-deploy-python-applications-from-your-favorite-python-ide">Develop and deploy Python applications from your favorite Python IDE</h3><p>New users often ask, <em>Do I have to develop Python applications in the RStudio IDE in order to publish them in RStudio Connect?</em> The answer is no! You do not need to touch the RStudio IDE for Python content development or publishing.</p><p>Publishing Python applications to RStudio Connect requires the <a href="https://pypi.org/project/rsconnect-python/" target="_blank"><code>rsconnect-python</code></a> package. This package is available to install with pip from PyPI and enables a command-line interface that can be used to publish from any Python IDE including PyCharm, VS Code, JupyterLab, Spyder, and others.</p><p>Once you have the <code>rsconnect-python</code> package, the only additional information you need to supply is the RStudio Connect server address and a <a href="https://docs.rstudio.com/connect/user/api-keys/">publisher API key</a>.</p><p>The application shown here is the Stock Pricing Dashboard built with Dash and available for download from the Jump Start Examples available in RStudio Connect. The example comes packaged with everything needed to run it locally in the Python IDE of your choice. When you’re ready to try publishing, the Jump Start will guide you through that process, including all the required commands from <code>rsconnect-python</code>.</p><p><img src="dash-jumpstart-example.gif" alt="GIF of the Dash Jump Start example in RStudio Connect"></p><h3 id="streamlit-and-bokeh-applications">Streamlit and Bokeh Applications</h3><p>Data scientists who develop Streamlit or Bokeh applications can also use the <a href="https://pypi.org/project/rsconnect-python/" target="_blank"><code>rsconnect-python</code></a> package to publish to RStudio Connect. If you&rsquo;ve previously used the <code>rsconnect-python</code> package for other types of Python content deployment, make sure you upgrade to the latest version before attempting to use the Beta features with RStudio Connect 1.8.4.</p><p>This release ships with a example for Streamlit, located in the Jump Start Examples Python tab. For Bokeh, we recommend starting with examples from the <a href="https://docs.bokeh.org/en/latest/docs/gallery.html#server-app-examples" target="_blank">Bokeh App Gallery</a>. Source code for the Bokeh gallery applications is available from the <a href="https://github.com/bokeh/bokeh/tree/master/examples/app/" target="_blank">Bokeh GitHub repository</a>.</p><p>Visit the User Guide to learn more about our beta support for Streamlit and Bokeh:</p><ul><li>Learn more about <a href="https://docs.rstudio.com/connect/user/streamlit/">deploying Streamlit applications</a></li><li>Learn more about <a href="https://docs.rstudio.com/connect/user/bokeh/">deploying Bokeh applications</a></li></ul><p><strong>What does &ldquo;Beta&rdquo; Mean?</strong> <em>Bokeh and Streamlit app deployment are beta features. This means they are still undergoing final testing before official release. Should you encounter any bugs, glitches, lack of functionality or other problems, please let us know so we can improve before public release.</em></p><h3 align="center">Learn how data science teams use RStudio products<br/><a href="https://rstudio.com/solutions/r-and-python/">Visit R & Python - A Love Story</a></h3><h2 id="new--notable">New &amp; Notable</h2><h3 id="scheduling-across-time-zones">Scheduling Across Time Zones</h3><p>A new time zone option for scheduled reports can be used to prevent schedules from breaking during daylight savings time changes. Publishers can now configure a report to run in a specific time zone by modifying the settings available in the <a href="https://docs.rstudio.com/connect/user/scheduling/">Schedule panel</a>.</p><h3 id="content-usage-tracking">Content Usage Tracking</h3><p>In previous versions of RStudio Connect, content usage was only available for static and rendered content, as well as Shiny applications. With this release, Python content and Plumber API usage data is available via the <a href="https://docs.rstudio.com/connect/api/#instrumentation">Instrumentation API</a>. Learn more about tracking content usage on RStudio Connect in the <a href="https://docs.rstudio.com/connect/cookbook/user-activity/">Server API Cookbook</a>.</p><img src="content-usage-past30.png" alt="Screenshot of the Content Usage Info Panel in RStudio Connect" width="40%"/><h2 id="authentication-changes">Authentication Changes</h2><h3 id="openid-connect">OpenID Connect</h3><p>RStudio Connect 1.8.4 introduces OpenID Connect as an authentication provider for single sign-on (SSO). This new functionality is built on top of the existing support for OAuth2, which was previously limited to Google authentication. For backwards compatibility, Google is the default configuration, so no action is necessary for existing installations. See the <a href="https://docs.rstudio.com/connect/admin/authentication/oauth2/">OAuth2</a> section of the Admin Guide for details.</p><h3 id="automatic-user-role-mapping">Automatic User Role Mapping</h3><p>RStudio Connect now supports assigning user roles from authentication systems that support remote groups. Roles can be assigned in a custom attribute or automatically mapped from an attribute or group name. See <a href="https://docs.rstudio.com/connect/admin/user-management/#user-role-mapping">Automatic User Role Mapping</a> for more details.</p><h3 id="custom-login--logout-for-proxied-authentication">Custom Login &amp; Logout for Proxied Authentication</h3><p>Proxied authentication now supports more customizable login and logout flows with the settings ProxyAuth.LoginURL and ProxyAuth.LogoutURL. See the <a href="https://docs.rstudio.com/connect/admin/authentication/proxied/">Proxied Authentication</a> section of the Admin Guide for details.</p><h3 align="center"><a href="https://rstudio.com/products/connect/evaluation/">Try the free 45 day evaluation of RStudio Connect 1.8.4</a></h3><h2 id="deprecations--breaking-changes">Deprecations &amp; Breaking Changes</h2><ul><li><strong>Breaking Change</strong> SSLv3 is no longer supported, since it is considered cryptographically broken.</li><li><strong>Deprecation</strong> The setting <code>Python.LibraryCheckIsFatal</code> has been deprecated. Python library version checks are now non-fatal and result in a warning in the RStudio Connect log at startup.</li></ul><p>Please review the <a href="http://docs.rstudio.com/connect/news">full release notes</a>.</p><blockquote><h4 id="upgrade-planning">Upgrade Planning</h4><p>For RStudio Connect installations that make use of Python, note that the latest version of the virtualenv package (version 20) is now supported. This is a reversal of the previous RStudio Connect 1.8.2 requirement on virtualenv.This release also provides support for Ubuntu 20.04 LTS.</p></blockquote><p>To perform an upgrade, download and run the installation script. The script installs a new version of RStudio Connect on top of the earlier one, and existing configuration settings are respected.</p><pre><code># Download the installation scriptcurl -Lo rsc-installer.sh https://cdn.rstudio.com/connect/installer/installer-v1.1.0.sh# Run the installation scriptsudo bash ./rsc-installer.sh 1.8.4-11</code></pre><h3 align="center"><a href="https://rstudio.com/products/connect/">Click through to learn more about RStudio Connect</a></h3></description></item><item><title>Winners of the 2nd Annual Shiny Contest</title><link>https://www.rstudio.com/blog/winners-of-the-2nd-shiny-contest/</link><pubDate>Mon, 13 Jul 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/winners-of-the-2nd-shiny-contest/</guid><description><p>At rstudio::conf(2020) we <a href="https://rstudio.com/resources/rstudioconf-2020/making-the-shiny-contest/" target="_blank" rel="noopener noreferrer">announced</a> the Shiny Contest 2020. Since then, lots changed in the world, and we decided to hold off on announcing the results of the contest for a bit. Today we would like to take some time to acknowledge the winners, honourable mentions, and runners up for Shiny Contest 2020.</p><p>But before that, let’s start with some stats!</p><p>We had 220 submissions from 183 unique Shiny developers to the contest this year. The number of submissions this year was 62% higher than last year, which, frankly, contributed to the lengthy review period.</p><p>This year we announced a prize specifically for novice Shiny developers, and we are thrilled that 32% of the submissions were from developers with less than 1 year experience with Shiny.</p><p>We were also incredibly impressed by the wide variety of application areas of the submissions. The figures below shows the distributions of categories and keywords for the app submissions. Perhaps unsurprisingly, lots of submissions involving COVID-19 data! Note that especially the categories plot likely underestimates the diversity of application areas since it can be quite difficult to classify some apps into a single category.</p><p><img src="categories-keywords-1.png" /></p><p>Apps were evaluated based on technical merit and artistic achievement. Some apps excelled in one of these categories and some in the other, and some in both. Evaluation also took into account the narrative on the contest submission post on RStudio Community.</p><p>All winners of the Shiny Contest 2020 will get one year of shinyapps.io Basic Plan, a bunch of hex stickers of RStudio packages, and a spot on the <a href="https://shiny.rstudio.com/gallery/#user-showcase" target="_blank" rel="noopener noreferrer">Shiny User Showcase</a>. Runners up will additionally get any number of RStudio t-shirts, books, and mugs (worth up to $200) where mailing is possible. And, finally, grand prize winners will additionally receive special and persistent recognition by RStudio in the form of a winners page and a badge that will be publicly visible on their <a href="http://rstudio.community/" target="_blank" rel="noopener noreferrer">RStudio Community</a> profile, as well as half-an-hour one-on-one with a representative from the RStudio Shiny team for Q&amp;A and feedback!</p><p>Alright, without further ado, here are the winners! Note that winners are listed in no specific order within each category.</p><div id="grand-prizes" class="section level2"><h2>Grand prizes</h2><div id="blog-explorer" class="section level3"><h3>🏆 <a href="https://nz-stefan.shinyapps.io/blog-explorer/" target="_blank" rel="noopener noreferrer">Blog Explorer</a></h3><p><a href="https://nz-stefan.shinyapps.io/blog-explorer/"><img src="images/topic-model-results.png" align="right" width="100%"></a></p><p>A Shiny app to browse the results of a topic model trained on 30,000+ blog articles about the statistical programming language R.</p><p>We loved the beautiful UI of this app that is developed using HTML templates. Topic modeling results are presented using network graphs that leverage JavaScript. The app code is clear and makes fantastic use of modules. We also enjoyed the through narrative on the submission. <a href="https://community.rstudio.com/t/blog-explorer-2020-shiny-contest-submission/58803" target="_blank" rel="noopener noreferrer">[Read more]</a></p><p><br></p></div><div id="git-discoverer" class="section level3"><h3>🏆 <a href="https://rajkstats.shinyapps.io/git_discoverer_app/" target="_blank" rel="noopener noreferrer">Git Discoverer</a></h3><p><a href="https://rajkstats.shinyapps.io/git_discoverer_app"><img src="images/git-discoverer.png" align="right" width="100%"></a></p><p>This project is re-work the app author’s submission for RStudio Shiny Contest 2019. The re-worked app features popular machine learning and deep learning projects on GitHub, dynamic rendering, functionality to sort by trend, stars, and forks as well as a disconnect screen for Shiny Server.</p><p>We loved that the app author re-built this app to try out skills they developed over the past year. The improvement in the UI is very striking! And the submission narrative is incredibly detailed as well! <a href="https://community.rstudio.com/t/re-work-of-gitdiscoverer-2020-shiny-contest-submission/58325" target="_blank" rel="noopener noreferrer">[Read more]</a></p><p><br></p></div><div id="shiny-decisions" class="section level3"><h3>🏆 <a href="https://sparktuga.shinyapps.io/ShinyDecisions/" target="_blank" rel="noopener noreferrer">Shiny Decisions</a></h3><p><a href="https://sparktuga.shinyapps.io/ShinyDecisions/"><img src="images/shiny-decisions.png" align="right" width="100%"></a></p><p>A game about making the best of terrible choices. In Shiny Decisions your goal is to last as long as possible while making decisions that affect the wealth, population and environment quality in the world.</p><p>The app is quite complex, and hard to describe with words. We strongly recommend giving the game a try to get a sense of it! The code for the app is equally complex, but very well organised.</p><p><br></p></div><div id="datify" class="section level3"><h3>🏆 <a href="https://kneijenhuijs.shinyapps.io/Datify" target="_blank" rel="noopener noreferrer">Datify</a></h3><p><a href="https://kneijenhuijs.shinyapps.io/Datify"><img src="images/datify.png" align="right" width="100%"></a></p><p>Curious about the sentiment of your favourite artists? And how do they compare to other artists? Are specific artists changing their musical style over time? And do they vary more in their musical creativity than others? Answers to these type of questions can be found in this app that is inspired, in part, by one of last year’s winning submissions, the <a href="https://community.rstudio.com/t/shiny-contest-submission-sentify-spotify-musical-sentiment-visualization/25207" target="_blank" rel="noopener noreferrer">Sentify</a> app.</p><p>We loved the clean UI of this app and the interactivity when selecting artists. The various data visualisations and the consistent use of color in them is quite striking as well! We should also note that this app won one of the Grand Prizes in the novice category with app developers having less than 1 year experience with Shiny!</p><p><br></p></div><div id="hexmake" class="section level3"><h3>🏆 <a href="https://connect.thinkr.fr/hexmake/" target="_blank" rel="noopener noreferrer">Hexmake</a></h3><p><a href="https://connect.thinkr.fr/hexmake/"><img src="images/hexmake.png" align="right" width="100%"></a></p><p>An application to build your own hex sticker. Allows to customise name, font, colours, to manipulate the image, to export the hex and to save it in an open hex database.</p><p>The application area is quite straightforward but the technical details of this app are what set it apart from other apps with a similar goal. The app comes with a series of tools built on top of the magick package that allows the user to modify the image they upload to the app. It also comes with its own file format for the resulting hex and it’s plugged into a Mongo database where users can save their own hexes and share it with others. It also has a nice walk through to help users get started. <a href="https://community.rstudio.com/t/hexmake-2020-shiny-contest-submission/59122" target="_blank" rel="noopener noreferrer">[Read more]</a></p><p><br><br></p></div></div><div id="runners-up" class="section level2"><h2>Runners up</h2><p><a href="https://parmsam.shinyapps.io/one_source_indy/"><img src="images/one-source-indy.png" align="right" height="170"></a></p><div id="one-source-indy" class="section level3"><h3>🏅 <a href="https://parmsam.shinyapps.io/one_source_indy/" target="_blank" rel="noopener noreferrer">One Source Indy</a></h3><p>This app uses Indianapolis community resource data to create an open source app to better inform in-need homeless or unstably-housed individuals living in Indianapolis on resources available in their community. The prototype was created to show how publicly available resource data can be used with R shiny, to potentially collaborate with Indianapolis homeless outreach organizations, and to encourage others to develop similar applications for social good.</p><p>We loved the application area of this app and that the app authors also made the webscraping code available. We were also very impressed by the complexity of the app given that the app authors had less than one year experience with Shiny. <a href="https://community.rstudio.com/t/one-source-indy-2020-shiny-contest-submission/55391" target="_blank" rel="noopener noreferrer">[Read more]</a></p><p><br></p><p><a href="https://sebastianwolf.shinyapps.io/Corona-Shiny/"><img src="images/corona-shiny.png" align="right" height="170"></a></p></div><div id="material-design-covid-19-dashboard" class="section level3"><h3>🏅 <a href="https://sebastianwolf.shinyapps.io/Corona-Shiny/" target="_blank" rel="noopener noreferrer">Material Design COVID-19 Dashboard</a></h3><p>Governments and COVID-19: Which one stops it faster, better, has fewer people dying? These questions get answered in this visually appealing and mobile friendly dashboard that uses plotly and shinymaterial. <a href="https://community.rstudio.com/t/material-design-corona-covid-19-dashboard-2020-shiny-contest-submission/59690" target="_blank" rel="noopener noreferrer">[Read more]</a></p><p><br><br><br><br></p><p><a href="https://dgranjon.shinyapps.io/deminR"><img src="images/deminR.png" align="right" height="170"></a></p></div><div id="deminr" class="section level3"><h3>🏅 <a href="https://dgranjon.shinyapps.io/deminR" target="_blank" rel="noopener noreferrer">deminR</a></h3><p>This is the R version of the Minesweeper. The goal is simple : flag all the mines as quick as possible by clicking on the grid. While this app is optimized for mobile use, it also works on desktop. Note that since the right click on desktop platforms is replaced by a long press for mobiles which takes more time, scores are categorized by devices. As soon as you click on a mine, the game is immediately lost. You may reset the game at any time when the timer is on by clicking on the option button in the navigation bar. After a success, the score may be shared on twitter (as long as you have a twitter account). <a href="https://community.rstudio.com/t/deminr-a-minesweeper-for-r-2020-shiny-contest-submission/56356" target="_blank" rel="noopener noreferrer">[Read more]</a></p><p><br></p><p><a href="https://johncoene.shinyapps.io/fopi-contest/#home"><img src="images/freedom-of-press.png" align="right" height="170"></a></p></div><div id="freedom-of-press-index" class="section level3"><h3>🏅 <a href="https://johncoene.shinyapps.io/fopi-contest/#home" target="_blank" rel="noopener noreferrer">Freedom of Press Index</a></h3><p>This app visualises the Freedom of Press Index. It is built using the fullPage package and presents essentially two “views”: (1) to explore the progress of the index through time and (2) another to compare indices across countries. The application is packaged with golem so it can be easily shared and also comes with a docker image. <a href="https://community.rstudio.com/t/freedom-of-press-index-2020-shiny-contest-submission/55775" target="_blank" rel="noopener noreferrer">[Read more]</a></p><p><br><br><br><br></p><p><a href="https://nicohahn.shinyapps.io/covid19/"><img src="images/covid-storyboard.png" align="right" height="170"></a></p></div><div id="visualization-of-covid-19-cases" class="section level3"><h3>🏅 <a href="https://nicohahn.shinyapps.io/covid19/" target="_blank" rel="noopener noreferrer">Visualization of Covid-19 Cases</a></h3><p>This app takes a slightly different approach to visualizing the outbreak of the coronavirus. While the confirmed and deceased cases can still be viewed on a world map, the main goal was to tell the story of the virus. Where it came from, how it spread and what consequences it had. It is designed as a story board that is supplemented by various plots to underline the significance of different events. <a href="https://community.rstudio.com/t/visualization-of-covid-19-cases-2020-shiny-contest-submission/57211" target="_blank" rel="noopener noreferrer">[Read more]</a></p><p><br><br></p></div></div><div id="honorable-mentions" class="section level2"><h2>Honorable mentions</h2><p>✨ <a href="https://scotland.shinyapps.io/sg-equality-evidence-finder/" target="_blank" rel="noopener noreferrer">Equality evidence finder</a>: The Equality Evidence Finder provides a summary of the range of available equality research and statistics for Scotland. The app currently contains over 250 interactive charts and 500 equality evidence summaries, covering a wide range of policy areas. • Data can be read directly from the Scottish Government open data platform, allowing charts to be updated automatically as soon as new data is published. <a href="https://community.rstudio.com/t/scottish-government-equality-evidence-finder-2020-shiny-contest-submission/53699" target="_blank" rel="noopener noreferrer">[Read more]</a></p><p>✨ <a href="https://dgkf.shinyapps.io/riddlr-challenge-catalog/" target="_blank" rel="noopener noreferrer">riddlr: Test-case-driven R Programming Challenges</a>: A set of shiny tools for creating coding challenges. Questions are added via a simple Rmd template , making it easy for contributors to expand the variety of questions. Metadata about the question is captured in the yaml header and named code chunks are used to capture the pre-populated code block, solution and test inputs. <a href="https://community.rstudio.com/t/riddlr-r-programming-challenges-2020-shiny-contest-submission/54078" target="_blank" rel="noopener noreferrer">[Read more]</a></p><p>✨ <a href="https://smirnovayu.shinyapps.io/hangman_en/" target="_blank" rel="noopener noreferrer">Hangman</a>: Classic hangman in Shiny! Press a letter, if it is in the word - it is added to it, if not - a picture of a hangman gets extended by one line. There is a Russian version in the app repo as well! <a href="https://community.rstudio.com/t/hangman-2020-shiny-contest-submission/54937" target="_blank" rel="noopener noreferrer">[Read more]</a></p><p>✨ <a href="https://shahreyar-abeer.shinyapps.io/life_of_pi/" target="_blank" rel="noopener noreferrer">Life of pi: A Monte Carlo simulation</a>: A shiny app that demonstrates the use of Monte Carlo Simulation to estimate the value of <span class="math inline">\(\pi\)</span>. <a href="https://community.rstudio.com/t/life-of-pi-a-monte-carlo-simulation-2020-shiny-contest-submission/59748" target="_blank" rel="noopener noreferrer">[Read more]</a></p><p>✨ <a href="https://hssgenomics.shinyapps.io/RNAseq_DRaMA/" target="_blank" rel="noopener noreferrer">rnaseqDRaMA - RNAseq data visualization and mining</a>: RNAseq has been widely adopted as the method of choice for large-scale gene expression profiling. Data under-utilization, however remains a major challenge due to specific skill set required for data processing, interpretation, and analysis. To simplify end-user RNA-seq data interpretation, we created RNA-seq DRaMA (RNAseq Data Retrieval and Mining Analytical platform) - an R/Shiny interactive reporting system with user-friendly web interface for data exploration and visualization (<a href="https://hssgenomics.shinyapps.io/RNAseq_DRaMA/" class="uri">https://hssgenomics.shinyapps.io/RNAseq_DRaMA/</a> ). The app supports many methods for data exploration including: sample PCA and multidimensional scaling, gene- and sample- correlation analyses, Venn diagram and UpSet set visualizations, gene expression group barplots and heatmaps with hierarchical clustering, volcano plots, pathway analysis with QuSAGE, and Transcription Factor network analysis. All plots are highly customized in terms of sample, feature, threshold, and color selections and create publication-ready pdf and tabular outputs. All features are well-documented with an in-app manual. RNAseq DRaMA has been extensively tested at the HSS Genomics Center with more than 100 projects delivered and several projects currently deployed in the public domain. The app comes with a manual written in bookdown! <a href="https://community.rstudio.com/t/rnaseqdrama-rnaseq-data-visualization-and-mining-2020-shiny-contest-submission/57244" target="_blank" rel="noopener noreferrer">[Read more]</a></p><p>✨ <a href="https://rafa-pereira-br.shinyapps.io/accessibilityatlas/" target="_blank" rel="noopener noreferrer">Accessibility Atlas</a>: The Accessibility Atlas is a Shiny App that allows people to interactively explore the results of the Access to Opportunities Project . It contains maps and charts that allow users to visualize estimates of people’s access to employment, education and health services at a high spatial resolution and disaggregated by socio-economic groups according to income level and color/race. In English and Portugese. <a href="https://community.rstudio.com/t/accessibility-atlas-2020-shiny-contest-submission/57337" target="_blank" rel="noopener noreferrer">[Read more]</a></p><p>✨ <a href="https://mklienz.shinyapps.io/dude-wmb/" target="_blank" rel="noopener noreferrer">Dude Where’s my Bus</a>: This Shiny application provides the user with a series of tools to inform them about the location and due times of buses and trains at multiple stops and positions in Auckland, New Zealand, helping to answer the question posed by the app’s title - Dude, Where’s My Bus? The app features multiple real time boards, live bus locations, and functionality to find an ideal bus stop. <a href="https://community.rstudio.com/t/dude-wheres-my-bus-2020-shiny-contest-submission/56634" target="_blank" rel="noopener noreferrer">[Read more]</a></p><p>✨ <a href="https://pachamaltese.shinyapps.io/tradestatistics" target="_blank" rel="noopener noreferrer">Trade Statistics</a>: Open Trade Statistics is a project that includes a public API, a dashboard, and an R package for data retrieval. In particular, the dashboard was conceived as a graphical tool for people from economics and humanities that, most of the times, are used to Excel and not to using APIs. The dashboard allows users to explore the data visually and then export it to xlsx and other formats. App was reviewed by rOpenSci as well! <a href="https://community.rstudio.com/t/tradestatistics-2020-shiny-contest-submission/53917" target="_blank" rel="noopener noreferrer">[Read more]</a></p><p><br></p></div><div id="all-submissions-to-shiny-contest-2020" class="section level2"><h2>All submissions to Shiny Contest 2020</h2><p>Feel free to peruse <a href="https://rpubs.com/minebocek/shiny-contest-2020-submissions" target="_blank" rel="noopener noreferrer">the full list of all submissions to the contest</a> with links to the apps along with the submission narratives on RStudio Community. Note that data and code used in the apps are all publicly available and/or openly licensed. We hope that they will serve as inspiration for your next Shiny app!</p></div></description></item><item><title>Why You Need a World Class IDE to Do Serious Data Science</title><link>https://www.rstudio.com/blog/2020-07-09-why-you-need-a-world-class-ide-to-do-serious-data-science/</link><pubDate>Thu, 09 Jul 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/2020-07-09-why-you-need-a-world-class-ide-to-do-serious-data-science/</guid><description><p><sup> <p style="text-align: right !important;margin-top: 0px;margin-bottom: 30px;"><i>Photo by <a style="color: #000000;" href="https://unsplash.com/@danieljerez">Daniel Jerez</a> on <a style="color: #000000;" href="https://unsplash.com/photos/CD4WHrWio6Q">Unsplash</a></i></p> </sup></p><p>Data science can feel like an endless tunnel, with tremendous investment at the beginning and little visibility into where or when your results will emerge. Data science teams wrestle with many challenges, such as rapid iteration of new ideas, business alignment, productivity, transparency, and delivering durable value.</p><p>As we’ve discussed in <a href="https://blog.rstudio.com/2020/06/24/delivering-durable-value/" target="_blank" rel="noopener noreferrer">recent blog posts</a>, there are many advantages to using code for your data science work. However, to make your data science teams as productive as possible, they need the best environment in which to efficiently write that code. In this post, I explain why serious data science requires a world-class IDE.</p><br><h2 id="1-data-science-by-its-nature-is-iterative">1. Data Science by Its Nature Is Iterative</h2><p>There is no better way to shine a light on a solution than the back-and-forth testing of new ideas with code and a fail-fast attitude. You ultimately need to try an idea, search for new functions, review the syntax, visualize the results, make updates, and repeat.</p><p>The RStudio IDE lets you run statistical commands just this way, with the option to visualize results early on as you are building. This engaging means of executing code continuously from the console and directly inline from saved scripts lends itself to increased experimentation and ultimately faster results. With one environment, you can view, debug, and track the history of results, making it substantially easier to build off existing work.</p><p>Rich features, such as automated code validation, syntax highlighting, and smart indentation, make coding and iterating new work even faster. Don’t remember the format of a function? Just type a question mark before it, and the full syntax opens up in a separate pane. Check out our latest addition to these features with spell-check and guidance for conventional data science terms in the newest RStudio 1.3 release <a href="https://blog.rstudio.com/2020/05/27/rstudio-1-3-release/" target="_blank" rel="noopener noreferrer">here</a>.</p><br><h2 id="2-your-business-has-unique-challenges">2. Your Business Has Unique Challenges</h2><p>Software on its own is useless for the specific questions and answers that your business will uniquely demand. With the high rate of failure in data science projects (learn more <a href="https://blog.rstudio.com/2020/05/19/driving-real-lasting-value-with-serious-data-science/" target="_blank" rel="noopener noreferrer">here</a> in our recent post), success will ultimately rely on data scientists understanding the business and its data and using code to extract insights from that data. While most development environments design around programming tasks, the RStudio IDE is built for authoring particular questions and answers.<br><br></p><blockquote><p>While most development environments design around programming tasks, the RStudio IDE is built for authoring particular questions and answers.</p></blockquote><br><p>Dedicated panes in the IDE connecting to a database, defining variables, and pre-viewing and managing data give you the control needed to reach a final solution.</p><br><h2 id="3-multiple-tools-and-technologies-leads-to-inefficiency">3. Multiple Tools and Technologies Leads to Inefficiency</h2><p>For the data scientist, the sheer number of environments and technologies when trying to find solutions is continually a challenge. Every time you have to switch between tools, windows, or import from one to another means lost time and mental energy.</p><p>While the RStudio IDE integrates the R console, source code, output plots, database connections, and code execution environment all in one place, interoperability with other tools and technologies make it even more potent for development. You might need to grab some data in a SQL database, query, open Python to analyze, visualize in D3, and model in a language like Stan. With the RStudio IDE, all of this is possible in one place, as we recently demonstrated in <a href="https://blog.rstudio.com/2020/07/07/interoperability-july/" target="_blank" rel="noopener noreferrer">this blog post</a>. You can also very fluidly and naturally author Python inside R Markdown, allowing for a language-agnostic approach with your team. Learn more about this with <a href="https://rstudio.com/solutions/r-and-python/" target="_blank" rel="noopener noreferrer">“R &amp; Python: A Love Story.”</a></p><br><h2 id="4-teams-require-accountability-and-transparency">4. Teams Require Accountability and Transparency</h2><p>Git support inside the RStudio IDE makes sharing code and collaborating over a versioned environment possible and easy. When your team inspects how a problem was first solved, they need to see how the solution evolved. Using tools like R Markdown from within the IDE allows you to create a rich variety of content: PDFs, word documents, slides, and HTML files. Integration with <a href="https://rstudio.com/products/connect/" target="_blank" rel="noopener noreferrer">RStudio Connect</a> then allows visually appealing and digestible results from the RStudio IDE to be made available to shareholders across the organization with secure publishing of code and scheduled reports.</p><p>All of these content types are reproducible because they are generated from code, allowing your team to peer-review your work and make your work auditable by third parties.</p><br><h2 id="5-solutions-must-last">5. Solutions Must Last</h2><p>Vendors often lock developers into proprietary products that incur rising costs and discourage reuse. Building with an IDE based on community and open source has tremendous potential for avoiding these pitfalls and making your data science work more durable. The RStudio IDE can import over 16,000 open source packages from the R community that avoid this type of lock-in. When we ask R users in our annual survey what tools they use for their applications, 86% say they use the RStudio Desktop IDE (see the chart below). With this large community of IDE users and RStudio’s commitment to open source, data scientists can worry less about being locked into license fees and focus on solving problems.</p><figure><img align="center" style="padding: 35px;" src="survey-chart.jpg"><figcaption>Figure 1: 86% of respondents interested in R use the RStudio IDE.</figcaption></figure><p>To learn more about the RStudio IDE, and explore how you can use it in your environment, <a href="https://rstudio.com/products/rstudio/" target="_blank" rel="noopener noreferrer">download</a> a free copy from the RStudio web site today.</p></description></item><item><title>Interoperability in July</title><link>https://www.rstudio.com/blog/interoperability-july/</link><pubDate>Tue, 07 Jul 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/interoperability-july/</guid><description><p><sup>Photo by <a href="https://unsplash.com/@mark_crz?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Mark Cruz</a> on <a href="https://unsplash.com/s/photos/ice-cream-cones?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Unsplash</a></sup></p><p>The TIOBE Company just published the July edition of its <a href="https://www.tiobe.com/tiobe-index/" target="_blank" rel="noopener noreferrer">TIOBE Programming Community Index</a> of programming language popularity. R users will be pleased to see that R is now ranked as the 8th most popular programming language as shown in the screenshot below, having risen 12 positions since July of last year.</p><figure><img src="./tiobe-july.jpg" alt="Figure 1: TIOBE Language Rankings showing R as the 8th Most Popular Language" /><figcaption>Figure 1: TIOBE Language Rankings showing R as the 8th Most Popular Language</figcaption></figure><p>While we at RStudio are pleased to see R climbing the TIOBE charts, what we&rsquo;re going to focus on this month is all the other languages, both on this list and not, that data science teams also use to do their jobs. We&rsquo;re going to focus on <strong>interoperability</strong> with R, and how it helps data science teams get more value of all their organization&rsquo;s analytic investments.</p><p>If you&rsquo;re a regular reader of this blog, you may already know that the RStudio IDE supports Python (you can read more at <a href="https://rstudio.com/solutions/r-and-python/" target="_blank" rel="noopener noreferrer">R &amp; Python: A Love Story</a>. What&rsquo;s less well-known, however, is that when you write code in R Markdown within the IDE, you may also embed:</p><ul><li><strong>SQL code</strong> for accessing databases,</li><li><strong>BASH code</strong> for shell scripts,</li><li><strong>C and C++ code</strong> using the <code>Rcpp</code> package,</li><li><strong>STAN code</strong> for doing statistical modeling,</li><li><strong>Javascript</strong> for doing web programming,</li><li><strong>and many more languages</strong>. You can find a complete list of the many platforms supported in the language engines chapter of the book, <a href="https://bookdown.org/yihui/rmarkdown/language-engines.html" target="_blank" rel="noopener noreferrer">R Markdown: The Definitive Guide</a>.</li></ul><p>If you&rsquo;re wondering how this could work, I&rsquo;ve created a very simple example R Markdown document that demonstrates how languages can work together. It creates an in-memory database of <code>gapminder</code> data, queries it using SQL, prints the result of the query in R, plots the result using <code>matplotlib</code> in Python and saves the result as an image, and then prints the size of the image in BASH.</p><pre class="markdown"><code>---title: "Multilingual R Markdown"authors: "Carl Howe, RStudio"date: "7/6/2020"output: html_document---```{r setup, include=FALSE, echo = TRUE}knitr::opts_chunk$set(echo = TRUE, collapse = TRUE)library(tidyverse)library(rlang)library(reticulate)library(RSQLite)library(DBI)library(gapminder)reticulate::use_python("/usr/local/bin/python3", required = TRUE)``````{r gm_db_setup}gapminder_sqllite_db <- dbConnect(RSQLite::SQLite(), ":memory:")dbWriteTable(conn = gapminder_sqllite_db,"gapminder", gapminder)country <- "Switzerland"```## use R variable `country` in SQL query```{sql connection = gapminder_sqllite_db, output.var="gmdata"}SELECT * FROM gapminder WHERE country = ?country```## Access results of SQL query in R```{r}head(gmdata, 5)## country continent year lifeExp pop gdpPercap## 1 Switzerland Europe 1952 69.62 4815000 14734.23## 2 Switzerland Europe 1957 70.56 5126000 17909.49## 3 Switzerland Europe 1962 71.32 5666000 20431.09## 4 Switzerland Europe 1967 72.77 6063000 22966.14## 5 Switzerland Europe 1972 73.78 6401400 27195.11```## Plot in Python and save result as .png```{python}import matplotlib.pyplot as pltplt.plot(r.gmdata.year, r.gmdata.lifeExp)plt.grid(True)plt.title("Switzerland Life Expectancy (years)")plt.savefig("./SwitzerlandLifeExp.png")```## Show size of Python plot using BASH```{bash}ls -l SwitzerlandLifeExp.png## -rw-r--r-- 1 chowe staff 26185 Jul 7 17:26 SwitzerlandLifeExp.png```</code></pre><figure><img src="./SwitzerlandLifeExp.png" alt="Python Plot of Switzerland Life Expectancy" /><figcaption>Figure 2: Resulting Python Plot of Switzerland Life Expectancy</figcaption></figure><p>Throughout the month of July, we&rsquo;ll be devoting several articles to how RStudio supports interoperability and the benefits interoperability brings to data science teams. We encourage you to look for those subsequent posts this month. Meanwhile, to learn more about how interoperability improves the productivity of data science teams and some of the many platforms that RStudio supports, we recommend the following resources:</p><ul><li><a href="https://rstudio.com/resources/rstudioconf-2019/new-language-features-in-rstudio/" target="_blank" rel="noopener noreferrer"><strong>New language features in RStudio</strong></a>: This rstudio::conf 2019 video by developer Jonathan McPherson talks about how the RStudio IDE dramatically improves support for many languages frequently used alongside R in data science projects, including SQL, D3, Stan, and Python.</li><li><a href="https://rstudio.com/resources/webinars/r-python-a-data-science-love-story/" target="_blank" rel="noopener noreferrer"><strong>R &amp; Python: A Data Science Love Story</strong></a>: This webinar with RStudio&rsquo;s Lou Bajuk and Sean Lopp discusses how RStudio&rsquo;s toolchain supports the use of both R and Python, including support for Jupyter notebooks.</li><li><a href="https://rstudio.com/resources/rstudioconf-2019/ursa-labs-and-apache-arrow-in-2019/" target="_blank" rel="noopener noreferrer"><strong>Ursa Labs and Apache Arrow</strong></a>. In this rstudio::conf 2019 video, Wes McKinney talks about Ursa Labs&rsquo; work with Apache Arrow is dramatically speeding data sharing between R, Python, and other data science environments.</li></ul></description></item><item><title>Announcing Public Package Manager and v1.1.6</title><link>https://www.rstudio.com/blog/announcing-public-package-manager/</link><pubDate>Wed, 01 Jul 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-public-package-manager/</guid><description><p>Today we are excited to release version 1.1.6 of RStudio Package Manager and announce<a href="https://packagemanager.rstudio.com">https://packagemanager.rstudio.com</a>. This service builds on top of the work done by CRAN, to offer the R community:</p><ul><li>Access to <strong>pre-compiled packages on Linux</strong> via <code>install.packages</code> resulting in <a href="https://blog.rstudio.com/2019/11/07/package-manager-v1-1-no-interruptions/">significantly faster</a> package install times on Linux systems including cloud servers, CI/CD systems, and Docker containers.</li><li><strong>Historical checkpoints for CRAN</strong> enabling reproducible work, and even <a href="https://blog.rstudio.com/2019/01/30/time-travel-with-rstudio-package-manager-1-0-4/">time travel</a>, by freezing package dependencies with a one-line repository option.</li><li>Expanded <strong>Windows support for older versions of R</strong>, allowing you to access the latest versions of packages on older versions of R without compiling from source.</li></ul><p>We invite everyone to try this service, but please note we do not currentlysupport package binaries for Mac OS though we are considering adding support inthe future. The easiest way to get started is by visiting the <a href="https://packagemanager.rstudio.com/client/#/repos/1/overview">Package ManagerSetup Page</a>. Youcan also view <a href="https://support.rstudio.com/hc/en-us/articles/360046703913">frequently askedquestions</a> or <a href="https://rstudio.com/products/package-manager">learnmore about RStudio PackageManager</a>.</p><h3 id="relationship-to-cran">Relationship to CRAN</h3><p>This service builds off of the work done by CRAN and is a supplement toRStudio’s <a href="https://cran.rstudio.com">popular CRAN mirror</a>. If CRAN were abrewery, Package Manager would be your local liquor store; Package Managerwouldn&rsquo;t be possible without CRAN, but we hope it makes it a little easier toinstall packages without having to go to the (literal) source each time.</p><p>For <strong>package authors</strong>, before a package is available on Package Manager it must beaccepted, tested, and distributed on CRAN. Package Manager watches for thoseupdates and then carefully builds updated or new packages on additional operating systemsand R versions, finally adding them as a versioned checkpoint.</p><p>For <strong>R users</strong>, Package Manager acts like a regular CRAN mirror, ensuring allthe code you know how to write automatically works. Note that Package Managercan lag behind CRAN by a few days, so if you need the latest packages you canadd both Package Manager and CRAN to your repo option.</p><h3 id="community-integrations">Community Integrations</h3><p>In addition to using the Public Package Manager directly, R users can benefitfrom community integrations that access the service automatically:</p><ul><li><p>The <a href="https://github.com/rstudio/renv">renv</a> package helps R users managepackage environments over time, and is able to use the service to provide fasterinstall times and increase cross-platform project portability.</p></li><li><p>The <a href="https://github.com/r-lib/actions">actions</a> package provides GitHubActions for package authors taking advantage of CI/CD workflows such asautomated testing. The package uses Public Package Manager to speed up actionsand eliminate redundant package compilation.</p></li><li><p>The popular <a href="https://www.rocker-project.org/">rocker</a> project gives R users aconvenient way to work with Docker. This ecosystem increasingly takes advantageof Public Package Manager to provide faster package installs within a containeras well as versioned installs for reproducible research.</p></li></ul><p>If your community project would benefit from Public Package Manager please<a href="https://community.rstudio.com/c/r-admin/package-manager?tags=package-manager%2Cpublic-rspm">create a topic</a>on the RStudio Community.</p><h3 id="support-legal-terms-and-feedback">Support, Legal Terms, and Feedback</h3><p>Please consult the <a href="https://rstudio.com/about/rstudio-service-terms-of-use/">RStudio Terms ofUse</a> prior to use. Ifyou use R in an organization, we recommend <a href="https://rstudio.com/products/package-manager">evaluating RStudio PackageManager</a> which includes all thebenefits of the public Package Manager plus additional controls and features forprofessional data science teams: the ability to serve packages in offlineenvironments, access to curated subsets of CRAN, and the ability to shareprivate R packages.</p><p>RStudio does not provide direct support for this service you can get help through RStudio Community. The best way to ask a question is to <a href="https://community.rstudio.com/c/r-admin/package-manager?tags=package-manager%2Cpublic-rspm">create atopic</a>on RStudio Community after reviewing the<a href="https://support.rstudio.com/hc/en-us/articles/360046703913">FAQ</a>. This forumis also the best place to leave suggestions or feedback, we&rsquo;re eager to learnhow we can better support your needs!</p><p>If you are interested in learning more about how Package Manager works, theseopen source repositories provide information on how we <a href="https://github.com/rstudio/r-builds">build and distributeR</a>, handle <a href="https://github.com/rstudio/r-system-requirements">systemrequirements</a>, and <a href="https://github.com/rstudio/r-docker">manage ourbuild environment</a>. Package Manager relieson the tireless work of a team of engineers, thousands of compute hours, andTBs of storage and network IO.</p><p>The RStudio Package Manager <a href="https://docs.rstudio.com/rspm/admin">admin guide</a>also provides details on how Package Manager <a href="https://docs.rstudio.com/rspm/admin/repositories/#repo-syncing">interacts withCRAN</a>, how it<a href="https://docs.rstudio.com/rspm/admin/serving-binaries/">serves binary packages</a>,and details the <a href="https://docs.rstudio.com/rspm/admin/getting-started/configuration/">additionaloptions</a>available for on-premise use.</p><h3 id="new-updates-in-rstudio-package-manager-v116">New Updates in RStudio Package Manager v1.1.6</h3><p>In addition to <a href="https://packagemanager.rstudio.com">https://packagemanager.rstudio.com</a>, the 1.1.6 release offers the community and customers incremental updates including:</p><ul><li>Access to an <a href="https://packagemanager.rstudio.com/__api__/swagger/index.html">API to easily integrate Package Manager</a> with other systems and services</li><li>Support for R 4.0 and Ubuntu 20</li><li>More robust access and debugging when distributing packages from Git (applicable to on-premise customers only)</li></ul><p>Please review the full <a href="https://docs.rstudio.com/rspm/news/">release notes</a> and consider <a href="https://docs.rstudio.com/rpm/installation/">upgrading to the latestversion</a>.</p></description></item><item><title>Future-Proofing Your Data Science Team</title><link>https://www.rstudio.com/blog/future-proofing-your-data-science-team/</link><pubDate>Tue, 30 Jun 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/future-proofing-your-data-science-team/</guid><description><p><sup>Photo by <a style="color: #000000;" href="https://unsplash.com/@sushioutlaw?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Brian McGowan</a> on <a style="color: #000000;" href="https://www.rstudio.com/s/photos/brian-mcgowan-tomorrowland?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Unsplash</a></sup></p><p><em>This is a guest post from RStudio&rsquo;s partner, Mango Solutions</em></p><p>As RStudio’s Carl Howe recently discussed in his blog post on <a href="https://blog.rstudio.com/2020/05/12/equipping-wfh-data-science-teams/" target="_blank" rel="noopener noreferrer">equipping remote data science teams</a>, with the rapidly evolving COVID-19 crisis, companies have been increasingly forced to adopt working from home policies. Our technology and digital infrastructure has never been more important. Newly formed remote data science teams need to maintain productivity and continue to drive effective stakeholder communication and business value, and the only way to achieve this is through appropriate infrastructure and well-defined ways of working.</p><p>Whether your workforce works remotely or otherwise, centralizing platforms and enabling a cloud based infrastructure for data science will lead to more opportunities for collaboration. It may even reduce IT spend in terms of equipment and maintenance overhead, thus future-proofing your data science infrastructure for the long run.</p><p>So when it comes to implementing long-lived platform, here are some things to keep in mind:</p><h2 id="collaboration-through-a-centralized-data-and-analytics-platform">Collaboration Through a Centralized Data and Analytics Platform</h2><p>A centralized platform, such as RStudio Server Pro, means all your data scientists will have access to an appropriate platform and be working within the same environment. Working in this way means that a package written by one developer can work with a minimum of effort in all your developers’ environments allowing simpler collaboration. There are other ways of achieving this with technologies such as <em>virtualenv</em> for Python, but this requires that each project set up its own environment, thereby increasing overhead. Centralizing this effort ensures that there is a well-understood way of creating projects, and each developer is working in the same way.</p><p>When using a centralized platform, some significant best practices are:</p><ul><li><strong>Version control</strong>. If you are writing code of any kind, even just scripts, it should be versioned religiously and have clear commit messages. This ensures that users can see each change made in scripts if anything breaks and can reproduce your results on their own.</li><li><strong>Packages</strong>. Whether you are working in Python or R, code should be packaged and treated like the valuable commodity it is. At Mango Solutions, a frequent challenge we address with our clients is to debug legacy code where a single ‘expert’ in a particular technology has written some piece of process which has become mission critical and then left the business. There is then no way to support, develop, or otherwise change this process without the whole business grinding to a halt. Packaging code and workflows helps to document and enforce dependencies, which can make legacy code easier to manage. These packages can then be maintained by RStudio Package Manager or Artifactory.</li><li><strong>Reusability.</strong> By putting your code in packages and managing your environments with <em>renv</em>, you’re able to make your data science reusable. Creating this institutional knowledge means that you can avoid a Data Scientist becoming a single point of failure, and, when a data scientist does leave, you won’t be left with a model that nobody understands or can’t run. As Lou Bajuk explained in his blog post, <a href="https://blog.rstudio.com/2020/06/24/delivering-durable-value/" target="_blank" rel="noopener noreferrer">Does your Data Science Team Deliver Durable Value?</a>, durable code is a significant criteria for future-proofing your data science organization.</li></ul><h2 id="enabling-a-cloud-based-environment">Enabling a Cloud-based Environment</h2><p>In addition to this institutional knowledge benefit, running this data science platform on a cloud instance allows us to scale up the platform easily. With the ability to deploy to Kubernetes, scaling your deployment as your data science team grows is a huge benefit while only requiring you to pay for what you need to, when you need it.</p><p>This move to cloud comes with some tangential benefits which are often overlooked. Providing your data science team with a cloud-based environment has a number of benefits:</p><ol><li>The cost of hardware for your data science staff can be reduced to low cost laptops rather than costly high end on-premise hardware.</li><li>By providing a centralized development platform, you allow remote and mobile work which is a key discriminator for hiring the best talent.</li><li>By enhancing flexibility, you are better positioned to remain productive in unforeseen circumstances.</li></ol><p>This last point cannot be overstated. At the beginning of the Covid-19 lockdown, a nationwide company whose data team was tied to desktops found themselves struggling to provide enough equipment to continue working through the lockdown. As a result, their data science team could not function and were unable to provide insights that would have been invaluable through these changing times. By contrast, here at Mango, our data science platform strategy allowed us to switch seamlessly to remote working, add value to our partners, and deliver insights when they were needed most.</p><p>Building agility into your basic ways of working means that you are well placed to adapt to unexpected events and adopt new platforms which are easier to update as technology moves on.</p><p>Once you have a centralized analytics platform and cloud-based infrastructure in place, how are you going to convince the business to use it? This is where the worlds of Business Intelligence and software dev-ops come to the rescue.</p><p>Analytics-backed dashboards using technologies like Shiny or Dash for Python with RStudio Connect means you can quickly and easily create front ends for business users to access results from your models. You can also easily expose APIs that allow your websites to be backed by scalable models, potentially creating new ways for customers to engage with your business.</p><p>A word of caution here: Doing this without considering how you are going to maintain and update what have now become software products can be dangerous. Models may go out of date, functionality can become irrelevant, and the business can become disillusioned. Fortunately, these are solved problems in the web world, and solutions such as containers and Kubernetes alongside CI/CD tools make this a simpler challenge. As a consultancy we have a tried and tested solutions that expose APIs from R or Python that back high-throughput websites from across a number of sectors for our customers.</p><h2 id="collaborative-forms-of-communications">Collaborative Forms of Communications</h2><p>The last piece of the puzzle for your data science team to be productive has nothing to do with data science but is instead about communication. Your data science team may create insights from your data, but they are like a rudderless ship without input from the business. Understanding business problems and what has value to the wider enterprise requires good communication. This means that your data scientists have to partner with people who understand the sales and marketing strategy. And if you are to embrace the ethos of flexibility as protection against the future, then good video-conferencing and other technological communications are essential.</p><hr style="width:100%;border:1px solid rgba(0,0,0,.1);margin:50px 0"><h3 id="about-dean-wood-and-mango-solutions">About Dean Wood and Mango Solutions</h3><p>Dean Wood is a Data Science Leader at <a href="https://www.mango-solutions.com" target="_blank" rel="noopener noreferrer">Mango Solutions</a>. Mango Solutions provides complex analysis solutions, consulting, training, and application development for some of the largest companies in the world. Founded and based in the UK in 2002, the company offers a number of bespoke services for data analysis including validation of open-source software for regulated industries.</p></description></item><item><title>Does your Data Science Team Deliver Durable Value?</title><link>https://www.rstudio.com/blog/delivering-durable-value/</link><pubDate>Wed, 24 Jun 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/delivering-durable-value/</guid><description><style type="text/css">table {border-top: 1px solid rgba(117,170,219,.6);border-bottom: 1px solid rgba(117,170,219,.6);margin: 25px 0 15px 0;padding: 40px 35px 35px 35px;width: 100%;}tr:nth-child(even) {background: #ffffff;}tr {vertical-align: top;}td {text-align: left;padding: 2px 5px;}th {font-size: 24px;font-weight: 400;padding-bottom: 15px;text-align: left;}td li {font-size: 15px;}.quote-spacing {padding:0 80px;}.quote-size {font-size: 140%;line-height: 34px;}@media only screen and (max-width: 600px) {.quote-spacing {padding:0;}.quote-size {font-size: 120%;line-height: 28px;}table {padding: 40px 0px 35px 0px;}}</style><p><sup>Photo by <a href="https://unsplash.com/@zoltantasi?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText" target="_blank" rel="noopener noreferrer">Zoltan Tasi</a> on <a href="https://unsplash.com/s/photos/boulder-rock?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText" target="_blank" rel="noopener noreferrer">Unsplash</a></sup></p><p>In a recent series of blog posts, we introduced the idea of <a href="https://blog.rstudio.com/2020/05/19/driving-real-lasting-value-with-serious-data-science/" target="_blank" rel="noopener noreferrer">Serious Data Science</a> to help tackle the challenges of effectively implementing data science in an organization. We then focused on the importance of <a href="https://blog.rstudio.com/2020/06/02/is-your-data-science-credible-enough/" target="_blank" rel="noopener noreferrer">delivering insights that are credible with your stakeholders</a> and <a href="https://blog.rstudio.com/2020/06/09/is-your-data-science-team-agile/" target="_blank" rel="noopener noreferrer">approaching data science projects in an agile way</a>. However, once you’ve created valuable insights, the challenge is then how to continue to deliver those insights in a repeatable and sustainable way. Otherwise, your initial impact may be short-lived.</p><h2 id="obstacles-to-providing-ongoing-value-with-data-science">Obstacles to providing ongoing value with data science</h2><p>Once a valuable insight or tool has reached a decision maker, organizations struggle with maintaining and growing the value of these data science investments over time. Far too often, they need to start from scratch when solving a new problem or are forced to painfully reimplement old analyses when they are unexpectedly needed. Some of the key areas data science teams struggle with include:</p><ul><li><strong>Lack of reuse:</strong> Especially when your data science insights are locked up in a spreadsheet or a point-and-click tool, it can be nearly impossible to reuse that analysis in new ways. This forces data scientists to start from scratch when a new problem comes along and makes it difficult to build an analytical toolbox of valuable intellectual property over time.</li><li><strong>Lack of reproducibility:</strong> When you share your analysis with someone else, they may find it difficult to reproduce it if they don’t have identical versions of tools and libraries. As these tools evolve, you may find it impossible to recreate your analyses. Both of these situations are frustrating, leading to unnecessary work and anxiety as you attempt to figure out what element of the environment has changed.</li><li><strong>Stale insights and repetitive work:</strong> While a stakeholder may value your analysis today, they&rsquo;ll likely want to run it again with updated data in the future. If your analysis is static, it quickly becomes stale which forces the decision maker to either make a decision on old data or to ask you for an update. These out-of-date analyses lead to frustration on both sides, as the stakeholder waits for the update, and the data scientist is forced to repeat work instead of working on new analyses.</li></ul><div style="overflow-x:auto;"><table><tr><th>Obstacles</th><th>Solutions</th></tr><tr><td>Lack of reuse</td><td>Build your analyses with code, not clicks</td></tr><tr><td>Lack of reproducibility</td><td>Manage data science environments for repeatability</td></tr><tr><td>Stale insights and repetitive work</td><td>Deploy tools to keep insights up to date</td></tr><tr><td>Unsustainable data science platforms</td><td>Embrace platforms that support open source software</td></tr></table><div style="font-size:85%;padding-bottom: 20px;"><i>Figure 1: Common obstacles to delivering durable value with your data science and approaches to mitigate them.</i></div><h2 id="a-durable-approach-to-data-science">A Durable Approach to Data Science</h2><p>To make the benefits of your data science insights durable over the long term, we recommend applying <em>Serious Data Science</em> principles as outlined in Figure 1. We suggest that your data science teams:</p><ul><li><strong>Build your analyses with code, not clicks.</strong> Data science teams should use a code-oriented approach because code can be developed, applied, and adapted to solve similar problems in the future. This reusable and extensible code then becomes core intellectual property for your organization which will make it easier to solve new problems in the future and increase the aggregate value of your data science work.</li><li><strong>Manage data science environments for repeatability.</strong> Organizations need ways to reproduce reports and dashboards as projects, tools, and dependencies change. Otherwise, your team may spend far too much time attempting to recreate old results, or worse, it may give different answers to the same questions at different points in time, thereby undermining your team’s credibility. Use packages such as <a href="https://rstudio.github.io/renv/articles/renv.html" target="_blank" rel="noopener noreferrer">renv</a> for individual projects and use products such as <a href="https://rstudio.com/products/package-manager/" target="_blank" rel="noopener noreferrer">RStudio Package Manager</a> to improve reproducibility across a larger organization.</li><li><strong>Deploy tools to keep insights up to date.</strong> No one wants to make a decision based on old data. Publish your insights on web-based tools such as <a href="https://rstudio.com/products/connect/" target="_blank" rel="noopener noreferrer">RStudio Connect</a> to keep your business stakeholders up to date with on-demand access and scheduled updates. Deploying insights this way also frees the data scientist to spend their time solving new problems rather than solving the same problem again and again.</li></ul><p>Sharla Gelfand recently spoke at rstudio::conf 2020 about the benefits of reproducible reports for the College of Nurses of Ontario:</p><div style="padding: 20px 0 35px 0;"><script src="https://fast.wistia.com/embed/medias/cj68m8on14.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_cj68m8on14 videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/cj68m8on14/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div></div><h2 id="building-on-a-sustainable-foundation">Building on a Sustainable Foundation</h2><p>To this point, our serious data science approach has largely been independent of the underlying data science platform. However, your choice of data science platform can itself pose a risk to the durability of the work you do. Your data platform can become unsustainable over time due to:</p><ul><li><strong>High license costs:</strong> Expensive software and variable budgets often force teams to restrict platform access to a select few data scientists. Worse, those teams may have to hold off on tackling new data science projects or deploying to more stakeholders until Finance approves money for more seats.</li><li><strong>Dwindling communities:</strong> If the platform or language decreases in its popularity with developers, it may become difficult to find new data scientists who are familiar with it.</li><li><strong>Vendor acquisitions or shifts in business models:</strong> If the platform maker is acquired by a larger company or shifts their business model, it may abandon or scale back investment in their previous product. Alternatively, sometimes vendors move from an innovation to a value extraction model, where locked-in customers are forced to pay higher license fees over time.</li></ul><p>Regardless of the underlying reason, an unsustainable platform can drive up costs and potentially even force an organization to start from scratch with a new platform. To reduce these threats, we recommend embracing platforms that support open source software. Doing so improves the sustainability of your data science because these platforms are:</p><ul><li><strong>Cost effective:</strong> Open source software can deliver tremendous value at minimal cost, which mitigates the risk of losing your data science platform due to future budget cuts. It also makes it much easier to expand to more users as your data science team grows.</li><li><strong>Widely supported:</strong> The R and Python open source communities are large and growing, so you can be confident these tools, and the expertise to use them will be available for many years to come. These communities are further bolstered by <a href="https://rstudio.com/about/what-makes-rstudio-different/" target="_blank" rel="noopener noreferrer">RStudio’s mission</a>, which is dedicated to sustainable investment in free and open-source software for data science.</li><li><strong>Vendor independent:</strong> RStudio’s founder JJ Allaire wrote the following in a <a href=" https://blog.rstudio.com/2020/01/29/rstudio-pbc/" target="_blank" rel="noopener noreferrer">recent blog post</a>:</li></ul><div style="background-color: #f8f8f8;padding:50px 30px 30px 30px;margin:50px 0;"><div style="text-align:center;padding-bottom:10px;"><img src="logo-lockup.svg" width="400px"></div><div class="quote-spacing"><p class="quote-size"><i>"Users should be wary of the underlying motivations and goals of software companies, especially ones that provide the essential tools required to carry out their work."</i></p><p style="text-align: right;">JJ Allaire, CEO, RStudio<br><a href="https://rstudio.com/pbc-keynote" target="_blank">rstudio.com/pbc-keynote</a></p></div></div><p>With this caution in mind, consider building your data science investments on a platform with an open source core. Should they change their business or licensing model, everything you need to do your core data science work will still be freely available, and you can freely choose whether you want to pay the vendor’s price.</p><h2 id="learn-more-about-serious-data-science">Learn more about Serious Data Science</h2><p>For more information, check our previous posts <a href="https://blog.rstudio.com/2020/05/19/driving-real-lasting-value-with-serious-data-science/" target="_blank" rel="noopener noreferrer">introducing the concepts of Serious Data Science</a>, <a href="https://blog.rstudio.com/2020/06/02/is-your-data-science-credible-enough/" target="_blank" rel="noopener noreferrer">drilling into the importance of credibility</a>, and exploring <a href="https://blog.rstudio.com/2020/06/09/is-your-data-science-team-agile/" target="_blank" rel="noopener noreferrer">how to apply agile principles</a> to your data science work.</p><p>If you’d like to learn more, we also recommend:</p><ul><li>In this upcoming webinar, <a href="https://pages.rstudio.net/BeyondDashboardFatigueWebinar.html" target="_blank" rel="noopener noreferrer">Beyond Dashboard Fatigue</a> , we&rsquo;ll discuss how to repeatably deliver up-to-data analyses to your stakeholders using proactive email notifications through the blastula and gt packages, and how RStudio pro products can be used to scale out those solutions for enterprise applications</li><li>In this <a href="https://rstudio.com/about/customer-stories/astra_zeneca/" target="_blank" rel="noopener noreferrer"> customer spotlight</a>, Paul Metcalf, Head, Machine Learning and AI, Oncology R&amp;D at AstraZeneca, describes how his team “created a robust toolchain for routine tasks and enabled reproducible research” with R, RStudio, and Shiny.</li><li>To learn more about how RStudio Connect makes it simple to deliver repeatable, up-to-date data products to your stakeholders, check out the <a href="https://rstudio.com/products/connect/" target="_blank" rel="noopener noreferrer">RStudio Connect product page</a>.</li><li>RStudio’s Sean Lopp explores the importance of Reproducible Environments in this <a href="https://rviews.rstudio.com/2019/04/22/reproducible-environments/" target="_blank" rel="noopener noreferrer">RViews Blog post</a>.</li><li>Garrett Grolemund <a href="https://rstudio.com/resources/webinars/reproducibility-in-production/" target="_blank" rel="noopener noreferrer">presented a webinar</a> on the role that computational documents like RMarkdown play in supporting reproducibility in production.</li><li><a href="https://rstudio.com/about/what-makes-rstudio-different/" target="_blank" rel="noopener noreferrer">What Makes RStudio Different</a> explains that RStudio’s mission is to sustainably create free and open-source software for data science and allow anyone with access to a computer to participate freely in a data-centric global economy.</li></ul></description></item><item><title>Is Your Data Science Team Agile?</title><link>https://www.rstudio.com/blog/is-your-data-science-team-agile/</link><pubDate>Tue, 09 Jun 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/is-your-data-science-team-agile/</guid><description><style type="text/css">table {border-top: 1px solid rgba(117,170,219,.6);border-bottom: 1px solid rgba(117,170,219,.6);margin: 45px 0 45px 0;padding: 40px 0 20px 0;}tr:nth-child(even) {background: #ffffff;}tr {vertical-align: top;}th {font-size: 24px;font-weight: 400;}td li {font-size: 15px;}</style><p><sup>Photo by <a href="https://unsplash.com/@vespir?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText" target="_blank" rel="noopener noreferrer">James Forbes</a> on <a href="https://unsplash.com/s/photos/winding-path-through-woods?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText" target="_blank" rel="noopener noreferrer">Unsplash</a></sup></p><p>As we recently wrote in our first post on <a href="https://deploy-preview-290--rstudio-blog.netlify.app/2020/05/19/driving-real-lasting-value-with-serious-data-science/" target="_blank" rel="noopener noreferrer">Serious Data Science</a>, there are numerous challenges to effectively implementing data science in an organization. Many industry surveys warn that <strong>most analytics and data science projects fail</strong>, and <strong>most companies don’t achieve the revenue and profit growth that they hoped</strong> their data science investments would deliver. In this post, we’ll examine some underlying causes of why this happens.</p><p>In our previous post in this series, we discussed how to tackle <a href="https://blog.rstudio.com/2020/06/02/is-your-data-science-credible-enough/" target="_blank" rel="noopener noreferrer">building credibility for your data science team</a>. Here, we will focus on the challenges of quickly delivering real value with your data science team with a platform that supports an agile approach.</p><h2 id="data-science-development-is-often-a-long-and-winding-path-to-value">Data Science Development is Often a Long and Winding Path to Value</h2><p>In talking with many different data science teams, we’ve heard that it takes far too long for a team to ramp up, perform analyses, and then share those analyses in an impactful way with their organization. This makes it challenging for data science leaders to deliver value to the rest of their organization, which in turn makes it difficult to justify buying new tools, hiring new team members, and investing in their training.</p><p>Several common obstacles make it difficult for a data science team to quickly ramp up and be productive:</p><ul><li><strong>Training on new tools:</strong> When an organization invests in a new data science platform or brings in new team members unfamiliar with that tool, teams often require extensive training before they can start reliably delivering analyses.</li><li><strong>Monolithic applications:</strong> Too many data science development projects try to solve all problems at once by building or buying a single grand solution. Frequently these new massive tools take months to implement, require major changes to processes and demand significant configuration and professional services before they can be used effectively. And, if the new platforms don’t integrate well with existing tools, data scientists are forced to use multiple different environments to get their work done, which impedes their productivity and often leads to those analytic investments being under-utilized.</li><li><strong>Slow exploratory development:</strong> Good data science requires the ability to rapidly explore many approaches to solving a problem and to revise proposed solutions with colleagues and stakeholders. However, many firms adopt point-and-click model development tools in their quests for ease of use, not realizing that those interactive systems lock data scientists into hours of manual labor to create a new version when stakeholders request changes.</li><li><strong>Difficult to share and access results:</strong> Finally, if your stakeholders cannot find your data products, they won’t use them. Too many insights get delivered using ad hoc emails or isolated web links because existing tools make it difficult to deploy data science insights without extensive help from IT. This leads to stakeholders frequently struggling to find results, using analyses based on old data, or waiting too long to get an updated version.</li></ul><h2 id="finding-the-shortest-path-to-value">Finding the Shortest Path to Value</h2><p>Delivering data science value in an organization requires that your team be agile. While “Agile” usually refers to a very specific development methodology, here we use “agile” to simply describe a process that has four principles:</p><ol><li><strong>Use what you have.</strong> To quickly ramp up and deliver value, take advantage of the existing knowledge of your team and your previous investments.</li><li><strong>Collaborate regularly.</strong> The users of the product continuously meet with and influence developers.</li><li><strong>Iterate on deliverables rapidly.</strong> Developers incorporate feedback into the product in short development cycles until it delivers what the users want.</li><li><strong>Deliver results frequently.</strong> The process routinely delivers new products for users to critique and improve.</li></ol><p>Applying these principles to the data science development process allows your team to deliver value more quickly and efficiently. They help your team overcome the obstacles noted previously, and they demonstrate commitment to a Serious Data Science approach (see Figure 1).</p><div style="overflow-x:auto;"><table><tr><th>Obstacles</th><th>Solutions</th></tr><tr><td>Training required on new tools</td><td>Use tools many data scientists already know</td></tr><tr><td>Monolithic applications take too long to set up and don’t use existing analytic investments</td><td>Focus on small prototypes to prove value, using tools that integrate with your existing frameworks</td></tr><tr><td>Slow exploratory development</td><td>Rapid, code-based development</td></tr><tr><td>Difficult to access and share results</td><td>Deliver live results directly to stakeholders</td></tr></table></div><p>Figure 1: Use agile principles in a Serious Data Science approach to address common development obstacles.</p><p>To make your team more agile in your data science development process, we recommend that you:</p><ul><li><strong>Apply your existing knowledge.</strong> Many data scientists already know how to use R and Python because of the huge open source communities around these languages. This means your team will not require extensive training on a new platform. Using R and Python as the foundation of your data science platform also helps you quickly recruit and retain the best Data Science talent by letting your data scientists work in the environments they already know and love.</li><li><strong>Focus on small prototypes that prove value.</strong> Instead of trying to build or buy a single data science platform, focus your development on small easy-to-test modules that can then be combined with code or scripts. As an example, instead of trying to reinvent the entire customer buying experience, break up that concept into independent prototypes that improve recommendations, streamline purchases, and improve pricing for profitability. And choose a platform that integrates well with the other analytic frameworks you’ve already built, so that you can exploit those investments, rather building everything from scratch. (We will be talking about this principle of Interoperability in greater detail in the future).</li><li><strong>Rapidly evolve your solution using code-based development.</strong> We recommended using a code-based data science foundation such as R or Python that allows easy auditing and peer review in our <a href="https://blog.rstudio.com/2020/06/02/is-your-data-science-credible-enough/" target="_blank" rel="noopener noreferrer">prior blog post about credibility</a>. Code also allows you to rapidly evolve your solution to explore new approaches and accommodate stakeholder feedback. One of the features of R and Python that users love most is how easy it is to explore different analytic approaches to solving any given problem. For example, this <a href="https://rstudio.com/about/customer-stories/liberty-mutual/" target="_blank" rel="noopener noreferrer">recent customer spotlight with Liberty Mutual</a> highlights the power and flexibility of R in their data science environment.</li><li><strong>Deliver live results directly to stakeholders.</strong> Stakeholder feedback is critical to agile development, but it won’t help if they don’t have your latest results. You can eliminate that concern if you publish data science results on a web-based platform such as <a href="https://rstudio.com/products/connect/" target="_blank" rel="noopener noreferrer">RStudio Connect</a>. You can even automate this publication process using continuous integration development techniques such as Github actions. You can also notify stakeholders with automated emails from packages like <em>blastula</em>, which we will be covering in more detail in an upcoming webinar. Speeding up this delivery and feedback mechanism ensures stakeholders get input and see the value your data science team is delivering in real-time.</li></ul><h3 id="astellas-aymen-waqar-discusses-the-analytics-communications-gap">Astellas’ Aymen Waqar discusses the analytics communications gap:</h3><div style="padding: 15px 40px 35px 40px;text-align: center;"><script src="https://fast.wistia.com/embed/medias/iwmemji2xh.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><span class="wistia_embed wistia_async_iwmemji2xh popover=true popoverAnimateThumbnail=true videoFoam=true" style="display:inline-block;height:100%;position:relative;width:100%">&nbsp;</span></div></div></div><h2 id="learn-more-about-serious-data-science">Learn more about Serious Data Science</h2><p>For more information, see our previous posts <a href="https://blog.rstudio.com/2020/05/19/driving-real-lasting-value-with-serious-data-science/" target="_blank" rel="noopener noreferrer">introducing the concepts of Serious Data Science</a>, and <a href="https://blog.rstudio.com/2020/06/02/is-your-data-science-credible-enough/" target="_blank" rel="noopener noreferrer">drilling into the importance of credibility</a>. In the coming weeks, we will round out this series with a post on how to make sure the value your data science team provides is durable.</p><p>If you’d like to learn more, we also recommend:</p><ul><li><a href="https://rstudio.com/about/customer-stories/brown-forman/" target="_blank" rel="noopener noreferrer">Paul Ditterline describes a Serious Data Science approach adopted by Brown-Forman</a>, in which building on familiar open-source languages allowed their data scientists to ramp up quickly, and “turn into application developers and data engineers without learning any new languages or computer science skills.”</li><li><a href="https://rstudio.com/solutions/r-and-python/" target="_blank" rel="noopener noreferrer">R and Python: A Love Story</a> shows how RStudio enables collaboration between the R and Python users on your data science team and helps all of them easily share their data science insights with your stakeholders.</li><li>Visit the <a href="https://rstudio.com/products/team/" target="_blank" rel="noopener noreferrer">RStudio Team product page</a> to learn how the RStudio Team platform for data science allows you to capitalize on your existing analytic investments and rapidly deliver value to your organization.</li><li>For more information on using RStudio Connect and the blastula package to send custom emails to your stakeholders with plots, tables, and results inline, see <a href="https://blog.rstudio.com/2020/01/22/rstudio-connect-1-8-0/" target="_blank" rel="noopener noreferrer">this recent blog post</a> on the Connect 1.8 release.</li><li>See <a href="https://rstudio.com/about/what-makes-rstudio-different/" target="_blank" rel="noopener noreferrer">What Makes RStudio Different</a> to learn about how RStudio helps support open source data science.</li></ul></description></item><item><title>Is Your Data Science Credible Enough?</title><link>https://www.rstudio.com/blog/is-your-data-science-credible-enough/</link><pubDate>Tue, 02 Jun 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/is-your-data-science-credible-enough/</guid><description><style type="text/css">table {border-top: 1px solid rgba(117,170,219,.6);border-bottom: 1px solid rgba(117,170,219,.6);margin: 45px 0 45px 0;padding: 40px 0 20px 0;}tr:nth-child(even) {background: #ffffff;}tr {vertical-align: top;}th {font-size: 24px;font-weight: 400;}td li {font-size: 15px;}</style><h2 id="does-your-data-science-lack-credibility">Does Your Data Science Lack Credibility?</h2><p>In <a href="https://blog.rstudio.com/2020/05/19/driving-real-lasting-value-with-serious-data-science/" target="_blank" rel="noopener noreferrer">a recent post</a>, we defined three key attributes of a concept we call Serious Data Science: Credibility, Agility and Durability. In this post, we’ll drill into the challenge of delivering credible insights to your stakeholders, and how to address that challenge.</p><p>Ultimately, organizations use data science to discover valuable insights and then apply those insights intelligently. Such applications might include making a better decision, improving a process, or otherwise changing how things are usually done. However, to make this happen, the organization must do at least two things:</p><ul><li>Communicate these insights to the right decision-maker, stakeholder, or system (we’ll talk more about that in our next Serious Data Science post on being Agile).</li><li>Convince decision makers to trust the insight and accept its implications. If decision makers lack this trust, then they will likely ignore the recommendation, and fall back on “the way we’ve always done things.”</li></ul><p>Typically, a host of unasked questions underlie a decision-maker’s seeming resistance to data-driven insights. They might not act on the conclusions of a data science team because they:</p><ul><li><strong>Don’t know the skills of the data scientist:</strong> Does the person who created this insight know what they are doing? Do they understand business risks as well as they understand their models?</li><li><strong>Don’t trust data science tools:</strong> Did the data scientist depend too much on software in creating this result? Did the data science team just use black box tools that auto-magically produced an answer without an understanding of the business?</li><li><strong>Don’t have confidence in the development process:</strong> Did the data scientist consider all reasonable approaches to the problem? Was there any way for someone else to review what was done, and know how things changed over time?</li><li><strong>Don’t understand what the results mean:</strong> What is this insight actually telling me? How does it apply to what I do? What factors does it reflect? Is it really better than what we have done before? Could I get fired for acting on this result?</li></ul><p>All these questions and doubts contribute to stakeholder hesitation, especially when they feel that they, not the data scientist, will ultimately be held responsible for the result. Fortunately, there are ways to overcome these obstacles.</p><h2 id="how-can-you-deliver-credible-insights">How Can You Deliver Credible Insights?</h2><p>To deliver insights that your decision makers and other stakeholders trust and actually use, we recommend adopting a Serious Data Science approach. To do this, your team must have the training and tools to find insights that are relevant and valuable. And, your team must communicate these insights to other stakeholders in your organization in a way that builds trust and understanding.</p><p>Here are the key elements which will help your team meet these challenges:</p><ul><li><strong>Widely-used open source software:</strong> The best way to make sure your team has the training to use a data science tool properly is to <strong>use the tools they already know</strong>. Millions of data scientists around the world learn data science using open source languages, such as R and Python. While some may argue which language is best (see <a href="https://blog.rstudio.com/2019/12/17/r-vs-python-what-s-the-best-for-language-for-data-science/" target="_blank" rel="noopener noreferrer">this blog post</a> for our take on that question), both have tremendous strengths and are trusted platforms.</li><li><strong>Comprehensive data science capabilities:</strong> To be confident your team will find the best approach to any particular question, they need a wide range of analytic approaches readily available to apply and compare. Powered and validated by huge, ever-expanding communities and package libraries, the R and Python ecosystems ensure your team will always have the broadest range of tools for their analyses</li><li><strong>Process transparency via code:</strong> Code allows others to inspect how a problem was first solved, and how that solution matured over time. Unlike point-and-click solutions where the history of how the analysis evolved is hidden beneath a pretty (inter)face or a spreadsheet where the logic is strewn across countless different cells, code explicitly describes what steps lead to the results. Further, code can be peer-reviewed and audited by third parties for further assurance of correct behavior.</li><li><strong>Understanding through visualizations:</strong> Just as a picture is worth a thousand words, a great visualization can explain a thousand lines of code. Visualizations help stakeholders understand complex data science insights and build confidence in the results. Interactive tools such as <a href="https://shiny.rstudio.com/" target="_blank" rel="noopener noreferrer">Shiny</a> allow data scientists to create visualizations that can improve the understanding of a data scientist’s work while spurring engagement from stakeholders.</li></ul><p>Heather Nolis, Machine Learning Engineer at T-Mobile, and Jacqueline Nolis, Principal Data Scientist at Nolis, LLC, recently spoke at rstudio::conf 2020 about how they used Shiny to share their machine learning models drove engagement and built trust with their business stakeholders.</p><div align="center" style="padding: 35px 0 35px 0;"><script src="https://fast.wistia.com/embed/medias/58qjn34mxy.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_58qjn34mxy videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/58qjn34mxy/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div></div><h2 id="serious-data-science-credible-agile-and-durable">Serious Data Science: Credible, Agile, and Durable</h2><p>These elements of Serious Data Science—trusted tools, comprehensive capabilities, flexibility, and transparency—will all help your team deliver insights that are more likely to be accepted by decision makers and actually have an impact. Next week, we will focus on Agility, and how your team can not only develop apps quickly but also regularly share those results with stakeholders to create a consensus, so you can make sure you are <a href="https://blog.rstudio.com/2020/04/22/getting-to-the-right-question/" target="_blank" rel="noopener noreferrer">Getting to the Right Question</a>.</p><p><strong>Serious Data Science is:</strong></p><div style="overflow-x:auto;"><table><tr><th> Credible</th><th> Agile </th><th> Durable </th></tr><tr><td><ul><li>Uses widely deployed and trusted tools</li><li>Includes comprehensive data science capabilities</li><li>Offers flexibility through the use of code</li><li>Provides transparency through visualizations and code</li></ul></td><td><ul><li>Employs existing knowledge and analytic investments</li><li>Allows rapid development and iteration</li><li>Scales well for enterprise and production use</li><li>Empowers your business stakeholders</li></ul></td><td><ul><li>Provides reusable, reproducible code and results</li><li>Delivers relevant, up-to-date insights</li><li>Supports and is supported by a vital open source community</li><li>Avoids vendor lock-in</li></ul></td></tr></table></div><h4 id="figure-1-being-credible-is-one-of-the-crucial-elements-of-a-serious-data-science-platform">Figure 1: Being credible is one of the crucial elements of a Serious Data Science platform.</h4><h2 id="learn-more-about-serious-data-science">Learn More about Serious Data Science</h2><p>If you’d like to learn more about Serious Data Science, we recommend the following in addition to our previous posts in this series:</p><ul><li>In <a href="https://rstudio.com/about/customer-stories/redfin/" target="_blank" rel="noopener noreferrer">a recent customer spotlight</a>, Jared Goulart, Director - Operations Analytics at Redfin, described how a serious data science approach helped his team engage with stakeholders, allowing them to quickly evaluate different scenarios and plan their budgets for the next year.</li><li><a href="https://rstudio.com/solutions/r-and-python/" target="_blank" rel="noopener noreferrer">R &amp; Python: A Love Story</a> shows how RStudio helps make the full breadth and power of R and Python available to data science teams and helps them make an impact on their organizations.</li><li>The <a href="https://shiny.rstudio.com/gallery/" target="_blank" rel="noopener noreferrer">Shiny Gallery</a> highlights some of the amazing interactive visualizations that Shiny developers have created with R to convey insights and help their stakeholders make better, more informed decisions.</li></ul></description></item><item><title>RStudio 1.3 Released</title><link>https://www.rstudio.com/blog/rstudio-1-3-release/</link><pubDate>Wed, 27 May 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-1-3-release/</guid><description><p>Today we&rsquo;re excited to announce the general release of RStudio 1.3. This release features many major improvements to the IDE, including:</p><ul><li>Dramatically improved <strong>accessibility</strong> for sight-impaired users, which also upgrades keyboard navigation, contrast ratios, and visibility for everyone.</li><li>A real-time <strong>spell-checking engine</strong>, with suggestions, customizable dictionaries, and a built-in whitelist for common data science terms.</li><li>Extensible, in-IDE <strong>tutorials</strong> powered by the <a href="https://rstudio.github.io/learnr/"><code>learnr</code> package</a>.</li><li><strong>Settings and preferences</strong> are now stored in plain text files you can back up or manage with other tools; they can also be applied globally to all users on an RStudio Server.</li><li>Improved compatibility with <strong>R 4.0</strong> and <strong>iPad OS 13.1</strong>.</li><li>Many improvements to <strong>RStudio Server security</strong>, including idle timeouts and hardening against common types of attacks.</li><li>Dozens of small productivity improvements, including <strong>autosave</strong>, <strong>global replace</strong>, <strong>customizable file templates</strong>, <strong>Shiny jobs</strong>, and more.</li></ul><p>If you&rsquo;ve purchased the Professional version of RStudio, this release also has some new capabilities for you:</p><ul><li><strong>RStudio Desktop Pro</strong> can now function as a client for RStudio Server Pro; run your R session on your server with the convenience of native desktop windows and menus.</li><li>A new <strong>user manager</strong> on the Admin Dashboard makes it easy to manage licensed users on RStudio Server Pro.</li><li>Many small improvements to the <strong>Kubnernetes</strong> and <strong>Slurm</strong> Job Launcher plugins.</li></ul><p>See our <a href="https://www.rstudio.com/categories/rstudio-ide">blog series on RStudio 1.3</a> for articles describing a selection of the new capabilities in detail, and the <a href="https://rstudio.com/products/rstudio/release-notes/">RStudio 1.3 Release Notes</a> for a comprehensive list of features and bugfixes in this release.</p><p>A special thanks to <a href="https://www.massey.ac.nz/massey/expertise/profile.cfm?stref=416430">Dr Jonathan Godfrey</a> and <a href="https://www.jooyoungseo.com/">JooYoung Seo</a> for their insight into the new accessibility features, and to the hundreds of community members who helped us shape this release with their ideas, bugfixes, and contributions. We couldn&rsquo;t do this without you! Please <a href="https://rstudio.com/products/rstudio/download/">download the new release</a> and let us know what you think on our <a href="https://community.rstudio.com/c/rstudio-ide">community forum</a>.</p></description></item><item><title>The Role of the Data Scientist</title><link>https://www.rstudio.com/blog/role-of-the-data-scientist/</link><pubDate>Wed, 27 May 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/role-of-the-data-scientist/</guid><description><h2 id="data-scientists-face-an-existential-crisis">Data Scientists Face an Existential Crisis</h2><p>The term data scientist has always been a bit controversial. William Cleveland coined the term in 2001 to advocate the practical use of statistics in other technical fields and believed that use warranted a new name. Nowadays, professionals sporting a data science title typically hold a Ph.D., possess some detailed domain knowledge, and are either computer science majors who learned statistics or statisticians who learned to program. And while most of us have seen <a href="http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram" target="_blank" rel="noopener noreferrer">Drew Conway&rsquo;s diagram showing that combination of skills</a>, I think Joel Grus&rsquo; addition of evil intent allows us to better recognize other interesting combinations (see Figure 1 below).</p><img align="center" style="padding: 35px;" src="venn-diagrams.jpg"><p>However, new technologies and a difficult economic environment caused by COVID-19 restrictions have raised new questions about the data scientist role, including:</p><ul><li><strong>Will their jobs be replaced by automated tools?</strong> A slew of vendors, from DataRobot to Oracle, are offering no-code or low-code analytic tools that promise to replace expensive data scientists with point-and-click web pages. While most buyers understand it&rsquo;s not that simple, those vendor promotions create fear and doubt in executive minds.</li><li><strong>Will executives resort to intuition?</strong> Most seasoned leaders were comfortable running their organizations by gut feel before data scientists arrived. When confronted with a high-risk economic environment, some may feel that &ldquo;trust me&rdquo; is a safer and easier-to-defend strategy than trying to explain complex models.</li><li><strong>Is their career simply a hot trend that will go away?</strong> Data scientist careers have been on a roll for the last ten years, with <a href="https://www.glassdoor.com/Salaries/data-scientist-salary-SRCH_KO0,14.htm" target="_blank" rel="noopener noreferrer">Glassdoor reporting mean starting salaries exceeding US$100,000</a>. With the economy now on a downtrend, this hot trend may now cool.</li></ul><h2 id="the-reality-organizations-need-data-scientists-more-than-ever">The Reality: Organizations Need Data Scientists More Than Ever</h2><p>COVID-19 crisis has created much fear, uncertainty, and doubt—commonly abbreviated as FUD—in all of our lives. However, based on what we see from organizations using our packages and products, RStudio believes that this FUD is unwarranted because:</p><ul><li><strong>Identifying and solving hard problems can&rsquo;t be automated.</strong> Automated tools work well for well-understood problems, such as extraction and visualization of well-structured data. However, to compete in today&rsquo;s environment, organizations must attack the truly hard problems that we don&rsquo;t understand yet. Such problems exist in nearly every realm of human activity, ranging from mundane topics such as natural language understanding for customer service up to the Grand Challenges in Global Health. These problems can&rsquo;t be encapsulated into automated systems until a data scientist first solves them.</li><li><strong>Data-driven decision-making has proven to create better results.</strong> While intuitive management may have led to success in the past, data suggests that such an approach may not be competitive in today&rsquo;s markets. A <a href="https://www.mckinsey.com/business-functions/marketing-and-sales/our-insights/five-facts-how-customer-analytics-boosts-corporate-performance" target="_blank" rel="noopener noreferrer">study done by management consulting firm McKinsey &amp; Company</a> reports that data-driven companies were 23 times more likely to outperform competitors in acquiring new users and 19 times more likely to achieve above-average profitability than their non-data-driven competitors. With such case studies being taught in schools today, most leaders recognize that they need data scientists to be competitive.</li><li><strong>Demand for data science tools has never been higher.</strong> During these difficult times, many organizations are doubling down on open source tooling because they know it is the best path to reproducible, durable analytics. Downloads of open source software and demand for courses that teach using these tools have only increased in the last quarter.</li></ul><h2 id="serious-data-science-helps-data-scientists-demonstrate-their-value">Serious Data Science Helps Data Scientists Demonstrate Their Value</h2><p>From RStudio&rsquo;s point of view, the most convincing arguments we see for a bright data science future come from the work being done by the R and Python data science communities. We hear such stories regularly, and we&rsquo;ve been collecting examples of this work as part of <a href="https://rstudio.com/about/customer-stories/" target="_blank" rel="noopener noreferrer">our Customer Stories program</a>. We&rsquo;ll add more of those stories in the months to come as we continue to talk to the tens of thousands of data scientists in our community that use our tools every day. It&rsquo;s their work that inspires us to build and distribute the software we create.</p><p>These stories have helped us envision what we believe to be the new role of the successful data scientist, Specifically, we believe that the role of a data scientist is to deliver what we&rsquo;ve called <a href="https://blog.rstudio.com/2020/05/19/driving-real-lasting-value-with-serious-data-science/" target="_blank" rel="noopener noreferrer">Serious Data Science</a>. Just as was implied in Conway&rsquo;s Venn diagram in Figure 1a, Serious Data Science draws on some of the best practices found in software development, statistics, and domain expertise to deliver results that are:</p><ul><li><strong>Credible.</strong> It&rsquo;s no longer enough for data scientists to create models that only they understand. Today&rsquo;s best data scientists create data products that are not only statistically correct, but that can be visualized, explained, and withstand scrutiny by a large community of peers. Credibility within a large organization demands communication and collaboration skills beyond computation.</li><li><strong>Agile.</strong> Serious data science practitioners expect that they&rsquo;ll have to perform many iterations on their analysis to eventually address real-world business problems. Agility demands that they not only develop apps quickly but that they also regularly share those results with stakeholders to create consensus. (see our prior post <a href="https://blog.rstudio.com/2020/04/22/getting-to-the-right-question/" target="_blank" rel="noopener noreferrer">Getting to the Right Question</a> for more details of this challenge).</li><li><strong>Durable.</strong> Today&rsquo;s data scientists understand that cutting-edge analysis has no value if it can&rsquo;t still deliver value after they have moved on. Durable results require that they use processes that can survive changes in computing environments and infrastructure into the future.</li></ul><p>Look for a deeper discussion of the ways data scientists can enhance their role using Serious Data Science during the month of June.</p><h2 id="learn-more-about-the-data-scientists-role-and-serious-data-science">Learn More about the Data Scientist&rsquo;s Role and Serious Data Science</h2><p>For a real-world view of how data scientists work to solve hard, complex, and valuable problems, watch Pim Bongaerts from the California Academy of Sciences speak about using data science to help save coral reefs.</p><div style="padding: 35px 0 35px 0;"><script src="https://fast.wistia.com/embed/medias/zhpb795rre.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_zhpb795rre videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/zhpb795rre/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div></div><p>Eduardo Ariño de la Rubia, Data Science Manager at Facebook, <a href="https://rstudio.com/resources/rstudioconf-2020/value-in-data-science-beyond-models-in-production/" target="_blank" rel="noopener noreferrer">spoke at rstudio::conf 2020 on the role of a data scientist</a>, with an emphasis on how they bring value beyond putting models in production.</p><p>We also recommend our prior blog posts in this series:</p><ul><li><a href="https://blog.rstudio.com/2020/05/19/driving-real-lasting-value-with-serious-data-science/" target="_blank" rel="noopener noreferrer"><strong>Driving Real, Lasting Value with Serious Data Science</strong></a> defines the components and need for serious data science.</li><li><a href="https://blog.rstudio.com/2020/05/12/equipping-wfh-data-science-teams/" target="_blank" rel="noopener noreferrer"><strong>Equipping Your Data Science Team to Work from Home</strong></a> outlines server infrastructure that can make work-from-home data scientists more effective.</li><li><a href="https://blog.rstudio.com/2020/05/05/wrangling-unruly-data/" target="_blank" rel="noopener noreferrer"><strong>Wrangling Unruly Data: The Bane of Every Data Science Team</strong></a> explains why data wrangling is an integral (and unavoidably lengthy) part of data science.</li><li><a href="https://blog.rstudio.com/2020/04/28/avoid-irrelevancy-and-fire-drills-in-data-science-teams/" target="_blank" rel="noopener noreferrer"><strong>Avoid Irrelevancy and Fire Drills in Data Science Teams</strong></a> explains how data science teams can avoid being pigeon-holed into roles of ivory tower researchers or always-on-call data firefighters.</li><li><a href="https://blog.rstudio.com/2020/04/22/getting-to-the-right-question/" target="_blank" rel="noopener noreferrer"><strong>Getting to the Right Question</strong></a> discusses the communications gap between business people and data science and how to bridge that divide.</li></ul></description></item><item><title>Driving Real, Lasting Value with Serious Data Science</title><link>https://www.rstudio.com/blog/2020-05-19-driving-real-lasting-value-with-serious-data-science/</link><pubDate>Tue, 19 May 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/2020-05-19-driving-real-lasting-value-with-serious-data-science/</guid><description><p>Data science is now a hot area of investment for many organizations. Countless blogs, articles, and analyst reports emphasize that effective data science is critical for competitive advantage, and many business leaders believe that data science is vital for an organization to survive, much less thrive, over the next several years.</p><p>However, many data science leaders grapple with an existential crisis for their teams. On the one hand, many vendors and analyst reports emphasize the rise of Citizen Data Scientists, empowered by tools that promise to augment and automate the hard work of data science to automagically answer vital questions, no Data Scientist required. On the other hand, machine learning and deep learning methods in the hands of software engineers, fueled by lots of computational power, answer more and more questions (as long as the problem is well-defined, and there is sufficient data available). Squeezed in between these trends, what is the role of a data scientist?</p><p>Even worse, nearly as many blogs and analyst reports emphasize the challenges of effectively implementing data science in an organization, and emphatically state that <strong>most analytics and data science projects fail</strong>, and <strong>most companies don’t achieve the revenue and profit growth that they hoped</strong> their data science investments would deliver.</p><p>We will dive into the role of a data scientist in more detail in the coming weeks, but here we will focus on this question: Why is getting real, lasting value from data science investments so difficult?</p><h2 id="many-data-science-projects-lack-credibility-and-impact-over-time">Many data science projects lack credibility and impact over time</h2><p>In talking to many different organizations implementing data science projects, we have seen many challenges that prevent data science investments from delivering the value they should. These typically fall into three categories:</p><ul><li><p><strong>Lack of credibility:</strong> Data science leaders grapple with whether their team has the necessary training and the right tools to discover relevant and valuable insights in their data. Once the team has found something interesting, how can others in the organization understand and trust those insights enough to actually change their behavior, and make decisions based on them? This problem is compounded if the approach is a difficult-to-explain, black box model.</p></li><li><p><strong>Slow path to value:</strong> Seemingly simple questions like &ldquo;Which customers will be our most profitable next quarter?&rdquo; often turn into month-long research projects as data scientists scour the firm for data and struggle to wrangle it into shape (a topic we discussed in a recent blog post, <a href="https://blog.rstudio.com/2020/05/05/wrangling-unruly-data/" target="_blank" rel="noopener noreferrer">Wrangling Unruly Data</a>). Then once the data scientists start to develop an analysis, they find iterating and refining their results with stakeholders slow and unwieldy (something we covered in another blog post, <a href="https://blog.rstudio.com/2020/04/22/getting-to-the-right-question/" target="_blank" rel="noopener noreferrer"> Getting to the Right Question</a>). These slow response times frustrate business sponsors and often stymie putting data insights into action. Worse, they encourage decision makers to go with their gut intuition instead of data.</p></li><li><p><strong>Ephemeral benefits:</strong> Once a valuable insight or tool has reached a decision maker, organizations struggle with maintaining and growing the value of these data science investments over time. They find it difficult to implement repeatable and reproducible processes as their systems and data science tools evolve, which often forces them to start from scratch when solving a new problem, or to reimplement old analyses when needed. Furthermore, data science practice at an organization often become dependent on a single software vendor, and that vendor may try to extract more of the value the customer receives as software license revenue.</p></li></ul><p>Andrew Mangano, Data Intelligence Lead at Salesforce, spoke at rstudio::conf 2020 about the importance of delivering useful insights to your stakeholders.</p><div style="padding: 35px 0 35px 0;"><script src="https://fast.wistia.com/embed/medias/67q4k9196d.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_67q4k9196d videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/67q4k9196d/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div></div><h2 id="real-world-problems-need-serious-data-science">Real-world problems need serious data science</h2><p>So what’s the answer? And how do you cut through all the hype and confusion?</p><p>The reality is that hard, vaguely defined but valuable to solve, problems exist in the world. Commodity approaches (whether via augmented analytics for citizen data scientists, or standard machine learning approaches for software engineers) yield commodity answers. <strong>Real-world business problems require smart, agile data science teams</strong> empowered with the flexibility and breadth of open source languages like R and Python. We know this because <strong>tens of thousands of you use our software every day to do amazing things.</strong></p><p>To deliver real, lasting value, organizations need to set aside the hype and build on a strong foundation. We recommend adopting a strategy we call <em>Serious Data Science</em>. As shown in Figure 1, Serious Data Science is an approach to data science designed to deliver insights that are:</p><ul><li><strong>Credible:</strong> The first step is to ensure that your team has the training and tools to find insights that are relevant and valuable, and that your team can communicate these insights to other stakeholders in your organization in a way that builds trust and understanding.</li><li><strong>Agile:</strong> Next, the platform you use must enable data scientists to quickly develop and iterate those valuable insights, and get them to your decision makers, where they can have an impact.</li><li><strong>Durable:</strong> Finally, to deliver lasting value, the platform must also make it easy to reuse and reproduce your team’s data science work, to deliver up-to-date insights, and do so in a sustainable way for the long term.</li></ul><h4 id="serious-data-science-is">Serious Data Science is&hellip;.</h4><div style="overflow-x:auto;"><table><tr><th> Credible</th><th> Agile </th><th> Durable </th></tr><tr><td><ul><li>Uses widely deployed and trusted tools</li><li>Includes comprehensive data science capabilities</li><li>Offers flexibility through the use of code</li><li>Provides transparency through visualizations and code</li></ul></td><td><ul><li>Employs existing knowledge and analytic investments</li><li>Allows rapid development and iteration</li><li>Scales well for enterprise and production use</li><li>Empowers your business stakeholders</li></ul></td><td><ul><li>Provides reusable, reproducible code and results</li><li>Delivers relevant, up-to-date insights</li><li>Supports and is supported by a vital open source community</li><li>Avoids vendor lock-in</li></ul></td></tr></table></div><h4 id="figure-1-crucial-elements-of-a-serious-data-science-platform">Figure 1: Crucial elements of a Serious Data Science platform.</h4><h2 id="why-you-should-adopt-serious-data-science">Why you should adopt Serious Data Science</h2><p>We&rsquo;ll be writing in detail about these components of Serious Data Science in the weeks to come. But before we get to that, we must address a topic near and dear to every data science leader: the role of the data scientist within the organization. Our post next Tuesday will address how that role is changing in today&rsquo;s organizations, and why they will need the Serious Data Science framework to continue demonstrating their value in the months and years to come.</p><h2 id="learn-more-about-serious-data-science">Learn more about Serious Data Science</h2><p>If you’d like to learn more about Serious Data Science, we recommend:</p><ul><li><a href="https://rstudio.com/about/customer-stories/brown-forman/" target="_blank" rel="noopener noreferrer"><strong>Paul Ditterline describes a Serious Data Science approach adopted by Brown-Forman</strong></a>, reducing dependency on tools like Excel, to solve more, and more complex, data science problems more efficiently.</li><li><a href="https://blog.rstudio.com/2020/04/28/avoid-irrelevancy-and-fire-drills-in-data-science-teams/" target="_blank" rel="noopener noreferrer"><strong>Avoiding Irrelevancy and Fire Drills in Data Science Teams</strong></a> is another view of the challenges facing today’s data science teams, and how to tackle those challenges.</li><li><a href="https://rstudio.com/about/what-makes-rstudio-different/" target="_blank" rel="noopener noreferrer"><strong>What Makes RStudio Different</strong></a> highlights RStudio’s mission to create free and open-source software for data science, in order to allow anyone with access to a computer to participate freely in a data-centric global economy.</li><li><a href="https://rstudio.com/solutions/r-and-python/" target="_blank" rel="noopener noreferrer"><strong>R &amp; Python: A Love Story</strong></a> shows how RStudio helps make the full breadth and power of R and Python available to data science teams, and helps them make an impact on their organizations.</li><li><a href="https://rstudio.com/resources/rstudioconf-2020/" target="_blank" rel="noopener noreferrer"><strong>Recorded talks from rstudio::conf 2020</strong></a> highlight the amazing, impactful, creative work that the open source data science community is doing.</li></ul><style type="text/css">table {border-top: 1px solid rgba(117,170,219,.6);border-bottom: 1px solid rgba(117,170,219,.6);margin: 45px 0 45px 0;padding: 40px 0 20px 0;}tr:nth-child(even) {background: #ffffff;}tr {vertical-align: top;}th {font-size: 24px;font-weight: 400;}td li {font-size: 15px;}</style></description></item><item><title>Equipping Your Data Science Team to Work from Home</title><link>https://www.rstudio.com/blog/equipping-wfh-data-science-teams/</link><pubDate>Tue, 12 May 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/equipping-wfh-data-science-teams/</guid><description><p><sup>Photo by Djurdjica Boskovic on Unsplash</sup></p><p>If your data science team experienced an abrupt transition to working at home, it may be a good time to rethink their development tools. In this post, I&rsquo;ll talk about why laptop-centric data science gets in the way of strong data science teams and why you should consider deploying development and publishing servers.</p><h2 id="working-from-home-has-affected-both-people-and-data">Working from Home Has Affected Both People and Data</h2><p>Like tigers and koalas, we data scientists are fairly solitary creatures. We typically eschew meetings, embrace focus time, and block out distractions to focus on our work. And on those rare times when we need help, our typical reaction is to walk over to a colleague&rsquo;s desk and brainstorm an answer.</p><p>Enter COVID-19 and the new work-from-home environment. At first glance, it would appear nothing really has to change for the typical data science workflow; team members armed with laptops appear well-equipped to continue their data science work. However, many data science teams are now struggling with:</p><ul><li><strong>Collaborative deprivation.</strong> While we all feel some sense of isolation during this lockdown, working from home deprives data scientists of their most effective collaboration techniques. Without the ability to pop over to someone&rsquo;s desk to ask a question or get help debugging a piece of code, many data scientists find themselves not making the progress they are used to.</li><li><strong>Locked-in development licenses</strong>. Many teams have been rather chagrined to find that their commercial data analysis software licenses are only valid in their enterprise environment, forcing them into a painful dance with VPN software when working from home.</li><li><strong>Firewalled data</strong>. Many organizations prohibit access to sensitive corporate data from outside the firewall to limit security risks and preserve user privacy. That&rsquo;s no big deal when the data science team is working from an office, but it becomes a serious issue when working from home.</li><li><strong>Inconsistent laptop environments</strong>. Data scientists often download their own versions of libraries and development tools. However, that means that code developed on one data scientist&rsquo;s laptop won&rsquo;t necessarily work for another data scientist who has different versions of packages loaded. Working from home without regular contact with other team members allows these inconsistencies to fester and grow, raising roadblocks to reproducible results. Worse, data scientists risk losing code and data living in those unique laptop environments should their hard drive fail or their laptop fall victim to a household accident.</li></ul><h2 id="serious-data-science-requires-collaborative-tools">Serious Data Science Requires Collaborative Tools</h2><p>To be able to do their work collaboratively and repeatably, data science teams need infrastructure that encourages it and is supported by the organization and IT. That typically means shared servers for:</p><ul><li><strong>Code Development</strong>. Having a shared development server allows everyone to share a consistent programming environment. Servers can also be configured to provide more computational and memory resources for the team to share, including back end CPU and GPU clusters. For teams working on machine learning models, for example, training a new model on a laptop might take days, while training that same model using a Kubernetes cluster might only take hours or minutes. And since the development server is typically hosted within the organization firewall, it can be configured to have full access to datasets that would not be allowed externally.</li><li><strong>Application publishing</strong>. Data scientists who used to share their insights using a conference room and a projector now need ways to publish results that don&rsquo;t require real-time attendance. While most companies have some types of internal web servers, those rarely support R and Python run-time environments. Data science teams need a publishing platform that is easy to use, letting them share work without opening an IT ticket for every change.</li><li><strong>Package control</strong>. Solitary data scientists tend to do their own package management, frequently installing the latest and greatest packages that they find. However, using the latest and greatest software can often backfire when other team members try to reproduce or build onto their work. Storing approved packages on a centralized server and defining that as the standard data science environment makes your data science work more reproducible and long-lasting.</li></ul><h2 id="which-servers-should-you-choose">Which Servers Should You Choose?</h2><p>Which server-based tools you choose obviously depend on factors such as team size, workload, and company software policies. RStudio offers both open source and commercial alternatives, allowing organizations to choose whichever satisfies their needs best. Table 1 summarizes both approaches.</p><p>In addition to providing enhanced security, auditing, and usage monitoring, Pro solutions add other benefits that are less quantifiable. Specifically:</p><ul><li><strong>RStudio Server Pro adds back end computational muscle.</strong> As we noted above, many data science workloads benefit significantly from being run on server platforms with beefy processors and capacious memory. RStudio Server Pro offers a feature called <a href="https://solutions.rstudio.com/documents/Scaling-RStudio-Server-Pro-with-Kubernetes.pdf" target="_blank" rel="noopener noreferrer">Launcher that can offload R and Python job execution onto a back end Kubernetes or SLURM cluster</a>. For groups doing serious machine learning, this one feature can speed up the team&rsquo;s productivity significantly.</li><li><strong>RStudio Connect creates a production environment that your team controls</strong>. Shiny apps, Jupyter Notebooks, and R Markdown documents are great tools for communicating with people outside your data science team, but they need a place to live. RStudio Connect provides that place to live and gives your team a secure, centralized portal for data products, automated emails, and Plumber and Dash APIs that let non-data scientists make use of their insights.</li><li><strong>RStudio Package Manager ensures your team&rsquo;s work is repeatable</strong>. With 15,000 R packages on CRAN constantly being updated, an R application that runs with today&rsquo;s versions of those packages won&rsquo;t necessarily work with tomorrow&rsquo;s. Package Manager makes it easier to have stable access to packages, so your whole team can be using the same playbook. It can also restrict packages versions to only those that have been certified by a central authority, thereby ensuring approved results.</li></ul><table><tr><th> Open Source Solution</th><th> Value </th><th> Pro Solution </th><th> Added Value in Pro </th></tr><tr><td><a href="https://rstudio.com/products/rstudio/#rstudio-server">RStudio Server</a></td><td><ul><li>Broadens access to development tools</li><li>Boosts compute and memory resources available</li><li>Ensures common development environment</li></ul></td><td><a href="https://rstudio.com/products/rstudio/#rstudio-server">RStudio Server Pro*</a></td><td><ul><li>Adds collaborative editing and projects</li><li>Supports multiple R versions and sessions</li><li>Provides Launcher support for back end execution clusters</li><li>Supports bilingual data science teams with Jupyter</li></ul></td></tr><tr><td><a href="https://rstudio.com/products/shiny/shiny-server/">Shiny Server</a>,<br />Homegrown Web Servers</td><td><ul><li>Eases publishing of Shiny applications</li><li>Allows broad access to data science results</li></ul></td><td><a href="https://rstudio.com/products/connect/">RStudio Connect</a></td><td><ul><li>Consolidates many types of content on one server</li><li>Allows scheduled production and emails</li><li>Hosts R- and Python-based APIs</li></ul></td></tr><tr><td><a href="https://cran.r-project.org/web/packages/miniCRAN/index.html">miniCRAN Mirror</a></td><td><ul><li>Maintains a local copy of packages from approved sources</li></ul></td><td><a href="https://rstudio.com/products/package-manager/">RStudio Package Manager</a></td><td><ul><li>Speeds installs using binaries</li><li>Allows use of multiple package versions and checkpoints for roll back</li><li>Provides package use insights for IT</li></ul></td></tr></table><h4 id="table-1-open-source-and-professional-server-options-to-support-data-scientists">Table 1: Open Source and Professional Server Options To Support Data Scientists.</h4><p>*RStudio Server Pro, RStudio Connect, and RStudio Package Manager are also available bundled as RStudio Team.</caption></p><h2 id="dont-be-afraid-to-mix-and-match-servers-as-your-needs-dictate">Don&rsquo;t Be Afraid To Mix and Match Servers As Your Needs Dictate</h2><p>The collaboration processes data science teams have used for years have already been disrupted by COVID-19 and work from home mandates. The question for data science leaders is what they can do to provide new ways of working that are as good or better than what went before. Centralizing your data science development and production processes is a way to do that.</p><p>Emily Riederer, an Analytics Manager at Capital One, <a href="https://vimeo.com/theranchstudios/review/398622411/383d5791b1?sort=lastUserActionEventDate&direction=desc" target="_blank" rel="noopener noreferrer">summarized some of the benefits she’s seen from this centralized approach</a> at RStudio::conf 2020.</p><div style="padding: 35px 0 35px 0;"><script src="https://fast.wistia.com/embed/medias/cac6g1r9gr.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_cac6g1r9gr videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/cac6g1r9gr/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div></div><p>With that said, using servers to make your work-from-home data science team more productive doesn&rsquo;t have to be a Manhattan Project all-or-nothing proposition. If your data scientists are comfortable developing code on their laptops, you may want to begin by installing a publishing platform like RStudio Connect, and put off development and package management servers for another day. Similarly, some teams start by installing RStudio Server for centralized development and defer publishing and package management. But for teams doing serious data science, they have to start somewhere.</p><p>We’ll be posting additional commentary and case studies on equipping data science teams to work from home in the coming weeks. In the meantime, we recommend <a href="https://appsilon.com/rstudio-connect-as-a-solution-for-remote-data-science-teams/" target="_blank" rel="noopener noreferrer">a recent post about how Appsilon has used Connect</a> to create a remote work-friendly culture.</p><h2 id="for-more-information">For More Information</h2><p>If you’d like to learn more about how to better equip your data science team to work from home, we recommend:</p><ul><li><strong><a href="https://rstudio.com/resources/rstudioconf-2019/rstudio-job-launcher-changing-where-we-run-r-stuff/" target="_blank" rel="noopener noreferrer">Changing Where We Run Stuff</a></strong>. This 18-minute video of an RStudio::conf 2019 talk by Darby Hadley describes how Launcher improves workload scaling and isolation.</li><li><strong><a href="https://solutions.rstudio.com/launcher/kubernetes/#want-to-learn-more-about-rstudio-server-pro-and-kubernetes" target="_blank" rel="noopener noreferrer">RStudio Server Pro with Kubernetes Overview</a></strong>. This document provides architectural block diagrams and links to frequently asked questions about RStudio Pro and Launcher.</li><li><strong><a href="https://rstudio.com/products/connect/" target="_blank" rel="noopener noreferrer">The RStudio Connect Product Page</a></strong>. This overview of RStudio Connect provides links to several videos going into details of how it can help your data science team communicate better throughout the organization.</li><li><strong><a href="https://rstudio.com/resources/rstudioconf-2020/building-a-new-data-science-pipeline-for-the-ft-with-rstudio-connect/" target="_blank" rel="noopener noreferrer">Building a new data science pipeline for the FT with RStudio Connect</a>.</strong> George Kastrinakis from the Financial Times presented this 16-minute talk at RStudio::conf 2020 about how RStudio Connect significantly sped up its data science work and made it more agile.</li><li><strong><a href="https://rstudio.com/resources/webinars/reproducibility-in-production/" target="_blank" rel="noopener noreferrer">Reproducibility in Production</a>.</strong> This webinar by Garrett Grolemund describes how computational documents (such as RMarkdown and Jupyter Notebooks) help deliver reproducible results for your business stakeholders.</li><li><strong><a href="https://rstudio.com/resources/webinars/introduction-to-the-rstudio-package-manager/" target="_blank" rel="noopener noreferrer">Introduction to RStudio Package Manager</a>.</strong> This recorded webinar provides a detailed description of what RStudio Package Manager is and how it aids reproducibility in R applications.</li></ul></description></item><item><title>sparklyr 1.2: Foreach, Spark 3.0 and Databricks Connect</title><link>https://www.rstudio.com/blog/sparklyr-1-2/</link><pubDate>Wed, 06 May 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/sparklyr-1-2/</guid><description><p>A new version of <a href="https://sparklyr.ai"><code>sparklyr</code></a> is now available on CRAN! In this <code>sparklyr 1.2</code> release, the following new improvements have emerged into spotlight:</p><ul><li>A <code>registerDoSpark()</code> method to create a <a href="#foreach"><code>foreach</code></a> parallel backend powered by Spark that enables hundreds of existing R packages to run in Spark.</li><li>Support for <a href="#databricks-connect">Databricks Connect</a>, allowing <code>sparklyr</code> to connect to remote Databricks clusters.</li><li>Improved support for Spark <a href="#structures">structures</a> when collecting and querying their nested attributes with <code>dplyr</code>.</li></ul><p>A number of inter-op issues observed with <code>sparklyr</code> and the Spark 3.0 preview were also addressed recently, in hope that by the time Spark 3.0 officially graces us with its presence, <code>sparklyr</code> will be fully ready to work with it. Most notably, key features such as <code>spark_submit()</code>, <code>sdf_bind_rows()</code>, and standalone connections are now finally working with Spark 3.0 preview.</p><p>To install <code>sparklyr</code> 1.2 from CRAN run,</p><pre><code class="language-{r" data-lang="{r">install.packages(&quot;sparklyr&quot;)</code></pre><p>The full list of changes are available in the <code>sparklyr</code> <a href="https://github.com/sparklyr/sparklyr/blob/master/NEWS.md">NEWS</a> file.</p><h2 id="foreach">Foreach</h2><p>The <a href="https://CRAN.R-project.org/package=foreach"><code>foreach</code></a> package provides the <code>%dopar%</code> operator to iterate over elements in a collection in parallel. Using <code>sparklyr</code> 1.2, you can now register Spark as a backend using <code>registerDoSpark()</code> and then easily iterate over R objects using Spark:</p><pre><code class="language-{r" data-lang="{r">library(sparklyr)library(foreach)sc &lt;- spark_connect(master = &quot;local&quot;, version = &quot;2.4&quot;)registerDoSpark(sc)foreach(i = 1:3, .combine = 'c') %dopar% {sqrt(i)}</code></pre><pre><code>[1] 1.000000 1.414214 1.732051</code></pre><p>Since many R packages are based on <code>foreach</code> to perform parallel computation, we can now make use of all those great packages in Spark as well!</p><p>For instance, we can use <a href="https://tidymodels.github.io/parsnip/"><code>parsnip</code></a> and the <a href="https://tidymodels.github.io/tune/"><code>tune</code></a> package with data from <a href="https://CRAN.R-project.org/package=mlbench"><code>mlbench</code></a> to perform hyperparameter tuning in Spark with ease:</p><pre><code class="language-{r" data-lang="{r">library(tune)library(parsnip)library(mlbench)data(Ionosphere)svm_rbf(cost = tune(), rbf_sigma = tune()) %&gt;%set_mode(&quot;classification&quot;) %&gt;%set_engine(&quot;kernlab&quot;) %&gt;%tune_grid(Class ~ .,resamples = rsample::bootstraps(dplyr::select(Ionosphere, -V2), times = 30),control = control_grid(verbose = FALSE))</code></pre><pre><code># Bootstrap sampling# A tibble: 30 x 4splits id .metrics .notes* &lt;list&gt; &lt;chr&gt; &lt;list&gt; &lt;list&gt;1 &lt;split [351/124]&gt; Bootstrap01 &lt;tibble [10 × 5]&gt; &lt;tibble [0 × 1]&gt;2 &lt;split [351/126]&gt; Bootstrap02 &lt;tibble [10 × 5]&gt; &lt;tibble [0 × 1]&gt;3 &lt;split [351/125]&gt; Bootstrap03 &lt;tibble [10 × 5]&gt; &lt;tibble [0 × 1]&gt;4 &lt;split [351/135]&gt; Bootstrap04 &lt;tibble [10 × 5]&gt; &lt;tibble [0 × 1]&gt;5 &lt;split [351/127]&gt; Bootstrap05 &lt;tibble [10 × 5]&gt; &lt;tibble [0 × 1]&gt;6 &lt;split [351/131]&gt; Bootstrap06 &lt;tibble [10 × 5]&gt; &lt;tibble [0 × 1]&gt;7 &lt;split [351/141]&gt; Bootstrap07 &lt;tibble [10 × 5]&gt; &lt;tibble [0 × 1]&gt;8 &lt;split [351/123]&gt; Bootstrap08 &lt;tibble [10 × 5]&gt; &lt;tibble [0 × 1]&gt;9 &lt;split [351/118]&gt; Bootstrap09 &lt;tibble [10 × 5]&gt; &lt;tibble [0 × 1]&gt;10 &lt;split [351/136]&gt; Bootstrap10 &lt;tibble [10 × 5]&gt; &lt;tibble [0 × 1]&gt;# … with 20 more rows</code></pre><p>The Spark connection was already registered, so the code ran in Spark without any additional changes. We can verify that this was the case by navigating to the Spark web interface:</p><img src="https://www.rstudio.com/blog/images/2020-05-06-sparklyr-1-2-spark-backend-foreach-package.png" alt="Spark running foreach package using sparklyr"/><h2 id="databricks-connect">Databricks Connect</h2><p><a href="https://docs.databricks.com/dev-tools/databricks-connect.html">Databricks Connect</a> allows you to connect your favorite IDE (like <a href="https://rstudio.com/products/rstudio/download/">RStudio</a>!) to a Spark <a href="https://databricks.com/">Databricks</a> cluster.</p><p>You will first have to install the <code>databricks-connect</code> Python package as described in our <a href="https://github.com/sparklyr/sparklyr#connecting-through-databricks-connect">README</a> and start a Databricks cluster, but once that&rsquo;s ready, connecting to the remote cluster is as easy as running:</p><pre><code class="language-{r" data-lang="{r">sc &lt;- spark_connect(method = &quot;databricks&quot;,spark_home = system2(&quot;databricks-connect&quot;, &quot;get-spark-home&quot;, stdout = TRUE))</code></pre><img src="https://www.rstudio.com/blog/images/2020-05-06-sparklyr-1-2-spark-databricks-connect-rstudio.png" alt="Databricks Connect with RStudio Desktop"/><p>That&rsquo;s about it, you are now remotely connected to a Databricks cluster from your local R session.</p><h2 id="structures">Structures</h2><p>If you previously used <code>collect()</code> to deserialize structurally complex Spark data frames into their equivalents in R, you likely have noticed that Spark SQL struct columns were only mapped into JSON strings in R, which was non-ideal. You might also have run into a much dreaded <code>java.lang.IllegalArgumentException: Invalid type list</code> error when using <code>dplyr</code> to query nested attributes from any struct column of a Spark data frame in <code>sparklyr</code>.</p><p>Unfortunately, often times in real-world Spark use cases, data describing entities comprised of sub-entities (e.g., a product catalog of all hardware components of some computers) needs to be denormalized / shaped in an object-oriented manner in the form of Spark SQL structs to allow efficient read queries. When <code>sparklyr</code> had the limitations mentioned above, users often had to invent their own workarounds when querying Spark struct columns, which explained why there was a mass popular demand for <code>sparklyr</code> to have better support for such use cases.</p><p>The good news is with <code>sparklyr</code> 1.2, those limitations no longer exist when working running with Spark 2.4 or above.</p><p>As a concrete example, consider the following catalog of computers:</p><pre><code class="language-{r" data-lang="{r">library(dplyr)computers &lt;- tibble::tibble(id = seq(1, 2),attributes = list(list(processor = list(freq = 2.4, num_cores = 256),price = 100),list(processor = list(freq = 1.6, num_cores = 512),price = 133)))computers &lt;- copy_to(sc, computers, overwrite = TRUE)</code></pre><p>A typical <code>dplyr</code> use case involving <code>computers</code> would be the following:</p><pre><code class="language-{r" data-lang="{r">high_freq_computers &lt;- computers %&gt;%filter(attributes.processor.freq &gt;= 2) %&gt;%collect()</code></pre><p>As previously mentioned, before <code>sparklyr</code> 1.2, such query would fail with <code>Error: java.lang.IllegalArgumentException: Invalid type list</code>.</p><p>Whereas with <code>sparklyr</code> 1.2, the expected result is returned in the following form:</p><pre><code># A tibble: 1 x 2id attributes&lt;int&gt; &lt;list&gt;1 1 &lt;named list [2]&gt;</code></pre><p>where <code>high_freq_computers$attributes</code> is what we would expect:</p><pre><code>[[1]][[1]]$price[1] 100[[1]]$processor[[1]]$processor$freq[1] 2.4[[1]]$processor$num_cores[1] 256</code></pre><h2 id="and-more">And More!</h2><p>Last but not least, we heard about a number of pain points <code>sparklyr</code> users have run into, and have addressed many of them in this release as well. For example:</p><ul><li>Date type in R is now correctly serialized into Spark SQL date type by <code>copy_to()</code></li><li><code>&lt;spark dataframe&gt; %&gt;% print(n = 20)</code> now actually prints 20 rows as expected instead of 10</li><li><code>spark_connect(master = &quot;local&quot;)</code> will emit a more informative error message if it&rsquo;s failing because the loopback interface is not up</li></ul><p>&hellip; to name just a few. We want to thank the open source community for their continuous feedback on <code>sparklyr</code>, and are looking forward to incorporating more of that feedback to make <code>sparklyr</code> even better in the future.</p><p>Finally, in chronological order, we wish to thank the following individuals for contributing to <code>sparklyr</code> 1.2: <a href="https://github.com/zero323">zero323</a>, <a href="https://github.com/Loquats">Andy Zhang</a>, <a href="https://github.com/yl790">Yitao Li</a>,<a href="https://github.com/javierluraschi">Javier Luraschi</a>, <a href="https://github.com/falaki">Hossein Falaki</a>, <a href="https://github.com/lu-wang-dl">Lu Wang</a>, <a href="https://github.com/samuelmacedo83">Samuel Macedo</a> and <a href="https://github.com/jozefhajnala">Jozef Hajnala</a>. Great job everyone!</p><p>If you need to catch up on <code>sparklyr</code>, please visit <a href="https://sparklyr.ai">sparklyr.ai</a>, <a href="https://spark.rstudio.com">spark.rstudio.com</a>, or some of the previous release posts: <a href="https://blog.rstudio.com/2020/01/29/sparklyr-1-1/">sparklyr 1.1</a> and <a href="https://blog.rstudio.com/2019/03/15/sparklyr-1-0/">sparklyr 1.0</a>.</p><p>Thank you for reading this post.</p><p>This post was originally published on <a href="https://blogs.rstudio.com/ai/">blogs.rstudio.com/ai/</a></p></description></item><item><title>Wrangling Unruly Data: The Bane of Every Data Science Team</title><link>https://www.rstudio.com/blog/wrangling-unruly-data/</link><pubDate>Tue, 05 May 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/wrangling-unruly-data/</guid><description><p>There&rsquo;s an old saying (at least old in data scientist years) that goes, &ldquo;90% of data science is data wrangling.&rdquo; This rings particularly true for data science leaders, who watch their data scientists spend days painstakingly picking apart ossified corporate datasets or arcane Excel spreadsheets. Does data science really have to be this hard? And why can&rsquo;t they just delegate the job to someone else?</p><h2 id="data-is-more-than-just-numbers">Data Is More Than Just Numbers</h2><p>The reason that data wrangling is so difficult is that data is more than text and numbers. As shown in Figure 1, data scientists routinely have to deal with:</p><ul><li><strong>missing entries</strong>. The phrase &ldquo;You can&rsquo;t always get what you want&rdquo; is more than just a rock anthem &ndash; it also applies to data. Not every column or row in a real-world data set will be populated, yet data scientists still have to work with the data.</li><li><strong>ambiguous values</strong>. Without further information, a data scientist doesn&rsquo;t know if a value of <em>3/4</em> is a fraction, a month and a day, or just a string.</li><li><strong>mixed data types</strong>. People who enter data will sometimes insert comments along with the data which the data scientist then has to separate and exclude to work with the actual data values.</li></ul><img align="center" style="padding: 35px;" src="unruly-data-fig1.jpg"><h4 id="figure-1-inconsistent-ways-of-entering-values-impede-data-understanding">Figure 1: Inconsistent ways of entering values impede data understanding.</h4><h2 id="its-not-just-the-data-its-the-context">It&rsquo;s Not Just The Data; It&rsquo;s the Context</h2><p>The data challenges listed above are just the tip of the iceberg. Many datasets originate in Excel, and many Excel creators hide information in their column and row names as shown in Figure 2. In other data sets, no metadata is included within the data set at all. Instead data publishers provide a completely separate data dictionary that data scientists have to interpret to use the data.</p><img align="center" style="padding: 35px;" src="unruly-data-fig2.jpg"><h4 id="figure-2-data-only-starts-to-make-sense-when-values-have-context">Figure 2: Data only starts to make sense when values have context.</h4><h2 id="the-data-wrangling-challenge-has-no-easy-solutions">The Data Wrangling Challenge Has No Easy Solutions</h2><p>With these challenges facing them, your data scientists are far from wasting time when they are data wrangling. In fact, transforming data is an essential part of the understanding process</p><p>However, data science leaders can speed up data wrangling within a team by encouraging some simple behaviors:</p><ol><li><p><strong>Write code to allow reproducibility</strong>. Too many data scientists perform data wrangling using drag-and-drop tools like Excel. That approach may seem faster the first time that data set is ingested, but that manual process will stand in the way of reproducing the analysis later. Instead write functions for ingesting data that can be re-run every time the data changes, and you&rsquo;ll save time in the long run.</p></li><li><p><strong>Embrace tidy data</strong>. The <a href="https://www.tidyverse.org"><em>tidyverse</em> collection of packages in R</a> establishes a standardized way of storing and manipulating data called <em>tidy data</em>, as shown in Figure 3. The tidyverse ensures that all the context needed to understand a data set is made explicit by giving every variable its own column, every observation its own row, and storing only one value per cell.</p></li><li><p><strong>Create a standard data ingestion library</strong>. If your entire team defaults to using tidy data and the tidyverse in all their analyses, then they&rsquo;ll find it easier to read and reuse each other&rsquo;s data wrangling code. You can encourage that behavior by establishing a team Github organization where they can share those code packages and speed up their data understanding in future projects.</p></li></ol><img align="center" style="padding: 35px;" src="unruly-data-fig3.jpg"><h4 id="figure-3-tidy-data-makes-all-context-explicit-and-gives-each-variable-its-own-column">Figure 3: Tidy data makes all context explicit and gives each variable its own column.</h4><p>These behaviors can yield big rewards for data science teams. At rstudio::conf 2020, Dr. Travis Gerke of Moffitt Cancer Center in Tampa, Florida noted that reproducible pipelines have proved a game-changer in wrangling and unlocking complex patient data for the Center’s researchers.</p><div align="center" style="padding: 35px 0 35px 0;"><script src="https://fast.wistia.com/embed/medias/1jhwr01cpr.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_1jhwr01cpr videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/1jhwr01cpr/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div></div><h2 id="learn-more-about-reproducibility-and-sharing-standard-libraries">Learn More About Reproducibility and Sharing Standard Libraries</h2><p>If you&rsquo;d like to learn more about how to reduce data wrangling hassles, we recommend:</p><ul><li><a href="https://resources.rstudio.com/rstudio-conf-2020/rmarkdown-driven-development-emily-riederer">An rstudio::conf talk by Emily Riederer, an Analytics Manager at Capital One</a> describes how reproducible programming using RMarkdown can help R users develop better software programming practices.</li><li><a href="https://resources.rstudio.com/webinars/reproducibility-in-production">This recent RStudio webinar on Reproducibility in Production</a> shows you how to write executable R Markdown documents for a production environment.</li><li><a href="https://rstudio.com/products/package-manager/">RStudio Package Manager</a> helps you share standard and consistent libraries across your data science team. This professional product empowers R users to access packages and reproduce environments while giving IT control and visibility into package use.</li></ul></description></item><item><title>Avoid Irrelevancy and Fire Drills in Data Science Teams</title><link>https://www.rstudio.com/blog/avoid-irrelevancy-and-fire-drills-in-data-science-teams/</link><pubDate>Tue, 28 Apr 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/avoid-irrelevancy-and-fire-drills-in-data-science-teams/</guid><description><h2 id="balancing-the-twin-threats-of-data-science-development">Balancing the twin threats of data science development</h2><p>Data science leaders naturally want to maximize the value their teams deliver to their organization, and that often means helping them navigate between two possible extremes. On the one hand, a team can easily become an expensive R&amp;D department, detached from actual business decisions, slowly chipping away only to end up answering stale questions. On the other hand, teams can be overwhelmed with requests, spending all of their time on labor intensive, manual fire-drills, always creating one more “Just in Time” Powerpoint slide.</p><p>How do you avoid these threats, of either irrelevancy or constant fire drills? As we touched on in a recent blog post, <a href="https://blog.rstudio.com/2020/04/22/getting-to-the-right-question/">Getting to the Right Question</a>, it turns out the answer is pretty straightforward: use iterative, code-based development to share your content early and often, to help overcome the communications gap with your stakeholders.</p><p>Data science ecosystems can be complex and full of jargon, so before we dive into specifics let&rsquo;s consider a similar balancing act. Imagine you are forming a band that wants to share new music with the world. To do so, it is critical to get music out to your fans quickly, to iterate on ideas rapidly. You don’t want to get bogged down in the details of a recording studio on day 1. At the same time, you want to be able to capture and repeat what works - perhaps as sheet music, perhaps as a video, or even as a simple recording.</p><h2 id="share-your-data-science-work-early-and-often">Share your data science work early and often</h2><p>For data scientists, the key is creating the right types of outputs so that decision makers can iterate with you on questions and understand your results. Luckily, like a musician, the modern data science team has many ways to share their initial vision:</p><ul><li>They can quickly create notebooks, through tools like R Markdown or Jupyter, that are driven by reproducible code and can be shared, scheduled, and viewed without your audience needing to understand code.</li><li>They can build interactive web applications using tools like Shiny, Flask, or Dash to help non-coders test questions and explore data.</li><li>Sometimes, data science teams even create APIs, which act as a realistic preview of their final work with a much lower cost of creation.</li></ul><p>Sharing early and often enables data science teams to solve impactful problems. For example, perhaps a data scientist is tasked with forecasting sales by county. They might share their initial exploratory analysis sales leadership and tap into their domain expertise to help explain outlier counties. Or imagine a data scientist working to support biologists doing drug discovery research. Instead of responding to hundreds of requests for statistical analysis, the data scientist could build an interactive application to allow biologists to run their own analysis on different inputs and experiments. By sharing the application early and often, the biologist and data scientist can empower each other to complete far more experiments.</p><p>These types of outputs all share a few characteristics:</p><ol><li><p><strong>The outputs are easy to create.</strong> The sign that your team has the right set of tools is if a data scientist can create and share an output from scratch in days, not months. They shouldn’t have to learn a new framework or technology stack.</p></li><li><p><strong>The outputs are reproducible.</strong> It can be tempting, in a desire to move quickly, to take shortcuts. However, these shortcuts can undermine your work almost immediately. Data scientists are responsible for informing critical decisions with data. This responsibility is serious, and it means results can not exist only on one person’s laptop, or require manual tweaking to recreate. A lack of reproducibility can undermine credibility in the minds of your stakeholders, which may lead them to dismiss or ignore your analyses if the answer conflicts with their intuition.</p></li><li><p><strong>Finally, and most importantly: the outputs must be shared.</strong> All of these examples: notebooks, interactive apps and dashboards, and even APIs, are geared towards interacting with decision makers as quickly as possible to be sure the right questions are being answered.</p></li></ol><h2 id="benefits-of-sharing-for-data-science-teams">Benefits of sharing for data science teams</h2><p>Luckily, tools exist to ensure data science teams can create artifacts that share these three characteristics. At RStudio, we’ve built <a href="https://rstudio.com/products/team/">RStudio Team</a> with all 3 of these goals in mind.</p><p>Great data science teams talk about the happy result of this approach. For examples:</p><blockquote><p><em>“RStudio Connect is critical, the way you can deploy flexdashboards, R Markdown… I use web apps as a way to convey a model in a very succinct fashion&hellip; because I don’t know what the user will do, I can create an app where the user’s interactions with the model can imply it, I don’t have to come up with all the finite outcomes ahead of time”</em> - Moody Hadi at S&amp;P</p></blockquote><blockquote><p><em>“One of the key focuses for us was the method of delivery &hellip; actually taking your insights and getting business impact. How are non analytic people digesting your work.”</em> - Aymen Waqar at Astellas (check out our last blog post, <a href="https://blog.rstudio.com/2020/04/22/getting-to-the-right-question/">Getting to the Right Question</a>, to see Aymen discussing the analytics communication gap)</p></blockquote><div align="center" style="padding: 35px 0 35px 0;"><script src="https://fast.wistia.com/embed/medias/58qjn34mxy.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_58qjn34mxy videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/58qjn34mxy/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div></div><h2 id="its-not-just-about-production">It’s not just about production</h2><p>We often see data science teams make a common mistake that prevents them from achieving this delicate balancing act. A tempting trap is to focus exclusively on complex tooling oriented towards putting models in production. Because data science teams are trying to strike a balance between repeatability, robustness, and speed, and because they are working with code, they often turn to their software engineering counterparts for guidance on adopting “agile” processes. Unfortunately, many teams end up focusing on the wrong parts of the agile playbook. Instead of copying the concept - rapid iterations towards a useful goal - teams get caught up in the technologies, introducing complex workflows instead of focusing on results. This mistake leads to a different version of the expensive R&amp;D department - the band stuck in a recording studio with the wrong song.</p><p>Eduardo Arina de la Rubio, head of a large data team at Facebook, lays out an important reminder <a href="https://resources.rstudio.com/rstudio-conf-2020/value-in-data-science-beyond-models-in-production-eduardo-arino-de-la-rubia">in his recent talk at rstudio::conf 2020</a>. Data science teams are not machine learning engineers. While growth of the two are related, ML models will ultimately become commoditized, mastered by engineers and available in off-the-shelf offerings. Data scientists, on the other hand, have a broader mandate: to enable critical business decisions. Often, in the teams we work with at RStudio, many projects are resolved and decisions made based on the rapid iteration of an app or a notebook. Only on occasion does the result need to be codified into a model at scale - and usually engineers are involved at that stage.</p><p>To wrap up, at RStudio we get to interact with hundreds of data science teams of all shapes and sizes from all types of industries. The best of these teams have all mastered the same balancing act: they use powerful tools to help them share results quickly, earning them a fanbase among their business stakeholders and helping their companies make great decisions.</p><p>We developed RStudio Team with this balancing act in mind, and to make it easy for data science teams to create, reproduce and share their work. To learn more, please visit the <a href="https://rstudio.com/products/team/">RStudio Team page</a>.</p><div style="padding: 25px 0 25px 0;"></div></description></item><item><title>Getting to the Right Question</title><link>https://www.rstudio.com/blog/getting-to-the-right-question/</link><pubDate>Wed, 22 Apr 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/getting-to-the-right-question/</guid><description><h2 id="the-root-problem-we-dont-all-speak-the-same-language">The Root Problem: We Don&rsquo;t All Speak the Same Language</h2><p>Organizations across the modern business world recognize the critical importance of Data Science for competitive advantage. That recognition has driven Glassdoor to rate Data Scientist as <a href="https://www.glassdoor.com/List/Highest-Paying-Jobs-LST_KQ0,19.htm">one of the 25 top paying jobs in America in 2020</a>.</p><p>However, many organizations struggle to put these data scientists’ knowledge to work in their businesses where they can actually have an impact on success. We hear data scientists say, “The business can’t really tell us what they want, so they waste a lot of our time.” And in return, business people often say, “Our data scientists are really smart, but the applications they build too often fall short of what we’re looking for.”</p><p>The problem here is that data scientists and business people speak very different languages. Specifically, they struggle to understand each other around:</p><ul><li><strong>data.</strong> When a business person thinks about data associated with the business, they often are thinking about data that they can see in Web pages or spreadsheets. Data scientists, on the other hand, are usually looking for data that they can access using an Application Programming Interface or API.</li><li><strong>process.</strong> When business people think about the process for analysis, they tend to think in people-centric terms along the lines of “Becky takes the order, and then transmits that data to George.” Data scientists, on the other hand, usually think of process as a series of automated programs that works without people.</li><li><strong>results.</strong> Data scientists think of a result as an analysis running and producing correct output. Business people see a result as something that has an effect on the organization’s (usually financial) metrics. These are rarely the same thing, at least in the first version of a data science project.</li></ul><p>Both points of view are valid – they just aren’t the same, which creates a communications gap.</p><h2 id="iterative-development-can-overcome-the-communications-gap">Iterative Development Can Overcome the Communications Gap</h2><p><em>&ldquo;An approximate answer to the right question is worth a great deal more than a precise answer to the wrong question.&quot;</em> &ndash;John Tukey</p><h3 id="astellas-aymen-waqar-discusses-the-analytics-communications-gap">Astellas’ Aymen Waqar discusses the analytics communications gap:</h3><div style="padding: 15px 40px 35px 40px;text-align: center;"><script src="https://fast.wistia.com/embed/medias/iwmemji2xh.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><span class="wistia_embed wistia_async_iwmemji2xh popover=true popoverAnimateThumbnail=true videoFoam=true" style="display:inline-block;height:100%;position:relative;width:100%">&nbsp;</span></div></div></div><p>These communications gaps are part of a larger challenge of defining (and refining) the problem. While your business stakeholder might believe they have a clear definition of the problem they are trying to solve, they may not understand whether the data is available, how complex the modeling might be or how long building a model on large data might take, or what adjacent problems might be potentially more valuable and/or far simpler to solve. So, before starting the development process, the data scientist and the business stakeholder must explore and discuss the problem in enough detail to create a realistic development plan. And while data scientists and business people may struggle to understand each other’s words, they usually can agree if they can just see a working model. The difficult part is getting to that working model.</p><h2 id="a-commonly-used-data-exploration-process-can-help">A Commonly Used Data Exploration Process Can Help</h2><p>One way to get to agreement is to break down the project into simpler pieces and get agreement on each piece before moving on to the next. Garrett Grolemund and Hadley Wickham propose the following process below in their book <a href="https://r4ds.had.co.nz"><em>R for Data Science</em></a>. This process isn’t specific to any technology such as R or Python. Rather it’s a way to get your data scientist and business sponsor to come to consensus on what question they are attacking.</p><img align="center" style="padding: 35px;" src="process2.jpg"><p>The four steps are</p><ol><li><strong>Import</strong>. Identify the data you plan to use, and focus first on importing that data so you can work with it.</li><li><strong>Tidy</strong>. Now that you have the data in hand, reshape and manipulate the data into a form that your analysis tools can easily work with.</li><li><strong>Understand</strong>. This step is where your data scientists should be interacting most with sponsors by turning the data into visuals and models, and getting feedback about whether they satisfy the business needs.</li><li><strong>Communicate</strong>. Once you have consensus on what you’re building, this is where you simplify and polish the result so that everyone will understand the result.</li></ol><h2 id="four-recommendations-for-applying-this-process">Four Recommendations For Applying This Process</h2><p>Many data scientists (or at least those who have read <em>R for Data Science</em>) use this type of process for doing analysis. However, fewer think of using it as a communications tool to ensure they are answering the proper business questions. You can help your data scientists apply this approach; encourage them to:</p><ul><li><strong>Schedule check-ins at each step.</strong> Before you begin, set up regular check-ins with your business sponsors. Ideally, these should roughly correspond with the development phases listed above to ensure that everyone is in sync before moving on to the next phase.</li><li><strong>Use rapid prototyping tools and languages.</strong> R and Python are the tools of choice for most data scientists because they are well-suited to the type of iterative development process being described here. Both languages speed development and have excellent visualization tools which will help drive consensus.</li><li><strong>Document progress using public documents.</strong> Use a single Google Docs file to record each meeting and to record decisions. Don’t start a new document with each meeting, but simply prepend the date and the most recent meeting notes at top. By the time the project is done, you’ll have a record of the entire process from beginning to end which will help plan future projects.</li><li><strong>Defer performance concerns until you have an agreed result.</strong> Too many projects get bogged down designing for full-scale deployment before they actually know what they are building. Instead, develop a prototype that everyone agrees is the right idea, and then revise it to scale up when you decide to put it into production. This approach simplifies early decision-making and doesn’t waste precious project time on premature optimizations.</li></ul><p>Once the application satisfies both your data scientists and business stakeholders, you’ll want to share the finished application with the wider business community. One of the easiest ways to do this is through <em>RStudio Connect</em>, which can help you rapidly refine your content during the prototyping phase, and share it widely and consistently in the production phase. We will talk more about that in our next blog post. Meanwhile, to learn more about how Connect can add push-button publishing, scheduled execution of reports, and flexible security policies to your team’s data science work, please visit the <a href="https://rstudio.com/products/connect/">RStudio Connect product page</a>.</p><div style="padding: 25px 0 25px 0;"></div></description></item><item><title>RStudio and COVID-19</title><link>https://www.rstudio.com/blog/rstudio-and-covid-19/</link><pubDate>Fri, 17 Apr 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-and-covid-19/</guid><description><p>As we’re all aware, the impact of the current global pandemic has been significant. We know that many in the R community are involved in the response, and we want to help out where we can.</p><h2 id="shinyappsio">shinyapps.io</h2><p>If you are using R and Shiny to analyse or visualise COVID-19 data and want an easy way to share your work with others, we’d like to offer you six months of free basic hosting on <a href="http://shinyapps.io/">shinyapps.io</a>. This will enable you to share your app widely without having to worry about the number of compute hours (basic accounts are usually limited to 500 compute hours per month, but we won’t enforce that limit on these apps). Just use the discount code <code>COVID_DISCOUNT_2020_BASIC</code> when you sign up or upgrade (the coupon must be redeemed before June 1).</p><p>Due to a limitation in our system, you will need a credit card on file to take advantage of the discount. We know this is a hassle, but we won&rsquo;t charge your card for the first six months, and we&rsquo;ll send out an email reminder before the coupon expires so you can downgrade back to the free plan. (You can learn more about upgrades and downgrades in the <a href="https://docs.rstudio.com/shinyapps.io/billing-and-account-management.html#invoices-payments">shinyapps.io user’s guide</a>.)</p><p>If you’d like access to the Standard plan for hosting (e.g., you need authentication so you can share data only within your hospital), please get in touch with <a href="https://support.rstudio.com/hc/en-us/requests/new">Support</a> and we’ll do our best to help out.</p><h2 id="professional-products">Professional products</h2><p>We don&rsquo;t want access to our commercial products to gate your COVID-19 academic research. If your <a href="https://rstudio.com/pricing/academic-pricing/">qualified academic research group</a> could substantially benefit from <a href="https://rstudio.com/products/connect/">RStudio Connect</a>, <a href="https://rstudio.com/products/rstudio/#rstudio-server">RStudio Server Pro</a>, or <a href="https://rstudio.com/products/package-manager/">RStudio Package Manager</a>, we may be able to offer additional discounts on an annual plan or free access for 6 months. Please contact <a href="mailto:info@rstudio.com">info@rstudio.com</a> for more details.</p><h2 id="other-r-help">Other R help</h2><p>Are you a COVID-19-focussed research group that needs a little help with R programming (whether it be data analysis, app development, or package construction)? We obviously can’t help everyone, but our engineers would love to help out where possible. If you have pressing R development needs, <a href="https://community.rstudio.com/w/covid-help">please fill out this form</a>. The posts created via this form will be posted publicly on RStudio Community and our engineers and sustainers will be monitoring the forum regularly.</p></description></item><item><title>Effective Visualizations for Credible, Data-Driven Decision Making</title><link>https://www.rstudio.com/blog/effective-visualizations-for-credible-data-driven-decision-making/</link><pubDate>Thu, 16 Apr 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/effective-visualizations-for-credible-data-driven-decision-making/</guid><description><p>Recently, we were joined by the smart folks at Roche &amp; Novartis to present a webinar on effective data visualization. You can watch the recording of the <a href="https://resources.rstudio.com/webinars/effective-visualizations-for-data-driven-decisions">full presentation here</a>. It was the latest installment in a series of webinars highlighting industry leaders in the Pharmaceutical and Life Science spaces that are doing world-changing data science work.</p><p>They presented many great insights, most of which are relevant to data scientists in every industry, and we wanted to share our learnings.</p><p><strong>“Graphics and visuals are such an important component of the work that we do as quantitative data scientists.”</strong></p><p><a href="https://www.linkedin.com/in/marc-vandemeulebroecke-b394046/">Marc Vandemeulebroecke</a>, biostatistician at Novartis, stated this, then went on to say that effective visualizations help data scientists achieve two important things:</p><ol><li>better insight into data and</li><li>the ability to communicate results and conclusions to stakeholders.</li></ol><p>Imagine you have only 3 minutes to present your hard work to stakeholders. You’re probably going to rely on some sort of visualization. The stakeholders are then going to use the conclusions they draw from that presentation to make important decisions, such as what product to invest in or what initiative to approve. The stakes are set. Marc is clearly right: being able to effectively visualize data is an important part of any data scientist’s job.</p><p>So what’s the issue? Why spend an hour presenting on this topic?</p><p><strong>“Unfortunately we’re not always good at creating effective visualizations.”</strong></p><p>And the consequences can be brutal. “When we get [data visualization] wrong, it can lead to misinformation, confusion, or harm, especially in clinical and medical research”, <a href="https://www.linkedin.com/in/mark-baillie-52486735/">Mark Baillie</a>, Director of Statistical Methodology at Novartis, continues.</p><p>He says this while displaying a graphic recently published by the New York Times that shows the importance of social distancing during COVID 19. To be clear: this was an example of a good data visualization, one that potentially contributed to lives being saved.</p><img align="center" style="padding: 35px;" src="curve.jpg"><p>He transitioned to a lighter example, a bad data visualization: a confusing graphic that attempts to relay the results of a pizza topping survey by YouGov, a government entity in the UK:</p><img align="center" style="padding: 35px;" src="pizza-pie.jpg"><p>Certainly no harm was inflicted by botching a pizza graphic, but the graphic is confusing. The results were posted on <a href="https://twitter.com/yougov/status/838720989991223297?lang=en">Twitter</a> and the most popular reply questioned how YouGov managed to poll 695% <em>of the population</em>. YouGov had to make another post clarifying its analysis.</p><p>So why aren’t we great at data visualizations? Back to Marc’s introduction: The focus in advanced education is primarily on doing the analysis, not communicating and visualizing the results. Rightfully so or not, that in turn means data visualization is a skill frequently learned on the job.</p><p><strong>“So what do we mean by effective? We don’t necessarily mean beautiful.”</strong></p><p>We also make common missteps when creating data visualizations. These missteps include selecting the wrong graph type, misusing color, and not considering scale. The singular error that most of these missteps roll up into: we’re overly concerned with making visualizations that are beautiful, so much so that we’re willing to sacrifice the visualization’s effectiveness.</p><p><strong>“Effective visualization = effective communication” and the 3 laws for improving visual communication.</strong></p><p>The teams at Roche &amp; Novartis distill the goal of effective visualization down to effective communication. The purpose of any data visualization is to communicate with stakeholders in a way that results in true understanding and better decision making.</p><p>So, how do we achieve that? This has been such a focus at Novartis that they actually penned the Three Laws for Improving Visual Communication org-wide:</p><h2 id="law-1-have-a-clear-purpose">Law 1: Have a Clear Purpose</h2><p>Mark introduces this law by calling it “just advanced common sense”. Before jumping into your visualization, you should work through a why, what, who, and where mental framework:</p><p><strong>Why</strong> do you need a graph? You should be able to identify a specific purpose for the graph, such as delivering a message or exploration.</p><p><strong>What</strong> quantitative evidence do you have to support the purpose?</p><p><strong>Who</strong> is the intended audience? Once you figure that out, you can adjust the design to support their needs.</p><p><strong>Where</strong> is your visualization going to live? Once you know that, you should make design choices that fit formatting constraints.</p><p>Your decisions when creating a visualization should work in harmony to achieve its purpose. You may opt for a tool like RMarkdown when distributing a visualization to senior stakeholders internally, with the purpose of quickly showing clinical trial results without the need for deep exploration. The choices you make for this data visualization are likely different than the choices you’d make for a visualization that you were to submit to the FDA or teammates that you work with.</p><p>Mark expands on this law with a slide dedicated to a quote by John Tukey, who argues that the quality of an analysis is related to the quality of the question being asked.</p><p><em>“An approximate answer to the right question is worth a great deal more than a precise answer to the wrong question.”</em></p><p>The message here is to spend more time thinking about the questions that set your analysis and visualization building process in motion.</p><h2 id="law-2-show-the-data-clearly">Law 2: Show the Data Clearly</h2><p>Mark’s recommendation for this rule: do not lie or misrepresent with data. Your reporting should be open and transparent. When you’re clear, open, and transparent, you build trust with your audience, a necessary ingredient that enables your analysis to drive decision-making.</p><p>Once you make the decision to be open and transparent, you can start thinking about the design and the graph type of your visualizations. Choosing the correct graph type aids in interpretation. “Often, we don’t need to reinvent the wheel here”, Mark adds. If you want to show a deviation, correlation, or ranking, there are common graph types you can usually select from to achieve your visualization’s purpose in a clear way.</p><p>Additionally, you don’t want to make your audience members have to work when looking at a particular graph type. Unnecessary components in data visualizations, or overly stylistic choices, may do more harm than good.</p><img align="center" style="padding: 35px;" src="1waldo.jpg"><p>More specifically, you should carefully consider the scaling and spacing of your data visualizations. “Avoid plotting log-normally distributed variables on a linear scale”, Mark presents, then goes on to add, “measurements displayed close together are perceived to be closer in time.”</p><h2 id="law-3-make-the-message-obvious">Law 3: Make the Message Obvious</h2><p>The key to this law is: do not assume your reader understands what message your data visualization is trying to portray. Instead, really work at your visualization to make the takeaway obvious.</p><p>What are some best-practices you can follow to achieve this? Mark highlights several:</p><ul><li><p><strong>Try not to set text at angle.</strong> Instead, think of alternatives such as transposing the graph.</p></li><li><p><strong>Avoid unnecessary color.</strong> Don’t use color to differentiate between categories of the same variable. Doing so can contribute to confusion.</p></li><li><p><strong>Only use color when it adds value.</strong> Use bold, saturated, or contrasting color to emphasize important details.</p></li></ul><img align="center" style="padding: 35px;" src="color2.jpg"><ul><li><strong>Use informative labels and annotations to support the message.</strong> Emphasize the components of your data visualization that will make your message most obvious.</li></ul><p>All of your design choices should be made with the purpose of making your message more obvious to the audience. There are lots of bells and whistles you can add to your data visualizations. What you don’t want to do is add unnecessary or confusing distractions to your visualization.</p><h2 id="closing-thoughts">Closing Thoughts</h2><p>We’re grateful for the teams at Roche &amp; Novartis for giving a fantastic presentation and the 1,400 live attendees that chose to spend an hour with us during truly strange times. If you were one of those live attendees, thank you. If you are interested in watching the full recording of the presentation, <a href="https://resources.rstudio.com/webinars/effective-visualizations-for-data-driven-decisions">you can find it here</a>.</p><p>Many of the lessons from this webinar are core to RStudio’s mission: we love helping data scientists maximize the impact of their work. We also believe that it shouldn’t be difficult for data scientists to reproduce that impact time and time again as projects change.</p><p>If you’re hungry for more resources that may enhance your data visualization skills and ability to drive value at your organization, you should consider the following resources:</p><ul><li><a href="https://resources.rstudio.com/webinars/reproducibility-in-production">Reproducibility in Production</a> - a webinar presentation by Garrett Grolemund</li><li><a href="https://resources.rstudio.com/webinars/the-tidyverse-and-rstudio-connect">The Tidyverse and RStudio Connect</a> - a webinar presentation by Nathan Stephens</li><li><a href="https://rstudio.com/resources/webinars/introducing-flexdashboards/">Introducing Flexdashboards</a> - a webinar presentation by Garrett Grolemund</li><li><a href="https://rstudio.com/resources/cheatsheets/">Our Data Visualization cheat sheets</a> - located on our cheat sheet page under the resource tab in the navigation bar</li></ul><p>Finally, if you want to watch the other episodes in this webinar series, you can find them here:</p><ul><li><a href="https://resources.rstudio.com/webinars/scaling-data-science-with-r-at-janssen-pharmaceuticals">Scaling Data Science with R at Janssen Pharmaceuticals</a></li><li><a href="https://resources.rstudio.com/webinars/scaling-data-science-at-the-epa">Scaling Data Science at the EPA</a></li><li><a href="https://resources.rstudio.com/webinars/the-role-of-r-in-drug-discovery-research-and-development">The Role of R in Drug Discovery, Research, and Development</a></li></ul></description></item><item><title>Great Looking Tables: gt (v0.2)</title><link>https://www.rstudio.com/blog/great-looking-tables-gt-0-2/</link><pubDate>Wed, 08 Apr 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/great-looking-tables-gt-0-2/</guid><description><script src="https://www.rstudio.com/blog/great-looking-tables-gt-0-2/index_files/header-attrs/header-attrs.js"></script><p align="center"><img src="gt_hex.svg" width=100%></p><p>We are extremely excited to have our first release of the <strong>gt</strong> package available in CRAN! The name <strong>gt</strong> is short for “grammar of tables” and the goal of <strong>gt</strong> is similar to that of <strong>ggplot2</strong>, serving to not just to make it easy to make specific tables, but to describe a set of underlying components that can be recombined in different ways to solve different problems.</p><p>If you ever need to make beautiful customized <em>display</em> tables, I think you’ll find <strong>gt</strong> is up to the task. You can install <strong>gt</strong> 0.2 from CRAN with:</p><pre class="r"><code>install.packages(&quot;gt&quot;)</code></pre><p>For an initial release, it’s pretty big! There are so many ways to structure a table, apply formatting and annotations, and style it just the way you want. Currently <strong>gt</strong> renders tables to the HTML output format (and has the ability to export to image files). We plan to also support the LaTeX and RTF output formats in the near future.</p><p>The <a href="https://gt.rstudio.com">website for the <strong>gt</strong> package</a> has walkthrough articles for getting started and a <a href="https://gt.rstudio.com/reference/index.html">function reference section</a> with plenty of examples and images to show you how the table output is meant to appear.</p><div id="lets-get-acquainted-with-our-model-of-a-table" class="section level2"><h2>Let’s get acquainted with our model of a table</h2><p>We decided to formalize the parts of a table—and give them names—so that we have some language to act on. The larger components of a table (roughly from top to bottom) include the <em>table header</em>, the <em>column labels</em>, the <em>stub</em> and <em>stub head</em>, the <em>table body</em>, and the <em>table footer</em>. Within each of these components, there may be subcomponents (e.g., the <em>table header</em> contains a <em>title</em> and <em>subtitle</em>, the <em>table body</em> contains individual <em>cells</em>, etc.). Understanding how the parts fit together will make more sense with this diagram:</p><p align="center"><img src="gt_parts_of_a_table.svg" width=100%></p><p>Learning new vocabulary is definitely a pain, but we believe it’s worthwhile. Like <strong>ggplot2</strong>, the new words take some getting used to, but we believe learning them will improve your ability to analyze and understand existing tables, and then successfully recreate them in <strong>gt</strong>.</p></div><div id="examples-with-the-exibble-dataset" class="section level2"><h2>Examples with the <code>exibble</code> dataset</h2><p>The <code>exibble</code> dataset is included in <strong>gt</strong> and its <em>raison d’etre</em> is to be a small dataset (8 rows and 9 columns) with different column types for experimenting with formatting. It fits easily on a single screen when printed as a tibble and rendered as a <strong>gt</strong> table, making it easy to see the results of our <strong>gt</strong> experimentation.</p><pre class="r"><code>exibble</code></pre><pre><code>## # A tibble: 8 × 9## num char fctr date time datetime currency row group## &lt;dbl&gt; &lt;chr&gt; &lt;fct&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;chr&gt; &lt;chr&gt;## 1 0.111 apricot one 2015-01-15 13:35 2018-01-01… 50.0 row_1 grp_a## 2 2.22 banana two 2015-02-15 14:40 2018-02-02… 18.0 row_2 grp_a## 3 33.3 coconut three 2015-03-15 15:45 2018-03-03… 1.39 row_3 grp_a## 4 444. durian four 2015-04-15 16:50 2018-04-04… 65100 row_4 grp_a## 5 5550 &lt;NA&gt; five 2015-05-15 17:55 2018-05-05… 1326. row_5 grp_b## 6 NA fig six 2015-06-15 &lt;NA&gt; 2018-06-06… 13.3 row_6 grp_b## 7 777000 grapefruit seven &lt;NA&gt; 19:10 2018-07-07… NA row_7 grp_b## 8 8880000 honeydew eight 2015-08-15 20:20 &lt;NA&gt; 0.44 row_8 grp_b</code></pre><div id="a-simple-table" class="section level3"><h3>A simple table</h3><p>Let’s use that dataset to make the ‘Hello, World!’ of <strong>gt</strong> tables:</p><pre class="r"><code>exibble %&gt;% gt()</code></pre><p><img src="table_1.png" width=100%><br></p><p>Just like how the <code>ggplot()</code> function is the entry point to <strong>ggplot2</strong> plots, the <a href="https://gt.rstudio.com/reference/gt.html"><code>gt()</code></a> function serves as the first function to call for making <strong>gt</strong> tables.</p></div><div id="formatting-data-in-columns" class="section level3"><h3>Formatting data in columns</h3><p>The <code>exibble</code> dataset is blessed with an array of column types. This makes it a snap to experiment with <strong>gt</strong>’s collection of <code>fmt_*()</code> functions, which format the input data values.</p><p>Let’s test as many formatter functions as possible. Here’s the plan:</p><ul><li>have <code>num</code> display numbers with exactly 2 decimal places using <a href="https://gt.rstudio.com/reference/fmt_number.html"><code>fmt_number()</code></a></li><li>show nicely formatted dates in <code>date</code> using <code>date_style</code> <code>6</code> (the <code>m_day_year</code> style) with <a href="https://gt.rstudio.com/reference/fmt_date.html"><code>fmt_date()</code></a></li><li>format the 24-h time values in <code>time</code> to <code>time_style</code> <code>4</code> (the <code>hm_p</code> style) with <a href="https://gt.rstudio.com/reference/fmt_time.html"><code>fmt_time()</code></a></li><li>make the datetimes in <code>datetime</code> formatted as such with the <a href="https://gt.rstudio.com/reference/fmt_datetime.html"><code>fmt_datetime()</code></a> function</li><li>transform the <code>currency</code> column with <a href="https://gt.rstudio.com/reference/fmt_currency.html"><code>fmt_currency()</code></a> to show us values in the euro currency (<code>currency = "EUR"</code>)</li></ul><p>Phew! Here’s the code and the corresponding <strong>gt</strong> table:</p><pre class="r"><code>exibble %&gt;%gt() %&gt;%fmt_number(columns = vars(num), decimals = 2) %&gt;%fmt_date(columns = vars(date), date_style = 6) %&gt;%fmt_time(columns = vars(time), time_style = 4) %&gt;%fmt_datetime(columns = vars(datetime), date_style = 6, time_style = 4) %&gt;%fmt_currency(columns = vars(currency), currency = &quot;EUR&quot;)</code></pre><p><img src="table_2.png" width=100%><br></p><p>As can be seen, entire columns had formatting applied to them in very specific ways. There is some finer control available as well. We can style a subselection of rows in any given column and there are quite a few ways to specify the target rows (e.g., row indices, row names in the stub, conditional statement based on column data, etc.).</p><p>This only scratches the surface of what is possible in formatting the <em>table body</em>, there are more <code>fmt_*()</code> functions. If they don’t exactly suit your needs you can use the general <a href="https://gt.rstudio.com/reference/fmt.html"><code>fmt()</code></a> function and provide your own transformation function.</p></div><div id="a-table-with-a-header-and-a-footer" class="section level3"><h3>A table with a <em>header</em> and a <em>footer</em></h3><p>We can add components to the table. Let’s include a <em>header</em> with a title and subtitle, and, a <em>footer</em> with a source note. These parts are added with the <a href="https://gt.rstudio.com/reference/tab_header.html"><code>tab_header()</code></a> and <a href="https://gt.rstudio.com/reference/tab_source_note.html"><code>tab_source_note()</code></a> functions.</p><pre class="r"><code>exibble %&gt;%gt() %&gt;%tab_header(title = md(&quot;This is the `exibble` dataset in **gt**&quot;),subtitle = &quot;It is one of six datasets in the package&quot;) %&gt;%tab_source_note(md(&quot;More information is available at `?exibble`.&quot;))</code></pre><p><img src="table_3.png" width=100%><br></p><p>Adding new parts to the table is typically done by using a few <code>tab_*()</code> functions. Notice that we could style our text using Markdown with the included <a href="https://gt.rstudio.com/reference/md.html"><code>md()</code></a> function.</p></div><div id="adding-a-stub-and-organizing-rows-into-row-groups" class="section level3"><h3>Adding a <em>stub</em> and organizing rows into <em>row groups</em></h3><p>The <code>exibble</code> dataset has the <code>row</code> and <code>group</code> columns, which were purposefully included for experimentation with the table <em>stub</em> and with <em>row groups</em>. Rather than explaining those components at length, let’s revise the above code so that these columns are used to create those components:</p><pre class="r"><code>exibble %&gt;%gt(rowname_col = &quot;row&quot;, groupname_col = &quot;group&quot;) %&gt;%tab_header(title = md(&quot;This is the `exibble` dataset in **gt**&quot;),subtitle = md(&quot;We can use the `row` and `group` columns to structure the table&quot;)) %&gt;%tab_source_note(md(&quot;More information is available at `?exibble`.&quot;))</code></pre><p><img src="table_4.png" width=100%><br></p><p>This change effectively gives us row labels in a separate area to the left (the <em>stub</em>), and, row group labels above each grouping of rows. This is great for data that naturally falls into groupings. And worry not, if the initial order isn’t what you expected or wanted, the <a href="https://gt.rstudio.com/reference/row_group_order.html"><code>row_group_order()</code></a> function can be used to reorder the groupings.</p></div><div id="using-spanner-column-labels" class="section level3"><h3>Using <em>spanner column labels</em></h3><p>Just as with the <em>stub</em>, we can create groupings of columns with <em>spanner column labels</em> that encompass one or more columns. The <a href="https://gt.rstudio.com/reference/tab_spanner.html"><code>tab_spanner()</code></a> function makes this possible. By providing a <code>label</code> and a selection of <code>columns</code> the new label is placed above those columns and the associated horizontal rule will span across. Should the <code>columns</code> not be adjacent to each other, <a href="https://gt.rstudio.com/reference/tab_spanner.html"><code>tab_spanner()</code></a> will automatically gather them together.</p><pre class="r"><code>exibble %&gt;%gt(rowname_col = &quot;row&quot;, groupname_col = &quot;group&quot;) %&gt;%tab_spanner(label = &quot;Dates and Times&quot;, columns = matches(&quot;date|time&quot;)) %&gt;%tab_header(title = md(&quot;This is the `exibble` dataset in **gt**&quot;),subtitle = md(&quot;We can use the `tab_spanner()` function to organize and label columns&quot;)) %&gt;%tab_source_note(md(&quot;More information is available at `?exibble`.&quot;))</code></pre><p><img src="table_5.png" width=100%><br></p></div><div id="more-so-much-more" class="section level3"><h3>More… so much more</h3><p>It’s really not possible to explore much of what <strong>gt</strong> can do in a short blog post. You can do many more useful things like inserting footnotes, modifying text, borders, and fills, and, adding summary rows. Here’s an example of how the <code>pizzaplace</code> dataset can look with a little <strong>gt</strong> code (not shown here but <a href="https://gist.github.com/rich-iannone/1da1ae7a7203958a0c5b1bd1d4b24017">available in this gist</a>):</p><p><img src="table_6.png" width=100%><br></p><p>Getting started with <strong>gt</strong> can be a risk-free experience with the <strong>gt Test Drive</strong>. Hit the button below to be transported to an <strong>RStudio Cloud</strong> project with examples galore:</p><p align="center"><a href="https://rstudio.cloud/project/779965"><img src="gt-test-drive.svg" alt="RStudio Cloud Example" height="80px"></a></p><p>To make it easy to experiment with making <strong>gt</strong> tables, we included <a href="https://gt.rstudio.com/articles/gt-datasets.html">six datasets</a> in the package: <code>countrypops</code>, <code>sza</code>, <code>gtcars</code>, <code>sp500</code>, <code>pizzaplace</code> (your favorite), and <code>exibble</code>. Strangely enough, each of these datasets is celebrated with a circular logo.</p><p align="center"><img src="gt_datasets_labeled.svg" width=100%></p><p>(Each of these datasets has a unique story in the world of <strong>gt</strong> so the deluxe graphics are warranted.)</p><p>While we’re only getting started on this package we feel things are really coming along. But sure to visit and engage with us at the <a href="https://github.com/rstudio/gt/issues"><strong>gt</strong> issue tracker</a>. We want to hear of any bugs, usage questions, or great ideas you might have to make this package better. Thanks!</p></div></div></description></item><item><title>RStudio Connect 1.8.2</title><link>https://www.rstudio.com/blog/rstudio-connect-1-8-2/</link><pubDate>Thu, 02 Apr 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-1-8-2/</guid><description><h2 id="a-big-update-for-our-python-community">A big update for our Python community</h2><p>One of the biggest frustrations for a data scientist, whether your primary language is R or Python, is to have your hard work go underutilized. A stream of disposable reports, emails, and presentations that get viewed once and cast aside are not the ideal recipe for how to make an impact. To combat this, we have seen data scientists create more interactive content (such as applications, APIs, and dashboards) to engage the divided attention of stakeholders. Unfortunately, delivering interactivity often comes at the cost of learning far more about IT and infrastructure than perhaps you had planned.</p><p>At RStudio, we believe data scientists shouldn’t have to become experts in DevOps just to share their work with the rest of their organization. RStudio Connect was created to handle the burden of deployment and provide a single platform for all the content your team produces in R and Python. Today we are excited to announce RStudio Connect 1.8.2, with new options for data scientists who use Python to share and communicate; including support for Python APIs (with Flask) and beta support for interactive Python applications with Dash.</p><h3 align="center"><a href="https://rstudio.chilipiper.com/book/schedule-time-with-rstudio">Schedule a demo of RStudio Connect</a></h3><h2 id="flask-api-deployment">Flask API Deployment</h2><p>RStudio Connect 1.8.2 introduces support for Python API deployment, including applications built with Flask and other WSGI-compatible frameworks. This functionality lets data science teams make models developed in Python available as REST APIs. Once deployed, an RStudio Connect publisher can give other teams or services access to the API, securely delivering data science insights across their organization.</p><p><img src="flask-example.gif" alt="Python API Example on RStudio Connect 1.8.2"></p><p>RStudio Connect automatically integrates with several Flask extension packages like <code>Flask-RESTX</code>, <code>Flask-API</code>, and <code>Flasgger</code> to provide web-accessible documentation or an API console interface. Examples for each of these extensions can be found in the <a href="https://docs.rstudio.com/connect/user/flask/#examples">User Guide</a>.</p><p>Publishing a Python API to RStudio Connect requires the <a href="https://pypi.org/project/rsconnect-python/"><code>rsconnect-python</code></a> package. This package is available to install with pip from PyPi and enables a command-line interface that can be used to publish from any Python IDE including PyCharm, VS Code, JupyterLab, Spyder, and others.</p><p>Developers with RStudio Connect publisher accounts can follow along with the new Python API Jump Start Example to learn the basic deployment workflow:</p><img src="jumpstart.gif" alt="Flask Example in the Jump Start on RStudio Connect 1.8.2" style="width:400px;"/><p>Additional getting started information and examples can be found in the <a href="https://docs.rstudio.com/connect/user/flask/">User Guide</a>.</p><h3 align="center"><a href="https://rstudio.com/products/connect/evaluation/">Download RStudio Connect 1.8.2</a></h3><h2 id="beta-support-for-dash-applications">Beta Support for Dash Applications</h2><p>Dash applications provide an easy way for Python users to create interactive applications and dashboards that help decision makers engage with their work.</p><p>Python users can develop Dash applications in the IDE of their choosing. Publishing an application to RStudio Connect is supported using the <a href="https://pypi.org/project/rsconnect-python/"><code>rsconnect-python</code></a> package. Refer to the <a href="https://docs.rstudio.com/connect/user/dash/">User Guide</a> for details.</p><p>Once deployed to RStudio Connect, publishers can control access to their application, add viewers or collaborators, and adjust runtime settings to maximize performance or scale to meet audience demand.</p><p><img src="bikeshare-dash.gif" alt="Example Dash Application on RStudio Connect 1.8.2"></p><p>This is a Dash application hosted on RStudio Connect that shows availability predictions for Washington DC’s docked bike-share stations. To see more examples like this, visit our <a href="https://solutions.rstudio.com/python/overview/">Solutions Engineering</a> website.</p><p><strong>What does &ldquo;Beta&rdquo; Mean?</strong> <em>Dash support is a beta feature which is still undergoing final testing before its official release. Should you encounter any bugs, glitches, lack of functionality or other problems, please let us know so we can improve before public release.</em></p><h3 align="center">Learn how data science teams use RStudio products<br/><a href="https://rstudio.com/solutions/r-and-python/">Visit R & Python - A Love Story</a></h3><h2 id="new--notable">New &amp; Notable</h2><ul><li>For <strong>Publishers</strong>, 1.8.2 makes it easy to share filtered content lists:</li></ul><img src="filter-links.gif" alt="Share filtered content links in RStudio Connect 1.8.2" style="width:500px;"/><p>Easily share links to custom views of the content dashboard page, such as specific tags or search results.</p><ul><li><p>For <strong>Administrators and Publishers</strong>, this release includes new default runtime settings that allow APIs and applications to scale more efficiently.</p></li><li><p>The User and Admin Guide documentation sites have been updated:</p><ul><li>Visit the <a href="https://docs.rstudio.com/connect/admin/">Admin Guide</a></li><li>Visit the <a href="https://docs.rstudio.com/connect/user/">User Guide</a></li></ul></li></ul><h2 id="security-deprecations--breaking-changes">Security, Deprecations &amp; Breaking Changes</h2><ul><li><p><strong>Security</strong> Enforce locked user restrictions for active browser sessions.</p></li><li><p><strong>Breaking Change</strong> The <code>Postgres.URL</code> database connection URL no longer supports the <code>{$}</code> password placeholder. The <code>Postgres.URL</code> automatically uses the <code>Postgres.Password</code> value without a placeholder.</p></li><li><p><strong>Breaking Change</strong> The <code>Postgres.InstrumentationURL</code> database connection URL no longer supports the <code>{$}</code> password placeholder. <code>The Postgres.InstrumentationURL</code> automatically uses the <code>Postgres.InstrumentationPassword</code> value without a placeholder.</p></li><li><p><strong>Breaking Change</strong> Due to breaking changes to the <code>virtualenv</code> package, Python installations must have a version of <code>virtualenv</code> below 20, e.g.: <code>virtualenv&lt;20</code>. The version of <code>setuptools</code> must be 40.8 or higher. Incompatible versions will result in an error at startup.</p></li><li><p><strong>Deprecation</strong> The settings <code>SAML.IdPSigningCertificate</code> and <code>SAML.SPEncryptionCertificate</code> would previously accept the contents of a certificate as a long Base64 inline value. This option is no longer supported and a warning will be issued during startup if used. Now, these settings only support a path to a PEM certificate file.</p></li><li><p><strong>Deprecation</strong> The setting <code>SAML.IdPMetaData</code> has been deprecated. If this setting is currently used, a configuration migration will take place to transfer its value to either <code>SAML.IdPMetaDataURL</code> or <code>SAML.IdPMetaDataPath</code>. The deprecated metadata setting will be removed in a future release. The configuration migration for SAML settings is an automatic process that does not require immediate intervention but that will need a single manual step to be completed.</p></li></ul><p>Please review the full version of the release notes available <a href="http://docs.rstudio.com/connect/news">here</a>.</p><blockquote><h4 id="upgrade-planning">Upgrade Planning</h4><p>Due to breaking changes to the <code>virtualenv</code> package, Python installations must have a version of <code>virtualenv</code> below 20, e.g.: <code>virtualenv&lt;20</code>. The version of <code>setuptools</code> must be 40.8 or higher. Incompatible versions will result in an error at startup. Review the <a href="https://docs.rstudio.com/rsc/integration/python/">documentation</a> for more details. If you are upgrading from an earlier version, be sure to consult the release notes for the intermediate releases, as well.</p></blockquote><h3 align="center"><a href="https://rstudio.com/products/connect/">Click through to learn more about RStudio Connect</a></h3></description></item><item><title>Shiny Contest 2020 deadline extended</title><link>https://www.rstudio.com/blog/shiny-contest-2020-deadline-extended/</link><pubDate>Wed, 18 Mar 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-contest-2020-deadline-extended/</guid><description><p>The original deadline for Shiny Contest 2020 was this week, but given that many of us have had lots of unexpected changes to our schedules over the last week due to the COVID-19 outbreak, we have decided to extend the deadline by two weeks. If you&rsquo;ve been planning to submit an entry for the contest this week (and if history is any indicator, there may be a few of you out there), please feel free to take this additional time. The new deadline for the contest is 3 April 2020 at 5pm ET.</p></description></item><item><title>RStudio 1.3 Preview: The Little Things</title><link>https://www.rstudio.com/blog/rstudio-1-3-the-little-things/</link><pubDate>Tue, 17 Mar 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-1-3-the-little-things/</guid><description><p><em>This blog post is part of a series on new features in RStudio 1.3, currently available as a <a href="https://www.rstudio.com/products/rstudio/download/preview/">preview release</a>.</em></p><p>In every RStudio release, we introduce dozens of small quality-of-life improvements alongside bigger headline features. This blog post concludes our series on the upcoming RStudio 1.3 release with a look at some of these little conveniences.</p><h3 id="global-replace">Global Replace</h3><p>RStudio has long had a <em>Find in Files</em> feature, which makes it possible to easily locate text in your project. If you&rsquo;re not familiar with this feature, try it out: press <em>Ctrl+Shift+F</em> (MacOS: <em>Cmd+Shift+F</em>), or choose <em>Find in Files&hellip;</em> from the <em>Edit</em> menu.</p><p>In RStudio 1.3, it&rsquo;s now possible to <strong>replace</strong> the text you found:</p><img align="center" src="global-replace.png"><p>After you&rsquo;ve done a search, switch to <em>Replace</em> view via the toggle, enter your new text, and click <em>Replace All</em>. It works with regular expressions, too.</p><h3 id="resizable-environment-columns">Resizable Environment Columns</h3><p>This really is a little thing, but it drove many of you nuts: the size of the columns in the Environment pane was fixed, so if your variables (or values) were long, it was awkward to try to see the whole thing. Now you can!</p><img align="center" src="resize-cols.png"><h3 id="new-file-templates">New File Templates</h3><p>Do you usually start new files with the same information? For example, do you usually include a header comment on your R scripts with metadata you know you&rsquo;ll find useful later? You can now have RStudio inject this header for you when you create a new file.</p><img align="center" src="script-template.png"><p>Create a template in <code>~/.config/rstudio/templates/default.R</code> (macOS/Linux) or <code>AppData/Roaming/RStudio/templates/default.R</code> (Windows) to try it out. It works with other file types, too; for example creating a file named <code>default.cpp</code> will set the content for new C++ files.</p><p>If you&rsquo;re an RStudio Server administrator, you can set templates for all the users on your server, which can be helpful if your organization has standards around file headers and structure. Read more in <a href="https://docs.rstudio.com/ide/server-pro/1.3.898-1/r-sessions.html#default-document-templates">Default Document Templates</a> from the admin guide.</p><h3 id="autosave">Autosave</h3><p>RStudio automatically keeps its own backup copy of files you&rsquo;re editing so that you don&rsquo;t lose changes. We&rsquo;ve improved this in two ways in the 1.3 release:</p><img align="center" src="auto-save.png"><ol><li>When enabled, RStudio will automatically save open files as they are changed. This is useful if you don&rsquo;t want to have to remember to manually save and just want your changes saved at all times.</li><li>You can also disable the auto-backup, or change the interval at which it is performed. This is useful if you are storing your projects on a cloud-synchronized folder, which sometimes struggle to keep up with RStudio&rsquo;s frequent writes to the backup copy.</li></ol><h3 id="terminal-ergonomics">Terminal Ergonomics</h3><p>You can now set the initial working directory of new terminals, so it&rsquo;s less likely you&rsquo;ll have to begin each terminal session with the same old <code>cd</code> command.</p><img align="center" src="terminal-options.png"><p>We&rsquo;ve also added a bunch of commands designed to reduce the number of times you need to manually paste cumbersome file and directory paths between the IDE and the terminal.</p><img align="center" src="terminal-commands.png"><p>Specifically, we&rsquo;ve added:</p><ul><li>A command to open a new terminal at location of current editor file</li><li>A command to insert the full path and filename of current editor file into the terminal</li><li>A command in the File pane to open a new terminal at File pane&rsquo;s current location</li><li>A command to change the terminal to current RStudio working directory</li></ul><h3 id="shiny-background-jobs">Shiny Background Jobs</h3><p>RStudio can now run Shiny applications as background jobs in the Jobs tab we added in RStudio 1.2.</p><img align="center" src="shiny-background.png"><p>This has a couple of advantages:</p><ul><li>You can continue to use the R console while your Shiny application runs!</li><li>Your Shiny application runs in a fresh R session. This makes it easier to keep your application&rsquo;s code reproducible, since any implicit dependencies will keep the application from running successfully in the background.</li></ul><p>Note, however, that you can&rsquo;t use RStudio&rsquo;s debugging interface with a Shiny application running in the background, since it is part of a separate R session.</p><h3 id="wrapup">Wrapup</h3><p>If you&rsquo;d like to try out any of these features, we welcome you to download the <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio Preview</a> and give them a spin!</p><p>We hope these little changes make a big difference in your day-to-day work, and we&rsquo;d love to hear your feedback on the <a href="https://community.rstudio.com/c/rstudio-ide">community forum</a>.</p><p>Finally, we&rsquo;re grateful to you, the R community, for the overwhelming number of ideas, support, and bug reports that have helped us build this release. We couldn&rsquo;t have done it without you. Watch this space for an announcement of the stable release soon!</p></description></item><item><title>RStudio 1.3 Preview: Remote Sessions in RStudio Desktop Pro</title><link>https://www.rstudio.com/blog/rstudio-1-3-preview-desktop-pro/</link><pubDate>Tue, 10 Mar 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-1-3-preview-desktop-pro/</guid><description><p><em>This blog post is part of a series on new features in RStudio 1.3, currently available as a <a href="https://www.rstudio.com/products/rstudio/download/preview/">preview release</a>.</em></p><p>Today, we&rsquo;re going to talk about an exciting new feature of RStudio Desktop Pro - the ability to connect to remote sessions running on an existing RStudio Server Pro instance. This new Desktop Pro feature allows you to launch sessions remotely from your desktop to do your work on a more powerful computing system than your local machine, while providing the enhanced look-and-feel of the desktop application. Alternatively, you can continue to run your desktop sessions on your local machine and run resource intensive background jobs on a remote server, allowing your local resources to be used for more focused tasks.</p><p>When connected to a remote server, the IDE looks almost exactly the same as it normally does, but it will indicate which server you are connected to, as well as provide a means for managing sessions on that server.</p><img align="center" src="remote-session-ide.png"><h2 id="prerequisites">Prerequisites</h2><p>In order to use RStudio Desktop Pro to launch remote sessions, you will first need to ensure you have an existing installation of RStudio Server Pro. You will also want to ensure that the version number of RStudio Server Pro matches that of the RStudio Desktop Pro version in use.</p><h2 id="adding-remote-session-servers">Adding Remote Session Servers</h2><p>Once RStudio Server Pro is running, you can add it Desktop Pro as a new <em>Session Server</em> by clicking on the <em>Session Server Settings Dialog</em> from within Desktop Pro, which you can find underneath the connection status dropdown in the upper right corner of the IDE.</p><img align="center" src="connection-status-dropdown.png"><p>Clicking on <em>Session Server Settings</em> will present you with a dialog where you can add and remove session servers, each one indicating a remote RStudio Server Pro instance that you can connect to.</p><img align="center" src="session-server-settings.png"><p>Upon clicking on the <code>Add</code> button, you will be shown the following dialog to add a session server. Simply give the server the desired name and add the base URL of your RStudio Server Pro instance.</p><img align="center" src="add-session-server.png"><p>For most installations, these settings are sufficient. However, some installations may want to add path mappings.</p><h3 id="path-mappings">Path Mappings</h3><p>Path mappings allow you to map local system paths to remote server paths that will automatically be replaced when running Launcher Jobs on that particular session server. For example, if you map a shared drive on your local Windows machine at <code>H:</code> and this maps to <code>/shared/code</code> on your remote session server, you can add a path mapping from <code>H:-&gt;/shared/code</code> to ensure Launcher job paths including local paths are properly rewritten to remote paths.</p><h2 id="switching-between-servers">Switching Between Servers</h2><p>You can switch between remote session servers via the aforementioned <em>Connection Status Dropdown</em> - simply select which server to connect to, and the desktop will reload, connected to the selected server.</p><p>By default, once you have session servers defined, you will be prompted to select where to start your session every time you start Desktop Pro - locally, or on one of the specified servers. You can change this behavior by changing the <code>Start new sessions</code> setting on the <em>Session Server Settings</em> dialog.</p><p>Additionally, if you hold down the <em>Alt</em> key while launching Desktop Pro, you will be prompted to select where to start your session, regardless of any other settings.</p><h2 id="try-it-out">Try it Out!</h2><p>If you&rsquo;d like to try out the new Desktop Pro, you can download the <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio 1.3 Preview</a>. For more detailed documentation on RStudio Pro features, see the <a href="http://docs.rstudio.com/ide/desktop-pro/1.3.881-1">admin guide</a>.</p></description></item><item><title>RStudio 1.3 Preview: Accessibility</title><link>https://www.rstudio.com/blog/rstudio-1-3-preview-accessibility/</link><pubDate>Wed, 04 Mar 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-1-3-preview-accessibility/</guid><description><p><em>This blog post is part of a series on new features in RStudio 1.3, currently available as a <a href="https://www.rstudio.com/products/rstudio/download/preview/">preview release</a>.</em></p><h2 id="overview">Overview</h2><p>Screen reader users, and anyone who operates software entirely via keyboard or alternate input devices, have not been able to use the RStudio IDE due to accessibility shortcomings within the software itself. Troublesome areas have also cropped up for those with low-vision, including color blindness, and those with auditory impairments (e.g. captioning for our online video resources).</p><p>RStudio is happy to announce that we have started identifying and tackling these issues. It will be an ongoing process, and moving forward we will continue making accessibility improvements to all our products, and incorporate accessibility considerations into how we design, build, test, document, and support our products. Please email accessibility-related questions and feedback to <a href="mailto:accessibility@rstudio.com">accessibility@rstudio.com</a>.</p><p>A lot of engineering work has gone into RStudio 1.3 to improve screen reader support, keyboard support, and other general accessibility improvements.</p><h2 id="accessibility-menu-and-options">Accessibility Menu and Options</h2><p>New to RStudio 1.3 is an accessibility submenu under the Help menu, and an accessibility panel in the Global Options dialog. These provide options of interest to screen reader and/or keyboard-only users. Some of these options will be discussed in relevant sections below.</p><div class="rstudio-showcase-row"><div class="rstudio-showcase-item"><img align="center" src="accessibility-options.png"></div></div><h2 id="keyboard-support">Keyboard Support</h2><p>People with motor disabilities often use computers without a mouse; they use a keyboard or a variety of specialized input devices that behave like keyboards. Thus, accessible keyboard support is crucial for them to be successful using RStudio. The same is true for people who are blind; they rely on the keyboard, in conjunction with a screen reader.</p><p>Important considerations for accessible keyboard support are:</p><ul><li>Can user identify where keyboard focus is currently located?</li><li>Without using a mouse, can keyboard focus be moved to all parts of the user-interface, and can those user interface elements be operated via the keyboard?</li><li>Does focus move around in a consistent and predictable manner?</li></ul><p>If you are a mouse user then this may not seem important. Typing generally happens where you last clicked the mouse or where you see a blinking caret, and if you need to click on a widget, you&rsquo;ll click on the widget. However, being able to reduce use of the mouse and keep your hands on the keyboard can provide productivity and efficiency gains for all users, and possibly even reduce Repetitive Stress Injury (RSI).</p><h3 id="focus-indicator">Focus Indicator</h3><p>For the sighted or partially sighted keyboard user not using a screen reader, a visual indicator of focus location is critical. Focus can be moved either through RStudio-specific keyboard shortcuts (e.g. Ctrl+2 moves focus to the RStudio console), or via Tab and Shift+Tab, and in some cases, the arrow keys.</p><p>When moving focus with the keyboard RStudio 1.3 now shows a blue focus rectangle around the currently focused control. When moving focus by clicking the mouse, the focus rectangles are often kept hidden but tapping the Shift key will temporarily trigger display of the focus rectangle.</p><p>Some areas of RStudio indicate focus via a caret, such as the blinking cursor in the Console when it has focus, instead of a focus rectangle.</p><p>Here is an animated GIF screen capture showing navigation through the New Project dialog with the Tab key.</p><div class="rstudio-showcase-row"><div class="rstudio-showcase-item"><img src="focus-indicator-loop.gif" alt="Animation showing Tabbing through New Project dialog"></div></div><p>In prior versions of RStudio, this dialog was completely inaccessible via the keyboard. It did not show where focus was located, nor could the controls be activated via the spacebar as they can in 1.3.</p><h3 id="modal-dialog-keyboard-support">Modal Dialog Keyboard Support</h3><p>The above animation demonstrates that when focus is on the last control in the dialog (the Cancel button), hitting the Tab key wraps focus around to the start of the dialog and sets focus on the first control (the New Directory button in the example). In previous versions of RStudio, hitting Tab at the end of a modal dialog would send focus to user interface areas outside the dialog, areas that were intended to be &ldquo;under&rdquo; the dialog, and disabled. For example, it was possible to get focus back into the R console and enter commands while modal dialogs were displayed, breaking both the modal nature of the dialogs, and putting the RStudio application into undefined and untested states. This also had serious usability implications for screen reader users.</p><p>This has been fixed in 1.3, and focus now correctly stays in a modal dialog when using Tab or Shift+Tab at the end or beginning of the dialog.</p><h3 id="keyboard-operable-controls">Keyboard Operable Controls</h3><p>Once a control has focus it needs to respond to standardized keyboard interaction patterns. For example, buttons should generally respond to spacebar the same way as they would to a mouse click, tab controls should respond to the arrow keys to move between the tabs, and so forth.</p><p>Most controls in the RStudio 1.3 IDE have been updated to respect these conventions, though work remains to improve keyboard support in areas such as the Files pane and the Environment pane, and others. The final RStudio 1.3 accessibility documentation will outline areas where keyboard support is still incomplete, with workarounds whenever possible.</p><h3 id="main-menu-rstudio-server">Main Menu (RStudio Server)</h3><p>With RStudio Desktop the main menu (File, Edit, and so on) is a standard application menu, and responds to standard system keyboard shortcuts. On Microsoft Windows, for example, tapping the Alt key will put focus on the menubar and then arrow keys can be used to move around; or Alt+F will bring up the File menu directly.</p><p>On RStudio Server, the main menu is actually part of the web page, and thus requires slightly different shortcuts. Tapping Alt (on Windows), for example, puts focus on the Web Browser&rsquo;s main menu, not the RStudio menu.</p><p>RStudio Server 1.3 adds customizable keyboard shortcuts to get focus to the main menu, at which point the arrow keys can be used to navigate around the menus in essentially the same manner as standard application menus.</p><p>Each menu has its own shortcut, but a couple of useful shortcuts to get started with are <strong>Alt+Shift+F</strong> to get to the File menu and <strong>Alt+Shift+H</strong> for the Help menu. Those work on Windows and Linux. On Mac, the keys are <strong>Ctrl+Option+F</strong> and <strong>Ctrl+Option+H</strong>. A full list of shortcuts is in the keyboard help (Help / Keyboard Shortcuts Help).</p><p>Prior to RStudio Server 1.3, the only way to get focus on the main menu was via the mouse.</p><h2 id="resize-splitters-with-keyboard">Resize Splitters with Keyboard</h2><p>RStudio 1.3 supports resizing the panes using the keyboard; previously the only way to do this was by dragging the splitters with the mouse.</p><p>To activate, use View / Panes / Adjust Left Splitter (or Right Splitter, or Center Splitter), then use the arrow keys to move it in larger increments. Use Shift+Arrow keys to make smaller adjustments. When done, just continue using RStudio (e.g. you can use Tab key to move focus off the splitters).</p><p>In addition to opening up this ability to keyboard-only users, it has the side-effect of making it possible for iPad users to adjust the splitters with their keyboard (the splitters are not currently operable via touch).</p><h3 id="tab-key-traps">Tab Key Traps</h3><p>In the RStudio source editor and console, the Tab key is normally used to indent code and to trigger autocomplete suggestions, thus cannot be used to move focus out of the text editor or the console. This is known as a Tab trap in accessibility parlance.</p><p>The most common way around this is using other RStudio keyboard shortcuts to get focus elsewhere (Ctrl+1 to move focus to source editor, Ctrl+2 to the console, etc.), or with screen reader commands.</p><p>Additionally, a new accessibility option is available to change this behavior. The &ldquo;Tab key always moves focus&rdquo; setting is available in the new Accessibility Options pane in the Global Options dialog, and also via the new Help / Accessibility / Focus submenu. When enabled, the Tab key will now permit Tab and Shift+Tab to move focus in and out of these areas of the user interface.</p><h3 id="consistent-focus-behavior">Consistent Focus Behavior</h3><p>A number of issues remain in terms of keyboard focus location changing in a consistent and predictable fashion. For example, after displaying then closing the RStudio Server main menu, or a dialog box such as Global Options, the keyboard focus will often end up in an indeterminate location and the user must then use Ctrl+2 to put focus back on the console (for example). Significant improvements in this area are planned as part of ongoing accessibility work.</p><h2 id="contrast">Contrast</h2><p>User interfaces must have enough contrast between text and its background to be readable by people with moderately low vision. The Web Content Accessibility Guidelines (WCAG) define the measuring technique and recommended minimum values to accommodate a wide range of low-vision scenarios.</p><p>Many areas of the RStudio interface have been reviewed and adjustments made to bring contrast up to meet minimum WCAG 2.1 AA standards. Some areas of the interface still require work, especially some non-textual elements such as toolbar buttons.</p><p>RStudio supports many visual themes, but only the appearance of the default light theme has been evaluated. Future work will include creation of specific high-contrast themes that go well beyond the recommended minimum contrast recommendations.</p><h2 id="color-blindness">Color Blindness</h2><p>RStudio 1.3 does not include themes specifically designed for color blindness. We hope to include such themes in the future. In the meantime, some themes have been created by the RStudio community, for example the Pebble-Safe themes by Desi Quintans at <a href="https://github.com/DesiQuintans/Pebble-safe">https://github.com/DesiQuintans/Pebble-safe</a> (linked with their permission).</p><h2 id="reduced-motion">Reduced Motion</h2><p>The new &ldquo;Reduce user interface animations&rdquo; setting does very much what it suggests: when enabled, most of the subtle animations that take place in the RStudio user interface are suppressed. An example is when zooming or unzooming a pane, such as with Ctrl+Shift+2 to zoom the console. Normally zooming and unzooming causes the other panes to slide to their new positions. With animations disabled, these changes are instantaneous.</p><p>The option to reduce motion is provided both for those with vestibular disorders, who might prefer to disable these motions, and also for screen reader users, where these time-delayed changes can sometimes confuse the screen reader software. In fact, the &ldquo;reduce animations&rdquo; setting is enabled automatically when screen reader support is turned on.</p><h2 id="screen-reader-support-server-vs-desktop">Screen Reader Support (Server vs Desktop)</h2><p><strong>RStudio Server</strong> and <strong>RStudio Server Pro</strong> 1.3 are significantly improved over prior versions in their screen reader capabilities. This support is still very much a work-in-progress, and improvements will continue to be made in subsequent releases of RStudio to bring the screen reader experience up to the standards of accessibility and usability that users need and expect to get their work done.</p><p><strong>RStudio Desktop</strong> 1.3 (for Windows, Linux, and macOS) has most of the same improvements seen in RStudio Server, but due to underlying accessibility issues introduced by components used to build RStudio Desktop, we cannot yet recommend its use via a screen reader except in an experimental fashion. We are actively working with the developer of these components to get these issues addressed, and will be releasing updates with these fixes as soon as we can.</p><h3 id="enable-screen-reader-support">Enable Screen Reader Support</h3><p>When using RStudio Server 1.3 with a screen reader, it is essential to enable screen reader support. Once set, this setting is persisted for future RStudio sessions. The option is available to toggle via the main menu at <strong>Help / Accessibility / Screen Reader Support</strong>, or via the Global Options dialog, under the Accessibility panel.</p><p>Thanks to the new configuration and settings system in RStudio 1.3, as discussed in a <a href="https://blog.rstudio.com/2020/02/18/rstudio-1-3-preview-configuration/">prior blog post</a>, it is possible for an administrator to pre-enable this setting for an entire server, or for individual users. The important settings are:</p><pre><code> &quot;reduced_motion&quot;: true,&quot;enable_screen_reader&quot;: true</code></pre><p>Note that the above technique is specific to RStudio Server; preconfiguring screen reader support for RStudio Desktop will be done in a different way and will be documented in the forthcoming accessibility documentation.</p><h3 id="tested-screen-reader-and-browser-combinations">Tested Screen Reader and Browser Combinations</h3><p>RStudio Server screen reader support has primarily been tested with current versions of NVDA on Google Chrome for Windows, and VoiceOver on Safari for macOS. Some testing has also been performed with NVDA and Firefox, and JAWS and Chrome.</p><p>The goal is to support all major screen readers and browsers, so please report issues with any of these.</p><h3 id="screen-reader-focus-location">Screen Reader Focus Location</h3><p>Screen reader software announces (via voice, or a refreshable braille display) details about each control as it receives focus. It is critical that each control is properly labeled both in terms of identifying text (for example, the text on a button or next to a checkbox), the type of control (checkbox, button, tab control, menu, toolbar, and so forth), and the current state of the control (checked, disabled, selected, value of the text in a text box, etc.).</p><p>Most areas of RStudio 1.3 have been updated to ensure this is the case. In prior versions, many controls were missing some or all of this identifying information, making it impossible to meaningfully navigate via a screen reader.</p><h3 id="landmarks-and-regions">Landmarks and Regions</h3><p>The RStudio 1.3 page has been annotated to divide it up into named landmarks (also known as regions in some screen readers), providing another way to understand and move around the user interface. The regions in the default visual layout of RStudio are:</p><ul><li>Banner (the RStudio logo in the upper-left)</li><li>Navigation (the main menu and toolbar)</li><li>Main Workbench (the area below the main menu and toolbar, containing 4 quadrants)<ul><li>TabSet1 (upper-right quadrant, containing various feature tabs including Environment and History)</li><li>TabSet2 (lower-right quadrant, containing various feature tabs including Files and Help)</li><li>Source (upper-left quadrant, contains files open in text editor)</li><li>Console (lower-left, contains Console and other optional tabs such as Terminal and Jobs)</li></ul></li><li>Content Info Warning Bar (closable messagebar occasionally shown below the workbench)</li></ul><h3 id="live-announcements">Live Announcements</h3><p>RStudio makes use of live-announcements to notify the screen reader user of certain events. All potential automatic announcements are listed in the Announcements tab of the Accessibility Preferences pane in Global Options, and can be individually enabled or disabled.</p><p>For example, when a command is executed in the R console, output from that command is announced by the screen reader (up to a limit of 25 lines per command, configurable in the Accessibility Options). To read additional output, use the screen reader&rsquo;s virtual cursor mechanism to navigate the output area. Focus may also be moved to the output area via the <strong>Help / Accessibility / Focus / Focus Console Output</strong> command, shortcut key is <strong>Alt+Shift+2</strong> on Windows and Linux, <strong>Shift+Option+2</strong> on Mac.</p><p>Some announcements may only be triggered on-demand by the user. For example, <strong>Help / Accessibility / Speak / Speak Text Editor Location</strong> will read details of the current location in the text editor, including line and column number, context, file type, and file name.</p><h2 id="future">Future</h2><p>Work continues to bring screen reader support to RStudio Desktop, and to expose more of the RStudio IDE features to screen reader and keyboard-only users.</p><h2 id="try-it-out">Try it Out!</h2><p>If you&rsquo;d like to give the new accessibility enhancements a try, we&rsquo;d very much welcome your feedback on our <a href="https://community.rstudio.com/c/rstudio-ide">community forum</a>. You can download the RStudio 1.3 preview here:</p><p><a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio 1.3 Preview</a></p><p>For more, check out out the <a href="https://support.rstudio.com/hc/en-us/articles/360044226673-RStudio-Accessibility-Features">support article on RStudio accessibility</a>.</p><p>Additionally, we have an alpha program where we provide access to a server running the latest RStudio Server build with accessibility features enabled. Apply for an account by emailing <a href="mailto:accessibility@rstudio.com">accessibility@rstudio.com</a> and provide some background on the accessibility functionality you are interested in evaluating.</p></description></item><item><title>RStudio Package Manager 1.1.2 - Windows</title><link>https://www.rstudio.com/blog/rstudio-package-manager-1-1-2-windows/</link><pubDate>Thu, 27 Feb 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-package-manager-1-1-2-windows/</guid><description><p>RStudio Package Manager 1.1.2 introduces beta support for Windows packagebinaries. These binaries make it easier and faster to install R packages onWindows Desktop. With this release, all the benefits of Package Manager areavailable to desktop users including versioned repositories, curated subsets ofCRAN, centralized access to CRAN, Git, local packages, and usage tracking.Now data scientists on Windows can easily share work, collaborate, and spendmore time doing analysis instead of debugging packages.</p><img src="rspm-112-os.png" caption="Pick Windows OS as client" alt= "Pick Windows OS as client" class="center" width = "50%"><h2 id="other-updates">Other Updates</h2><p>In addition to adding support for Windows package binaries, the 1.1.2 releaseincludes:</p><ul><li><a href="https://docs.rstudio.com/rspm/1.1.2/admin/appendix-configuration.html#appendix-configuration-eviction">Eviction policies</a> that make it easier for administrators to manage the amount of utilized space.</li><li>Package binaries and system dependencies are now available for CentOS/RHEL 8</li><li>A <a href="https://docs.rstudio.com/rspm/1.1.2/admin/changing-database-provider.html">migration utility</a> to help upgrade from a single node installation to a highly available installation using Postgres.</li><li>A performance improvement that prevents increased CPU usage over time on installations using SQLite.</li></ul><p>Please review the <a href="https://docs.rstudio.com/rspm/news">full release notes</a>.</p><blockquote><h4 id="upgrade-planning">Upgrade Planning</h4><p>Upgrading to 1.1.2 from 1.1.0 is a minor upgrade. However, be aware that thisupgrade may take up to 30 minutes to complete. If you are upgrading a multi-nodeinstallation, allow the first node to update completely before upgrading othernodes. If you are upgrading from an earlier version, be sure to consult the release notes for theintermediate releases, as well.</p></blockquote><p>Package management is critical for making your data science reproducible, overtime, and across your organization. Wondering where you should start? <a href="mailto:sales@rstudio.com">Emailus</a>, our product team is happy to help!</p><h4 id="new-to-rstudio-package-manager">New to RStudio Package Manager?</h4><p><a href="https://rstudio.com/products/package-manager/">Download</a> the 45-day evaluationtoday to see how RStudio Package Manager can help you, your team, and yourentire organization access and organize R packages. Learn more with our <a href="https://demo.rstudiopm.com">onlinedemo server</a> or <a href="https://resources.rstudio.com/webinars/introduction-to-the-rstudio-package-manager-sean-lopp">latest webinar</a>.</p><ul><li><a href="https://docs.rstudio.com/rspm/admin">Admin Guide</a></li><li><a href="https://www.rstudio.com/wp-content/uploads/2018/07/RStudio-Package-Manager-Overview.pdf">Overview PDF</a></li><li><a href="https://resources.rstudio.com/webinars/introduction-to-the-rstudio-package-manager-sean-lopp">Introductory Webinar</a></li><li><a href="https://demo.rstudiopm.com">Online Demo</a></li></ul></description></item><item><title>RStudio 1.3 Preview: Integrated Tutorials</title><link>https://www.rstudio.com/blog/rstudio-1-3-integrated-tutorials/</link><pubDate>Tue, 25 Feb 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-1-3-integrated-tutorials/</guid><description><p><em>This blog post is part of a series on new features in RStudio 1.3, currently available as a <a href="https://www.rstudio.com/products/rstudio/download/preview/">preview release</a>.</em></p><p>We&rsquo;re excited to announce that RStudio v1.3 will gain a newly-minted pane: the <strong>Tutorial</strong> pane, used to host tutorials powered by the <a href="https://rstudio.github.io/learnr/"><strong>learnr</strong></a> package.</p><p>The <strong>learnr</strong> package makes it easy to turn any <a href="https://rmarkdown.rstudio.com/">R Markdown</a> document into an interactive tutorial. Here are some example tutorials from the <strong>learnr</strong> package, hosted on <a href="https://www.shinyapps.io/">shinyapps.io</a>:</p><div class="rstudio-showcase-row"><div class="rstudio-showcase-item"><h3>Summarizing Data</h3><a href="https://learnr-examples.shinyapps.io/ex-data-summarise/"><img src="tutorial-ex-data-summarise.png" alt="Summarizing Data with R"></a></div><div class="rstudio-showcase-item"><h3>Filtering Data</h3><a href="https://learnr-examples.shinyapps.io/ex-data-filter/"><img src="tutorial-ex-data-filter.png" alt="Filtering Data with R"></a></div></div><p>A <em>learnr</em> tutorial can include any of the following:</p><ol><li>Narrative, figures, illustrations and equations,</li><li>Code exercises (R code chunks that users can edit and execute),</li><li>Quiz questions,</li><li>Videos,</li><li>Interactive Shiny components.</li></ol><p>With the <strong>Tutorial</strong> pane, it is now possible to work through a <em>learnr</em> tutorial directly from the comfort of the RStudio IDE. You can use the RStudio IDE to learn, reflect, and tinker as you work through your running tutorial.</p><img align="center" src="rstudio-tutorials.png"><h2 id="browsing-tutorials">Browsing Tutorials</h2><p>RStudio will automatically index and display the tutorials provided by the installed R packages in your R library paths:</p><img align="center" src="rstudio-tutorials-list.png"><p>You can use this list to browse and run tutorials at your convenience.</p><h2 id="authoring-tutorials">Authoring Tutorials</h2><p>The <strong>learnr</strong> package bundles a selection of tutorials that will introduce users to R, RStudio and the <a href="https://www.tidyverse.org/">Tidyverse</a>, but we hope the R community at large will find this a useful medium for creating and sharing their own R tutorials. If you&rsquo;re interesting in authoring your own <em>learnr</em> tutorials, please see the <a href="https://rstudio.github.io/learnr/publishing.html">Publishing</a> article on the <a href="https://rstudio.github.io/learnr/"><strong>learnr</strong> website</a>.</p><h2 id="try-it-out">Try it Out!</h2><p>The <strong>Tutorial</strong> pane is available in the latest iteration of the RStudio v1.3 preview release. You can download the latest preview release here:</p><p><a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio 1.3 Preview</a></p><p>We&rsquo;d also like to take this time to highlight a small selection of R packages developed and shared by members of the R community that provide their own <em>learnr</em> tutorials:</p><ul><li><p><a href="https://vegawidget.github.io/vegawidget/">vegawidget</a> can be used to render charts specified using the <a href="https://vega.github.io/vega/">Vega</a> visualization grammar, and includes a <em>learnr</em> tutorial exploring how the package can be used;</p></li><li><p><a href="https://rstudio.github.io/sortable/">sortable</a> includes a tutorial showing how sortable widgets can be included in your own <em>learnr</em> tutorials;</p></li><li><p><a href="https://cran.r-project.org/package=sur">sur</a> includes a tutorial that (quite comprehensively!) explores the manipulation of R data frames, including how missing data can be handled.</p></li></ul><p>If you&rsquo;d like to try out the tutorials bundled in these packages, you can install the packages from CRAN with:</p><pre><code>install.packages(c(&quot;vegawidget&quot;, &quot;sortable&quot;, &quot;sur&quot;))</code></pre><p>and their associated tutorials will automatically become available in the <strong>Tutorial</strong> pane.</p><hr><p>Questions? Comments? Please share your feedback with us on the RStudio <a href="https://community.rstudio.com/c/rstudio-ide">community forums</a>.</p></description></item><item><title>RStudio 1.3 Preview: Configuration and Settings</title><link>https://www.rstudio.com/blog/rstudio-1-3-preview-configuration/</link><pubDate>Tue, 18 Feb 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-1-3-preview-configuration/</guid><description><p><em>This blog post is part of a series on new features in RStudio 1.3, currently available as a <a href="https://www.rstudio.com/products/rstudio/download/preview/">preview release</a>.</em></p><p>Today, we&rsquo;re going to talk about a number of improvements we&rsquo;ve made to RStudio 1.3 around configuration and settings. To set the stage, here&rsquo;s how you configure RStudio today:</p><img align="center" src="global-options.png"><p>This point-and-click dialog makes it easy for users to select the settings they want, but has a couple of limitations:</p><ol><li>For users, there is no way to back up or save settings in e.g., a <a href="https://dotfiles.github.io/">dotfiles repo</a>, nor a way to view or manipulate preferences with external tools.</li><li>For administrators, there is no way to establish defaults for users.</li></ol><p>In RStudio 1.3, we&rsquo;ve overhauled the settings and configuration system to address both of these issues, and along the way we&rsquo;ve made several portions of the IDE more amenable to configuration.</p><h2 id="user-preferences">User Preferences</h2><p>All the preferences in the Global Options dialog (and a number of other preferences that aren&rsquo;t) are now saved in a simple, plain-text JSON file named <code>rstudio-prefs.json</code>. Here&rsquo;s an example:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-json" data-lang="json">{<span style="color:#062873;font-weight:bold">&#34;posix_terminal_shell&#34;</span>: <span style="color:#4070a0">&#34;bash&#34;</span>,<span style="color:#062873;font-weight:bold">&#34;editor_theme&#34;</span>: <span style="color:#4070a0">&#34;Night Owl&#34;</span>,<span style="color:#062873;font-weight:bold">&#34;wrap_tab_navigation&#34;</span>: <span style="color:#007020;font-weight:bold">false</span>}</code></pre></div><p>The example above instructs RStudio to use the <code>bash</code> shell for the <strong>Terminal</strong> tab, to apply the <em>Night Owl</em> theme to the IDE, and to avoid wrapping around when navigating through tabs. All other settings will have their default values.</p><p>By default, this file lives in <code>AppData/Roaming/RStudio</code> on Windows, and <code>~/.config/rstudio</code> on other systems. While RStudio writes this file whenever you change a setting, you can also edit it yourself to change settings. You can back it up, or put it in a version control system. You&rsquo;re in control!</p><p>If you&rsquo;re editing this file by hand, you&rsquo;ll probably want a reference. A full list of RStudio&rsquo;s settings, along with their data types, allowable values, etc., can be found in the <a href="https://docs.rstudio.com/ide/server-pro/1.3.820-1/session-user-settings.html">Session User Settings</a> section of the RStudio Server Professional Administration Guide.</p><h2 id="administration-and-the-xdg-standard">Administration and the XDG Standard</h2><p>If you&rsquo;re an administrator of an RStudio Server, you can establish defaults for any setting by using a global set of user preferences, placed here:</p><pre><code>/etc/rstudio/rstudio-prefs.json</code></pre><p>RStudio&rsquo;s new configuration system complies with the <a href="https://specifications.freedesktop.org/basedir-spec/basedir-spec-latest.html">XDG Base Directory Specification</a>. This means that in addition to using XDG defaults for most directories, it is also possible to customize the location using environment variables. For example, you can set <code>XDG_CONFIG_HOME</code> for your users so that their configuration is loaded from somewhere other than <code>~/.config</code>, or <code>XDG_CONFIG_DIRS</code> to establish a different folder for server-wide configuration.</p><h2 id="the-configuration-folder">The Configuration Folder</h2><p>The user preferences aren&rsquo;t the only thing that lives in the configuration folder. In RStudio 1.3, we&rsquo;ve reorganized a number of user-level files and settings so that they&rsquo;re all in the same place. This makes your RStudio configuration much more portable; simply unpacking a backup of this folder will make it possible to apply all of your RStudio customizations at once.</p><p>Here&rsquo;s what&rsquo;s inside:</p><table><thead><tr><th>File/Folder</th><th>Content</th></tr></thead><tbody><tr><td><code>rstudio-prefs.json</code></td><td>User preferences</td></tr><tr><td><code>dictionaries/</code></td><td>Custom spelling dictionaries</td></tr><tr><td><code>keybindings/</code></td><td>Editor and workbench keybindings, in JSON format</td></tr><tr><td><code>snippets/</code></td><td>Console and source snippets (<code>*.snippet)</code></td></tr><tr><td><code>templates/</code></td><td>Default content for new files</td></tr><tr><td><code>themes/</code></td><td>Custom color themes (<code>*.rstheme</code>)</td></tr></tbody></table><p>Every one of these elements can now be configured both globally (in e.g., the <code>/etc/rstudio</code> configuration folder) and per-user (in e.g., the <code>~/.config/rstudio</code> folder).</p><p>So, for example, an administrator could pre-install custom themes for their users by placing them in <code>/etc/rstudio/themes/</code>, and then instruct RStudio to default to one of the custom themes by changing the <code>editor_theme</code> setting in <code>/etc/rstudio/rstudio-prefs.json</code>. Or, they could establish a system-wide default template for <code>.R</code> files in <code>/etc/rstudio/templates/default.R</code>.</p><p>More information is available in the Administration Guide here:</p><p><a href="https://docs.rstudio.com/ide/server-pro/1.3.820-1/r-sessions.html#customizing-session-settings">Customizing Session Settings</a></p><h2 id="try-it-out">Try it Out!</h2><p>If you&rsquo;d like to give the new configuration system a spin, we&rsquo;d very much welcome your feedback on our <a href="https://community.rstudio.com/c/rstudio-ide">community forum</a>. You can download the RStudio 1.3 preview here:</p><p><a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio 1.3 Preview</a></p></description></item><item><title>Shiny Contest 2020 is here!</title><link>https://www.rstudio.com/blog/shiny-contest-2020-is-here/</link><pubDate>Wed, 12 Feb 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-contest-2020-is-here/</guid><description><p>Over the years, we have loved interacting with the Shiny community and loved seeing and sharing all the exciting apps, dashboards, and interactive documents Shiny developers have produced. So last year we launched the Shiny contest and we were overwhelmed (in the best way possible!) by the <a href="https://community.rstudio.com/tag/shiny-contest">136 submissions</a>! Reviewing all these submissions was incredibly inspiring and humbling. We really appreciate the time and effort each contestant put into building these apps, as well as submitting them as fully reproducible artifacts via <a href="https://rstudio.cloud/">RStudio Cloud</a>.</p><p>And now it&rsquo;s time to announce the 2020 Shiny Contest, which will run from 29 January to 20 March 2020. (Actually, we announced it at rstudio::conf(2020), but now it&rsquo;s time to make it blog-official!)</p><p>You can submit your entry for the contest by filling the form at <a href="http://rstd.io/shiny-contest-2020">rstd.io/shiny-contest-2020</a>. The form will generate a post on <a href="http://community.rstudio.com/">RStudio Community</a>, which you can then edit further if you like. The deadline for submissions is 20 March 2020 at 5pm ET. We strongly recommend getting in your submission a few hours before this time so that you have ample time to resolve any last minute technical hurdles.</p><p>You are welcome to either submit your existing Shiny apps or create one in two months. And there is no limit on the number of entries one participant can submit. Please submit as many as you wish!</p><h2 id="requirements">Requirements</h2><ul><li>Data and code used in the app should be publicly available and/or openly licensed.</li><li>Your app should be <a href="http://shiny.rstudio.com/articles/shinyapps.html">deployed on shinyapps.io</a>.</li><li>Your app should be in an <a href="http://rstudio.cloud">RStudio Cloud</a> project.<ul><li>If you’re new to <a href="http://rstudio.cloud">RStudio Cloud</a> and <a href="http://www.shinyapps.io/">shinyapps.io</a>, you can create an account for free. Additionally, you can find instructions specific to this contest <a href="https://docs.google.com/document/d/1p-5Ls2kEU9TUoUTQfBNqwEMPoAL0eNHceKRXDZ1koXc/edit?usp=sharing">here</a> and find the general RStudio Cloud guide <a href="https://rstudio.cloud/learn/guide">here</a>.</li></ul></li></ul><h2 id="criteria">Criteria</h2><p>Just like last year, apps will be judged based on technical merit and/or on artistic achievement (e.g., UI design). We recognize that some apps may excel in one of these categories and some in the other, and some in both. Evaluation will be done keeping this in mind.</p><p>Evaluation will also take into account the narrative on the contest submission post as well as the feedback/reaction of other users on RStudio Community. We recommend crafting your submission post with this in mind.</p><h2 id="awards">Awards</h2><h3 id="honorable-mention-prizes">Honorable Mention Prizes:</h3><ul><li>One year of shinyapps.io Basic plan</li><li>A bunch of hex stickers of RStudio packages</li><li>A spot on the Shiny User Showcase</li></ul><h3 id="runner-up-prizes">Runner Up Prizes:</h3><p>All awards above, plus</p><ul><li>Any number of RStudio t-shirts, books, and mugs (worth up to $200)</li></ul><h3 id="grand-prizes">Grand Prizes:</h3><p>All awards above, and</p><ul><li>Special &amp; persistent recognition by RStudio in the form of a winners page, and a badge that&rsquo;ll be publicly visible on your RStudio Community profile</li><li>Half-an-hour one-on-one with a representative from the RStudio Shiny team for Q&amp;A and feedback</li></ul><p>Please note that we may not be able to send t-shirts, books, or other items larger than stickeers to non-US addresses.</p><p>The names and work of all winners will be highlighted in the Shiny User Showcase and we will announce them on RStudio’s social platforms, including community.rstudio.com (unless the winner prefers not to be mentioned). This year’s competition will be judged by Winston Chang and Mine Çetinkaya-Rundel.</p><p>We will announce the winners and their submissions on the RStudio blog, RStudio Community, and also on Twitter.</p><h2 id="need-inspiration">Need inspiration?</h2><p>Browse the winning apps and honorable mentions of last year&rsquo;s contest on the <a href="https://shiny.rstudio.com/gallery/">Shiny User Showcase</a> as well as the contest submissions at the <a href="https://community.rstudio.com/tag/shiny-contest">shiny-contest tag on RStudio Community</a>.</p></description></item><item><title>RStudio 1.3 Preview: Real Time Spellchecking</title><link>https://www.rstudio.com/blog/rstudio-1-3-preview-real-time-spellchecking/</link><pubDate>Tue, 11 Feb 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-1-3-preview-real-time-spellchecking/</guid><description><p>As part of the upcoming 1.3 release of the RStudio IDE we are excited to show you a preview of the real time spellchecking feature that we&rsquo;ve added.</p><p>Prior to RStudio 1.3, spellchecking was an active process requiring the user to step word by word in a dialog. By integrating spellchecking directly into the editor we can check words as you&rsquo;re typing them and are able to give suggestions on demand.</p><p>As with the prior spellcheck implementation (that is still invoked with the toolbar button) the new spellchecking is fully Hunspell compatible and any previous custom dictionaries can be used.</p><h2 id="interface-and-usage">Interface and usage</h2><p>Using the new real time spellchecking is simple with a standard and familiar interface. A short period of time after typing in a saved file of a supported format (R Script, R Markdown, C++, and more) a yellow underline will appear under words that don&rsquo;t pass the spellcheck of the loaded dictionary. Right click the word for suggestions, to ignore it, or to add it to your own local dictionary to be remembered for the future. The application will only check comments in code files and raw text in markdown files.</p><img align="center" style="padding: 35px;" src="context_menu_example.png"><p>In consideration of the domain specific language of RStudio users we have collected, and are constantly adding to, a <a href="https://github.com/rstudio/rstudio/blob/master/src/gwt/src/org/rstudio/studio/client/common/spelling/domain_specific_words.csv">whitelisted set of words</a> to reduce the noise of the spellcheck on initial usage. Sadly, <em>hypergeometric</em> and <em>reprex</em> are not yet listed in universal language dictionaries.</p><h2 id="customization">Customization</h2><p>The new spellcheck feature might not be for you, and that&rsquo;s ok. If you don&rsquo;t want your tools constantly questioning your spelling this feature is easy to turn off. Navigate to <strong>Tools -&gt; Global Options -&gt; Spelling</strong> and disable the &ldquo;Use real time spellchecking&rdquo; check box.</p><p>If you are typing in another language or really insist that this section should be spelled <em>customisation</em> you can switch to a different dictionary with the same dropdown interface as RStudio 1.2. Both the manual and real time spellcheck options will load the same dictionary. If none of the dictionaries shipped with RStudio fit your needs you can add any Hunspell compatible dictionary by clicking the <em>Add&hellip;</em> button next to the list of custom dictionaries.</p><img align="center" style="padding: 35px;" src="spellcheck_options.png"><h3 id="just-the-tip-of-the-iceberg">Just the tip of the iceberg</h3><p>This is just a small sample of what&rsquo;s to come in RStudio 1.3 and we&rsquo;re excited to show you more of what we&rsquo;ve been working hard on since 1.2 in the coming weeks. Stay tuned for more!</p><p>You can download the new RStudio 1.3 Preview release to try it out yourself:</p><p><a href="https://www.rstudio.com/products/rstudio/download/preview/">Download RStudio 1.3 Preview</a></p><p>Feedback is welcome on the <a href="https://community.rstudio.com/c/rstudio-ide">RStudio IDE Community Forum</a>.</p></description></item><item><title>RStudio, PBC</title><link>https://www.rstudio.com/blog/rstudio-pbc/</link><pubDate>Wed, 29 Jan 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-pbc/</guid><description><p>We started the RStudio project because we were excited and inspired by R. The <a href="http://www.r-project.org/contributors.html">creators of R</a> provided a flexible and powerful foundation for statistical computing; then made it free and open so that it could be improved collaboratively and its benefits could be shared by the widest possible audience.</p><p>It&rsquo;s better for everyone if the tools used for research and science are free and open. Reproducibility, widespread sharing of knowledge and techniques, and the leveling of the playing field by eliminating cost barriers are but a few of the shared benefits of free software in science.</p><p>RStudio&rsquo;s mission is to create free and open-source software for data science, scientific research, and technical communication. To that end, we currently lead contributions to over 250 open-source projects. To support this work, RStudio also sells a variety of commercial software products that enable teams to adopt open-source data science software at scale; along with online services to make it easier to learn and use data science tools over the web.</p><p>Melding the mission of creating open-source software with the imperatives of sustaining a commercial enterprise is a tricky business. It’s especially so today, as corporations are frequently forced into doing whatever it takes to sustain growth and provide returns to shareholders, even against the interests of their own customers! Users should be wary of the underlying motivations and goals of software companies, especially ones that provide the essential tools required to carry out their work.</p><p>In order to truly fulfill our open-source mission, RStudio needs to be uncompromisingly run for the benefit of all stakeholders, including employees, customers, and the community at large. Additionally, RStudio needs to earn the trust of its users, not just through its actions, but also its formal corporate charter. Until recently, there was no means under US corporate law for companies to put their mission and other stakeholders on equal footing with shareholders. Fortunately, thanks to the <a href="https://en.wikipedia.org/wiki/Benefit_corporation">B-Corp</a> movement, we now have a tool to do so: the Public Benefit Corporation. <strong>Today, we are thrilled to announce that RStudio has become a Public Benefit Corporation.</strong> RStudio, Inc. is now RStudio, PBC.</p><p>By becoming a PBC, we have codified our open-source mission into our charter, which means that our corporate decisions must both align with this mission, as well as balance the interests of community, customers, employees, and shareholders. As a PBC, RStudio will publish an annual report that describes the public benefit we have created, along with how we seek to provide public benefits in the future. The first of these annual reports is <a href="https://www.rstudio.com/about/pbc-report-2019">available on our website</a> today.</p><p>As part of this transition, we have also been recognized as a <a href="https://bcorporation.net/">Certified B Corporation (B Corp)</a>, joining a group of for-profit companies assessed to meet the highest standards of social and environmental performance, transparency, and accountability. These standards are measured by the non-profit B Lab’s “Impact Assessment”, a rigorous assessment of a company’s impact on its workers, customers, community, and environment. Details of this assessment can be found at: <a href="https://bcorporation.net/directory/rstudio">https://bcorporation.net/directory/rstudio</a>.</p><p>More details on our transition to Benefit Corporation are available in my <a href="https://rstudio.com/slides/rstudio-pbc">keynote slides</a> from rstudio::conf 2020 as well as our first annual <a href="https://www.rstudio.com/about/pbc-report-2019">Public Benefit Report</a>.</p><p>With this sustainable foundation and the support of our customers, employees, and the community, we look forward to making many more contributions in the years ahead.</p><div style="padding-top:20px;padding-bottom: 80px;"><h3>J.J. Allaire's Keynote at rstudio::conf 2020, announcing RStudio, PBC</h3><script src="https://fast.wistia.com/embed/medias/i7lqqlo6ng.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_i7lqqlo6ng videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/i7lqqlo6ng/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div></div></description></item><item><title>sparklyr 1.1: Foundations, Books, Lakes and Barriers</title><link>https://www.rstudio.com/blog/sparklyr-1-1/</link><pubDate>Wed, 29 Jan 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/sparklyr-1-1/</guid><description><img src="https://www.rstudio.com/blog-images/2020-01-29-sparklyr-1-1-linux-foundation-roadmap.png" style="display: none;" alt="Linux Foundation roadmap projects and sparklyr"/><p>Today we are excited to share that <a href="https://github.com/sparklyr/sparklyr">sparklyr</a> <code>1.1</code> is now available on <a href="https://CRAN.R-project.org/package=sparklyr">CRAN</a>!</p><p>In a nutshell, you can use sparklyr to scale datasets across computing clusters running <a href="http://spark.apache.org">Apache Spark</a>. For this particular release, we would like to highlight the following new features:</p><ul><li><strong><a href="#delta-lake">Delta Lake</a></strong> enables database-like properties in Spark.</li><li><strong><a href="#spark-3-0">Spark 3.0</a></strong> preview is now available through sparklyr.</li><li><strong><a href="#barrier-execution">Barrier Execution</a></strong> paves the way to use Spark with deep learning frameworks.</li><li><strong><a href="#qubole">Qubole</a></strong> clusters running Spark can be easily used with sparklyr.</li></ul><p>In addition, new community <strong><a href="#extensions">Extensions</a></strong> enable natural language processing and genomics, sparklyr is now being hosted within the <strong><a href="#linux-foundation">Linux Foundation</a></strong>, and the <strong><a href="#mastering-spark-with-r">Mastering Spark with R</a></strong> book is now available and free-to-use online.</p><p>You can install <code>sparklyr 1.1</code> from CRAN as follows:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">sparklyr&#34;</span>)</code></pre></div><h2 id="delta-lake">Delta Lake</h2><p>The <a href="https://delta.io/">Delta Lake</a> project is an open-source storage layer that brings <a href="https://en.wikipedia.org/wiki/ACID">ACID transactions</a> to Apache Spark. To use Delta Lake, first connect using the new <code>packages</code> parameter set to <code>&quot;delta&quot;</code>.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(sparklyr)sc <span style="color:#666">&lt;-</span> <span style="color:#06287e">spark_connect</span>(master <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">local&#34;</span>, version <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">2.4&#34;</span>, packages <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">delta&#34;</span>)</code></pre></div><p>As a simple example, let&rsquo;s write a small data frame to Delta using <code>spark_write_delta()</code>, overwrite it, and then read it back with <code>spark_read_delta()</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">sdf_len</span>(sc, <span style="color:#40a070">5</span>) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">spark_write_delta</span>(path <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">delta-test&#34;</span>)<span style="color:#06287e">sdf_len</span>(sc, <span style="color:#40a070">3</span>) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">spark_write_delta</span>(path <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">delta-test&#34;</span>, mode <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">overwrite&#34;</span>)<span style="color:#06287e">spark_read_delta</span>(sc, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">/tmp/delta-1&#34;</span>)</code></pre></div><pre><code># Source: spark&lt;delta1&gt; [?? x 1]id&lt;int&gt;1 12 23 3</code></pre><p>Now, since Delta is capable of tracking all versions of your data, you can easily time travel to retrieve the version that we overwrote:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">spark_read_delta</span>(sc, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">delta-test&#34;</span>, version <span style="color:#666">=</span> <span style="color:#40a070">0L</span>)</code></pre></div><pre><code># Source: spark&lt;delta1&gt; [?? x 1]id&lt;int&gt;1 12 23 34 45 5</code></pre><h2 id="spark-30">Spark 3.0</h2><p>To install and try out Spark 3.0 preview, simply run:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(sparklyr)<span style="color:#06287e">spark_install</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">3.0.0-preview&#34;</span>)sc <span style="color:#666">&lt;-</span> <span style="color:#06287e">spark_connect</span>(master <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">local&#34;</span>, version <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">3.0.0-preview&#34;</span>)</code></pre></div><p>You can then preview upcoming features, like the ability to read binary files. To demonstrate this, we can use <a href="https://blog.rstudio.com/2019/09/09/pin-discover-and-share-resources/">pins</a> to download a 237MB subset of <a href="http://www.image-net.org/">ImageNet</a>, and then load them into Spark:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">tiny_imagenet <span style="color:#666">&lt;-</span> pins<span style="color:#666">::</span><span style="color:#06287e">pin</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">http://cs231n.stanford.edu/tiny-imagenet-200.zip&#34;</span>)<span style="color:#06287e">spark_read_source</span>(sc, <span style="color:#06287e">dirname</span>(tiny_imagenet[1]), source <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">binaryFile&#34;</span>)</code></pre></div><pre><code># Source: spark&lt;images&gt; [?? x 4]path modificationTime length content&lt;chr&gt; &lt;dttm&gt; &lt;dbl&gt; &lt;list&gt;1 file:images/test_2009.JPEG 2020-01-08 20:36:41 3138 &lt; [3,138]&gt;2 file:images/test_8245.JPEG 2020-01-08 20:36:43 3066 &lt; [3,066]&gt;3 file:images/test_4186.JPEG 2020-01-08 20:36:42 2998 &lt; [2,998]&gt;4 file:images/test_403.JPEG 2020-01-08 20:36:39 2980 &lt; [2,980]&gt;5 file:images/test_8544.JPEG 2020-01-08 20:36:38 2958 &lt; [2,958]&gt;6 file:images/test_5814.JPEG 2020-01-08 20:36:38 2929 &lt; [2,929]&gt;7 file:images/test_1063.JPEG 2020-01-08 20:36:41 2920 &lt; [2,920]&gt;8 file:images/test_1942.JPEG 2020-01-08 20:36:39 2908 &lt; [2,908]&gt;9 file:images/test_5456.JPEG 2020-01-08 20:36:42 2906 &lt; [2,906]&gt;10 file:images/test_5859.JPEG 2020-01-08 20:36:39 2896 &lt; [2,896]&gt;# … with more rows</code></pre><p>Please notice that the <a href="https://spark.apache.org/news/spark-3.0.0-preview.html">Spark 3.0.0 preview</a> not a stable release in terms of either API or functionality.</p><h2 id="barrier-execution">Barrier Execution</h2><p>Barrier execution is a new feature introduced in <a href="https://spark.apache.org/releases/spark-release-2-4-0.html">Spark 2.4</a> which enables Deep Learning on Apache Spark by implementing an all-or-nothing scheduler into Apache Spark. This allows Spark to not only process analytic workflows, but also to use Spark as a high-performance computing cluster where other framework, like <a href="https://www.openmp.org/">OpenMP</a> or <a href="https://www.tensorflow.org/guide/distributed_training">TensorFlow Distributed</a>, can reuse cluster machines and have them directly communicate with each other for a given task.</p><p>In general, we don&rsquo;t expect most users to use this feature directly; instead, this is a feature relevant to advanced users interested in creating extensions that support additional modeling frameworks. You can learn more about barrier execution in Reynold Xin&rsquo;s <a href="https://vimeo.com/274267107">keynote</a>.</p><p>To use barrier execution from R, set the <code>barrier = TRUE</code> parameter in <code>spark_apply()</code> and then make use of a new parameter in the R closure to retrieve the network address of the additional nodes available for this task. A simple example follows:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(sparklyr)sc <span style="color:#666">&lt;-</span> <span style="color:#06287e">spark_connect</span>(master <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">local&#34;</span>, version <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">2.4&#34;</span>)<span style="color:#06287e">sdf_len</span>(sc, <span style="color:#40a070">1</span>, repartition <span style="color:#666">=</span> <span style="color:#40a070">1</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">spark_apply</span>(<span style="color:#666">~</span> .y<span style="color:#666">$</span>address, barrier <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>, columns <span style="color:#666">=</span> <span style="color:#06287e">c</span>(address <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">character&#34;</span>)) <span style="color:#666">%&gt;%</span><span style="color:#06287e">collect</span>()</code></pre></div><pre><code># A tibble: 1 x 1address&lt;chr&gt;1 localhost:50693</code></pre><h2 id="qubole">Qubole</h2><p><a href="https://www.qubole.com/product/data-platform/">Qubole</a> is a fully self-service multi-cloud data platform based on enterprise-grade data processing engines including Apache Spark.</p><p>If you are using Qubole clusters, you can now easily connect to a Spark through a new <code>&quot;qubole&quot;</code> connection method:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(sparklyr)sc <span style="color:#666">&lt;-</span> <span style="color:#06287e">spark_connect</span>(method <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">qubole&#34;</span>)</code></pre></div><p>Once connected, you can use Spark and R as usual. To learn more, visit <a href="https://docs.qubole.com/en/latest/user-guide/engines/spark/rstudio_spark.html">RStudio for Running Distributed R Jobs</a>.</p><h2 id="extensions">Extensions</h2><p>The new <a href="https://github.com/r-spark">github.com/r-spark</a> repo contains new community extensions. To mention a few, <a href="https://CRAN.R-project.org/package=variantspark">variantspark</a> and <a href="https://CRAN.R-project.org/package=sparkhail">sparkhail</a> are two new extensions for genomic research, <a href="https://github.com/r-spark/sparknlp">sparknlp</a> adds support for natural language processing.</p><p>For those of you with background in genomics, you can use <code>sparkhail</code> by first installing this extension from CRAN. Followed by connecting to Spark, creating a Hail Context, and then loading a subset of the <a href="https://www.internationalgenome.org/data/">1000 Genomes</a> dataset using <a href="https://hail.is/">Hail</a>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(sparklyr)<span style="color:#06287e">library</span>(sparkhail)sc <span style="color:#666">&lt;-</span> <span style="color:#06287e">spark_connect</span>(master <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">local&#34;</span>, version <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">2.4&#34;</span>, config <span style="color:#666">=</span> <span style="color:#06287e">hail_config</span>())hc <span style="color:#666">&lt;-</span> <span style="color:#06287e">hail_context</span>(sc)hail_data <span style="color:#666">&lt;-</span> pins<span style="color:#666">::</span><span style="color:#06287e">pin</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">https://github.com/r-spark/sparkhail/blob/master/inst/extdata/1kg.zip?raw=true&#34;</span>)hail_df <span style="color:#666">&lt;-</span> <span style="color:#06287e">hail_read_matrix</span>(hc, <span style="color:#06287e">file.path</span>(<span style="color:#06287e">dirname</span>(hail_data[1]), <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">1kg.mt&#34;</span>)) <span style="color:#666">%&gt;%</span><span style="color:#06287e">hail_dataframe</span>()</code></pre></div><p>You can then analyze it with packages like <code>dplyr</code>, <code>sparklyr.nested</code>, and <code>dbplot</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(dplyr)<span style="color:#06287e">sdf_separate_column</span>(hail_df, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">alleles&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">group_by</span>(alleles_1, alleles_2) <span style="color:#666">%&gt;%</span><span style="color:#06287e">tally</span>() <span style="color:#666">%&gt;%</span><span style="color:#06287e">arrange</span>(<span style="color:#666">-</span>n)</code></pre></div><pre><code># Source: spark&lt;?&gt; [?? x 3]# Groups: alleles_1# Ordered by: -nalleles_1 alleles_2 n&lt;chr&gt; &lt;chr&gt; &lt;dbl&gt;1 C T 24362 G A 23873 A G 19444 T C 18795 C A 4966 G T 4807 T G 4688 A C 4549 C G 15010 G C 112# … with more rows</code></pre><p>Notice that these frequencies come in pairs, C/T and G/A are actually the same mutation, just viewed from opposite strands. You can then create a histogram over the DP field, depth of the proband, as follows:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">sparklyr.nested<span style="color:#666">::</span><span style="color:#06287e">sdf_select</span>(hail_df, dp <span style="color:#666">=</span> info.DP) <span style="color:#666">%&gt;%</span>dbplot<span style="color:#666">::</span><span style="color:#06287e">dbplot_histogram</span>(dp)</code></pre></div><img src="https://www.rstudio.com/blog-images/2020-01-29-sparklyr-1-1-hail-histogram-pd.png" alt="Apache Spark, Hail, R, and sparklyr histogram"/><p>This code was adapted from Hail&rsquo;s <a href="https://hail.is/docs/0.2/tutorials/01-genome-wide-association-study.html">Genome Wide Association-Study</a>. You can learn more about this Hail community extensions from <a href="https://github.com/r-spark/sparkhail">r-spark/sparkhail</a>.</p><h2 id="linux-foundation">Linux Foundation</h2><p>The <a href="https://www.linuxfoundation.org">Linux Foundation</a> is home of projects such as <a href="https://www.linuxfoundation.org/projects/linux/">Linux</a>, <a href="https://kubernetes.io/">Kubernetes</a>, <a href="https://js.foundation/">Node.js</a> and umbrella foundations such as <a href="https://lfai.foundation/">LF AI</a>, <a href="https://www.lfedge.org/">LF Edge</a>, and <a href="https://www.lfnetworking.org/">LF Network</a>. We are very excited to have sparklyr be hosted as an incubation project within LF AI alongside <a href="https://www.acumos.org/">Acumos</a>, <a href="https://lfai.foundation/projects/angel-ml/">Angel</a>, <a href="https://lfai.foundation/projects/horovod/">Horovod</a>, <a href="https://pyro.ai/">Pyro</a>, <a href="https://onnx.ai/">ONNX</a> and several others.</p><p>Hosting sparklyr in LF AI within the Linux Foundation provides a neutral entity to hold the project assets and open governance. Furthermore, we believe hosting with LF AI will also help bring additional talent, ideas, and shared components from other Linux Foundation projects like <a href="https://delta.io">Delta Lake</a>, <a href="https://eng.uber.com/horovod/">Horovod</a>, <a href="https://onnx.ai">ONNX</a>, and so on into sparklyr as part of cross-project and cross-foundation collaboration.</p><p>This makes it a great time for you to join the sparklyr community, contribute, and help this project grow. You can learn more about this in <a href="https://sparklyr.org">sparklyr.org</a>.</p><h2 id="mastering-spark-with-r">Mastering Spark with R</h2><p><a href="https://therinspark.com">Mastering Spark with R</a> is a new book to help you learn and master Apache Spark with R from start to finish. It introduces data analysis with well-known tools like <a href="https://dplyr.tidyverse.org/">dplyr</a>, and covers everything else related to processing large-scale datasets, modeling, productionizing pipelines, using extensions, distributing R code, and processing real-time data &ndash; if you are not yet familiar with Spark, this is a great resource to get started!</p><p><a href="https://therinspark.com">&lt;img src=&rdquo;/blog-images/2020-01-29-sparklyr-1-1-book-cover.jpg&rdquo; width=&quot;200px&rdquo;&rdquo; alt=&quot;Mastering Spark with R book cover&rdquo;/&gt;</a></p><p>This book was published by <a href="http://shop.oreilly.com/product/0636920223764.do">O&rsquo;Reilly</a>, is available on <a href="https://www.amazon.com/gp/product/149204637X">Amazon</a>, and is also free-to-use <a href="https://therinspark.com/">online</a>. We hope you find this book useful and easy to read.</p><p>To catch up on previous releases, take a look at the <a href="https://blog.rstudio.com/2019/03/15/sparklyr-1-0/">sparklyr 1.0</a> post or watch various video tutorials in the <a href="https://www.youtube.com/channel/UCAwJMtPx4HgmMXEDTvZBJ4A/playlists">mlverse</a> channel.</p><p>Thank you for reading along!</p></description></item><item><title>RStudio Connect 1.8.0 </title><link>https://www.rstudio.com/blog/rstudio-connect-1-8-0/</link><pubDate>Wed, 22 Jan 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-1-8-0/</guid><description><p>RStudio Connect helps data science teams quickly make an impact by enabling themto publicize reports, models, dashboards, and applications created in R andPython with business stakeholders. The 1.8.0 release makes it even easier forteams to start sharing.</p><p>For <strong>Data Scientists</strong>, highlights include:</p><ul><li><a href="#python-support">Python Support</a>: Improvements that make it easier to share Jupyter Notebooks and mixed R and Python content.</li><li><a href="#pins">pins</a>: An easy way to share data, models, and other objects.</li><li><a href="#custom-emails">Custom Emails</a>: Use code to create beautiful emails with plots and results inline.</li></ul><p>For <strong>DevOps / IT Administrators</strong>, 1.8.0 makes it easier to support data science teams inproduction:</p><ul><li><a href="https://docs.rstudio.com/connect/admin/authentication.html#authentication-saml">SAML</a>: Seamless single sign-on integration.</li><li><a href="#scheduled-content-calendar">Schedule Content Calendar</a>: Audit content schedules in production.</li></ul><p>For <strong>Data Science Leaders</strong>, 1.8.0 simplifies onboarding new team members and easescollaboration:</p><ul><li><a href="#jump-start-examples">Jump Start Examples</a>: Learn new ways to use RStudio Connect and onboard new users.</li><li><a href="#git-centered-deployments">Git-centered Deployment</a>: Utilize Git-centric deployment workflows.</li></ul><p>To learn more about all the ways RStudio Connect makes it easy to connect yourData Science team with your decision makers, visit <a href="https://rstudio.com/products/connect">ourwebsite</a>. An easy way to get started iswith the <a href="https://rstudio.com/quickstart">RStudio Team Quickstart</a> to experienceall of RStudio&rsquo;s products on an easy-to-use virtual machine, or begin a free <a href="https://rstudio.com/products/connect/">45-day evaluation</a> of RStudio Connect.</p><h3 id="python-support">Python Support</h3><figure><img src="rsc-180-jupyter1.png"alt="Publish Options in Jupyter"/></figure><p>RStudio Connect has supported both R and Python for over a year, and during this time we&rsquo;ve made <a href="https://github.com/rstudio/rsconnect-jupyter/blob/master/CHANGELOG.md">significant improvements</a>. Data scientists can now <a href="https://docs.rstudio.com/rsconnect-jupyter/">publishJupyter Notebooks with a single click</a> orby using version control. Data scientists who use both R and Python also have moreflexibility, helping them deploy mixed content using the <a href="https://rstudio.github.io/reticulate">reticulate Rpackage</a>.</p><figure><img src="rsc-180-jupyter2.png"alt="Publish a Jupyter Notebook to RStudio Connect"/></figure><h3 id="pins">Pins</h3><p>The <a href="https://rstudio.github.io/pins">pins</a> package makes it easy to share data, models, and other objects on RStudio Connect. Pins are especially useful if you have data that regularly updates; simply schedule an R Markdown document to process your data and pin your results or model. Once pinned, your reports, applications, and APIs can automatically pull the updates. <a href="https://pins.rstudio.com/articles/boards-rsconnect.html">Learn more</a> or <a href="https://rviews.rstudio.com/2019/10/17/deploying-data-with-pins/">see an example</a>.</p><figure><img src="rsc-178-pins.png"alt="Pins Support in RStudio Connect"/></figure><h3 id="custom-emails">Custom Emails</h3><p>Sending plots, tables, and results inline in an email is a powerful way for data scientists to make an impact. RStudio Connect customers use custom emails to send daily reminders, conditional alerts, and to track key metrics. The latest release of the <a href="https://rich-iannone.github.io/blastula/">blastula package</a> makes it even easier for data scientists to specify these emails programmatically:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">if </span>(demand_forecast <span style="color:#666">&gt;</span> <span style="color:#40a070">1000</span>) {<span style="color:#06287e">render_connect_email</span>(input <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">alert-supply-team-email.Rmd&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">attach_connect_email</span>(subject <span style="color:#666">=</span> <span style="color:#06287e">sprintf</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">ALERT: Forecasted increase of %g units&#34;</span>, increase),attach_output <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>,attachments <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">demand_forecast_data.csv&#34;</span>))} else {<span style="color:#06287e">suppress_scheduled_email</span>()}</code></pre></div><figure><img src="rsc-180-email.png"alt="Custom Emails from RStudio Connect"/></figure><p>Learn more <a href="https://solutions.rstudio.com/2019/12/30/rstudio-connect-custom-emails-with-blastula/">here</a>!</p><h3 id="jump-start-examples">Jump Start Examples</h3><p>A common challenge facing data science teams is onboarding new users. Datascientists have to learn new tools, methods, and often a new domain. We&rsquo;vecreated a set of examples to help data scientists learn common best practices. Anew tutorial in the product helps users publish different data products, which isparticularly helpful for data scientists exploring new content types, such asreports, models, or datasets.</p><figure><img src="rsc-180-jumpstart.png"alt="Jump Start Examples"/></figure><h3 id="git-centered-deployments">Git-centered Deployments</h3><p>RStudio Connect&rsquo;s push-button publishing is a convenient and simple way for datascientists to share their work. However, some teams prefer Git-centricworkflows, especially when deploying content in production. RStudio Connect<a href="https://docs.rstudio.com/connect/1.8.0/user/git-backed/">supports theseworkflows</a>, making iteffortless for data science teams to adopt version control best practices withoutmaintaining additional infrastructure or remembering complex workflows. Datascientists simply commit to Git, and RStudio Connect will update the content,saving you any extra steps.</p><figure><img src="rsc-176-git-deploy.png"alt="Create New Content from Git Repository in RStudio Connect"/></figure><h3 id="scheduled-content-calendar">Scheduled Content Calendar</h3><p>For system and application administrators, RStudio Connect makes it simple toaudit data science work. For data science teams, one powerful application ofRStudio Connect is the ability to schedule tasks. These tasks can be everythingfrom simple ETL jobs to daily reports. In 1.8.0, we&rsquo;ve made it easier foradministrators to track these tasks across all publishers in a single place.This new view makes it possible to identify conflicts or times when the serveris being overbooked.</p><figure><img src="rsc-174-schedules.png"alt="View Scheduled Content"/></figure><h2 id="security-updates--deprecations">Security Updates &amp; Deprecations</h2><ul><li>RStudio Connect no longer supports Ubuntu 14.04 LTS (Trusty Tahr).</li><li>The <code>Postgres.URL</code> database connection URL will always use a nonempty <code>Postgres.Password </code>value as its password. Previous releases would use the password only when a <code>{$}</code> placeholder was present. Support for the <code>Postgres.URL</code> placeholder <code>{$}</code> is deprecated and will be removed in a future release.</li><li>Duplicate user names are now accepted in some conditions for LDAP, SAML, and proxied authentication providers. See the [release notes] for details.</li><li>Support for TLS 1.3. Access to this TLS version is available without additional configuration.</li><li>The setting <code>HTTPS.ExcludedCiphers</code> has been removed and is no longer supported. The <code>HTTPS.MinimumTLS</code> setting should be used to specify a minimum accepted TLS version. We recommend running a secure proxy when your organization has more complex HTTPS requirements.</li><li>The deprecated setting <code>Applications.DisabledProtocols</code> has been removed. Use <code>Applications.DisabledProtocol</code> instead. Multiple values should be placed one per line with this new setting.</li><li>The deprecated settings <code>OAuth2.AllowedDomains</code> and <code>OAuth2.AllowedEmails</code> have been removed. Use <code>OAuth2.AllowedDomain</code> and <code>OAuth2.AllowedEmail</code> instead. Multiple values should be placed one per line with these new settings.</li><li>TensorFlow Model API deployment supports models created with TensorFlow up to version 1.15.0. A systems administrator will need to update the version of <code>libtensorflow.so</code> installed on your RStudio Connect system.</li></ul><p>For more information about all the updates available in RStudio Connect 1.8.0,we recommend consulting the <a href="https://blog.rstudio.com/categories/rstudio-connect">release posts for the 1.7 series</a>.</p><p>This release also includes numerous bug fixes, the full <a href="https://doc.rstudio.com/connect/news">release notes</a> document all of the changes. Some of our favorites:</p><ul><li>The “Info” panel for Jupyter notebook content includes a summary of recent usage.</li><li>Python-based content can now use an application RunAs setting.</li><li>Scheduled reports no longer run multiple times if the application and database clocks disagree, or in the case of concurrent rendering attempts in a multi-node installation.</li></ul><blockquote><h2 id="upgrade-planning">Upgrade Planning</h2><p>Aside from the deprecations and breaking changes listed above,there are no other special considerations, and upgrading should require less thanfive minutes. If you are upgrading from a version earlier than 1.7.8, be sure to consultthe release notes for the intermediate releases, as well.</p></blockquote><h2 id="get-started-with-rstudio-connect">Get Started with RStudio Connect</h2><p>If you haven&rsquo;t yet had a chance to download and try <a href="https://rstudio.com/products/connect/">RStudioConnect</a>, we encourage you to do so.RStudio Connect is the best way to share all the work that you do in R and Python withcollaborators, colleagues, or customers.</p><p>You can find more details or download a 45-day evaluation of the product at<a href="https://www.rstudio.com/products/connect/">https://www.rstudio.com/products/connect/</a>.Additional resources can be found below:</p><ul><li><a href="https://www.rstudio.com/products/connect/">RStudio Connect home page &amp; downloads</a></li><li><a href="http://docs.rstudio.com/connect/admin/">RStudio Connect Administration Guide</a></li><li><a href="http://docs.rstudio.com/connect/news/">Detailed news and changes between each version</a></li><li><a href="https://www.rstudio.com/pricing/">Pricing</a></li></ul></description></item><item><title>Start 2020 with mad new skills you learned at rstudio::conf 2020. Final Call</title><link>https://www.rstudio.com/blog/rstudio-conf-2020-final-call/</link><pubDate>Sun, 12 Jan 2020 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-conf-2020-final-call/</guid><description><p>There will be no better time or place this year to accelerate your knowledge of all things R and RStudio than at rstudio::conf 2020 in San Francisco. While we’re approaching capacity, there’s still room for you! Whether you’re a dedicated R user or one of the many people who use R and Python, come join us and more than 2,000 fellow data scientists and data science team leaders in San Francisco January 27 - 30.</p><p align="center"><a href="https://web.cvent.com/event/36ebe042-0113-44f1-8e36-b9bc5d0733bf/websitePage:34f3c2eb-9def-44a7-b324-f2d226e25011?RefId=conference&utm_campaign=Site%20Promo&utm_medium=Ste&utm_source=ConfPage"><img src="learn-more-and-reg.png" style="width:60.0%" alt="Learn more and register for the RStudio 2020 conference here" /></a></p><p>You can still <a href="https://web.cvent.com/event/36ebe042-0113-44f1-8e36-b9bc5d0733bf/websitePage:34f3c2eb-9def-44a7-b324-f2d226e25011">register for a workshop (January 27 - 28)</a>, the conference (January 29 - 30), or both!With a little more than 2 weeks to go, we expect to reach conference capacity soon.</p><p>Here are the career-building workshops with seats still available for you as of January 9, 2020. We apologize in advance if a workshop listed here is sold out before you have the chance to register.</p><ul><li>A Practical Introduction to Data Visualization with ggplot2</li><li>Modern Geospatial Data Analysis with R</li><li>Designing the Data Science Classroom</li><li>Text Mining with Tidy Data Principles</li><li>Big Data with R</li><li>R Markdown and Interactive Dashboards</li><li>R for Excel Users</li><li>What They Forgot to Teach You about R Workshop</li><li>My Organization’s First R Package Workshop</li><li>Shiny from Start to Finish</li></ul><p><strong>Note: Childcare for registered attendees is also still available for children 6 months to 8 years of age from 8am - 6pm daily. The cost is $20/day.</strong></p><div id="hear-what-other-data-scientists-have-to-say-about-rstudioconf" class="section level2"><h2>Hear what other data scientists have to say about rstudio::conf!</h2><p align="center"><script src="https://fast.wistia.com/embed/medias/vjc5ow55nv.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="display:block;margin-left:auto;margin-right:auto;width:70%;padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="width:100%;height:100%;left:0;position:absolute;top:0;"><div class="wistia_embed wistia_async_vjc5ow55nv videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/vjc5ow55nv/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:80%;" alt="" aria-hidden="true" onload="this.parentNode.style.opacity=1;" /></div></div></div></div></p></div></description></item><item><title>reticulate 1.14</title><link>https://www.rstudio.com/blog/reticulate-1-14/</link><pubDate>Fri, 20 Dec 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/reticulate-1-14/</guid><description><p>We&rsquo;re excited to announce that <code>reticulate</code> 1.14 is now available on CRAN! You can install it with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">reticulate&#34;</span>)</code></pre></div><p>With this release, we are introducing a major new feature: <code>reticulate</code> can now automatically configure a Python environment for the user, in coordination with any loaded R packages that depend on <code>reticulate</code>. This means that:</p><ul><li><p><em>R package authors</em> can declare their Python dependency requirements to <code>reticulate</code> in a standardized way, and <code>reticulate</code> will automatically prepare the Python environment for the user; and</p></li><li><p><em>R users</em> can use R packages depending on <code>reticulate</code>, without having to worry about managing a Python installation / environment themselves.</p></li></ul><p>Ultimately, the goal is for R packages using <code>reticulate</code> to be able to operate just like any other R package, without forcing the R user to grapple with issues around Python environment management.</p><p>We&rsquo;d also like to give a special thanks to <a href="https://ryanhafen.com/">Ryan Hafen</a> for his work on the <a href="https://github.com/hafen/rminiconda">rminiconda</a> package. The work in this release borrows from many of the ideas he put together as part of the <code>rminiconda</code> package.</p><h2 id="r-packages-and-python----the-problem">R Packages and Python &ndash; The Problem</h2><p>Currently, <em>reticulated</em> R packages typically have to document for users how their Python dependencies should be installed. For example, packages like <a href="https://tensorflow.rstudio.com">tensorflow</a> provide helper functions (e.g. <code>tensorflow::install_tensorflow()</code>):</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(tensorflow)<span style="color:#06287e">install_tensorflow</span>()<span style="color:#60a0b0;font-style:italic"># use tensorflow</span></code></pre></div><p>This approach requires users to manually download, install, and configure an appropriate version of Python themselves. In addition, if the user has <em>not</em> downloaded an appropriate version of Python, then the version discovered on the user&rsquo;s system may not conform with the requirements imposed by the Python TensorFlow package &ndash; leading to more trouble.</p><p>Fixing this often requires instructing the user to install Python, and then use <code>reticulate</code> APIs (e.g. <code>reticulate::use_python()</code> and other tools) to find and use that version of Python. This is, understandably, more cognitive overhead than one normally might want to impose on the users of one&rsquo;s package.</p><h2 id="r-packages-and-python----the-solution">R Packages and Python &ndash; The Solution</h2><p>Our goal in this release, then, is to make it possible for <code>reticulate</code> to <em>automatically</em> prepare a Python environment for the user, without requiring any explicit user intervention. In other words, R packages that wrap Python packages through <code>reticulate</code> should <em>feel</em> just like any other R package. The R user should only need to write:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(tensorflow)<span style="color:#60a0b0;font-style:italic"># use tensorflow</span></code></pre></div><p>and <code>reticulate</code> will automatically prepare and install TensorFlow (prompting the user as necessary).</p><p>To that end, we&rsquo;ve made the following changes. If the user has not explicitly instructed <code>reticulate</code> to use a pre-existing Python environment, then:</p><ol><li><p><code>reticulate</code> will prompt the user to download and install <a href="https://docs.conda.io/en/latest/miniconda.html">Miniconda</a>;</p></li><li><p><code>reticulate</code> will prepare a default <code>r-reticulate</code> Conda environment, using (currently) Python 3.6 and <a href="https://numpy.org/">NumPy</a>;</p></li><li><p>When Python is initialized, <code>reticulate</code> will query any loaded R packages for their Python dependencies, and install those dependencies into the aforementioned <code>r-reticulate</code> Conda environment.</p></li></ol><p>Ultimately, this leads to an experience where R packages wrapping Python packages can work just like any other R package &ndash; the user will normally not need to intervene and manually configure their Python environment.</p><p>All that said, all of the pre-existing workflows for configuring Python remain available for users who require them. If you need to manually take control of the Python environment you use in your projects, you can still do so.</p><p>Currently, automatic Python environment configuration will only happen when using the aforementioned <code>reticulate</code> Miniconda installation. However, you can still call</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">reticulate<span style="color:#666">::</span><span style="color:#06287e">configure_environment</span>()</code></pre></div><p>to manually install any declared Python dependencies into your active Python environment.</p><h2 id="declaring-a-python-dependency">Declaring a Python Dependency</h2><p>R packages which want to declare a Python package dependency to <code>reticulate</code> can do so in their <code>DESCRIPTION</code> file. For example, suppose we were building a package <code>rscipy</code> which wrapped the Python <a href="https://www.scipy.org/">SciPy</a> package. We could declare the dependency on <code>scipy</code> with a field like:</p><pre><code>Config/reticulate:list(packages = list(list(package = &quot;scipy&quot;, pip = TRUE)))</code></pre><p>In particular, this will instruct <code>reticulate</code> to install the latest available version of the <code>scipy</code> package from <a href="https://pypi.org/">PyPI</a>, using <code>pip</code>.</p><p><code>reticulate</code> will read and parse the <code>DESCRIPTION</code> file when Python is initialized, and use that information when configuring the Python environment. See:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">vignette</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">python_dependencies&#34;</span>)</code></pre></div><p>for more information.</p><h2 id="limitations">Limitations</h2><p>With automatic configuration, <code>reticulate</code> wants to encourage a world wherein different R packages wrapping Python packages can live together in the same Python environment / R session. In essence, we would like to minimize the number of conflicts that could arise through different R packages having incompatible Python dependencies.</p><p>Unfortunately, Python projects tend to lean quite heavily upon virtual environments, and so Python packages do sometimes declare fairly narrow version requirements. Ultimately, we are relying on R package authors to work together and avoid declaring similarly narrow or incompatible version requirements. To that end, we ask package authors to please prefer using the latest-available packages on <code>pip</code> / the Conda repositories when possible, and to declare version requirements only when necessary.</p><h1 id="pandas-performance">Pandas Performance</h1><p>We&rsquo;ve also invested some time into improving the performance of conversions between R and Python for Pandas DataFrames &ndash; in particular, the conversion performance should be greatly improved for DataFrames with a large number of columns.</p><p>For example, with the following script:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(reticulate)rdf <span style="color:#666">&lt;-</span> <span style="color:#06287e">as.data.frame</span>(<span style="color:#06287e">matrix</span>(<span style="color:#40a070">0</span>, nrow <span style="color:#666">=</span> <span style="color:#40a070">1000</span>, ncol <span style="color:#666">=</span> <span style="color:#40a070">10000</span>))pdf <span style="color:#666">&lt;-</span> <span style="color:#06287e">r_to_py</span>(rdf)<span style="color:#06287e">system.time</span>(<span style="color:#06287e">r_to_py</span>(rdf))<span style="color:#06287e">system.time</span>(<span style="color:#06287e">py_to_r</span>(pdf))</code></pre></div><p>We see the following timings:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic"># reticulate 1.13 ----</span><span style="color:#666">&gt;</span> <span style="color:#06287e">system.time</span>(<span style="color:#06287e">r_to_py</span>(rdf))user system elapsed<span style="color:#40a070">7.581</span> <span style="color:#40a070">0.052</span> <span style="color:#40a070">7.640</span><span style="color:#666">&gt;</span> <span style="color:#06287e">system.time</span>(<span style="color:#06287e">py_to_r</span>(pdf))user system elapsed<span style="color:#40a070">15.363</span> <span style="color:#40a070">0.065</span> <span style="color:#40a070">15.446</span><span style="color:#60a0b0;font-style:italic"># reticulate 1.14 ----</span><span style="color:#666">&gt;</span> <span style="color:#06287e">system.time</span>(<span style="color:#06287e">r_to_py</span>(rdf))user system elapsed<span style="color:#40a070">0.303</span> <span style="color:#40a070">0.002</span> <span style="color:#40a070">0.306</span><span style="color:#666">&gt;</span> <span style="color:#06287e">system.time</span>(<span style="color:#06287e">py_to_r</span>(pdf))user system elapsed<span style="color:#40a070">1.320</span> <span style="color:#40a070">0.025</span> <span style="color:#40a070">1.347</span></code></pre></div><p>Over a 10x improvement!</p><h1 id="python-27">Python 2.7</h1><p>As you may be aware, Python 2.7 is slowly being phased out in favor of Python 3. On January 1st, 2020, Python 2.7 will officially reach end-of-life. To that end, this will be the last <code>reticulate</code> release to officially support Python 2.7 &ndash; all future work will focus on supporting Python 3.x. We strongly encourage users of <code>reticulate</code> to update to Python 3 if they have not already.</p><hr><p>Questions? Comments? Please get in touch with us on the <a href="https://community.rstudio.com">RStudio community forums</a>.</p></description></item><item><title>R vs. Python: What's the best language for Data Science?</title><link>https://www.rstudio.com/blog/r-vs-python-what-s-the-best-for-language-for-data-science/</link><pubDate>Tue, 17 Dec 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/r-vs-python-what-s-the-best-for-language-for-data-science/</guid><description><p>This is a question that we at RStudio hear a lot. With the tremendous growth in both languages, and in the application of data science in general, there is a lot of interest and debate over which is the “best” language for data science.</p><p><img src="https://rstudio.github.io/reticulate/images/reticulated_python.png" alt="R and Python"></p><p>From our founding, RStudio has been dedicated to a couple of key ideas: that it’s better for everyone if the tools used for data science are free and open, and that we love and support coding as the most powerful path to tackle data science. Coding gives current and aspiring data scientists superpowers to tackle the most complex problems, because code is flexible, reusable, inspectable, and reproducible.</p><p>With that in mind, at RStudio we don’t judge which language you prefer. We just care that you feel enabled to do great data science. As RStudio’s Chief Data Scientist Hadley Wickham expressed in a <a href="https://qz.com/1661487/hadley-wickham-on-the-future-of-r-python-and-the-tidyverse/">recent interview with Dan Kopf</a>: <em>Use whatever makes you happy.</em></p><p>We will talk more about the benefits of coding for data science in a future blog post, but in this post we will briefly examine the debates over R vs. Python, and then share why we believe R and Python can, should and do work beautifully together.</p><h2 id="r-or-python-for-data-science">R or Python for Data Science?</h2><p>There is a lot of heated discussion over the topic, but there are some great, thoughtful articles as well. Some suggest Python is preferable as a general-purpose programming language, while others suggest data science is better served by a dedicated language and toolchain. The origins and development arcs of the two languages are compared and contrasted, often to support differing conclusions.</p><p>For individual data scientists, some common points to consider:</p><ul><li>Python is a great general programming language, with many libraries dedicated to data science.</li><li>Many (if not most) general introductory programming courses start teaching with Python now.</li><li>Python is the go-to language for many ETL and Machine Learning workflows.</li><li>Many (if not most) introductory courses to statistics and data science teach R now.</li><li>R has become the world&rsquo;s largest repository of statistical knowledge with reference implementations for thousands, if not tens of thousands, of algorithms that have been vetted by experts. The documentation for many R packages includes links to the primary literature on the subject.</li><li>R has a very low barrier to entry for doing exploratory analysis, and converting that work into a great report, dashboard, or API.</li><li>R with RStudio is often considered the best place to do exploratory data analysis.</li></ul><p>For organizations with Data Science teams, some additional points to keep in mind:</p><ul><li>For some organizations, Python is easier to deploy, integrate and scale than R, because Python tooling already exists within the organization. On the other hand, we at RStudio have worked with thousands of data teams successfully solving these problems with our open-source and <a href="https://rstudio.com/products/team/">professional products</a>, including in multi-language environments.</li><li>R has a great community of supportive data scientists from diverse backgrounds. For example, <a href="https://rladies.org/about-us/">R-Ladies</a> is a global organization dedicated to promoting gender diversity in the R Community.</li><li>Most interfaces for novel machine learning tools are first written and supported in Python, while many new methods in statistics are first written in R.</li><li>Trying to enforce one language to the exclusion of the other, perhaps out of vague fears of complexity or costs to support both, risks excluding a huge potential pool of Data Scientist candidates either way.</li><li>Advice on building Data Science teams often stresses the importance of having a diverse team bringing a variety of viewpoints and complementary skills to the table, to make it more likely to efficiently find the “best” solution for a given problem. In this vein, R users tend to come from a much more diverse range of domain expertise (ecology, economics, psychology, bioinformatics, policy analysis, etc.).</li></ul><p>Thus, the focus on “R or Python?” risks missing the advantages that having both can bring to individual data scientists and data science teams. Because of this, many of these articles end up with fairly nuanced conclusions, along the lines of “You need both” or “It depends.” A great example of this view can be found in the above-referenced interview with Hadley Wickham:</p><p><em>Generally, there are a lot of people who talk about R versus Python like it’s a war that either R or Python is going to win. I think that is not helpful because it is not actually a battle. These things exist independently and are both awesome in different ways.</em></p><h2 id="r-and-python-for-data-science">R and Python for Data Science!</h2><p>And so the reality is that both languages are valuable, and both are here to stay. This is borne out by our experience. In talking to our customers, we’ve found that many Data Science teams today are bilingual, leveraging both R and Python in their work. In the spirit of Hadley’s <em>Use whatever makes you happy</em>, we’ve worked to make this sometime-rocky relationship a much happier one. We give individual Data Scientists, and the Data Science teams and organizations they are a part of, a smoother path to using both languages side by side, and to address the concerns around complexity or cost that IT teams might have about supporting both.</p><p>For example:</p><ul><li>Our open source <a href="https://rstudio.github.io/reticulate/">reticulate</a> package and the RStudio IDE makes it easy to combine R and Python in a single data science project.</li><li>Our <a href="https://rstudio.com/products/team/">professional products</a> make it easier to manage and collaborate across bilingual Data Science environments. E.g., RStudio Server Pro launches and manages Jupyter Notebooks and JupyterLab, and RStudio Connect makes it easy to share Jupyter Notebooks with stakeholders, alongside your work in R and your mixed R and Python projects.</li><li>As a longer term investment in improving cross-language collaboration, we are incubating Ursa Labs, providing operational support and infrastructure for this industry-funded development group specializing in open source data science tools. Wes McKinney, the author of the pandas package for Python is the Director, and talks a lot with Hadley Wickham. The <a href="https://blog.rstudio.com/2018/04/19/arrow-and-beyond/">goal of the Ursa Labs project</a> is nothing less than creating a modern data science runtime environment that takes advantage of the computational advances of the last 20 years, and can be used from many languages, including R and Python.</li></ul><p>To learn more about how RStudio supports using R and Python on the same Data Science teams, check out our <a href="https://rstudio.com/solutions/r-and-python/">R and Python Love Story</a>, where we provide information and resources for Data Scientists, Data Science Leaders, and DevOps/IT Leaders grappling with mixed R &amp; Python environments. Or, you check out our recent <a href="https://resources.rstudio.com/webinars/r-and-python-a-data-science-love-story">R and Python Love Story Webinar</a>, where you can watch the recording or download the slides. In future blog posts, we will also talk more about what we’ve seen in real life Data Science teams using R and Python side by side.</p></description></item><item><title>Emails from R: Blastula 0.3</title><link>https://www.rstudio.com/blog/emails-from-r-blastula-0-3/</link><pubDate>Thu, 05 Dec 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/emails-from-r-blastula-0-3/</guid><description><p>We’re pleased to announce blastula, a package for creating beautiful customemails in R. At RStudio, we love interactive dashboards, but some situationscall for a different communication mechanism. Use blastula to:</p><ul><li><strong>Compose</strong> custom email bodies based on code, code output, and markdown<br /></li><li><strong>Send</strong> emails using SMTP servers - even GMail - or integrate with production services like RStudio Connect</li></ul><p>Blastula makes it easy to send notifications for everything from anomaly detection to fantasy basketball results, all without leaving R.</p><p>To get started, install blastula from CRAN:</p><pre class="r"><code>install.packages(&#39;blastula&#39;)</code></pre><div id="creating-email" class="section level1"><h1>Creating Email</h1><p>Blastula’s unique strength is creating custom HTML email bodies that render in avariety of email clients, including mobile. The recommended way to create emailis using blastula’s R Markdown output format. <code>blastula::blastula_email</code>. Thebody of the email will respect the R Markdown output, including markdown syntaxand code chunk outputs.</p><pre class="markdown"><code>---output: blastula::blastula_email---Hi Team,This *important* forecast needs to go out today.```{r echo=FALSE}model &lt;- arima(presidents, c(1, 0, 0))predict(model, 3)```</code></pre><p>To create the email from the R Markdown document, use <code>render_email</code>:</p><pre class="r"><code>email &lt;- render_email(&#39;email.Rmd&#39;)</code></pre><p>The resulting email object can be previewed in RStudio.</p><p><img src="blastula_rmd.png" caption="Email from R Markdown" alt= "Email from R Markdown"></p><p>Alternatively, it is possible to create an email without R Markdown, by usingthe <code>compose_email</code> function to combine text, images, and even plots:</p><pre class="r"><code>library(blastula)library(ggplot2)library(glue)plot &lt;- qplot(disp, hp, data = mtcars, colour = mpg)plot_email &lt;- add_ggplot(plot)email &lt;- compose_email(body = md(c(&quot;Team, How would you plot the relationship between these 3 variables?&quot;,plot_email)))</code></pre><p><img src="blastula_preview.png" caption="Preview blastula emails in RStudio" alt= "Preview blastula emails in RStudio"></p><p>Visit the <a href="https://rich-iannone.github.io/blastula/">documentation</a> to learn how to embedimages, set email headers and footers, and even add call-to-action buttons.</p></div><div id="sending-custom-emails-with-smtp" class="section level1"><h1>Sending Custom Emails with SMTP</h1><p>To send email, blastula includes functions to access SMTP servers such as GMail, Outlook, and Office365.</p><p>First, securely tell blastula about your SMTP server:</p><pre class="r"><code>create_smtp_creds_key(id = &quot;gmail&quot;,user = &quot;user_name@gmail.com&quot;,provider = &quot;gmail&quot;)</code></pre><p>Next, use the SMTP service to send your custom email:</p><pre class="r"><code>email %&gt;%smtp_send(from = &quot;personal@email.net&quot;,to = &quot;personal@email.net&quot;,subject = &quot;Testing the `smtp_send()` function&quot;,credentials = creds_key(id = &quot;gmail&quot;))</code></pre></div><div id="sending-custom-emails-with-rstudio-connect" class="section level1"><h1>Sending Custom Emails with RStudio Connect</h1><p>Organizations can use blastula in production on RStudio Connect. For instance, weuse blastula to track critical services like our support ticketvolume and our staff training schedules.</p><p>An easy way to get started is to access the RStudio Connect examples:</p><pre class="r"><code>blastula::prepare_rsc_example_files()</code></pre><p>Publish the resulting R Markdown document to RStudio Connect, where it can be<a href="https://docs.rstudio.com/connect/1.7.8/user/r-markdown-schedule.html">scheduled for regularexecution</a>and distributed to stakeholders.</p><p><img src="blastula_rsc.png" caption="Schedule and Email in RStudio Connect" alt= "Schedule and Email in RStudio Connect"></p><p>Blastula offers three additional functions to make it easier to create emailsfor RStudio Connect.</p><ul><li><code>render_connect_email</code> automatically adds a footer to the email with useful links back to the content on RStudio Connect.<br /></li><li><code>attach_connect_email</code> ensures RStudio Connect sends the custom email, and also makes it easy to customize the subject line, include additional email attachments, and optionally attach the report output.<br /></li><li><code>suppress_scheduled_email()</code> allows you to skip sending the email. This pattern is very powerful. For example, reports can be run once a day, but only distributed if certain conditions are met.</li></ul><p>Together, these three functions can be used to send proactive notifcations:</p><pre class="r"><code>if (demand_forecast &gt; 1000) {render_connect_email(input = &quot;alert-supply-team-email.Rmd&quot;) %&gt;%attach_connect_email(subject = sprintf(&quot;We need to prepare %d units!&quot;, demand_forecast),attach_output = TRUE,attachments = c(&quot;demand_forecast_data.csv&quot;))} else {suppress_scheduled_email()}</code></pre><p>Please be sure to visit the <a href="https://rich-iannone.github.io/blastula/index.html">blastulawebsite</a> to find additionalresources. Afterall, who doesn’t want a ggplot in their inbox?</p></div></description></item><item><title>RStudio's Commercial Desktop License is now RStudio Desktop Pro</title><link>https://www.rstudio.com/blog/rstudio-desktop-pro/</link><pubDate>Tue, 03 Dec 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-desktop-pro/</guid><description><p>We have good news for our commercial desktop IDE customers. We are giving the commercial version of our desktop IDE a new name and some great new features, including support for the RStudio Professional Drivers, at no additional cost to you.</p><p>As part of this release, we are renaming RStudio Commercial Desktop License to RStudio Desktop Pro. Existing Commercial Desktop customers will be migrated as part of their next renewal. However, if you would like to migrate to the new release before then, or have any questions, please contact your RStudio Customer Success representative for more information.</p><h2 id="updates-for-users">Updates for Users</h2><p>As with the current Commercial Desktop License offering, RStudio Desktop Pro has all the great features of the RStudio Desktop Open Source Edition, plus a commercial license for organizations not able to use AGPL software, and access to priority support.</p><p>Beyond that, RStudio Desktop Pro also provides:</p><ul><li><strong>RStudio Professional Drivers:</strong> These drivers provide ODBC data connectors for many of the most popular data sources. These drivers can be downloaded and configured directly from within RStudio Desktop Pro. See <a href="https://docs.rstudio.com/pro-drivers/">RStudio Pro Drivers Documentation</a> for details, and this <a href="https://blog.rstudio.com/2019/10/24/pro-drivers-1-6-0-release/">blog post</a> for the most recent updates.</li><li><strong>License Activation and Management</strong>: To help users ensure compliance with their organization’s policies against AGPL software, RStudio Desktop Pro is a separate download from the RStudio Desktop Open Source Edition, with the AGPL license removed. A commercial license manager is integrated into the software, and the license itself is delivered as part of the purchase process. The usage of RStudio Desktop Pro is governed by a time-limited license tied to the renewal date. The time-limited license prevents users from accidentally reverting to the AGPL license if their subscription lapses.</li></ul><p>RStudio professional products are designed to work together. For example, RStudio Desktop Pro will use the same professional drivers as <a href="https://rstudio.com/products/rstudio-server-pro/">RStudio Server Pro</a> and <a href="https://rstudio.com/products/connect/">RStudio Connect</a>, ensuring a consistent user experience across platforms. We&rsquo;d love to get your input on what you&rsquo;d like to see in the future, to better integrate RStudio Desktop Pro with our other professional products. Please email <a href="mailto:sales@rstudio.com">sales@rstudio.com</a> with your feedback.</p><h2 id="for-more-information">For more information</h2><ul><li><a href="https://rstudio.com/products/rstudio/">RStudio Product Page</a></li><li><a href="https://docs.rstudio.com/other/rdp/">RStudio Desktop Pro Documentation</a></li><li><a href="https://docs.rstudio.com/pro-drivers/">RStudio Pro Drivers Documentation</a></li></ul></description></item><item><title>learnr 0.10.0</title><link>https://www.rstudio.com/blog/learnr-0-10-0/</link><pubDate>Mon, 02 Dec 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/learnr-0-10-0/</guid><description><p><a href="https://rstudio.github.io/learnr/"><code>learnr</code></a> 0.10.0 has <a href="https://cran.r-project.org/package=learnr">been released</a>! In this version of <code>learnr</code>, quiz questions have been expanded to allow for more question types. <a href="https://learnr-examples.shinyapps.io/quiz_question/#section-basic-question-types">Text box</a> quiz questions have been implemented natively within <code>learnr</code> and ranking questions have been implemented using the <a href="https://rstudio.github.io/sortable/"><code>sortable</code></a> package.</p><p><a href="https://andrie-de-vries.shinyapps.io/sortable_tutorial_question_rank/"><img src="./learnr-sortable-demo.gif" style="border: 1px solid #ddd" ></a></p><p>The <a href="https://rstudio.github.io/learnr/"><code>learnr</code> R package</a> makes it easy to turn any <a href="http://rmarkdown.rstudio.com">R Markdown</a> document into an interactive tutorial. Tutorials consist of content along with interactive components for checking and reinforcing understanding. Tutorials can include any or all of the following:</p><ol><li><p>Narrative, figures, illustrations, and equations.</p></li><li><p>Code exercises (R code chunks that users can edit and execute directly).</p></li><li><p>Quiz questions.</p></li><li><p>Videos (supported services include YouTube and Vimeo).</p></li><li><p>Interactive Shiny components.</p></li></ol><p>Tutorials automatically preserve work done within them, so if a user works on a few exercises or questions and returns to the tutorial later they can pick up right where they left off.</p><h2 id="example">Example</h2><p>Test out the latest <a href="https://andrie-de-vries.shinyapps.io/sortable_tutorial_question_rank/">interactive demo</a> of <code>sortable</code>'s ranking quiz question.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">learnr<span style="color:#666">::</span><span style="color:#06287e">run_tutorial</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">question_rank&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">sortable&#34;</span>)</code></pre></div><iframe width="100%" height="800" src="https://andrie-de-vries.shinyapps.io/sortable_tutorial_question_rank/" frameborder="0" style="border: 1.1px solid black"></iframe><h1 id="highlights">Highlights</h1><h2 id="new-quiz-questions">New quiz questions</h2><p>I am excited to announce that quiz questions are now mini shiny applications. This opens the door to new and extendable question types, such as text box and ranking questions. The <a href="https://rstudio.github.io/sortable/"><code>sortable</code> R package</a> (an <code>htmlwidgets</code> wrapper around the drag-and-drop <a href="https://sortablejs.github.io/Sortable/"><code>Sortable.js</code></a>) has already implemented ranking questions using the new <code>learnr</code> quiz question API. Thank you <a href="https://twitter.com/timelyportfolio">Kenton Russell</a> for originally pursuing <code>sortable</code> and <a href="https://twitter.com/RevoAndrie">Andrie de Vries</a> for connecting the two packages.</p><p>Please see <a href="https://learnr-examples.shinyapps.io/quiz_question/"><code>learnr::run_tutorial(&quot;quiz_question&quot;, &quot;learnr&quot;)</code></a> for more information.</p><h2 id="available-tutorials">Available tutorials</h2><p>A new function, <code>available_tutorials()</code>, has been added. When called, this function will find all available tutorials in every installed R package. If a <code>package</code> name is provided, only that package will be searched. This functionality has been integrated into <code>run_tutorial</code> if a user provides a wrong tutorial name or forgets the package name.</p><p>Please see <code>?learnr::available_tutorials</code> for more information.</p><h2 id="better-pre-rendering">Better pre-rendering</h2><p>Using the latest <code>rmarkdown</code>, <code>learnr</code> tutorials are now agressively pre-rendered. For package developers, please do not include the pre-rendered HTML files in your package as users will most likely need to recompile the tutorial. See <a href="https://github.com/rstudio/learnr/blob/1b9ac06d2c4b052a60ce6f24ffc9c7af13294a59/.Rbuildignore#L18"><code>learnr</code>'s <code>.Rbuildignore</code></a> for an example.</p><h2 id="deploying-dependencies-not-found">Deploying dependencies not found</h2><p>If your <code>learnr</code> tutorial contains <em>broken</em> code within exercises for users to fix, the CRAN version of <code>packrat</code> will not find all of your dependencies to install when the tutorial is deployed. To deploy tutorials containing exercise code with syntax errors, install the development version of <code>packrat</code>. This version of <code>packrat</code> is able to find dependencies per R chunk, allowing for <em>broken</em> R chunks within the tutorial file.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">remotes<span style="color:#666">::</span><span style="color:#06287e">install_github</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">rstudio/packrat&#34;</span>)</code></pre></div><h2 id="breaking-changes">Breaking changes</h2><p><code>learnr</code> 0.10.0 includes some non-backward-compatible bug fixes involving a the browser&rsquo;s local storage. It is possible that the browser&rsquo;s local storage will have a &ldquo;cache miss&rdquo; and existing users will be treated like new users.</p><h1 id="learnr-change-log"><code>learnr</code> change log</h1><h2 id="new-features">New features</h2><ul><li><p>Quiz questions are implemented using shiny modules (instead of htmlwidgets). (<a href="https://github.com/rstudio/learnr/pull/194">#194</a>)</p></li><li><p>Aggressively rerender prerendered tutorials in favor of a cohesive exercise environment (<a href="https://github.com/rstudio/learnr/issues/169">#169</a>, <a href="https://github.com/rstudio/learnr/pull/179">#179</a>, and <a href="https://github.com/rstudio/rmarkdown/pull/1420">rstudio/rmarkdown#1420</a>)</p></li><li><p>Added a new function, <code>safe</code>, which evaluates code in a new, safe R environment. (<a href="https://github.com/rstudio/learnr/pull/174">#174</a>)</p></li></ul><h2 id="minor-new-features-and-improvements">Minor new features and improvements</h2><ul><li><p>Added the last evaluated exercise submission value, <code>last_value</code>, as an exercise checker function argument. (<a href="https://github.com/rstudio/learnr/pull/228">#228</a>)</p></li><li><p>Added tabset support. (<a href="https://github.com/rstudio/learnr/pull/219">#219</a> <a href="https://github.com/rstudio/learnr/issues/212">#212</a>)</p></li><li><p>Question width will expand to the container width. (<a href="https://github.com/rstudio/learnr/pull/222">#222</a>)</p></li><li><p>Available tutorial names will be displayed when no <code>name</code> parameter or an incorrect <code>name</code> is provided to <code>run_tutorial()</code>. (<a href="https://github.com/rstudio/learnr/pull/234">#234</a>)</p></li><li><p>The <code>options</code> parameter was added to <code>question</code> to allow custom questions to pass along custom information. See <code>sortable::sortable_question</code> for an example. (<a href="https://github.com/rstudio/learnr/pull/243">#243</a>)</p></li><li><p>Missing package dependencies will ask to be installed at tutorial run time. (<code>@isteves</code>, <a href="https://github.com/rstudio/learnr/issues/253">#253</a>)</p></li><li><p>When questions are tried again, the existing answer will remain, not forcing the user to restart from scratch. (<a href="https://github.com/rstudio/learnr/issues/270">#270</a>)</p></li><li><p>A version number has been added to <code>question_submission</code> events. This will help when using custom storage methods. (<a href="https://github.com/rstudio/learnr/pull/291">#291</a>)</p></li><li><p>Tutorial storage on the browser is now executed directly on <code>indexedDB</code> using <code>idb-keyval</code> (dropping <code>localforage</code>). This change prevents browser tabs from blocking each other when trying to access <code>indexedDB</code> data. (<a href="https://github.com/rstudio/learnr/pull/305">#305</a>)</p></li></ul><h2 id="bug-fixes">Bug fixes</h2><ul><li><p>Fixed a spurious console warning when running exercises using Pandoc 2.0. (<a href="https://github.com/rstudio/learnr/issues/154">#154</a>)</p></li><li><p>Added a fail-safe to try-catch bad student code that would crash the tutorial. (<a href="https://github.com/adamblake"><code>@adamblake</code></a>, <a href="https://github.com/rstudio/learnr/issues/229">#229</a>)</p></li><li><p>Replaced references to <code>checkthat</code> and <code>grader</code> in docs with <a href="https://github.com/rstudio-education/gradethis">gradethis</a> (<a href="https://github.com/rstudio/learnr/issues/269">#269</a>)</p></li><li><p>Removed a warning created by pandoc when evaluating exercises where pandoc was wanting a title or pagetitle. <a href="https://github.com/rstudio/learnr/pull/303">#303</a></p></li></ul></description></item><item><title>pins 0.3.0: Azure, GCloud and S3</title><link>https://www.rstudio.com/blog/pins-0-3-0-azure-gcloud-and-s3/</link><pubDate>Thu, 28 Nov 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/pins-0-3-0-azure-gcloud-and-s3/</guid><description><p>A new version of <code>pins</code> is available on CRAN! <code>pins 0.3</code> comes with many improvements and the following major features:</p><ul><li>Retrieve <strong>pin information</strong> with <code>pin_info()</code> including properties particular to each board.</li></ul><p>You can install this new version from CRAN as follows:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">pins&#34;</span>)</code></pre></div><p>In addition, there is a new <a href="https://rstudio.github.io/pins/articles/use-cases.html">Use Cases</a> section in our docs, various improvements (see <a href="https://rstudio.github.io/pins/news/index.html">NEWS</a>) and two community extensions being developed to support <a href="https://rstudio.github.io/connections/#pins">databases</a> and <a href="https://gitlab.com/gwmngilfen/nextcloudr">Nextcloud</a> as boards.</p><h2 id="cloud-boards">Cloud Boards</h2><p><code>pins 0.3</code> adds support to find, retrieve and store resources in various cloud providers like: <a href="https://azure.microsoft.com/">Microsoft Azure</a>, <a href="https://cloud.google.com/">Google Cloud</a> and <a href="https://aws.amazon.com/">Amazon Web Services</a>.</p><p><img src="images/pins-cloud-boards-azure-gcloud-s3.png" alt=""></p><p>To illustrate how they work, lets first try to find the World Bank indicators dataset in <a href="https://www.kaggle.com/">Kaggle</a>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(pins)<span style="color:#06287e">pin_find</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">indicators&#34;</span>, board <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">kaggle&#34;</span>)</code></pre></div><pre><code># A tibble: 6 x 4name description type board&lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;1 worldbank/world-development-indicators World Development Indicators files kaggle2 theworldbank/world-development-indicators World Development Indicators files kaggle3 cdc/chronic-disease Chronic Disease Indicators files kaggle4 bigquery/worldbank-wdi World Development Indicators (WDI) Data files kaggle5 rajanand/key-indicators-of-annual-health-survey Health Analytics files kaggle6 loveall/human-happiness-indicators Human Happiness Indicators files kaggle</code></pre><p>Which we can then easily download with <code>pin_get()</code>, beware this is a 2GB download:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">pin_get</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">worldbank/world-development-indicators&#34;</span>)</code></pre></div><pre><code>[1] &quot;/.../worldbank/world-development-indicators/Country.csv&quot;[2] &quot;/.../worldbank/world-development-indicators/CountryNotes.csv&quot;[3] &quot;/.../worldbank/world-development-indicators/database.sqlite&quot;[4] &quot;/.../worldbank/world-development-indicators/Footnotes.csv&quot;[5] &quot;/.../worldbank/world-development-indicators/hashes.txt&quot;[6] &quot;/.../worldbank/world-development-indicators/Indicators.csv&quot;[7] &quot;/.../worldbank/world-development-indicators/Series.csv&quot;[8] &quot;/.../worldbank/world-development-indicators/SeriesNotes.csv&quot;</code></pre><p>The <code>Indicators.csv</code> file contains all the indicators, so let&rsquo;s load it with <a href="https://readr.tidyverse.org/">readr</a>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">indicators <span style="color:#666">&lt;-</span> <span style="color:#06287e">pin_get</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">worldbank/world-development-indicators&#34;</span>)[6] <span style="color:#666">%&gt;%</span>readr<span style="color:#666">::</span><span style="color:#06287e">read_csv</span>()</code></pre></div><p>Analysing this dataset would be quite interesting; however, this post focuses on how to share this in S3, Google Cloud or Azure storage. More specifically, we will learn how to publish to an <a href="https://pins.rstudio.com/articles/boards-s3.html">S3 board</a>. To publish to other cloud providers, take a look at the <a href="https://pins.rstudio.com/articles/boards-gcloud.html">Google Cloud</a> and <a href="https://pins.rstudio.com/articles/boards-azure.html">Azure boards</a> articles.</p><p>As you would expect, the first step is to register the S3 board. When using RStudio, you can use the <a href="https://pins.rstudio.com/articles/pins-rstudio.html">New Connection</a> action to guide you through this process, or you can specify your <code>key</code> and <code>secret</code> as follows. Please refer to the <a href="https://pins.rstudio.com/articles/boards-s3.html">S3 board</a> article to understand how to store your credentials securely.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">board_register_s3</span>(name <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">rpins&#34;</span>,bucket <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">rpins&#34;</span>,key <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">VerySecretKey&#34;</span>,secret <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">EvenMoreImportantSecret&#34;</span>)</code></pre></div><p>With the S3 board registered, we can now pin the indicators dataset with <code>pin()</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">pin</span>(indicators, name <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">worldbank/indicators&#34;</span>, board <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">rpins&#34;</span>)</code></pre></div><p>That&rsquo;s about it! We can now find and retrieve this dataset from S3 using <code>pin_find()</code>, <code>pin_get()</code> or view the uploaded resources in the S3 management console:</p><p><img src="images/pins-upload-s3-results.png" alt=""></p><p>To make this even easier for others to consume, we can make this S3 bucket public; which means you can now connect to this board without even having to configure S3, making it possible to retrieve this dataset with just one line of R code!</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">pins<span style="color:#666">::</span><span style="color:#06287e">pin_get</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">worldbank/indicators&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">https://rpins.s3.amazonaws.com&#34;</span>)</code></pre></div><pre><code># A tibble: 5,656,458 x 6CountryName CountryCode IndicatorName IndicatorCode Year Value&lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt;1 Arab World ARB Adolescent fertility rate (births per… SP.ADO.TFRT 1960 1.34e+22 Arab World ARB Age dependency ratio (% of working-ag… SP.POP.DPND 1960 8.78e+13 Arab World ARB Age dependency ratio, old (% of worki… SP.POP.DPND.OL 1960 6.63e+04 Arab World ARB Age dependency ratio, young (% of wor… SP.POP.DPND.YG 1960 8.10e+15 Arab World ARB Arms exports (SIPRI trend indicator v… MS.MIL.XPRT.KD 1960 3.00e+66 Arab World ARB Arms imports (SIPRI trend indicator v… MS.MIL.MPRT.KD 1960 5.38e+87 Arab World ARB Birth rate, crude (per 1,000 people) SP.DYN.CBRT.IN 1960 4.77e+18 Arab World ARB CO2 emissions (kt) EN.ATM.CO2E.KT 1960 5.96e+49 Arab World ARB CO2 emissions (metric tons per capita) EN.ATM.CO2E.PC 1960 6.44e-110 Arab World ARB CO2 emissions from gaseous fuel consu… EN.ATM.CO2E.GF… 1960 5.04e+0# … with 5,656,448 more rows</code></pre><p>This works since <code>pins 0.3</code> automatically register URLs as a <a href="https://pins.rstudio.com/articles/boards-websites.html">website board</a> to save you from having to explicitly call <code>board_register_datatxt()</code>.</p><p>It&rsquo;s also worth mentioning that <code>pins</code> stores the dataset using an R native format, which requires only 72MB and loads much faster than the original 2GB dataset.</p><h2 id="pin-information">Pin Information</h2><p>Boards like <a href="https://pins.rstudio.com/articles/boards-kaggle.html">Kaggle</a> and <a href="https://pins.rstudio.com/articles/boards-rsconnect.html">RStudio Connect</a>, store additional information for each pin which you can now easily retrieve with <code>pin_info()</code>.</p><p>For instance, we can retrieve additional properties from the indicators pin from Kaggle as follows,</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">pin_info</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">worldbank/world-development-indicators&#34;</span>, board <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">kaggle&#34;</span>)</code></pre></div><pre><code># Source: kaggle&lt;worldbank/world-development-indicators&gt; [files]# Description: World Development Indicators# Properties:# - id: 23# - subtitle: Explore country development indicators from around the world# - tags: (ref) business, economics, international relations, business finance...# - creatorName: Megan Risdal# - creatorUrl: mrisdal# - totalBytes: 387054886# - url: https://www.kaggle.com/worldbank/world-development-indicators# - lastUpdated: 2017-05-01T17:50:44.863Z# - downloadCount: 42961# - isPrivate: FALSE# - isReviewed: TRUE# - isFeatured: FALSE# - licenseName: World Bank Dataset Terms of Use# - ownerName: World Bank# - ownerRef: worldbank# - kernelCount: 422# - topicCount: 7# - viewCount: 254379# - voteCount: 1121# - currentVersionNumber: 2# - usabilityRating: 0.7647# - extension: zip</code></pre><p>And from RStudio Connect boards as well,</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">pin_info</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">worldnews&#34;</span>, board <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">rsconnect&#34;</span>)</code></pre></div><pre><code># Source: rsconnect&lt;jluraschi/worldnews&gt; [table]# Properties:# - id: 6446# - guid: 1b9f04c5-ddd4-43ca-8352-98f6f01a7034# - access_type: all# - url: https://beta.rstudioconnect.com/content/6446/# - vanity_url: FALSE# - bundle_id: 16216# - app_mode: 4# - content_category: pin# - has_parameters: FALSE# - created_time: 2019-09-30T18:20:21.911777Z# - last_deployed_time: 2019-11-18T16:00:16.919478Z# - build_status: 2# - run_as_current_user: FALSE# - owner_first_name: Javier# - owner_last_name: Luraschi# - owner_username: jluraschi# - owner_guid: ac498f34-174c-408f-8089-a9f10c630a37# - owner_locked: FALSE# - is_scheduled: FALSE# - rows: 44# - cols: 1</code></pre><p>To retrieve all the extended information when discovering pins, pass <code>extended = TRUE</code> to <code>pin_find()</code>.</p><p>Thank you for reading this post!</p><p>Please refer to <a href="https://rstudio.github.io/pins">rstudio.github.io/pins</a> for detailed documentation and <a href="https://github.com/rstudio/pins/issues/new">GitHub</a> to file issues or feature requests.</p></description></item><item><title>Package Manager 1.1.0.1 Patch Release</title><link>https://www.rstudio.com/blog/package-manager-1-1-0-1-patch-release/</link><pubDate>Tue, 26 Nov 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/package-manager-1-1-0-1-patch-release/</guid><description><p>Earlier this month we <a href="https://blog.rstudio.com/2019/11/07/package-manager-v1-1-no-interruptions/">announced RStudio Package Manager version 1.1.0</a>, a majorupdate to RStudio Package Manager that added support for:</p><ul><li>Faster package installs on Linux with pre-compiled binaries</li><li>Easier installs through the display of system requirements</li><li>A calendar of repository checkpoints to make it easier to reproduce work</li></ul><p>This patch release updates the 1.1.0 version without introducing any newfeatures. The 1.1.0.1 patch fixes a bug in the prior release that impactsperformance when serving binary packages from historical checkpoints. There isalso a small chance that an incorrect historical binary could be servedunder specific repository and source configurations. These problems werediscovered internally, and no customers reported any impact.</p><blockquote><h4 id="patch-instructions">Patch Instructions</h4><p>We strongly encourage all customers using version 1.1.0 to apply the patch bysimply by installing version 1.1.0.1 on top of the existing1.1.0 release.</p></blockquote><p>If you are upgrading from an earlier version, be sure to consult the <a href="https://doc.rstudio.com/rspm/news">release notes</a> for the intermediate releases, as well.</p><h4 id="new-to-rstudio-package-manager">New to RStudio Package Manager?</h4><p><a href="https://rstudio.com/products/package-manager/">Download</a> the 45-day evaluationtoday to see how RStudio Package Manager can help you, your team, and yourentire organization access and organize R packages. Learn more with our <a href="https://demo.rstudiopm.com">onlinedemo server</a> or <a href="https://resources.rstudio.com/webinars/introduction-to-the-rstudio-package-manager-sean-lopp">latest webinar</a>.</p><ul><li><a href="https://docs.rstudio.com/rspm/admin">Admin Guide</a></li><li><a href="https://www.rstudio.com/wp-content/uploads/2018/07/RStudio-Package-Manager-Overview.pdf">Overview PDF</a></li><li><a href="https://resources.rstudio.com/webinars/introduction-to-the-rstudio-package-manager-sean-lopp">Introductory Webinar</a></li><li><a href="https://demo.rstudiopm.com">Online Demo</a></li></ul></description></item><item><title>Thinking about rstudio::conf 2020? See the full conference schedule!</title><link>https://www.rstudio.com/blog/thinking-about-rstudio-conf-2020-see-the-full-conference-schedule/</link><pubDate>Mon, 25 Nov 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/thinking-about-rstudio-conf-2020-see-the-full-conference-schedule/</guid><description><p>rstudio::conf 2020, the conference on all things R and RStudio, is only two months away and we’ve already surpassed last year’s registration! If you’re a dedicated R user or one of the many people and teams who use RStudio with R and Python, now is the time to claim your spot in San Francisco.</p><p align="center"><a href="https://web.cvent.com/event/36ebe042-0113-44f1-8e36-b9bc5d0733bf/websitePage:34f3c2eb-9def-44a7-b324-f2d226e25011?RefId=conference&utm_campaign=Site%20Promo&utm_medium=Ste&utm_source=ConfPage"><img src="learn-more-and-reg.png" style="width:60.0%" alt="Learn more and register for the RStudio 2020 conference here" /></a></p><p>Today we’re delighted to announce the <a href="https://web.cvent.com/event/36ebe042-0113-44f1-8e36-b9bc5d0733bf/websitePage:34f3c2eb-9def-44a7-b324-f2d226e25011?RefId=conference&amp;utm_campaign=Site%20Promo&amp;utm_medium=Ste&amp;utm_source=ConfPage">full conference schedule</a>, so that you can plan your days. rstudio::conf 2020 takes place at the Hilton San Francisco Union Square January 29-30, preceded by Training Days on the 27th and 28th.This year we have over 80 talks, 18 lightning talks, and 20 posters.</p><ul><li>Join host Hadley Wickham, Chief Scientist at RStudio, in welcoming our keynote speakers:<ul><li><a href="https://hilaryparker.com/about-hilary-parker/">Hilary Parker</a> (Stitch Fix) and <a href="http://www.biostat.jhsph.edu/~rpeng/">Roger Peng</a> (Johns Hopkins)</li><li><a href="http://www.bewitched.com/about.html">Martin Wattenberg</a> and <a href="http://www.fernandaviegas.com/">Fernanda Viegas</a> (Research Scientists, Google)</li><li><a href="https://jennybryan.org/">Jenny Bryan</a> (Engineer, RStudio)</li><li><a href="https://github.com/jjallaire">JJ Allaire</a> (CEO, RStudio)</li></ul></li><li>16 invited talks from outstanding speakers, innovators, and data scientists.</li><li>Over 50 contributed talks from the R and open-source data science community.</li><li>And 30 talks by RStudio employees on the latest developments in shiny, the tidyverse, R and python integration, and more!</li></ul><p>We also have <a href="https://web.cvent.com/event/36ebe042-0113-44f1-8e36-b9bc5d0733bf/websitePage:34f3c2eb-9def-44a7-b324-f2d226e25011?RefId=conference&amp;utm_campaign=Site%20Promo&amp;utm_medium=Ste&amp;utm_source=ConfPage">18 two day workshops</a> (for both beginners and experts!) if you want to go deep into a topic. We look forward to seeing you there!</p></description></item><item><title>Our first artist in residence: Allison Horst!</title><link>https://www.rstudio.com/blog/artist-in-residence/</link><pubDate>Mon, 18 Nov 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/artist-in-residence/</guid><description><p>I&rsquo;m very excited to announce that Allison Horst is RStudio&rsquo;s inaugural artist-in-residence. Allison is one of my favourite artists in the R community. Over the next year, she&rsquo;s going to be working with us to create even more awesome artwork. Here&rsquo;s a little about Allison in her own words.</p><p>&mdash; Hadley</p><hr><p>Hello everyone, I’m Allison.</p><img src="horst_rstudio_air.png" width="400"/><p>Some of you might know me from my <a href="https://github.com/allisonhorst/stats-illustrations">R- and stats-inspired illustrations</a> on <a href="https://twitter.com/allison_horst">twitter</a>). I’m excited to share that, as of October 2019, I am an Artist-in-Residence with <a href="https://rstudio.com/">RStudio</a>. My goal as an RStudio Artist-in-Residence is to create useful and engaging illustrations and designs that welcome users to explore, teach, and learn about new packages and functions in R. In this post, I’ll share what motivates me to create R-related artwork, and what I&rsquo;ll be working on at RStudio.</p><h3 id="why-did-i-start-making-r-related-artwork">Why did I start making R-related artwork?</h3><p>My primary job is teaching data science and statistics courses to ~100 incoming students each year at the <a href="http://bren.ucsb.edu/">Bren School of Environmental Science and Management</a>, UC Santa Barbara.</p><p>When teaching, I’ve frequently found myself struggling to motivate students to try new R packages or functions. As an example, imagine you’re a student in an “Intro to Data Science” class learning to code for the first time. You’re already kind of intimidated by R, and then the really excited (unnamed) instructor exclaims “<code>dplyr::mutate()</code> is sooooo awesome!!!” while displaying code examples and/or R documentation on a slide behind them:</p><img src="horst_teaching_code.png" width="600"/><p>Even if the instructor is positive and encouraging, a screen full of code and documentation behind them might cast a daunting cloud over the introduction.</p><p>That’s the position I found myself in as a teacher. There was a clear disconnect between my excitement about sharing new things in R, and what I was presenting visually as a “first glimpse” into what a package or function could do. I felt frustrated to not have educational visuals that aligned with my enthusiasm. I also felt that if I could just make a student’s first exposure to a new coding skill something positive &mdash; funny, or happy, or intriguing, or just plain cute &mdash; they would be less resistant to investing in a new [insert thing] in R.</p><h3 id="what-are-my-goals">What are my goals?</h3><p>When I started creating my aRt to lower learning barriers, I kept three things in mind:</p><ul><li>Focus first on the big-picture application/use of the R function or package.</li><li>Make illustrations visually engaging, welcoming, and useful for useRs at all levels.</li><li>Use imagery to make it feel like R is working with you, not against you.</li></ul><p>I tried a few different styles and characters and the friendly, hardworking, colorful monsters were most representative of how I think about work done by packages and functions. All of the monsteRs illustrations are driven by the goal of creating a friendlier bridge between learners and R functions / packages that might look intimidating at first glance.</p><p>For example, instead of showing a chunk of code while trying to encourage students to learn <code>dplyr::mutate()</code>, their first sighting of the function would be mutant monsteRs working behind the scenes to add columns to a data frame, while keeping the existing ones:</p><img src="dplyr_mutate.png" width="600"/><p>And here are the R Markdown wizard monsteRs, helping to keep text, code and outputs all together, then knitting to produce a final document:</p><img src="rmarkdown_wizards.png" width="600"/><p>And of course the ggplot2 artist monsteRs are using geoms, themes, and aesthetic mappings to build masterful data visualizations:</p><img src="ggplot2_masterpiece.png" width="600"/><p>Do the monsteRs teach code? Well, no. But I hope that they <strong>do</strong> provide a welcome entry point for learners, and make the use of an R function or package clear and memorable. And while I create the illustrations mostly with teachers and learners in mind, users at any level can learn something new, or remember something old, through art reminders.</p><img src="janitor_tyler_tweet.png" width="400"/><h3 id="what-else-am-i-working-on">What else am I working on?</h3><p>The monsteRs make frequent appearances in my artwork, but I’ve also enjoyed contributing to the R community through other graphic design and illustrations. Here’s an extended cut of the classic schematic from <a href="https://r4ds.had.co.nz/">R for Data Science</a>, updated to include environmental data and science communication bookends, that Dr. Julia Lowndes envisioned and presented in her <a href="https://www.youtube.com/watch?v=Z8PqwFPqn6Y&amp;feature=youtu.be&amp;start=2710">useR!2019 keynote</a>:</p><img src="horst-eco-r4ds.png" width="700"/><p>I had a great time creating buttons and banners for the <a href="https://rstudio.com/bof/">“Birds of a Feather” sessions</a> at upcoming <a href="https://web.cvent.com/event/36ebe042-0113-44f1-8e36-b9bc5d0733bf/summary">rstudio::conf(2020)</a> - where I’m looking forward to meeting many of you in person!</p><img src="horst_bof_buttons.png" width="700"/><p>And, I’ve been working on hex designs for R-related groups and packages! Here are a few: the hex sticker for Santa Barbara R-Ladies (our growing local chapter of <a href="https://rladies.org/">R-Ladies Global</a>), the new <a href="https://github.com/r-lib/rray"><code>rray</code> package</a> hex envisioned by Davis Vaughan, and a design for the <a href="https://tidymodels.github.io/butcher/"><code>butcher</code> package</a> from Joyce Cahoon and the <a href="https://github.com/tidymodels/tidymodels"><code>tidymodels</code></a> team:</p><img src="horst_hexes.png" width="700"/><p>I’m inspired by how RStudio and the broader R community have embraced and supported art as a means of reaching more users, improving education materials (see the beautiful <a href="https://education.rstudio.com/">RStudio Education</a> site with artwork by <a href="https://desiree.rbind.io/">Desirée De Leon</a>!), and simply making the R landscape a bit brighter. I am excited to continue producing aRt as an RStudio Artist-in-Residence over the next year.</p><p>&mdash; Allison</p></description></item><item><title>Package Manager 1.1.0 - No Interruptions</title><link>https://www.rstudio.com/blog/package-manager-v1-1-no-interruptions/</link><pubDate>Thu, 07 Nov 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/package-manager-v1-1-no-interruptions/</guid><description><p>No interruptions. That was our team&rsquo;s goal for RStudio Package Manager 1.1.0 - we setout to make R package installation fast enough that it wouldn&rsquo;t interrupt yourwork. More and more data scientists use Linux environments, whether to access extrahorsepower during development or to run production code in containers.Unfortunately, the rise in Linux environments has seen a corresponding increasein package installation pain. For Windows and Mac OS, CRAN provides pre-compiledbinary packages that install almost instantly, but the same binaries are not available onLinux. As a result, data scientists can lose their train of thought, or put offtrying out a new method, all because they have to wait for new packages to compile and install.New users often face a tedious hour-long setup process before they can try outenvironments. IT/DevOps engineers are forced to wait any time they want tobuild a new image, deploy to production, or restore an environment.</p><p>RStudio Package Manager already makes it easy for an organization to control anddistribute R packages. Now, packages from CRAN can be immediately available fordeployment on Linux systems, through Linux package binaries. These binariesinstall significantly faster and are available to all Package Manager clientswherever your organization uses R. Binaries are supported for a range of Rversions and platforms, for more than 80% of CRAN packages, and they are updatedevery week! Binaries make it easier for users to get started, simpler foradmins to manage environments, and make it dramatically easier to implement automation.</p><script src="https://fast.wistia.com/embed/medias/422du9ce0z.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_422du9ce0z videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><p>To see the difference for yourself, try installing a package on your Linuxserver using our demo server:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic"># First, pick your operating system</span>DISTRO <span style="color:#666">&lt;-</span> <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">xenial&#39;</span> <span style="color:#60a0b0;font-style:italic"># choices: xenial, bionic, centos7, opensuse42, opensuse15</span><span style="color:#60a0b0;font-style:italic"># Next, install from our demo server</span><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#39;</span><span style="color:#4070a0">dplyr&#39;</span>, repos <span style="color:#666">=</span> <span style="color:#06287e">sprintf</span>(<span style="color:#4070a0">&#39;</span><span style="color:#4070a0">https://demo.rstudiopm.com/cran/__linux__/%s/latest&#39;</span>, DISTRO), lib <span style="color:#666">=</span> <span style="color:#06287e">tempdir</span>())<span style="color:#60a0b0;font-style:italic"># Finally, compare to how long it takes to install from CRAN</span><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#39;</span><span style="color:#4070a0">dplyr&#39;</span>, repos <span style="color:#666">=</span> <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">https://cran.rstudio.com&#39;</span>, lib <span style="color:#666">=</span> <span style="color:#06287e">tempdir</span>())</code></pre></div><p>At RStudio, we use these binaries in production every day. Although this <a href="https://community.rstudio.com/t/faster-package-installs-on-linux-with-package-manager-beta-release/39607">communitypost</a>contains more information about the previous beta, we are excited to announcethat with v1.1.0, the binaries are ready for your production systems. Support isavailable for offline or air-gapped environments.</p><h2 id="other-updates">Other Updates</h2><p>In addition to adding support for Linux package binaries, the 1.1.0 releaseconcludes more than a year of updates since the 1.0.0 release, adding:</p><ul><li><a href="https://blog.rstudio.com/2019/04/18/rstudio-package-manager-1-0-8-system-requirements/">System dependency</a> information for R packages</li><li><a href="https://blog.rstudio.com/2019/03/13/rstudio-package-manager-1-0-6-readme/">Package READMEs</a> to help discover documentation</li><li>A <a href="https://blog.rstudio.com/2019/01/30/time-travel-with-rstudio-package-manager-1-0-4/">calendar</a> to make version management a breeze</li><li>Significant improvements for IT, including performance improvements, security updates, and new storage options</li></ul><p>Please review the <a href="https://docs.rstudio.com/rspm/news">full release notes</a>.</p><blockquote><h4 id="upgrade-planning">Upgrade Planning</h4><p>Upgrading to 1.1.0 from 1.0.10 or earlier is a major update but will take less than five minutes. If you areupgrading from an earlier version, be sure to consult the release notes for theintermediate releases, as well.</p></blockquote><p>Package management is critical for making your data science reproducible, overtime, and across your organization. Wondering where you should start? <a href="mailto:sales@rstudio.com">Emailus</a>, our product team is happy to help!</p><h4 id="new-to-rstudio-package-manager">New to RStudio Package Manager?</h4><p><a href="https://rstudio.com/products/package-manager/">Download</a> the 45-day evaluationtoday to see how RStudio Package Manager can help you, your team, and yourentire organization access and organize R packages. Learn more with our <a href="https://demo.rstudiopm.com">onlinedemo server</a> or <a href="https://resources.rstudio.com/webinars/introduction-to-the-rstudio-package-manager-sean-lopp">latest webinar</a>.</p><ul><li><a href="https://docs.rstudio.com/rspm/admin">Admin Guide</a></li><li><a href="https://www.rstudio.com/wp-content/uploads/2018/07/RStudio-Package-Manager-Overview.pdf">Overview PDF</a></li><li><a href="https://resources.rstudio.com/webinars/introduction-to-the-rstudio-package-manager-sean-lopp">Introductory Webinar</a></li><li><a href="https://demo.rstudiopm.com">Online Demo</a></li></ul></description></item><item><title>renv: Project Environments for R</title><link>https://www.rstudio.com/blog/renv-project-environments-for-r/</link><pubDate>Wed, 06 Nov 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/renv-project-environments-for-r/</guid><description><p>We&rsquo;re excited to announce that <a href="https://rstudio.github.io/renv/"><code>renv</code></a> is now available on CRAN! You can install <code>renv</code> with:</p><pre><code class="language-{r" data-lang="{r">install.packages(&quot;renv&quot;)</code></pre><p><code>renv</code> is an R dependency manager. Use <code>renv</code> to make your projects more:</p><ul><li><p><strong>Isolated</strong>: Each project gets its own library of R packages, so you can feel free to upgrade and change package versions in one project without worrying about breaking your other projects.</p></li><li><p><strong>Portable</strong>: Because <code>renv</code> captures the state of your R packages within a lockfile, you can more easily share and collaborate on projects with others, and ensure that everyone is working from a common base.</p></li><li><p><strong>Reproducible</strong>: Use <code>renv::snapshot()</code> to save the state of your R library to the lockfile <code>renv.lock</code>. You can later use <code>renv::restore()</code> to restore your R library exactly as specified in the lockfile.</p></li></ul><p>If you&rsquo;ve used <a href="http://rstudio.github.io/packrat">Packrat</a> before, this may all feel familiar. User feedback has made it clear that a number of the decisions we made during Packrat&rsquo;s development ultimately made it frustrating to use, and led to a sub-optimal user experience. The goal then is for <code>renv</code> to be a robust, stable replacement for the Packrat package, with fewer surprises and better default behaviors. While we will continue maintaining Packrat, all new development will focus on <code>renv</code>.</p><p>In addition, we&rsquo;ve built <code>renv</code> to work well with R projects using Python through <a href="https://rstudio.github.io/reticulate/"><code>reticulate</code></a>. Using <code>renv</code>, you can also create project-local Python environments, and instruct <code>reticulate</code> to automatically bind to, manage, and use these environments.</p><h2 id="getting-started">Getting Started</h2><p>The core essence of the <code>renv</code> workflow is fairly simple:</p><ol><li><p>Use <code>renv::init()</code> to initialize a project. <code>renv</code> will discover the R packages used in your project, and install those packages into a private project library.</p></li><li><p>Work in your project as usual, installing and upgrading R packages as required as your project evolves.</p></li><li><p>Use <code>renv::snapshot()</code> to save the state of your project library. The project state will be serialized into a file called <code>renv.lock</code>.</p></li><li><p>Use <code>renv::restore()</code> to restore your project library from the state of your previously-created lockfile <code>renv.lock</code>.</p></li></ol><p>In short: use <code>renv::init()</code> to initialize your project library, and use <code>renv::snapshot()</code> / <code>renv::restore()</code> to save and load the state of your library.</p><p>After your project has been initialized, you can work within the project as before, but without fear that installing or upgrading packages could affect other projects on your system.</p><h2 id="collaborating">Collaborating</h2><p>When you want to share a project with other collaborators, you may want to ensure everyone is working with the same environment &ndash; otherwise, code in the project may unexpectedly fail to run because of changes in behavior between different versions of the packages in use. You can use <code>renv</code> to help make this possible.</p><p>When using <code>renv</code>, the packages used in your project will be recorded into a lockfile, <code>renv.lock</code>. Because <code>renv.lock</code> records the exact versions of R packages used within a project, if you share that file with your collaborators, they will be able to use <code>renv::restore()</code> to install exactly those packages into their own library. This implies the following workflow for collaboration:</p><ol><li><p>Select a way to share your project sources. The most common way nowadays is to use a version control system with a hosted repository; e.g. <a href="https://git-scm.com/">Git</a> with <a href="https://github.com/">GitHub</a>, but many other options are available.</p></li><li><p>Make sure your project is initialized with <code>renv</code> by calling <code>renv::init()</code>.</p></li><li><p>Call <code>renv::snapshot()</code> as needed, to generate and update <code>renv.lock</code>.</p></li><li><p>Share your project sources, alongside the generated lockfile <code>renv.lock</code>.</p></li></ol><p>After your collaborators have received your <code>renv.lock</code> lockfile &ndash; for example, by cloning the project repository &ndash; they can then also execute <code>renv::init()</code> to automatically install the packages declared in that lockfile into their own private project library. By doing this, they will now be able to work within your project using the exact same R packages that you were when <code>renv.lock</code> was generated.</p><h2 id="time-travel">Time Travel</h2><p>On some occasions, you might find that you&rsquo;ve made a change to <code>renv.lock</code> that you&rsquo;d like to roll back. If you&rsquo;re using <a href="https://git-scm.com/">Git</a> for version control with your project (and we strongly encourage you to!), <code>renv</code> has a couple helper functions that make it easy to find and use previously-committed versions of the lockfile.</p><ul><li><p>Use <code>renv::history()</code> to view past versions of <code>renv.lock</code> that have been committed to your repository, and find the commit hash associated with that particular revision of <code>renv.lock</code>.</p></li><li><p>Use <code>renv::revert()</code> to pull out an old version of <code>renv.lock</code> based on the previously-discovered commit, and then use <code>renv::restore()</code> to restore your library from that state.</p></li></ul><p>If you have an alternate version control system you&rsquo;d like to see us support, please <a href="https://github.com/rstudio/renv/issues">let us know</a>!</p><h2 id="integration-with-python">Integration with Python</h2><p><code>renv</code> also makes it easy to set up a project-local Python environment to use with your R projects. This can be especially useful if you&rsquo;re using the <a href="https://rstudio.github.io/reticulate/"><code>reticulate</code></a> package, or other packages depending on reticulate such as <a href="https://tensorflow.rstudio.com/"><code>tensorflow</code></a> or <a href="https://keras.rstudio.com/"><code>keras</code></a>. Just call:</p><pre><code class="language-{r" data-lang="{r">renv::use_python()</code></pre><p>and a project-local Python environment will be set up and used by <code>reticulate</code>. When <code>renv</code>'s Python integration is active, a couple extra features will activate:</p><ol><li><p><code>renv</code> will instruct <code>reticulate</code> to load your project-local version of Python by default, avoiding some of the challenges with finding and selecting an appropriate version of Python on the system.</p></li><li><p>Calling <code>reticulate::py_install()</code> will install packages into the project&rsquo;s Python environment by default.</p></li><li><p>When <code>renv::snapshot()</code> is called, your project&rsquo;s Python library will also be captured into <code>requirements.txt</code> (for virtual environments) / <code>environment.yml</code> (for <a href="https://docs.conda.io/projects/conda/en/latest/user-guide/concepts/environments.html">Conda</a> environments).</p></li><li><p>Similarly, <code>renv::restore()</code> will also attempt to restore your Python environment, as encoded in <code>requirements.txt</code> / <code>environment.yml</code> from a previous snapshot.</p></li></ol><h2 id="packrat">Packrat</h2><p>If you&rsquo;ve used Packrat before, you&rsquo;re likely interested to learn what&rsquo;s changed in <code>renv</code>. We&rsquo;ll try to summarize the most poignant changes:</p><h3 id="project-initialization">Project Initialization</h3><p><code>packrat::init()</code> would, by default, attempt to retrieve package sources from CRAN under the assumption that you might want to rebuild packages from sources in the future (e.g. in an offline environment). This assumption was rarely true, and still often was unhelpful as many packages are difficult to build from sources.</p><p>To alleviate this, <code>renv::init()</code> no longer downloads package sources, and also attempts to copy and reuse packages already installed in your R libraries. This makes initializing new projects a breeze &ndash; you no longer have to sit around and wait as your project&rsquo;s multitude of dependencies get reinstalled; instead, the copies already available on your system will be copied and re-used.</p><h3 id="snapshots-and-dependencies">Snapshots and Dependencies</h3><p><code>packrat::snapshot()</code> would, in addition to capturing the state of your project library, also attempt to discover the R packages used in your project by crawling your <code>.R</code> and <code>.Rmd</code> files for dependencies. Unfortunately, this system was fairly unreliable and caused a number of issues, especially when the machinery itself emitted warnings or errors that could not be easily diagnosed.</p><p>The dependency discovery machinery in <code>renv</code> has been rewritten from the ground up, and should now be much more reliable. However, if you discover that this still causes issues for you, you can disable this altogether by changing the type of snapshot performed in your project. Use <code>renv::settings$snapshot.type(&quot;simple&quot;)</code> to use &ldquo;simple&rdquo; snapshots in your project, where the state of your library is captured as-is without any extra filtering to limit which of your installed packages enter the lockfile.</p><h2 id="extra-tools">Extra Tools</h2><p>In addition, <code>renv</code> comes with a couple extra tools out-of-the-box to help with common development workflows:</p><ul><li><p>Install packages from a wide variety of sources with <code>renv::install()</code>. <code>renv::install()</code> understands a subset of the <a href="https://cran.r-project.org/web/packages/remotes/vignettes/dependencies.html">remotes specification</a>, and so can be used for simple, dependency-free package installation in your projects. Currently, you can install packages from CRAN, GitHub, Gitlab, and Bitbucket. In addition, <code>renv</code> is also compatible with other tools commonly used to install packages, such as <a href="https://remotes.r-lib.org/"><code>remotes</code></a> and <a href="https://pak.r-lib.org/"><code>pak</code></a>.</p></li><li><p>Use <code>renv::dependencies()</code> to enumerate the R dependencies in your project. If necessary, use <code>.renvignore</code> files to tell <code>renv</code> which files and folders should not be scanned during dependency discovery.</p></li></ul><p>Finally, if you have a Packrat project that you&rsquo;d like to try porting to <code>renv</code>, you can use <code>renv::migrate()</code> to migrate the project infrastructure over to <code>renv</code>.</p><h2 id="learning-more">Learning More</h2><p>Please check out the <code>renv</code> <a href="https://rstudio.github.io/renv/articles/renv.html">Getting started guide</a> to learn more. If you are looking for strategies to manage reproducible environments, or don&rsquo;t know if <code>renv</code> is the right fit, check out <a href="https://environments.rstudio.com">https://environments.rstudio.com</a>. If you have questions or comments, please get in touch with us on the <a href="https://community.rstudio.com/">RStudio community forums</a>.</p></description></item><item><title>RStudio Professional Drivers 1.6.0</title><link>https://www.rstudio.com/blog/pro-drivers-1-6-0-release/</link><pubDate>Thu, 24 Oct 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/pro-drivers-1-6-0-release/</guid><description><p>Access to data is crucial for data science. Unfortunately, servers that run RStudio are often disconnected from databases, especially in organizations that are new to R. In order to help data scientists access their databases, RStudio offers ODBC data connectors that are supported, easy to install, and designed to work everywhere you use RStudio professional products. The 1.6.0 release of <a href="https://rstudio.com/products/drivers/">RStudio Professional Drivers</a> includes a few important updates.</p><h2 id="new-data-sources">New data sources</h2><p><img align="center" src="drivers.jpg" ></p><p>The 1.6.0 release includes new drivers for the following data sources:</p><ul><li>Amazon Athena</li><li>Google BigQuery</li><li>Apache Cassandra</li><li>MongoDB</li><li>MySQL</li><li>IBM Netezza</li></ul><p>These six new drivers complement the eight existing drivers from the prior release: Amazon Redshift, Apache Hive, Apache Impala, Oracle, PostgreSQL, Microsoft SQL Server, Teradata, and Salesforce. The existing drivers have also been updated with new features and improvements in the 1.6.0 release. For example, the SQL Server driver now supports the NTLM security protocol. For a full list of changes, refer to the <a href="https://docs.rstudio.com/drivers/1.6/release-notes/">RStudio Professional Drivers 1.6.0 release notes</a>.</p><h2 id="new-packaging-rpm--deb">New packaging (<code>.rpm</code> / <code>.deb</code>)</h2><p>Installations of drivers from the prior release of RStudio Professional Drivers relied on an installer script. In this release, the installer script has been eliminated and instead the drivers use standard Linux package management tools &ndash; <code>.rpm</code> and <code>.deb</code> packages &ndash; that we provide for RHEL/CentOS 6/7, Debian/Ubuntu, and SUSE 12/15. Standardized packaging makes installations and upgrades easier for administrators. Those needing custom installations (e.g. installations into a non-standard directory), can still download the <code>.tar</code> file. For step-by-step instructions see <a href="https://docs.rstudio.com/pro-drivers/installation/">Installing RStudio Professional Drivers</a>.</p><ul><li><strong>Breaking change</strong>. Installing 1.6.0 drivers on top of existing drivers could cause issues. Administrators should uninstall existing drivers and remove driver entries in <code>odbcinst.ini</code> before installing version 1.6.0. See <a href="https://docs.rstudio.com/pro-drivers/installation/">Installing RStudio Professional Drivers</a>.</li><li><strong>Breaking change</strong>. Installing 1.6.0 drivers no longer updates <code>odbcinst.ini</code>. Administrators should manually add entries to <code>odbcinst.ini</code> based on <code>odbcinst.ini.sample</code> which is included in driver packaging. See <a href="https://docs.rstudio.com/pro-drivers/installation/">Installing RStudio Professional Drivers</a>.</li></ul><h2 id="using-with-python">Using with Python</h2><p>RStudio Professional Drivers can be used with both R and Python. You can use the drivers with Jupyter Notebooks and JupyterLab sessions that launch from <a href="https://blog.rstudio.com/2019/09/19/rstudio-1-2-5-release/">RStudio Server Pro 1.2.5</a>. You can also use the drivers with Jupyter Notebooks that are published to <a href="https://blog.rstudio.com/2019/01/17/announcing-rstudio-connect-1-7-0/">RStudio Connect 1.7.0</a>+.</p><h2 id="a-note-about-write-backs">A note about write-backs</h2><p>RStudio Professional Drivers are just one part of a complex ODBC connection chain designed for doing data science. Typical data science tasks involve querying and extracting subsets of data into R. It can be tempting to use the ODBC connection chain for data engineering tasks such as bulk loads, high speed transactions, and general purpose ETL. However, heavy-duty data engineering tasks are better done with specialized third-party tools. We recommend using the ODBC connection chain primarily for querying and analyzing data.</p><p><img align="center" src="odbc-data-connectors-white.png" ></p><p>While doing data science, it is often handy to write data from R into databases. ODBC write-backs can be challenging when creating tables or inserting records. Standards vary wildly across data sources, and matching data types to data sources can be exacting. Compared to specialized third-party tools, ODBC write-backs tend to be slow. We recommend ODBC write-backs from R only when appropriate and only for small tables.</p></description></item><item><title>Shiny 1.4.0</title><link>https://www.rstudio.com/blog/shiny-1-4-0/</link><pubDate>Tue, 15 Oct 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-1-4-0/</guid><description><p>Shiny 1.4.0 has been released! This release mostly focuses on under-the-hood fixes, but there are a few user-facing changes as well.</p><p>If you&rsquo;ve written a Shiny app before, you&rsquo;ve probably encountered errors like this:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-R" data-lang="R"><span style="color:#06287e">div</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Hello&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">world!&#34;</span>, )<span style="color:#60a0b0;font-style:italic">#&gt; Error in tag(&#34;div&#34;, list(...)) : argument is missing, with no default</span></code></pre></div><p>This is due to a trailing comma in <code>div()</code>. It&rsquo;s very easy to accidentally add trailing commas when you&rsquo;re writing and debugging a Shiny application.</p><p>In Shiny 1.4.0, you&rsquo;ll no longer get this error &ndash; it will just work with trailing commas. This is true for <code>div()</code> and all other HTML tag functions, like <code>span()</code>, <code>p()</code>, and so on.</p><p>The new version of Shiny also lets you control the whitespace between HTML tags. Previously, if there were two adjacent tags, like the two spans in <code>div(a(&quot;Visit this link&quot;, href=&quot;path/&quot;), span(&quot;.&quot;))</code>, whitespace would always be inserted between them, resulting in output that renders as &ldquo;Visit this link .&quot;.</p><p>Here&rsquo;s what the generated HTML looks like:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-R" data-lang="R"><span style="color:#06287e">div</span>(<span style="color:#06287e">a</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Visit this link&#34;</span>, href <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">path/&#34;</span>), <span style="color:#06287e">span</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">.&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; &lt;div&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;a href=&#34;path/&#34;&gt;Visit this link&lt;/a&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;span&gt;.&lt;/span&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;/div&gt;</span></code></pre></div><p>Now, you can use the <code>.noWS</code> parameter to remove the spacing between tags, so you can create output that renders as &ldquo;Visit this link.&quot;:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-R" data-lang="R"><span style="color:#06287e">div</span>(<span style="color:#06287e">a</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Visit this link&#34;</span>, href <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">path/&#34;</span>, .noWS <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">after&#34;</span>), <span style="color:#06287e">span</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">.&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; &lt;div&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;a href=&#34;path/&#34;&gt;Visit this link&lt;/a&gt;&lt;span&gt;.&lt;/span&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;/div&gt;</span></code></pre></div><p>The <code>.noWS</code> parameter can take one or more other values to control whitespace in other ways:</p><ul><li><code>&quot;before&quot;</code> suppresses whitespace before the opening tag.</li><li><code>&quot;after&quot;</code> suppresses whitespace after the closing tag.</li><li><code>&quot;after-begin&quot;</code> suppresses whitespace between the opening tag and its first child. (In the example above, the <code>&lt;span&gt;</code> tags are children of the <code>&lt;div&gt;</code>.</li><li><code>&quot;after-begin&quot;</code> suppresses whitespace between the last child and the closing tag.</li></ul><p>(These changes actually come from version 0.4.0 of the htmltools package, but most users will encounter these functions via Shiny, and the documentation in Shiny has been updated to reflect the changes.)</p><h2 id="breaking-changes">Breaking changes</h2><p>We&rsquo;ve updated from jQuery 1.12.4 to 3.4.1. There&rsquo;s a small chance that JavaScript code will behave slightly differently with the new version of jQuery, so if you encounter a compatibility issue, you can use the old version of jQuery with <code>options(shiny.jquery.version=1)</code>. Note that this option will go away some time in the future, so if you find that you need to use it, please make sure to update your JavaScript code to work with jQuery 3.</p><p>For the full set of changes in this version of Shiny, please see <a href="http://shiny.rstudio-staging.com/reference/shiny/1.4.0/upgrade.html">this page</a>.</p></description></item><item><title>Building Data Science Infrastructure at an Enterprise Level with RStudio and ProCogia</title><link>https://www.rstudio.com/blog/building-data-science-infrastructure-at-an-enterprise-level-with-rstudio-and-procogia/</link><pubDate>Wed, 02 Oct 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/building-data-science-infrastructure-at-an-enterprise-level-with-rstudio-and-procogia/</guid><description><p><sup> Photo by <a href="https://unsplash.com/@phoebezzf?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Zhifei Zhou</a> on <a href="https://unsplash.com/s/photos/seattle?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup></p><p>We’re hosting a free, half-day event with one of our <a href="https://rstudio.com/certified-partners/">Full Service Certified Partners</a>, ProCogia, in Seattle on Wednesday October 9th. This event is for data science and IT teams that want to learn more about:</p><ul><li>helpful new RStudio R packages like <code>pins</code></li><li>what RStudio professional products can do for your data science team if you have both Jupyter and RStudio users</li><li>using Kubernetes or Slurm to scale your work</li></ul><p><strong>If you’re interested in learning more, be sure to register on the ProCogia event page:<a href="https://www.eventbrite.com/e/rstudio-and-procogia-roadshow-tickets-72099330037">RStudio &amp; ProCogia Roadshow: R in Enterprise</a>.</strong></p><p>For a taste of what to expect, here’s a discussion on using enterprise RStudio and python products to build a containerized data science development environment, written by Gagandeep Singh, data scientist and certified RStudio administrator, at ProCogia.</p><p>“Modern organizations are spending a considerable amount of resources on data science research and development. Companies are recognizing that the need to have a dedicated data science department in-house has increased exponentially. Companies are looking at external partners with dedicated competence and experience in this field to assist in building a comprehensive solution. As specialty data science consultants, we have established partnerships with popular data science platform providers such as RStudio and are involved in the JupyterHub project. For example, we were brought in by a multinational leading biotechnology company to design, develop and deploy an integrated data science development platform for their team of over 100 international data scientists. The ask was to build a comprehensive solution where users can use either R and/or Python, develop models, and share results through a common platform.</p><p>The biggest concern for us was to provide a solution that could handle a multitude of user sessions, but also provide high-performance computing and resources at the same time. The safe option was to build a high-availability, load-balanced environment, although it might create troubles in the future as the number of users kept increasing and resources needed to be optimized.</p><p>We instead decided to take a two-pronged approach, in which a Kubernetes-backed, containerized solution would be the primary interface, and a load-balanced product would be backing up any additional load. Users launch their own containers for each processing session and Kubernetes takes care of the backend resource allocations. They can run both Jupyter notebooks and R scripts in the container, and perform multiple assignments concurrently. The publishing platform, RStudio Connect, provides a cohesive product to share results through shiny applications, HTML reports, and even Jupyter Notebooks with Python code. RStudio Server Pro 1.2 now supports running sessions and jobs on Kubernetes.</p><p>Connect has both Python and R enabled. It has also been configured to schedule reports to be sent as emails. We also configured the RStudio Server Pro IDE in a load-balanced, high-availability environment. In this situation, the RStudio IDE’s internal load balancer works with AWS’s load balancer to accommodate backup and smooth operations in case any of the servers go down. The publishing platform is also configured with high availability, which means multiple servers are simultaneously serving the users’ publishing needs using a common database. We integrated a high-availability RStudio Package Manager into the mix, which enabled the administrator to establish control over both package access and downloads. Users could utilize RStudio Package Manager to access different versions of R packages. Our instance of Package Manager was also capable of serving internally developed packages by connecting to the original git source, which eliminated the need for additional administration.</p><p>A customized solution that took into account security protocols, data sharing and management, and the needs of individual data scientists and administrators is what was needed. There cannot be a one-size-fits-all solution when it comes to the architecture of data science environments for organizations across various industries. The environment must be tailored to business goals, administrator concerns, and user productivity. A solution that works with regards to security protocols may not be a solution that works for users of the environment. The factors are all considered when creating a data science environment for organizations based on their specific goals at the time and in the future.”</p><p>Space is limited, so reserve your seat now!</p><p>Agenda:</p><ul><li>8:30 am Registration &amp; Breakfast</li><li>9:00 am Welcome &amp; Introduction</li><li>9:15 am Open Source in Enterprise: a Virtuous Cycle - Lou Bajuk (RStudio)</li><li>9:45 am Automating RStudio Products in the Cloud - ProCogia</li><li>10:30 am Coffee Break</li><li>10:45 am The R Community in Transitioning to Pro from Open Source - Daniella Mark (ProCogia)</li><li>11:00 am Looking Ahead: R in Enterprise - Kevin Bolger (ProCogia)</li><li>11:30 am New Data Workflows in RStudio Connect Using the <code>pins</code> Package - Javier Luraschi (RStudio)</li><li>12:00 pm Closing Remarks &amp; Lunch</li></ul></description></item><item><title>Fall & Winter Workshop Roundup</title><link>https://www.rstudio.com/blog/fall-winter-workshop-roundup/</link><pubDate>Mon, 30 Sep 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/fall-winter-workshop-roundup/</guid><description><p>Join RStudio at one of our Fall and Winter workshops! We&rsquo;ll be hosting a few different workshops in a variety of cities across the US and UK. Topics range from building tidy tools, to teaching data science, to mastering machine learning. See below for more details on each workshop and how to register.</p><h3 id="building-tidy-tools-with-hadley-wickham">Building Tidy Tools with Hadley Wickham</h3><p><strong>When</strong>: October 14 &amp; 15, 2019</p><p><strong>Where</strong>: Loudermilk Conference Center in Atlanta, GA</p><p><strong>Who</strong>: Hadley Wickham, Chief Scientist at RStudio</p><p>Register here: <a href="https://cvent.me/2YXxr">https://cvent.me/2YXxr</a></p><p>Chief Data Scientist Hadley Wickham is hosting his popular “Building Tidy Tools” workshop in Atlanta, Georgia this October.</p><p>You should take this workshop if you have experience programming in R and want to learn how to tackle larger scale problems. You’ll get the most from it if you’re already familiar with functions and are comfortable with R’s basic data structures (vectors, matrices, arrays, lists, and data frames). Note: There is ~30% overlap in the material with Hadley’s previous “R Masterclass”. However, the material has been substantially reorganized, so if you’ve taken the R Masterclass in the past, you’ll still learn a lot in this class.</p><p>This course has three primary goals. You will:</p><ol><li><p>Learn efficient workflows for developing high-quality R functions, using the set of conventions codified by a package. You&rsquo;ll also learn workflows for unit testing, which helps ensure that your functions do exactly what you think they do.</p></li><li><p>Master the art of writing functions that do one thing well and can be fluently combined together to solve more complex problems. We&rsquo;ll cover common function writing pitfalls and how to avoid them.</p></li><li><p>Learn how to write collections of functions that work well together and adhere to existing conventions so they&rsquo;re easy to pick up for newcomers. We&rsquo;ll discuss API design, functional programming tools, the basics of object design in S3, and the tidy eval system for NSE.</p></li></ol><h3 id="welcome-to-the-tidyverse-an-introduction-to-r-for-data-science">Welcome to the Tidyverse: An Introduction to R for Data Science</h3><p><strong>When</strong>: The one-day workshop is hosted on both October 14 &amp; October 15</p><p><strong>Where</strong>: Loudermilk Conference Center in Atlanta, GA</p><p><strong>Who</strong>:</p><ul><li>Carl Howe, Director of Education at RStudio</li><li>Christina Koch, University of Wisconsin</li><li>Teon Brooks, Data Scientist at Mozilla</li></ul><p>Register Here: <a href="https://cvent.me/ZlvXL">https://cvent.me/ZlvXL</a></p><p>Join RStudio’s Director of Education, Carl Howe, and some special teachers for their “Welcome to the Tidyverse: An Introduction to R for Data Science”. This workshop is designed for folks who are new to R and want to learn more.</p><p>Looking for an effective way to learn R? This one-day course will teach you a workflow for doing data science with the R language. It focuses on using R’s Tidyverse, which is a core set of R packages that are known for their impressive performance and ease of use. We will focus on doing data science, not programming.</p><p>In this course, you’ll learn to:</p><ol><li>Visualize data with R&rsquo;s <code>ggplot2</code> package</li><li>Wrangle data with R&rsquo;s <code>dplyr</code> package</li><li>Fit models with base R</li><li>Document your work reproducibly with R Markdown</li></ol><h3 id="machine-learning-workshop-with-max-kuhn">Machine Learning Workshop with Max Kuhn</h3><p><strong>When</strong>: November 18 &amp; 19, 2019</p><p><strong>Where</strong>: Hilton London Paddington in London, UK</p><p><strong>Who</strong>: Max Kuhn , Software Engineer at RStudio</p><p>Register here: <a href="https://cvent.me/bKoXk">https://cvent.me/bKoXk</a></p><p>See Max Kuhn teach his Machine Learning workshop this fall in London. This is a great chance to hear Max teach and experience this class while he is across the pond.</p><p>This two-day course will provide an overview of using R for supervised learning. The session will step through the process of building, visualizing, testing, and comparing models that are focused on prediction. The goal of the course is to provide a thorough workflow in R that can be used with many different regression or classification techniques. Case studies on real data will be used to illustrate the functionality and several different predictive models are illustrated.</p><h3 id="introduction-to-machine-learning-with-the-tidyverse">Introduction to Machine Learning with the Tidyverse</h3><p><strong>When</strong>: December 12 &amp; 13, 2019</p><p><strong>Where</strong>: RStudio’s Boston Office</p><p><strong>Who</strong>:</p><ul><li>Garrett Grolemund, Data Scientist and Professional Educator at RStudio</li><li>Alison Hill, Data Scientist and Professional Educator at RStudio</li></ul><p>Register here: <a href="https://cvent.me/brM1M">https://cvent.me/brM1M</a></p><p>Get a sneak peek at Garrett and Alison’s rstudio::conf2020 workshop, “Introduction to Machine Learning with the Tidyverse”. If you can’t make it to the conference this year, this is your chance to experience one of the workshops and help them test drive their content.</p><p>This is a test run for a workshop in the final stages of development. The workshop provides a gentle introduction to machine learning and to the tidyverse packages that do machine learning. You’ll learn how to train and assess predictive models with several common machine learning algorithms, as well as how to do feature engineering to improve the predictive accuracy of your models. We will focus on learning the basic theory and best practices that support machine learning, and we will do it with a modern suite of R packages known as <code>tidymodels</code>. Tidymodels packages, like <code>parsnip</code>, <code>recipes</code>, and <code>rsample</code> provide a grammar for modeling and work seamlessly with R’s tidyverse packages.</p><p>Since this is a test run, the workshop is limited to a small number of seats. The low price reflects the experimental nature of the material. Students will be asked to provide constructive feedback in a course survey.</p></description></item><item><title>RStudio Connect 1.7.8 - Put a pin in it!</title><link>https://www.rstudio.com/blog/rstudio-connect-1-7-8-put-a-pin-in-it/</link><pubDate>Mon, 23 Sep 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-1-7-8-put-a-pin-in-it/</guid><description><p>This release adds new workflows for data scientists and improved productionsettings for administrators. For data scientists, it used to be hard to use thesame data or R objects in different content, and even harder to update thoseresources regularly. This release enables you to pin objects in Connect to solvethese challenges. For administrators, we&rsquo;ve reduced the most common sources ofpublishing failures and significantly improved error handling. All together,version 1.7.8 makes it even easier for data scientist teams to share andleverage their work with the enterprise.</p><h2 id="updates-for-users">Updates for Users</h2><h3 id="experimental-support-for-pinshttpsrstudiogithubiopins">Experimental Support for <a href="https://rstudio.github.io/pins">Pins</a></h3><figure><img src="rsc-178-pins.png"alt="Pins Support in RStudio Connect"/></figure><p>The <a href="https://rstudio.github.io/pins"><code>pins</code></a> R package provides a way for Rusers to easily share resources using RStudio Connect. Your resources may betext files (CSV, JSON, etc.), R objects (<code>.Rds</code>, <code>.Rda</code>, etc.), or any othertype of files you want to share. Sharing these files can be useful in manysituations, for example:</p><ol><li><p>Multiple pieces of content require the same data. Rather than copyingthat data, each piece of content references a single source of truth hostedon RStudio Connect.</p></li><li><p>Content depends on processed datasets or model objects that need to be regularly updated.Rather than redeploying the content each time the information changes, use apinned resource and update only the dataset or model. The update can be automated using ascheduled R Markdown document. Other deployed content will read the newest data oneach run.</p></li><li><p>You need to share resources that aren&rsquo;t structured for traditional toolslike databases. For example, models saved as R objects aren&rsquo;t easy to storein a database. Rather than using email or file systems to share these R objects,use RStudio Connect to host these resources as pins.</p></li></ol><p>Refer to the <a href="https://docs.rstudio.com/connect/1.7.8/user/pins.html">RStudio Connect userguide</a> or the <a href="https://rstudio.github.io/pins/articles/boards-rsconnect.html">pins website</a> for moreinformation.</p><h3 id="example-apis-for-model-serving">Example APIs for Model Serving</h3><script src="https://fast.wistia.com/embed/medias/dm8sl8nz5n.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_dm8sl8nz5n videoFoam=true" style="height:100%;position:relative;width:100%">&nbsp;</div></div></div><p>RStudio Connect makes it easy for data scientists to share models written in Rwith other teams as RESTFul APIs. As part of the 1.7.8 release, we&rsquo;ve expandedour documentation to help teams approach model management with the followingexamples:</p><ol><li><p>The <a href="https://solutions.rstudio.com/model-management/overview/">end-to-end use</a> of RStudio Connect to train, deploy, monitor, and A/B test a model.</p></li><li><p>Add <a href="https://solutions.rstudio.com/examples/rest-apis-overview/#log-details-about-api-requests-and-responses">additional logging</a> to API requests in order to track latency, performance, input parameters, and route popularity.</p></li></ol><h2 id="updates-for-administrators">Updates for Administrators</h2><h3 id="deployment-error-logging">Deployment Error Logging</h3><figure><img src="rsc-178-errors.png"alt="Better Error Logs"/></figure><p>We&rsquo;ve overhauled the RStudio Connect deployment process for R code. RStudioConnect now captures errors and surfaces specific error codes andrecommendations. All publishing methods see these improvements includingpush-button publishing, Git-backed deployment, or custom workflows using theConnect Server API.</p><p>A glossary of the error codes and recommendations is available <a href="https://docs.rstudio.com/connect/1.7.8/user/publishing.html#error-codes">here</a>.</p><h3 id="r-package-repositories">R Package Repositories</h3><p>By default, RStudio Connect attempts to install R packages from the packagerepositories that were used in the development environment. In some cases, youmay wish to specify different behavior and tell RStudio Connect where it shouldlook for R packages:</p><ul><li>Your developers install packages from a public CRAN mirror, but your production server must use an internal CRAN mirror.</li><li>You use an isolated network for your production server, so RStudio Connect can not access the package repository used on the development network.</li><li>You want to use RStudio Package Manager&rsquo;s package binaries in production.</li></ul><p>The <a href="https://docs.rstudio.com/connect/1.7.8/admin/getting-started.html#getting-started-rspm">Admin Guide</a> describes how to configure the package repositories RStudio Connect should use for R.</p><h3 id="usage-scorecard-and-feedback">Usage Scorecard and Feedback</h3><figure><img src="rsc-178-scorecard.png"alt="Usage Scorecard in Connect"/></figure><p>Like you, we&rsquo;re committed to expanding the influence of data science in theenterprise. A new usage scorecard on the Admin Dashboard&rsquo;s Metrics page helpsyou understand how your team uses RStudio Connect and what additionalcapabilities may be available. We have made it easy for you to share thisscorecard with RStudio, providing feedback that will help us further improveRStudio Connect.</p><h2 id="deprecations-and-breaking-changes">Deprecations and Breaking Changes</h2><ul><li><p><strong>SECURITY:</strong> New Timeout Defaults - The <code>Authentication.Lifetime</code> and <code>Authentication.Inactivity</code> settings have new default values that are in line with industry best practices. These settings control how frequently a user must refresh their log in to RStudio Connect. The new default values are shorter ensuring users are authenticated more frequently.</p></li><li><p><strong>BREAKING CHANGE:</strong> External Package Check - On start up, RStudio Connect now checks to ensure every R packages listed as external in the configuration file is available in every version of R on the system. Any missing packages will cause RStudio Connect to fail to start. This check prevents unexpected deployment and runtime failures for content. You can opt out of this check by setting <code>Packages.ExternalCheckIsFatal</code> to <code>false</code>.</p></li><li><p><strong>BREAKING CHANGE:</strong> R Markdown Rendering Errors - R errors that occur during R Markdown rendering now stop the deployment process with an error. Previously, R errors would result in the error message being rendered as part of the document contents. The previous behavior can be restored by setting chunk options for the R code chunk in the Rmd file, e.g. <code>{r error=TRUE warning=TRUE}</code>.</p></li><li><p><strong>PLATFORM SUPPORT:</strong> Trusty - RStudio Connect 1.7.8 is the last release that will support Ubuntu 14.04 Trusty. Please refer to the <a href="https://www.rstudio.com/about/platform-support/">RStudio Platform Support</a> policy for more information.</p></li><li><p><strong>PLATFORM SUPPORT:</strong> REHL 8 - RStudio Connect 1.7.8 is the first release to support Red Hat Enterprise Linux 8. Refer to the <a href="https://docs.rstudio.com/connect/1.7.8/admin/getting-started.html#installation-redhat">RStudio Connect Admin Guide</a> for more information.</p></li><li><p><strong>BREAKING CHANGE:</strong> These previously deprecated settings have been removed; see the <a href="https://docs.rstudio.com/connect/news/">release notes</a> for more details: <code>LoadBalancing.EnforceMinRsconnectVersion</code>, <code>Applications.ExplicitPublishing</code>, <code>Authorization.UsersListingMinRole</code>, and <code>Password.UserInfoEditableBy</code>.</p></li><li><p><strong>DEPRECATIONS:</strong> The following settings have been deprecated and will be removed in the next release; see the <a href="https://docs.rstudio.com/connect/news/">release notes</a> for more details: <code>Applications.DisabledProtocols</code>, <code>OAuth2.AllowedDomains</code>, and <code>OAuth2.AllowedEmails</code>.</p></li></ul><p>Refer to the <a href="https://docs.rstudio.com/connect/news/">full release notes</a> formore information on all of the changes and bug fixes in this release.</p><h2 id="upgrade-planning">Upgrade Planning</h2><blockquote><p>Please take special note of the breaking changes above, especially the new checkfor external packages. Aside from the deprecations and breaking changes above,there are no other special considerations and upgrading should require less thanfive minutes. If you are upgrading from an earlier version, be sure to consultthe release notes for the intermediate releases, as well.</p></blockquote><h2 id="get-started-with-rstudio-connect">Get Started with RStudio Connect</h2><p>If you haven&rsquo;t yet had a chance to download and try <a href="https://rstudio.com/products/connect/">RStudioConnect</a>, we encourage you to do so.RStudio Connect is the best way to share all the work that you do in R (Shinyapps, R Markdown documents, plots, dashboards, Plumber APIs, etc.) withcollaborators, colleagues, or customers.</p><p>You can find more details or download a 45-day evaluation of the product at<a href="https://www.rstudio.com/products/connect/">https://www.rstudio.com/products/connect/</a>.Additional resources can be found below.</p><ul><li><a href="https://www.rstudio.com/products/connect/">RStudio Connect home page &amp; downloads</a></li><li><a href="http://docs.rstudio.com/connect/admin/">RStudio Connect Administration Guide</a></li><li><a href="http://docs.rstudio.com/connect/news/">Detailed news and changes between each version</a></li><li><a href="https://www.rstudio.com/pricing/">Pricing</a></li></ul></description></item><item><title>Data Science in Production: a Joint Event with Yotabites</title><link>https://www.rstudio.com/blog/data-science-in-production-a-joint-event-with-yotabites/</link><pubDate>Sun, 22 Sep 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/data-science-in-production-a-joint-event-with-yotabites/</guid><description><p><img src="yotabites_banner.png" alt=""></p><p>Join us Wednesday, October 23rd, in Austin, Texas as RStudio teams up with <a href="https://yotabites.com/">Yotabites</a>, to host a free half-day event on using open source data science languages and RStudio in production. Yotabites is an RStudio <a href="https://www.rstudio.com/certified-partners/">Full Service Certified Partner</a> that provides consulting and professional services for RStudio products.</p><p>This event is for RStudio and Jupyter users and their IT colleagues who enable them. We will show how RStudio products can be incorporated into robust business processes and how Yotabites can help simplify the process. We will discuss using R and Python with RStudio, data engineering workflows, scaling container orchestration with kubernetes and Slurm resource managers, and deploying data products into production with CI/CD pipelines and model management.</p><p>This is a great chance to test drive some of the newest features in RStudio’s professional products and to hear stories of businesses like yours who have incorporated open source data science languages in production successfully.</p><ul><li>11:30 am Arrival &amp; Registration</li><li>12:00 pm Lunch</li><li>12:30 pm Welcome &amp; Introduction by Express Scripts</li><li>1:00 pm Shiny App Framework for Machine Learning &amp; Mass Migration of Shiny Apps</li><li>2:00 pm Coffee Break</li><li>2:15 pm Improving Efficiency in Production Environments with Automation</li><li>3:00 pm Using Python with RStudio</li><li>4:00 pm Closing Remarks &amp; Networking</li></ul><p>The event will take place at the <a href="https://www.google.com/maps/place/Capital+Factory/@30.2689645,-97.7430044,17z/data=!3m1!4b1!4m5!3m4!1s0x8644b5a741a48747:0x3c7ccf742ba7769d!8m2!3d30.2689645!4d-97.7408104">Capital Factory</a> in Austin, Texas.Space is limited, so secure your spot on the Yotabites registration page: <a href="http://bit.ly/2khDaYY">http://bit.ly/2khDaYY</a></p></description></item><item><title>RStudio Server Pro 1.2 Update</title><link>https://www.rstudio.com/blog/rstudio-1-2-5-release/</link><pubDate>Thu, 19 Sep 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-1-2-5-release/</guid><description><p>Today, we&rsquo;re announcing an important update to RStudio Server Pro 1.2 that introduces two new capabilities.</p><h2 id="slurm-jobs">Slurm Jobs</h2><img align="right" width="100" height="100" src="slurm_logo.png " alt="Slurm Workload Manager"><p><a href="https://slurm.schedmd.com/">Slurm</a> is a open-source workload management system, capable of running distributed jobs across a cluster. It&rsquo;s a popular tool for data science teams to run big, resource-intensive jobs on dedicated hardware. In this update, we&rsquo;re introducing a new Slurm back end for RStudio Server Pro&rsquo;s new Job Launcher (itself introduced in the initial release of RStudio Server Pro 1.2).</p><p>This means that it&rsquo;s possible to write R code in RStudio and submit it to a Slurm cluster for execution, using RStudio Server&rsquo;s new Jobs feature. Once started, Slurm jobs exist independently from R sessions and can be monitored from any R session, or from the improved user dashboard. Admins can configure many aspects of the experience, including setting different resource limits (e.g. memory and CPU) for different groups of Slurm users.</p><p>Finally, Slurm can optionally be used to run interactive R sessions as well if your R environment exists solely on your Slurm cluster.</p><h2 id="jupyter-sessions">Jupyter Sessions</h2><img align="right" width="100" height="100" src="jupyter_logo.png" alt="Project Jupyter"><p>Many data science teams use <a href="https://jupyter.org/">Jupyter</a> side by side with RStudio as a tool for reproducible research, and earlier this year we announced <a href="https://www.rstudio.com/2019/01/17/announcing-rstudio-connect-1-7-0/">Jupyter support for RStudio Connect</a>. In this RStudio Server Pro update, we&rsquo;re making it easier to use these tools together; you can now run Jupyter sessions in addition to RStudio sessions inside RStudio Server Pro, making it possible to both author and publish Jupyter notebooks inside <a href="https://www.rstudio.com/products/team/">RStudio Team</a>.</p><p>Just like RStudio sessions, RStudio Server Pro manages all of the authentication, supervision, and lifetime of Jupyter sessions, and gives you a convenient dashboard of running sessions. Starting a new Jupyter session is as easy as choosing Jupyter when you start a new session.</p><p><img src="jupyter-session.png" alt="" title="Start a new JupyterLab or Jupyter Notebook session"></p><p>Both Jupyter Notebook and JupyterLab are supported. Note that RStudio does not bundle Jupyter (it must be installed separately) and that Jupyter is only available when RStudio Server Pro is configured with the Job Launcher.</p><h2 id="pro-user-dashboard">Pro User Dashboard</h2><p>In addition to adding these two new capabilities, we&rsquo;ve revamped the RStudio Server Pro user dashboard (homepage), with cleaner visuals and a clearer layout. You&rsquo;ll see a summary of all your active sessions and jobs, quick links to your active projects, and tools for managing ongoing work.</p><p><img src="dashboard.png" alt="" title="RStudio User Dashboard"></p><h2 id="open-source-and-desktop">Open Source and Desktop</h2><p>Alongside this interim release of RStudio Server Pro 1.2, we&rsquo;re releasing an update to the 1.2 desktop and open source server. While this update primarily focuses on bugfixes and stability improvements, it also introduces a number of small features:</p><ul><li>We have restored compatibility with 32 bit R on Windows, which was temporarily dropped in the initial release of RStudio 1.2. Note that a 64 bit operating system is still required to run RStudio itself.</li><li>We&rsquo;ve improved compatibility with the upcoming MacOS Catalina, and added support for RedHat Enterprise Linux 8 as well as Fedora 28.</li><li>A new embedded Pandoc version improves performance and stability for R Notebooks and R Markdown on Windows.</li></ul><p>You can download the new RStudio 1.2 update (1.2.5001-3) here:</p><p><a href="https://www.rstudio.com/products/rstudio/download/">Download RStudio</a></p><p>The <a href="https://www.rstudio.com/products/rstudio/release-notes/">release notes</a> contain a full list of all of the bugfixes and features in this release, and of course feedback is welcome on the <a href="https://community.rstudio.com/c/rstudio-ide">RStudio IDE Community Forum</a>.</p><p><em><strong>UPDATE:</strong></em> <em>Nov. 27, 2019</em><br><em>Learn more about <a href="https://rstudio.com/solutions/python-and-r/">how R and Python work together in RStudio</a>.</em></p></description></item><item><title>pins: Pin, Discover and Share Resources</title><link>https://www.rstudio.com/blog/pin-discover-and-share-resources/</link><pubDate>Mon, 09 Sep 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/pin-discover-and-share-resources/</guid><description><p>Today we are excited to announce the <a href="https://github.com/rstudio/pins">pins</a> package is available on CRAN! <code>pins</code> allows you to <strong>pin</strong>, <strong>discover</strong> and <strong>share</strong> remote <strong>resources</strong>, locally or in remote storage.</p><p>If you find yourself using <code>download.file()</code> or asking others to download files before running your R code, use <code>pin()</code> to achieve fast, simple and reliable reproducible research over remote resources.</p><h2 id="pins">Pins</h2><p>You can use the <code>pins</code> package to:</p><ul><li><strong>Pin</strong> remote resources locally to work offline and cache results with ease, <code>pin()</code> stores resources in boards which you can then retrieve with <code>pin_get()</code>.</li><li><strong>Discover</strong> new resources across different boards using <code>pin_find()</code>.</li><li><strong>Share</strong> resources on GitHub, Kaggle or RStudio Connect by registering new boards with <code>board_register()</code>.</li><li><strong>Resources</strong> can be anything from CSV, JSON, or image files to arbitrary R objects.</li></ul><p>You can install <code>pins</code> from CRAN with:</p><pre><code>install.packages(&quot;pins&quot;)</code></pre><p>You can <strong>pin</strong> remote files with <code>pin(url)</code>. <code>pin(url)</code> downloads and caches the remote <code>url</code>, returning the path to the locally cached file. This gives you the ability to work offline (or continue working even if the remote resource disappears) with minimal changes to your existing code. When called again in the future, <code>pin()</code> will automatically check for changes, and only re-download the file if needed.</p><p>For instance, the following example makes use of a remote CSV file, which you can download and cache with <code>pin()</code> before it&rsquo;s loaded with <code>read_csv()</code>:</p><pre><code>library(tidyverse)library(pins)url &lt;- &quot;https://raw.githubusercontent.com/facebook/prophet/master/examples/example_retail_sales.csv&quot;retail_sales &lt;- read_csv(pin(url))</code></pre><pre><code># A tibble: 293 x 2ds y&lt;date&gt; &lt;dbl&gt;1 1992-01-01 1463762 1992-02-01 1470793 1992-03-01 1593364 1992-04-01 1636695 1992-05-01 1700686 1992-06-01 1686637 1992-07-01 1698908 1992-08-01 1703649 1992-09-01 16461710 1992-10-01 173655# … with 283 more rows</code></pre><p>This makes reading subsequent remotes files orders of magnitude faster, files are only downloaded when the remote resource changes.</p><p>The <code>pins</code> package allows you to <strong>discover</strong> remote resources using <code>pin_find()</code>, currently, it can search resources in CRAN packages, Kaggle and RStudio Connect. For instance, we can search resources mentioning &ldquo;seattle&rdquo; in CRAN packages as follows:</p><pre><code>pin_find(&quot;seattle&quot;, board = &quot;packages&quot;)</code></pre><pre><code># A tibble: 6 x 4name description type board&lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;1 hpiR/ex_sales Subset of Seattle Home Sales from hpiR package. table packa…2 hpiR/seattle_sales Seattle Home Sales from hpiR package. table packa…3 latticeExtra/Seata… Daily Rainfall and Temperature at the Seattle-Tac… table packa…4 microsynth/seattle… Data for a crime intervention in Seattle, Washing… table packa…5 vegawidget/data_se… Example dataset: Seattle daily weather from vegaw… table packa…6 vegawidget/data_se… Example dataset: Seattle hourly temperatures from… table packa…</code></pre><p>Notice that all pins are referenced as <code>&lt;owner&gt;/&lt;name&gt;</code> and even if the <code>&lt;owner&gt;</code> is not provided, each board will assign an appropriate one. While you can ignore <code>&lt;owner&gt;</code> and reference pins by <code>&lt;name&gt;</code>, this can fail in some boards if different owners assign the same name to a pin.</p><p>You can then retrieve a pin as a local path through <code>pin_get()</code>:</p><pre><code>pin_get(&quot;hpiR/seattle_sales&quot;)</code></pre><pre><code># A tibble: 43,313 x 16pinx sale_id sale_price sale_date use_type area lot_sf wfnt bldg_grade tot_sf&lt;chr&gt; &lt;chr&gt; &lt;int&gt; &lt;date&gt; &lt;chr&gt; &lt;int&gt; &lt;int&gt; &lt;dbl&gt; &lt;int&gt; &lt;int&gt;1 ..00… 2013..… 289000 2013-02-06 sfr 79 9295 0 7 25602 ..00… 2013..… 356000 2013-07-11 sfr 18 6000 0 6 15403 ..00… 2010..… 333500 2010-12-29 sfr 79 7200 0 8 23804 ..00… 2016..… 577200 2016-03-17 sfr 79 7200 0 8 23805 ..00… 2012..… 237000 2012-05-02 sfr 79 5662 0 7 13706 ..00… 2014..… 347500 2014-03-11 sfr 79 5830 0 7 8807 ..00… 2012..… 429000 2012-09-20 sfr 18 12700 0 7 16408 ..00… 2015..… 653295 2015-07-21 sfr 79 7000 0 7 19909 ..00… 2014..… 427650 2014-02-19 townhou… 79 3072 0 7 198010 ..00… 2015..… 488737 2015-03-19 townhou… 79 3072 0 7 1980# … with 43,303 more rows, and 6 more variables: beds &lt;int&gt;, baths &lt;dbl&gt;,# age &lt;int&gt;, eff_age &lt;int&gt;, longitude &lt;dbl&gt;, latitude &lt;dbl&gt;</code></pre><p>Finally, you can also <strong>share</strong> resources with other R sessions and other users by publishing to a local folder, Kaggle, GitHub and RStudio Connect.</p><p>To publish resources in Kaggle, you would first need to register the Kaggle board by creating a <a href="https://www.kaggle.com/me/account">Kaggle API Token</a>, and then publishing to Kaggle by storing a pin in the &lsquo;kaggle&rsquo; board:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">board_register_kaggle</span>(token <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">&lt;path-to-kaggle.json&gt;&#34;</span>)<span style="color:#06287e">pin_get</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">hpiR/seattle_sales&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">pin</span>(name <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">seattle_sales&#34;</span>, board <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">kaggle&#34;</span>)</code></pre></div><p>There are other boards you can use or even create custom boards as described in the <a href="https://rstudio.github.io/pins/articles/boards-understanding.html">Understanding Boards</a> article; in addition, <code>pins</code> can also be used with RStudio products which we will describe next.</p><h2 id="rstudio">RStudio</h2><p>You can use <a href="https://www.rstudio.com/products/rstudio/">RStudio</a> and <a href="https://www.rstudio.com/products/connect/">RStudio Connect</a> to discover and share content within your organization with ease.</p><p>To enable new boards, like Kaggle and RStudio Connect, you can use <a href="https://blog.rstudio.com/2017/08/16/rstudio-preview-connections/">RStudio&rsquo;s Data Connections</a> to create a new &lsquo;pins&rsquo; connection, which provides you access to many boards:</p><img src="https://www.rstudio.com/blog/images/2019-09-09-rstudio-connect-board.png" height="200px"/><p>Once connected, you can use the connections pane to track the pins you own and preview them with ease. Notice that one connection is created for each board.</p><img src="https://www.rstudio.com/blog/images/2019-09-09-rstudio-explore-pins.png" height="170px" style="box-shadow: 2px 10px 10px #EAEAEA;"/><p>To <strong>discover</strong> remote resources, simply expand the &ldquo;Addins&rdquo; menu and select &ldquo;Find Pin&rdquo; from the dropdown. This addin allows you to search for pins across all boards, or scope your search to particular ones as well:</p><img src="https://www.rstudio.com/blog/images/2019-09-09-rstudio-discover-pins.png" height="280px"/><p>You can then <strong>share</strong> local resources using the RStudio Connect board. Lets use <code>dplyr</code> and the <code>hpiR_seattle_sales</code> pin to analyze this further and then pin our results in RStudio Connect.</p><pre><code>board_register_rsconnect()pin_get(&quot;hpiR/seattle_sales&quot;) %&gt;%group_by(baths = ceiling(baths)) %&gt;%summarise(sale = floor(mean(sale_price))) %&gt;%pin(&quot;sales-by-baths&quot;, board = &quot;rsconnect&quot;)</code></pre><p>After a pin is published, you can then browse to the pin&rsquo;s content from the RStudio Connect web interface.</p><img src="https://www.rstudio.com/blog/images/2019-09-09-rstudio-share-resources.png" height="300px" style="box-shadow: 2px 10px 10px #EAEAEA;"/><p>You can now set the appropriate permissions in RStudio Connect, and voila! From now on, those with access can make use of this remote pin locally!</p><p>For instance, a colleague can reuse the <code>sales-by-baths</code> pin by retrieving it from RStudio Connect and visualize its contents using <code>ggplot2</code>:</p><pre><code>library(pins)board_register_rsconnect()pin_get(&quot;sales-by-baths&quot;) %&gt;%ggplot(aes(x = baths, y = sale)) +geom_point() + geom_smooth(method = 'lm', formula = y ~ exp(x))</code></pre><img src="https://www.rstudio.com/blog/images/2019-09-09-rstudio-reuse-pin-ggplot2.png" height="280px" style="box-shadow: 2px 10px 10px #EAEAEA;"/><p>Pins can also be automated using scheduled R Markdown. This makes it much easier to create Shiny applications that rely on scheduled data updates or to share prepared resources across multiple pieces of content. You no longer have to fuss with file paths on RStudio Connect, mysterious resource URLs, or redeploying application code just to update a dataset!</p><p>Experimental support for <code>pins</code> will be introduced in an upcoming release, stay tuned for RStudio Connect 1.7.8!</p><p>Please also make sure to <del>pin</del> visit, <a href="https://rstudio.github.io/pins">rstudio.github.io/pins</a>, where you can find detailed documentation and additional resources. Thanks!</p></description></item><item><title>Deadline extended for rstudio::conf(2020) abstract submissions</title><link>https://www.rstudio.com/blog/rstudio-conf-2020-submission-deadline-extended/</link><pubDate>Fri, 06 Sep 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-conf-2020-submission-deadline-extended/</guid><description><p><a href="https://www.rstudio.com/conference/">rstudio::conf</a>, the conference on all things R and RStudio, will take place January 29 and 30, 2020 in San Francisco, California, preceded by Training Days on January 27 and 28.</p><p>We&rsquo;ve received requests from a number of you for permission to submit talk/e-poster abstracts after the deadline (today, September 6). In response, we&rsquo;re extending the deadline by a week for everyone; <strong>the new submission deadline is September 13,</strong> a week from today, at 11:59PM PDT. We&rsquo;ll still notify you of our decision in early October.</p><p>See our <a href="https://blog.rstudio.com/2019/08/09/rstudio-conf-2020-call-for-submissions/">earlier post</a> for submission guidelines.</p><p><a href="https://rstd.io/conf-talks" button="" type="button" style="padding: 12px 20px; border: none; font-size: 18px; border-radius: 3px; cursor: pointer; background-color: #4c83b6; color: #fff; box-shadow: 0, 1px, 3px, 0px, rgba(0,0,0,0.10);">Apply now!</a></p></description></item><item><title>rstudio::conf(2020) Diversity and international scholarships</title><link>https://www.rstudio.com/blog/diversity-and-international-scholarships/</link><pubDate>Fri, 30 Aug 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/diversity-and-international-scholarships/</guid><description><p>rstudio::conf(2020L) continues our tradition of diversity scholarships, and this year we’re increasing the program size to 44 recipients. As a result of thinking about our goals, this year we have two components to the program:</p><ul><li><p>38 domestic diversity scholarships available to anyone living in the US orCanada who is a member of a group that is under-represented at rstudio::conf.At present, these groups would include women/minority genders, people ofcolor, LGBTQ, elders/older adults, and those with disabilities.</p></li><li><p>6 international scholarships available to citizens/permanent residents ofMexico, as well as countries in South America, Central America, and theCaribbean.</p></li></ul><p>In the long run, we hope that the rstudio::conf participants reflect the full diversity of the world around us. We believe in building on-ramps so that people from diverse backgrounds can learn R, build their knowledge, and then contribute back to the community. Through this program, we want to support scholars by covering the costs of conference registration (including workshops) and providing funds for travel and accommodation (up to $1000). At the conference, scholars will also have networking opportunities with past diversity scholarship recipients as well as with leaders in the field.</p><p>We also recognise that there are many parts of the world that do not offer easy access to R events. In addition to South America, Central America, Mexico, and the Caribbean, which are mentioned above, Africa has an active and growing community of R users, but relatively few R events hosted within the region. Due to the persistent challenges African scholars have faced in obtaining visas to attend rstudio::conf in recent years, we are currently looking into alternative ways we might involve our African colleagues in this year’s conference, and will be sharing more about these plans soon. We will continue to re-evaluate the regions where our scholarships can have the greatest impact and will adjust this program as rstudio::conf grows. International scholars will receive complimentary conference and workshop registration, funds for travel and accommodation funding (up to $3000), and additional networking opportunities.</p><p>Scholarship applications will be evaluated on three main criteria:</p><ul><li><p>How will attending the conference impact you? What important problems willyou be able to tackle that you couldn’t before?</p></li><li><p>You will learn the most if you already have some experience with R, so show uswhat you’ve achieved so far (GitHub is the easiest way for us to assess yourskills, if you’re new to Github, check out<a href="https://guides.github.com/activities/hello-world/">https://guides.github.com/activities/hello-world/</a> to get somethinginformative up in under an hour).</p></li><li><p>How will you share your knowledge with others? We can’t help everyone, sowe’re particularly interested in helping those who will go back to theircommunities and spread the love. Show us how you’ve shared your skills andknowledge in the past, and tell us what you have planned for the future.</p></li></ul><p>The scholarships are competitive, so make sure you highlight what makes you special. The application deadline is September 30, 2019.</p><p><a href="https://forms.gle/ASgFTmxq44EKdz286" button type="button" style= "padding: 12px 20px; border: none; font-size: 18px; border-radius: 3px; cursor: pointer; background-color: #4c83b6; color: #fff; box-shadow: 0, 1px, 3px, 0px, rgba(0,0,0,0.10);">Apply now!</a></p></description></item><item><title>rstudio::conf(2020) call for submissions</title><link>https://www.rstudio.com/blog/rstudio-conf-2020-call-for-submissions/</link><pubDate>Fri, 09 Aug 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-conf-2020-call-for-submissions/</guid><description><p><a href="https://rstd.io/conf">rstudio::conf</a>, the conference on all things R and RStudio, will take place January 29 and 30, 2020 in San Francisco, Califorrnia, preceded by Training Days on January 27 and 28.</p><p>The rstudio::conf program includes invited speakers and RStudio employees, but we also want to hear from you, the R community! We are particularly interested in submissions that have one or more of these qualities:</p><ul><li>Showcase the use of R to solve real problems.</li><li>Expand the use of R to reach new domains and audiences.</li><li>Combine R with other world class tools, like python, tensorflow, and spark.</li><li>Communicate using R, whether it’s building on top of RMarkdown, Shiny, ggplot2, or something else altogether.</li><li>Discuss how to teach R effectively.</li></ul><p>This year we’ve expanded the program: in addition to the three general interest parallel tracks, we’ve added a fourth track, aimed at:</p><ul><li>Industry-specific topics (particularly pharma and finance).</li><li>Advanced technical topics aimed at expert R programmers.</li></ul><p>We strive to reflect the full diversity of the R community in our conference program. If you have an interesting topic, we encourage you to apply, regardless of your background, experience, or job title.</p><p>Contributions will take one of three forms:</p><ul><li>20-minute contributed talk, given alongside invited talks.</li><li>5 minute lightning talk, hosted in two high-energy lightning talk sessions.</li><li>Electronic poster, shown during the opening reception on Thursday evening. We’ll provide a big screen, power, internet, drinks and snacks; you’ll provide a laptop running an innovative display or demo.</li></ul><p>If accepted, you’ll receive complimentary registration for the conference. (If you have already registered, we’ll refund your registration.)</p><p>If you’re interested, please create an account and submit a proposal at: <a href="https://rstd.io/conf-talks">https://rstd.io/conf-talks</a>. Submission closes September 7, and we’ll make decisions in early October.</p><p><a href="https://rstd.io/conf-talks" button type="button" style= "padding: 12px 20px; border: none; font-size: 18px; border-radius: 3px; cursor: pointer; background-color: #4c83b6; color: #fff; box-shadow: 0, 1px, 3px, 0px, rgba(0,0,0,0.10);">Apply now!</a></p></description></item><item><title>The Shiny Developer Series</title><link>https://www.rstudio.com/blog/the-shiny-developer-series/</link><pubDate>Mon, 05 Aug 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/the-shiny-developer-series/</guid><description><p><a href="https://pages.rstudio.net/shiny_dev_series.html"><img src="https://www.rstudio.com/assets/img/Shiny-developer-series-flat.png" alt="The Shiny Developer Series"></a></p><p>Shiny is one of the best ways to build interactive documents, dashboards, and data science applications. But advancing your skills with Shiny does not come without challenges.</p><p>Shiny developers often have a stronger background in applied statistics than in areas useful for optimizing an application, like programming, web development, and user-interface design. Though there are many packages and tools that make developing advanced Shiny apps easier, new developers may not know these tools exist or how to find them. And Shiny developers are also often siloed. Though the Shiny developer community is huge, there is rarely someone sitting next to you to sound out ideas about your app.</p><p>With these challenges in mind, the RStudio Community has partnered with Eric Nantz of the R-Podcast to create the Shiny Developer Series.</p><h1 id="what-is-the-shiny-developer-series">What is the Shiny Developer Series?</h1><p>Our goal is to:</p><ul><li>Review great tools that serve Shiny developers</li><li>Meet the people behind these tools and learn from their experiences</li><li>Foster the Shiny community.</li></ul><p>Each episode of the series includes:</p><ul><li>A webinar, where Eric hosts a live interview with the author of a tool or package that helps make Shiny developers’ lives a bit easier.</li><li>An open Q&amp;A and follow-up discussion on <a href="https://community.rstudio.com/c/shiny">community.rstudio.com/c/shiny</a></li><li>Recorded Live Demos - when it makes sense, Eric and/or his guest will record a demo of the tools they talked about.</li></ul><p><a href="https://pages.rstudio.net/shiny_dev_series.html"><strong>Register now for Shiny Developer Series updates and scheduling reminders</strong></a></p><h1 id="past-episodes">Past episodes</h1><p>We have already had three great episodes!</p><h3 id="winston-chang-on-shinys-development-history-and-future">Winston Chang on Shiny’s Development History and Future</h3><p>In Episode 1, Winston Chang talked about the key events that triggered RStudio’s efforts to make Shiny a production-ready framework, how principles of software design are invaluable for creating complex applications, and exciting plans for revamping the user interface and new integrations.<a href="https://shinydevseries.com/post/episode-1-shiny-development-past-and-future/">Watch the episode</a> - <a href="https://shinydevseries.com/post/episode-1-shiny-development-past-and-future/">Show notes</a> - <a href="https://community.rstudio.com/t/shiny-developer-series-episode-1-follow-up-thread/29491/">Follow-up community discussion</a></p><h3 id="colin-fay-on-golem-and-effective-shiny-development-methods">Colin Fay on <code>golem</code> and Effective Shiny Development Methods.</h3><p>In Episode 2, Colin Fay from ThinkR shared insights and practical advice for building production grade Shiny applications. He talked about the new <code>golem</code> package as the <code>usethis</code> for Shiny app development, why keeping the perspective of your app customers can keep you on the right development path, and much more.<a href="https://shinydevseries.com/post/episode-2-golem/">Watch the episode</a> - <a href="https://shinydevseries.com/post/episode-2-golem/">Show notes</a> - <a href="https://community.rstudio.com/t/shiny-developer-series-episode-2-follow-up-thread-colin-fay-on-golem-and-effective-shiny-development-methods/32618">Follow-up community discussion</a></p><p><a href="https://shinydevseries.com/post/golem-demo/"><strong>Video demo of the <code>golem</code> shiny app development workflow</strong></a></p><h3 id="mark-edmondson-on-googleanalyticsr-and-building-an-r-package-optimized-for-shiny">Mark Edmondson on <code>googleAnalyticsR</code> and building an R-Package Optimized for Shiny</h3><p>In Episode 3 - Mark Edmondson from IIH Nordic talked about how he incorporates Shiny components such as modules with <code>googleAnalyticsR</code> and his other excellent packages. He dived into some of the technical challenges he had to overcome to provide a clean interface to many Google APIs, the value of open-source contributions to both his work and personal projects, and much more.<a href="https://shinydevseries.com/post/episode-3-googleanalyticsr/">Watch the episode</a> - <a href="https://shinydevseries.com/post/episode-3-googleanalyticsr/">Show notes</a> - <a href="https://community.rstudio.com/t/shiny-developer-series-webinar-discussion-episode-3-mark-edmondson-on-googleanalyticsr-and-linking-shiny-to-complex-apis/33669">Follow-up community discussion</a></p><h1 id="upcoming-episodes">Upcoming Episodes</h1><p>We have episodes scheduled for the rest of the year.</p><h3 id="david-granjon-on-the-rinterface-collection-of-production-ready-shiny-ui-packages">David Granjon on the RinteRface Collection of Production-Ready Shiny UI Packages</h3><p>Episode 4 - Friday, August 9, Noon-1PM Eastern</p><p>If you&rsquo;ve ever wanted to build an elegant and powerful Shiny UI that takes advantage of modern web frameworks, this episode is for you! David Granjon of the <code>RinteRface</code> project joins us to highlight the ways you can quickly create eye-catching dashboards, mobile-friendly views, and much more with the <code>RinteRface</code> suite of Shiny packages.</p><h3 id="nick-strayer-on-novel-uses-of-javascript-in-shiny-applications">Nick Strayer on Novel Uses of JavaScript in Shiny Applications</h3><p>Episode 5 - Friday, September 13, 11am-Noon Eastern</p><p>Shiny has paved the way for R users to build interactive applications based in javascript, all through R code. But the world of javascript can bring new possibilities for visualizations and interactivity. Nick Strayer joins us in episode 5 of the Shiny Developer Series to discuss the ways he&rsquo;s been able to harness the power of javascript in his projects, such as his <code>shinysense</code> package.</p><h3 id="yang-tang-on-advanced-ui-the-motivation-and-use-cases-of-shinyjqui">Yang Tang on Advanced UI, the Motivation and Use Cases of <code>shinyjqui</code></h3><p>Episode 6 - Friday, October 25, 11am-Noon Eastern</p><p>Sometimes your Shiny app&rsquo;s UI needs a little extra interactivity to give users more flexibility and highlight key interactions. For example, one user might not like the initial placement of a plot or data table and would like to move it around themselves. In episode 6 of the Shiny Developer Series, we will be joined by Yang Tang to discuss the development and capabilities of the powerful <code>shinyjqui</code> package that provides Shiny developers clean and intuitive wrappers to the immensely popular JQuery javascript library.</p><h3 id="victor-perrier--fanny-meyer-on-dreamrs-and-tools-to-customize-the-look--feel-of-your-app">Victor Perrier &amp; Fanny Meyer on dreamRs, and Tools to Customize the Look &amp; Feel of Your App</h3><p>Episode 7 - Friday November 8, 11-Noon Eastern time.</p><p>Have you ever wanted to ease the effort in customizing the look and feel of your Shiny app? Victor and Fanny are behind dreamRs, a large collection of R packages dedicated to Shiny developers, many of which are designed to help you make your Shiny app as professional looking as possible. They will talk about how moving beyond Shiny’s default options can improve your users’ experience.</p><h3 id="nathan-teetor-on-the-approach-and-philosophy-of-yonder">Nathan Teetor on the approach and philosophy of <code>yonder</code></h3><p>Episode 8 - Friday December 6, 1-2pm Eastern</p><p>We have seen the Shiny community grow immensely with excellent packages built to extend the existing functionality provided by Shiny itself. But what would a re-imagination of Shiny’s user interface and server-side components entail? Nathan’s <code>yonder</code> package is built on Shiny, but gives developers an alternative framework for building applications with R. Eric will talk with Nathan about what shiny developers can learn from this approach and how he’s approached such an ambitious undertaking!</p><hr><p>If you would benefit by being kept up to date about this material and developments in the community, please <a href="https://pages.rstudio.net/shiny_dev_series.html"><strong>register now for Shiny Developer Series. You will receive updates and scheduling reminders</strong></a>, and check out the <a href="https://shinydevseries.com/">Shiny Developer Series website</a>.</p></description></item><item><title>RStudio Trainer Directory Launches</title><link>https://www.rstudio.com/blog/2019-07-18-instructor-directory/</link><pubDate>Thu, 18 Jul 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/2019-07-18-instructor-directory/</guid><description><p>Several dozen people have taken part in RStudio&rsquo;s instructor training and certification program since it was <a href="https://www.rstudio.com/blog/2019-02-28-instructor-training/">announced</a> earlier this year. Since <a href="https://www.rstudio.com/blog/2019-05-21-instructor-training-updates/">our last update</a>, many of them have completed certification, so we are pleased to announce a preview of <a href="https://rstd.io/trainers/">our trainers&rsquo; directory</a>. Each of the people listed there has completed an exam on <a href="https://drive.google.com/drive/folders/13ohFt3D0EJ5PDbMaWTxnHH-hwA7G0IvY">modern evidence-based teaching practices</a>, as well as technical exams on the <a href="https://r4ds.had.co.nz/">Tidyverse</a> or <a href="https://shiny.rstudio.com/">Shiny</a>, and they would welcome inquiries from anyone who is looking for high-quality training on these topics.</p><p>We plan to fold this directory into the main RStudio website in the near future, and would be grateful for comments about how to make it more useful. We are also now taking applications for instructor training in September and October 2019; if you are interested, you can find details <a href="https://rstudio-trainers.netlify.com/#info">here</a> or register <a href="https://docs.google.com/forms/d/e/1FAIpQLSdnybZ-Zs64QE1h7bk67uRs1UCUi1Tibi3noefyStrTHplSDA/viewform">here</a>. If you have questions or suggestions, please <a href="mailto:greg.wilson@rstudio.com">email Greg Wilson</a>.</p></description></item><item><title>rstudio::conf(2020) is open for registration!</title><link>https://www.rstudio.com/blog/rstudio-conf-2020/</link><pubDate>Mon, 15 Jul 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-conf-2020/</guid><description><p><a href="https://cvent.me/1DdKa?RefId=dev-blog&amp;utm_source=DevBlog&amp;utm_medium=Site&amp;utm_campaign=Site%20Promo">rstudio::conf</a>, the conference for all things R and RStudio, will take place January 29 and 30, 2020 in San Francisco, California. It will be preceded by Training Days on January 27 and 28. Early Bird registration is now open!</p><p><a href="http://rstd.io/conf"><img src="header.jpg" alt="Register now"></a></p><h2 id="conference-wednesday-thursday-jan-29-30">Conference: Wednesday-Thursday, Jan 29-30</h2><p>Join me, your host and Chief Scientist of RStudio, for our keynote speakers:</p><ul><li><p><a href="https://hilaryparker.com/about-hilary-parker/">Hilary Parker</a> (Stitch Fix) and<a href="http://www.biostat.jhsph.edu/~rpeng/">Roger Peng</a> (Johns Hopkins)</p></li><li><p><a href="http://www.bewitched.com/about.html">Martin Wattenberg</a> and<a href="http://www.fernandaviegas.com">Fernanda Viegas</a> (Research Scientists, Google).</p></li><li><p><a href="https://jennybryan.org">Jenny Bryan</a> (Engineer, RStudio).</p></li><li><p><a href="https://github.com/jjallaire">JJ Allaire</a> (CEO, RStudio).</p></li></ul><p>Along with 80 other talks in four parallel tracks.</p><p>As well as RStudio data scientists and engineers, you&rsquo;ll also hear from and interact with outstanding speakers drawn from the wider R and data science communities. Stay tuned for announcements about our invited speakers, and our call for contributed talks.</p><p>Check out our <a href="https://resources.rstudio.com/rstudio-conf-2019">videos from the last conference</a> to get a sense of the depth and breadth of the typical content.</p><h2 id="training-days-monday-tuesday-jan-27-28">Training days: Monday-Tuesday, Jan 27-28</h2><p>Preceding the conference on Monday and Tuesday, January 27-28, RStudio will offer two days of optional in-person training. This year, we&rsquo;ve expanded our training program to include 19 workshops taught by experts throughout the R community. These workshops span 3 categories of learning:</p><ul><li>Introductory: Workshops that require minimal or no R experience</li><li>Intermediate/Advanced: Workshops that expand your existing R knowledge and focus on a specialized area of R</li><li>Professional Development: Workshops focusing on professional skills that complement your R knowledge.</li></ul><p>Your workshop choices this year include:</p><table><thead><tr><th align="left"><strong>6 Introductory Workshops</strong></th><th align="left"><strong>Instructor(s)</strong></th></tr></thead><tbody><tr><td align="left">Designing The Data Science Classroom</td><td align="left"><a href="https://www2.stat.duke.edu/~mc301/">Mine Cetinkaya-Rundel</a></td></tr><tr><td align="left">Introduction to Machine Learning with the Tidyverse</td><td align="left"><a href="https://alison.rbind.io">Alison Hill</a> &amp; <a href="https://www.linkedin.com/in/garrett-grolemund-49328411/">Garrett Grolemund</a></td></tr><tr><td align="left">Communicating with R Markdown and Interactive Dashboards</td><td align="left"><a href="http://carlhowe.com">Carl Howe</a> &amp; <a href="https://yihui.name/en/vitae/">Yihui Xie</a></td></tr><tr><td align="left">R for Excel Users</td><td align="left"><a href="https://jules32.github.io/">Julia Lowndes</a> &amp; <a href="https://www.bren.ucsb.edu/people/Faculty/allison_horst.htm">Allison Horst</a></td></tr><tr><td align="left">Shiny From Start To Finish</td><td align="left">Danny Kaplan</td></tr><tr><td align="left">Introduction to Data Science in the Tidyverse</td><td align="left"><a href="http://www.amelia.mn">Amelia McNamara</a> &amp; <a href="http://hadley.nz">Hadley Wickham</a></td></tr></tbody></table><table><thead><tr><th align="left"><strong>11 Intermediate and Advanced Workshops</strong></th><th align="left"><strong>Instructor(s)</strong></th></tr></thead><tbody><tr><td align="left">Text Mining with Tidy Data Principles</td><td align="left"><a href="https://juliasilge.com/about/">Julia Silge</a></td></tr><tr><td align="left">Modern Geospatial Data Analysis with R</td><td align="left"><a href="http://zevross.com/information/about-zevross/">Zev Ross</a></td></tr><tr><td align="left">Data Visualization with R</td><td align="left">Kieran Healy</td></tr><tr><td align="left">Time Series and Forecasting in R</td><td align="left"><a href="https://robjhyndman.com">Rob Hyndman</a></td></tr><tr><td align="left">My Organization&rsquo;s First R Package</td><td align="left">Rich Iannone &amp; Malcolm Barrett</td></tr><tr><td align="left">Applied Machine Learning</td><td align="left">Max Kuhn</td></tr><tr><td align="left">Building Tidy Tools</td><td align="left">Charlotte Wickham &amp; <a href="http://hadley.nz">Hadley Wickham</a></td></tr><tr><td align="left">What They Forgot to Teach You About R</td><td align="left">Kara Woo, Jenny Bryan, &amp; Jim Hester</td></tr><tr><td align="left">Big Data with R</td><td align="left">Edgar Ruiz</td></tr><tr><td align="left">JavaScript for Shiny Users</td><td align="left"><a href="https://www.garrickadenbuie.com/">Garrick Aden-Buie</a></td></tr><tr><td align="left">Deep Learning with Keras and TensorFlow in R</td><td align="left"><a href="http://bradleyboehmke.github.io">Bradley Boehmke</a></td></tr></tbody></table><table><thead><tr><th align="left"><strong>2 Workshops for Partners, Professionals and Administrators</strong></th><th align="left"><strong>Instructor</strong></th></tr></thead><tbody><tr><td align="left">RStudio Professional Products administration</td><td align="left">Andrie de Vries</td></tr><tr><td align="left">Instructor Training</td><td align="left"><a href="http://third-bit.com/cv/">Greg Wilson</a></td></tr></tbody></table><p>While we&rsquo;ve expanded the number of workshop attendees we can accommodate by nearly 50% in 2020, RStudio::conf workshops are very popular and do sell out quickly. We urge attendees who would like to attend workshops to register well in advance of the conference registration deadline.</p><h2 id="who-should-go">Who should go?</h2><p><a href="https://cvent.me/1DdKa?RefId=dev-blog&amp;utm_source=DevBlog&amp;utm_medium=Site&amp;utm_campaign=Site%20Promo">rstudio::conf</a> is for RStudio users, R administrators, and RStudio partners who want to learn how to write better Shiny applications, explore all the capabilities of R Markdown, work effectively with Spark or TensorFlow, build predictive models, understand the tidyverse of tools for data science, build tidy tools themselves, discover production-ready development &amp; deployment practices, earn certification as a trainer for Shiny or the Tidyverse, or become a certified administrator of RStudio professional products.</p><h2 id="why-do-people-go-to-rstudioconf">Why do people go to rstudio::conf?</h2><p>Because there is simply no better way to learn about all things R &amp; RStudio.</p><blockquote><p>After #rstudioconf, I shut my laptop for four days to decompress, marinateand spend some time with family. It&rsquo;s back to work tomorrow and I couldn&rsquo;tbe more excited to start coding again and <em>this feeling</em> is why I love theR community so much.&mdash; <a href="https://twitter.com/beeonaposy/status/1087914863194710016">Caitlyn Hudon</a></p></blockquote><blockquote><p>Thing I learned at #Rstudioconf that sticks out to me the most: it turns outthat this group of people who are so kind and welcoming online are also kindand welcoming in real life. More than any library or api that’s what makes#rstats great.&mdash; <a href="https://twitter.com/NicholasStrayer/status/1086681100787822592">Nick Strayer</a></p></blockquote><blockquote><p><a href="https://drmowinckels.io/blog/why-rstudio-conf-is-the-best-conference-experience-i-have-had/">Wrote a little post</a>on why #rstudioconf is the best conference experience i have had. No kidding.I am on a high!&mdash; <a href="https://twitter.com/DrMowinckels/status/1086469524772306945">Athanasia Mowinckel</a></p></blockquote><script src="https://fast.wistia.com/embed/medias/rz2ehf04zi.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_rz2ehf04zi videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/rz2ehf04zi/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><h2 id="what-should-i-do-now">What should I do now?</h2><p><img src='early-bird.png' width='200' align='right' />Be an early bird! Attendance is limited. All seats are are available on a first-come, first-serve basis. Early Bird registration discounts are available (Conference only) and a capped number of Academic discounts are also available for eligible students and faculty.</p><p>Stay tuned for information about diversity scholarships which will be announced mid August.</p><p>If all tickets available for a particular workshop are sold out before you are able to purchase, we apologize in advance!</p><p><a href="https://cvent.me/1DdKa?RefId=dev-blog&utm_source=DevBlog&utm_medium=Site&utm_campaign=Site%20Promo" button type="button" style= "padding: 12px 20px; border: none; font-size: 18px; border-radius: 3px; cursor: pointer; background-color: #4c83b6; color: #fff; box-shadow: 0, 1px, 3px, 0px, rgba(0,0,0,0.10);">Register now!</a></p><p>If you have any questions or issues registering, please email <a href="mailto:conf@rstudio.com">conf@rstudio.com</a>.</p></description></item><item><title>RStudio Connect 1.7.6 - Publish Git-backed Content</title><link>https://www.rstudio.com/blog/rstudio-connect-1-7-6/</link><pubDate>Mon, 24 Jun 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-1-7-6/</guid><description><p>RStudio Connect 1.7.6 has been released and is now available for download. Thisrelease includes a new publishing method for Git-backed content, the ability forpublishers to manage vanity URLs for applications, full support for all SAMLauthentication providers, and other improvements and bug fixes.</p><figure><img src="https://www.rstudio.com/blog-images/rsc-176-git-header.png"alt="New Content Dropdown Menu in RStudio Connect"/></figure><h2 id="updates">Updates</h2><h3 id="new-publishing-method-for-git-backed-content">New publishing method for Git-backed content</h3><p>This release adds the ability for data scientists to deploy content from Gitrepositories for individual applications within RStudio Connect.</p><figure><img src="https://www.rstudio.com/blog-images/rsc-176-git-content.png"alt="Publish Content from Git to RStudio Connect"/></figure><p>This publishing method is designed to allow data scientists to publish directlyfrom Git repositories to Connect, and have that content get updated at regularintervals without the need for external CI/CD systems like Jenkins or Travis CI.</p><figure><img src="https://www.rstudio.com/blog-images/rsc-176-git-deploy.png"alt="Create New Content from Git Repository in RStudio Connect"/></figure><p>This publishing method complements the existing methods of:</p><ul><li>Push-button publishing from the RStudio IDE</li><li>Push-button publishing from Jupyter Notebooks</li><li>Programmatic deployment with CI/CD pipelines using the RStudio Connect Server APIs</li></ul><p>Refer to the documentation on Git-Backed Content in the <a href="https://docs.rstudio.com/connect/1.7.6/user/git-backed.html">UserGuide</a> and<a href="https://docs.rstudio.com/connect/1.7.6/admin/content-management.html#git-backed">AdministrationGuide</a>for additional information on the configuration and usage of this newfunctionality.</p><h3 id="publishers-can-modify-vanity-urls-for-content">Publishers can modify vanity URLs for content</h3><p>Vanity URLs allow administrators to create &ldquo;vanity paths&rdquo; for published contentin RStudio Connect, which makes the content available at a customized URL pathrather than a URL path that uses the numeric app ID as an identifier.</p><figure><img src="https://www.rstudio.com/blog-images/rsc-176-vanity-urls.png"alt="Modify the Vanity URL/Address for Content in RStudio Connect"/></figure><p>This release adds the ability for publishers to modify vanity URLs for publishedcontent without the need for administrators to perform this configuration.</p><p>Refer to the documentation on custom vanity URLs in the <a href="https://docs.rstudio.com/connect/1.7.6/user/settings-panel.html#vanity-url">UserGuide</a>and <a href="https://docs.rstudio.com/connect/1.7.6/admin/appendix-configuration.html#appendix-configuration-authorization">AdministrationGuide</a>for additional information on the configuration and usage of this new<code>Authorization.PublishersCanManageVanities</code> setting.</p><h3 id="ability-to-isolate-viewer-permissions-and-discoverability">Ability to isolate viewer permissions and discoverability</h3><p>This release adds the ability for administrators to configure a global settingto prevent viewers from seeing any other registered users, groups, or publishersin RStudio Connect.</p><figure><img src="https://www.rstudio.com/blog-images/rsc-176-viewer-permissions.png"alt="Isolate Viewer Permissions in RStudio Connect"/></figure><p>This setting is useful for RStudio Connect environments where users with theviewer role should not able to discover the existence or identities of otherusers, groups, or publishers on the server. With this setting enabled, userswith the viewer role will only be able to discover and view published contentthat they have been explicitly granted access to.</p><p>Refer to the Administration Guide Configuration section on<a href="https://docs.rstudio.com/connect/1.7.6/admin/appendix-configuration.html#appendix-configuration-authorization">Authorization</a>for more information on the <code>Authorization.ViewersCanOnlySeeThemselves</code> setting.</p><h3 id="full-support-for-all-saml-providers">Full support for all SAML providers</h3><p>RStudio Connect 1.7.4 added <a href="https://blog.rstudio.com/2019/05/14/introducing-saml-in-rstudio-connect/">support for SAML-basedauthentication</a>and a subset of identity providers. The 1.7.6 release adds support for allSAML-based identity providers. Refer to the support article on <a href="https://support.rstudio.com/hc/en-us/articles/360022321494-Getting-Started-with-SAML-in-RStudio-Connect">Getting Startedwith SAML in RStudioConnect</a>for more information.</p><h3 id="documentation-for-server-api-cookbook">Documentation for Server API Cookbook</h3><p>The RStudio Connect <a href="https://docs.rstudio.com/connect/1.7.6/cookbook/">Server APICookbook</a> has been madeavailable as a separate guide and is no longer part of the User Guide.</p><h2 id="security--authentication-changes">Security &amp; Authentication Changes</h2><ul><li><p><strong>Forgot Password Behavior</strong> - When using built-in Password authentication,requesting a password reset via the &ldquo;forgot password&rdquo; link no longer fails fornon-existing users, to prevent malicious user enumeration.</p></li><li><p><strong>Email Address Changes</strong> - Changes made to the email addresses in userprofiles done manually or via Connect Server API will cause an email to be sentto the old email address, so the user is notified about the new email address inuse.</p></li><li><p><strong>Brute-Force Protection</strong> - A protection against brute-force attacks has beenimplemented for all authentication attempts against API calls to Connect usingeither API keys or tokens. After a failed authentication attempt, the user mayhave to wait longer before trying again.</p></li><li><p><strong>Enforced Password Complexity</strong> - Use <code>Password.MinimumScore</code> to control howcomplex (secure) new passwords must be when using the password authenticationprovider. See the <a href="https://docs.rstudio.com/connect/1.7.6/admin/authentication.html#authenticaton-password">Password Authenticationsection</a>of the Administration Guide for details.</p></li></ul><h2 id="deprecations--breaking-changes">Deprecations &amp; Breaking Changes</h2><ul><li><p><strong>Breaking Change</strong> - <code>Authorization.UsersListingMinRole</code> has been deprecatedand it should be removed from the configuration file. A warning will be issuedduring startup in the <code>rstudio-connect.log</code> if the setting is in use. In thenext release, the presence of this setting will prevent RStudio Connect fromstarting up. Customers using this setting with any value other than the default(viewer) should use <code>Authorization.ViewersCanOnlySeeThemselves = true</code> instead.</p></li><li><p><strong>Breaking Change</strong> - The <code>needs_config</code> field has been removed from theContent entity of the experimental Server API. All Content fields and endpointsto interact with content are provided in the Server API Reference.</p></li></ul><p>Refer to the <a href="https://docs.rstudio.com/connect/news/">full release notes</a> formore information on all of the changes and bug fixes in this release.</p><h2 id="upgrade-planning">Upgrade Planning</h2><blockquote><p>If you use the <code>Authorization.UsersListingMinRole</code> setting, please take noteof the changes described above and in the release notes. Aside from thedeprecations and breaking changes above, there are no other specialconsiderations and upgrading should require less than five minutes. If you areupgrading from an earlier version, be sure to consult the release notes forthe intermediate releases, as well.</p></blockquote><h2 id="get-started-with-rstudio-connect">Get Started with RStudio Connect</h2><p>If you haven&rsquo;t yet had a chance to download and try <a href="https://rstudio.com/products/connect/">RStudioConnect</a>, we encourage you to do so.RStudio Connect is the best way to share all the work that you do in R (Shinyapps, R Markdown documents, plots, dashboards, Plumber APIs, etc.) withcollaborators, colleagues, or customers.</p><p>You can find more details or download a 45-day evaluation of the product at<a href="https://www.rstudio.com/products/connect/">https://www.rstudio.com/products/connect/</a>.Additional resources can be found below.</p><ul><li><a href="https://www.rstudio.com/products/connect/">RStudio Connect home page &amp; downloads</a></li><li><a href="http://docs.rstudio.com/connect/admin/">RStudio Connect Administration Guide</a></li><li><a href="http://docs.rstudio.com/connect/news/">Detailed news and changes between each version</a></li><li><a href="https://www.rstudio.com/pricing/">Pricing</a></li></ul></description></item><item><title>RStudio Connect 1.7.4.2 - Important Security Patch</title><link>https://www.rstudio.com/blog/rstudio-connect-1-7-4-2-important-security-patch/</link><pubDate>Thu, 13 Jun 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-1-7-4-2-important-security-patch/</guid><description><p>This RStudio Connect patch release addresses an urgent security update and an important bug fix.</p><ul><li><p><strong>Security Update: Password Authentication</strong><br>A vulnerability has been identified for customers using RStudio Connect&rsquo;s built-in <a href="https://docs.rstudio.com/connect/1.7.4.2/admin/authentication.html#authentication-password">password authentication</a>. Due to the risks involved, <strong>password authentication will now require the configuration setting <a href="https://docs.rstudio.com/connect/1.7.4.2/admin/appendix-configuration.html#appendix-configuration-server"><code>Server.Address</code></a> for operations that will send emails</strong>. If this setting is not configured an error will occur for these operations. More detailed information about the vulnerability will be released in a future post. We have received no reports of impacted customers, but we recommend customers using password authentication upgrade immediately.</p></li><li><p><strong>Bug Fix: SLES Installer</strong><br>Prior versions of RStudio Connect used a shared RPM installer for SLES and RedHat. This installer was found to be incompatible for customers upgrading to SUSE Linux Enterprise Server 12 SP4 and SUSE Linux Enterprise Server 15. This version introduces a separate installer for SUSE systems. See the <a href="https://rstudio.com/products/connect/download-commercial">download page</a> for details.</p></li></ul><blockquote><p><strong>Upgrade Planning</strong><br>There are no special considerations when upgrading from RStudio Connect v1.7.4 to thispatch release. Upgrades should take less than five minutes. If you are upgrading from an older release, pleaseconsult the intermediate <a href="https://docs.rstudio.com/connect/news">release notes</a>.</p></blockquote><p>More information on the new features added in RStudio Connect 1.7.4 is available in a <a href="https://blog.rstudio.com/2019/05/14/introducing-saml-in-rstudio-connect/">prior post</a>.</p></description></item><item><title>Work Week at a Glance</title><link>https://www.rstudio.com/blog/2019-06-06-work-week-at-a-glance/</link><pubDate>Thu, 06 Jun 2019 14:30:11 +0000</pubDate><guid>https://www.rstudio.com/blog/2019-06-06-work-week-at-a-glance/</guid><description><p>Last month our team assembled from 26 states and seven countries in Columbus, OH for our company Work Week. Though plain in name, this week is anything but. It is filled with stories, homemade goodies, big laughs, new experiences and lots of in-depth conversations. The layout varies a bit year to year but the same is true - each and every time we connect, communicate, and get things done!</p><p>Some highlights this year included, hex cookies made by <a href="https://soonersugar.com/">Sooner Sugar</a>, jam sessions in the lobby, hearing from J.J. and Tareef, and five minute lightning talks given by our employees! It is fun to see our team members showcase their talents, whether it is wrapping or rapping.</p><p><img src="https://www.rstudio.com/blog-images/uploads/cookies.jpg" alt=""></p><p><img src="https://www.rstudio.com/blog-images/uploads/i-zfxw9sc-x51.jpg" alt="" title="Alison wrapping"></p><p><img src="https://www.rstudio.com/blog-images/uploads/i-zzzzsrq-x51.jpg" alt="" title="Toni rapping"></p><p>Thanks to the <a href="https://www.marriott.com/hotels/travel/cmhwi-the-westin-great-southern-columbus/">The Westin Great Southern Columbus</a> for making our stay so warm and welcoming. Elsie and her team were attentive, kind and available for any requests that came their way. It was beyond exceptional customer service! Our specific food requests along with our unusual IT needs were handled with ease and smiles (Thanks Robert &amp; Cory!). Our own Kaitlyn Horwitz also did a fantastic job planning end to end and coordinating all the details. We are thankful for all the effort she put in to make it special!</p><p>We are grateful for another successful Work Week, and the chance to spend time with each other. We are looking forward to next year and who knows, maybe you could <a href="https://www.rstudio.com/about/careers/">join us</a>!</p><p><img src="https://www.rstudio.com/blog-images/uploads/jamsession.png" alt=""></p><p><img src="https://www.rstudio.com/blog-images/uploads/tareefandjj.png" alt=""></p><p>Photo credit: Wes McKinney and Clay Walker</p></description></item><item><title>[R]eady for Production: a Joint Event with RStudio and EODA</title><link>https://www.rstudio.com/blog/r-eady-for-production-a-joint-event-with-rstudio-and-eoda/</link><pubDate>Fri, 24 May 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/r-eady-for-production-a-joint-event-with-rstudio-and-eoda/</guid><description><p>We’re excited to team up with <a href="https://www.eoda.de/en/">EODA</a>, an RStudio <a href="https://www.rstudio.com/certified-partners/">Full Service Certified Partner</a>, to host a free data science in production event in Frankfurt, Germany, on June 13. This one-day event will be geared for data science and IT teams that want to learn how to integrate their analysis solutions with the optimal IT infrastructure.</p><p>This is a great chance to work in smaller groups with experts from EODA and RStudio on best-practice approaches to productive data-science architectures, and to see real-world solutions to deployment problems.</p><p>With sessions in English and German, the conference will start with a high-level overview of the right interaction between data science and IT, and then focus on more hands-on solutions to engineering problems such as building APIs with Plumber and deploying to RStudio Connect, using Python and SQL in the RStudio IDE, and Shiny load testing.</p><p>For more information, and to secure your spot, head to the registration page!</p><p><a href="https://www.eoda.de/de/data-science-event-eoda-RStudio.html">[R]eady for Production with EODA and RStudio</a></p></description></item><item><title>RStudio Instructor Training Updates</title><link>https://www.rstudio.com/blog/2019-05-21-instructor-training-updates/</link><pubDate>Tue, 21 May 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/2019-05-21-instructor-training-updates/</guid><description><p>There has been a lot of interest in RStudio&rsquo;s instructor training and certification program since <a href="https://blog.rstudio.com/2019/02/28/rstudio-instructor-training/">it was announced in February</a>. People are now going through the training course and the subsequent exams, so we&rsquo;d like to share a bit more information about them.</p><h2 id="whats-the-big-picture">What&rsquo;s the big picture?</h2><p>In order to be certified, candidates must do a one-day training course on modern evidence-based teaching techniques, write a 90-minute exam on that material, and complete a technical exam on each of the topics they are certifying for. Right now we are offering separate certifications for the tidyverse and for Shiny; we plan to expand the list this fall. The course and each exam costs US$500, and a small number of waivers are available to ensure that people who might find this a barrier are able to take part.</p><p>Once someone has been certified, we will add them to our website and recommend them to anyone looking for training in R, RStudio products, and related topics. Certified instructors will also have an opportunity to help shape the future development of the certification program. You can tell us you&rsquo;re interested by filling in <a href="https://docs.google.com/forms/d/e/1FAIpQLSdnybZ-Zs64QE1h7bk67uRs1UCUi1Tibi3noefyStrTHplSDA/viewform">this form</a>; we will then let you know about upcoming training courses.</p><h2 id="what-is-the-relationship-with-the-carpentries-instructor-training">What is the relationship with the Carpentries&rsquo; instructor training?</h2><p>Our instructor training program draws inspiration and material from the one developed by <a href="https://carpentries.org/">the Carpentries</a> (the umbrella organization that includes <a href="https://software-carpentry.org/">Software Carpentry</a>, <a href="https://datacarpentry.org/">Data Carpentry</a>, and <a href="https://librarycarpentry.org/">Library Carpentry</a>). In recognition of this, anyone who is a certified Carpentries instructor and has taught at least one R workshop for them does not need to take RStudio&rsquo;s instructor training course. However, we still require people to complete the teaching and technical exams.</p><h2 id="what-does-the-teaching-exam-cover">What does the teaching exam cover?</h2><p>The <a href="https://drive.google.com/drive/folders/13ohFt3D0EJ5PDbMaWTxnHH-hwA7G0IvY">slides for the teaching exam</a> are available online under a Creative Commons license, and <a href="http://teachtogether.tech/">this free online book</a> has more material for those who want a deeper dive. The instructions for the teaching exam are given below.</p><p><strong>Overall Instructions</strong></p><p>Thank you again for your interest in certifying as an RStudio Instructor. This examination tests your knowledge of teaching.</p><ol><li>You must complete this exam within 90 minutes. Please take a moment to read over the entire exam, and then share your screen with your examiner and think aloud as you work through the questions.</li><li>You may use any digital resources you want during this examination, but may not communicate with any person other than your examiner, and may not share information with other people about the content of this examination. Failure to abide by this rule will result in immediate disqualification.</li><li>You may present your sample lesson at the start of the examination or at the end; please let the examiner know which you prefer when the examination starts.</li></ol><p><strong>Instructions for Sample Lesson</strong></p><p>Prepare a 15-minute lesson on a topic related to R, RStudio products, or data science, and submit it to the examiner at least two days before your scheduled examination. Your submission should include (but is not restricted to):</p><ol><li>a learner persona characterizing the audience for the lesson;</li><li>a concept map showing the mental model you intend to convey;</li><li>two formative assessments (such as multiple choice questions or Parsons Problems); and</li><li>any notebooks or slides you would use to support delivery of the lesson.</li></ol><p>You will have 20 minutes during the examination to deliver the lesson and its formative assessments. Live coding is strongly encouraged; if you make any mistakes (deliberate or otherwise), try to incorporate them into your teaching as you would in front of a class. You will be able to present your lesson at the start or the end of the exam as you prefer; please let the examiner know your choice when the exam starts.</p><p>Note that your examiner will attempt one of your formative assessments during the lesson. Please allow 3–5 minutes for this in your planning.</p><h2 id="what-does-the-tidyverse-exam-cover">What does the tidyverse exam cover?</h2><p>The short answer is, &ldquo;Everything in <em>R for Data Science</em>.&rdquo; The full instructions are given below.</p><p>Thank you again for your interest in certifying as an RStudio Instructor. This examination tests your knowledge of the material in <a href="https://r4ds.had.co.nz/"><em>R for Data Science</em></a> and your ability to explain it. You must complete the exam within 90 minutes. Please take a moment to read over the entire exam, and then share your screen with your examiner and think aloud as you work through the questions.</p><ol><li>You may use any digital resources you want during this examination, but you may not communicate with any person other than your examiner. Failure to abide by this rule will result in immediate disqualification.</li><li>You are required to use the RStudio IDE for the practical portions of this exam. You may use either the desktop edition or rstudio.cloud.</li><li>Narrate your work as you go along. When you make mistakes or go down blind alleys, diagnose and correct problems out loud as you would in front of a class.</li></ol><h2 id="what-does-the-shiny-exam-cover">What does the Shiny exam cover?</h2><p>We are still developing and testing the technical exam for <a href="https://shiny.rstudio.com/">Shiny</a>, and hope to have it ready by June.</p><h2 id="i-have-questions">I have questions&hellip;</h2><p>If you have questions or suggestions, we&rsquo;d love to hear from you: please <a href="mailto:greg.wilson@rstudio.com">email Greg Wilson</a>.</p></description></item><item><title>Introducing SAML in RStudio Connect</title><link>https://www.rstudio.com/blog/introducing-saml-in-rstudio-connect/</link><pubDate>Tue, 14 May 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/introducing-saml-in-rstudio-connect/</guid><description><p>RStudio Connect 1.7.4 builds off of the prior release with significant improvements for RStudio Connect administrators. SAML support is available for production, a new admin view of scheduled reports helps identify potential conflicts, additional server APIs support deployment workflows, and our email configuration has been completely re-vamped.</p><figure><img src="https://www.rstudio.com/blog/images/rsc-174-schedules.png"alt="View Scheduled Content"/> <figcaption><p>View Scheduled Content</p></figcaption></figure><h2 id="scheduled-content-calendar">Scheduled Content Calendar</h2><p>This release includes a new calendar view, pictured above, that helps administrators review all scheduled content in a single place. This view can help administrators identify reports that are running too frequently, or times when multiple reports have overlapping schedules; e.g., Monday at 9am.</p><h2 id="saml-support">SAML Support</h2><p>We are very excited to announce the release of SAML 2.0 as a production authentication method for RStudio Connect. This opens the door for integration with common identity providers in the enterprise, along with single-sign-on, multi-factor authentication, and other important security conventions.</p><p>As a part of this release, we prepared SAML integration templates to simplify your integration with common cloud identity providers. RStudio Connect also supports the SAML 2.0 protocol for integrations with many other authentication providers or homegrown SAML tools.</p><table><thead><tr><th>Identity Provider</th><th>Status</th><th>More Information</th></tr></thead><tbody><tr><td>Azure Active Directory</td><td>Tested and Integrated</td><td><a href="https://azuremarketplace.microsoft.com/en-us/marketplace/apps/aad.rstudioconnect?tab=Overview">Azure Portal</a></td></tr><tr><td>Okta</td><td>Tested &amp; Integrated</td><td><a href="https://saml-doc.okta.com/SAML_Docs/How-to-Configure-SAML-2.0-for-RStudio-Connect.html">Okta Integration Guide</a></td></tr><tr><td>OneLogin</td><td>Tested &amp; Integrated</td><td>Search <a href="https://www.onelogin.com/product/app-catalog">OneLogin Portal</a> After Login</td></tr><tr><td>Google, JumpCloud, ADFS, WSO2, PingIdentity, Shibboleth</td><td>Tested</td><td><a href="https://support.rstudio.com/hc/en-us/articles/360022321494">Configuration Guide</a></td></tr><tr><td>General SAML 2.0</td><td>Supported</td><td><a href="https://support.rstudio.com/hc/en-us/articles/360022321494">Configuration Guide</a></td></tr><tr><td>Duo, Centrify, Auth0</td><td>Supported</td><td><a href="https://support.rstudio.com/hc/en-us/articles/360022321494">Available in RStudio Connect 1.7.6+</a></td></tr></tbody></table><p>RStudio Connect’s SAML 2.0 authentication method supports Just-In-Time provisioning, either local or remote group management, Identity Provider metadata, and a handful of other configuration options that can be explored in the RStudio Connect Admin Guide.</p><h2 id="server-apis">Server APIs</h2><p>The <a href="https://docs.rstudio.com/connect/1.7.4/api/">Connect Server API</a> allows teams to interact with RStudio Connect from code. This release lets you programmatically update <a href="https://docs.rstudio.com/connect/1.7.4/user/cookbook.html#cookbook-content">updating content settings</a> and <a href="https://docs.rstudio.com/connect/1.7.4/user/cookbook.html#cookbook-promotion">manage content bundles</a>. Build a custom workflow, such as promoting content from a staging server to production.</p><figure><img src="assets-ci.png"alt="CI/CD Toolchains"/> <figcaption><p>Integrate Connect into CI/CD Toolchains</p></figcaption></figure><p>Learn more about <a href="https://solutions.rstudio.com/deploy/promote/">different approaches to asset deployment</a>, including how to <a href="https://solutions.rstudio.com/deploy/overview/">integrate RStudio Connect into CI/CD toolchains</a>.</p><h2 id="email-overhaul">Email Overhaul</h2><p>RStudio Connect uses email to distribute content, manage user accounts, notify publishers of errors, and more. In order to send emails, administrators must configure Connect to use sendmail or an SMTP client. In prior versions of RStudio Connect, this configuration was done in the RStudio Connect dashboard. Version 1.7.4 and above removes this support in favor of <a href="https://docs.rstudio.com/connect/1.7.4/admin/email-setup.html">managing email settings</a> with the RStudio Connect configuration file. This change makes setup easier and more consistent for administrators.</p><blockquote><p>When upgrading to RStudio Connect 1.7.4, administrators should <a href="https://support.rstudio.com/hc/en-us/articles/360022554513">follow these instructions</a> to migrate email settings to the configuration file.</p></blockquote><p>In addition, RStudio Connect no longer requires email setup. Disabling email can be useful for groups starting a proof-of-concept, or teams running Connect in certain locked-down environments, In these cases, Connect will gracefully disable settings that require email. For full functionality we strongly recommend an email integration.</p><h2 id="breaking-changes-and-deprecations">Breaking Changes and Deprecations</h2><ul><li>Network information is no longer collected and stored at <code>{Server.DataDir}/metrics/rrd/network-*.rrd</code>. This information was never used by RStudio Connect and is removed to save storage space.</li><li>The experimental content API has changed, see the <a href="https://docs.rstudio.com/connect/1.7.4/api/#updateContent">new API documentation</a> for details.</li><li>The <code>[LoadBalancing].EnforceMinRsconnectVersion</code> setting now defaults to true and has been deprecated. RStudio Connect will now require <code>rsconnect</code> version 0.8.3 or above.</li><li>The next version of RStudio Connect will not support discovering R versions through the file <code>/etc/rstudio/r-versions</code>. Administrators using this file should migrate this information to the Connect configuration file&rsquo;s <a href="https://docs.rstudio.com/connect/1.7.4/admin/r.html#r-versions"><code>[Server].RVersion</code> field</a>.</li></ul><p>Please consult the full <a href="https://docs.rstudio.com/connect/news">release notes</a>.</p><blockquote><h4 id="upgrade-planning">Upgrade Planning</h4><p>RStudio Connect 1.7.4 introduces a new upgrade process to facilitate changesto the RStudio Connect configuration file. This process requires administratorsto upgrade RStudio Connect and then update the configuration file, as <a href="https://docs.rstudio.com/connect/1.7.4/admin/appendix-configuration.html#appendix-configuration-migration">described here</a>.Specific <a href="https://support.rstudio.com/hc/en-us/articles/360022554513">instructions</a> for the 1.7.4 upgrade are also available. Ifyou are upgrading from an earlier release than 1.7.2, please consult theintermediate release notes.</p></blockquote><p>If you haven&rsquo;t yet had a chance to download and try <a href="https://rstudio.com/products/connect/">RStudioConnect</a>, we encourage you to do so.RStudio Connect is the best way to share all the work that you do in R and Python (Shinyapps, R Markdown documents, plots, dashboards, Plumber APIs, Jupyter, etc.) withcollaborators, colleagues, or customers.</p><p>You can find more details or download a 45-day evaluation of the product at<a href="https://www.rstudio.com/products/connect/">https://www.rstudio.com/products/connect/</a>.Additional resources can be found below.</p><ul><li><a href="https://www.rstudio.com/products/connect/">RStudio Connect home page &amp; downloads</a></li><li><a href="http://docs.rstudio.com/connect/admin/">RStudio Connect Admin Guide</a></li><li><a href="http://docs.rstudio.com/connect/news/">Detailed news and changes between each version</a></li><li><a href="https://www.rstudio.com/pricing/#ConnectPricing">Pricing</a></li><li><a href="https://beta.rstudioconnect.com/connect/">An online preview of RStudio Connect</a></li></ul></description></item><item><title>Introducing RStudio Server Pro 1.2 - New Features, Packaging, and Pricing</title><link>https://www.rstudio.com/blog/introducing-rstudio-server-pro-1-2/</link><pubDate>Thu, 09 May 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/introducing-rstudio-server-pro-1-2/</guid><description><p><img src="https://www.rstudio.com/blog-images/2019-05-09-RSP-1-2.png" style="width: 40%; float: right"/></p>We are excited to announce the general availability of RStudio Server Pro 1.2, and to introduce RStudio Server Pro Standard and RStudio Server Pro Enterprise.<p>RStudio customers have made it clear to us that the future of data analysis is moving to a more elastic and flexible computational model. RStudio Server Pro now allows you to execute R processes remotely, and introduces new packaging and pricing to reflect this new trend.</p><br><a href="https://rstudio.youcanbook.me" button type="button" style= "padding: 12px 20px; border: none; font-size: 12px; border-radius: 3px; cursor: pointer; background-color: #4c83b6; color: #fff; box-shadow: 0, 1px, 3px, 0px, rgba(0,0,0,0.10);">Contact Sales</a>&emsp;<a href="https://www.rstudio.com/products/rstudio-server-pro/" button type="button" style= "padding: 12px 20px; border: none; font-size: 12px; border-radius: 3px; cursor: pointer; background-color: #4c83b6; color: #fff; box-shadow: 0, 1px, 3px, 0px, rgba(0,0,0,0.10);">Learn More</a>&emsp;<a href="https://www.rstudio.com/products/rstudio-server-pro/evaluation/" button type="button" style= "padding: 12px 20px; border: none; font-size: 12px; border-radius: 3px; cursor: pointer; background-color: #4c83b6; color: #fff; box-shadow: 0, 1px, 3px, 0px, rgba(0,0,0,0.10);">Try a 45 Day Evaluation</a><br><br><h3 id="new-rstudio-server-pro-features">New RStudio Server Pro Features</h3><p>The most significant new feature of RStudio Server Pro is Launcher, the long-awaited capability to separate the execution of R processes from the server where RStudio Server Pro is installed. Launcher allows you to run RStudio sessions and ad-hoc R scripts within your existing cluster workload managers, so you can leverage your current infrastructure instead of provisioning load balancer nodes manually. Now organizations that want to use Kubernetes or other job managers can run interactive sessions or batch jobs remotely and scale them independently.</p><p><a href="https://solutions.rstudio.com/launcher/overview/">Learn more about Launcher here</a>.</p><p>Other <a href="https://blog.rstudio.com/2018/11/05/rstudio-rsp-1.2-features/">features exclusive to RStudio Server Pro 1.2 include</a> improved R version management and enhanced configuration reload.</p><p>In addition, RStudio Server Pro has all of the new features of the new RStudio v1.2</p><ul><li><a href="https://blog.rstudio.com/2018/10/02/rstudio-1-2-preview-sql/">SQL</a>: live feedback on your SQL queries and autocompletion</li><li><a href="https://blog.rstudio.com/2018/10/29/rstudio-ide-custom-theme-support/">Custom Theme Support</a>: tune the editor colors exactly the way you like them</li><li><a href="https://blog.rstudio.com/2018/10/05/r2d3-r-interface-to-d3-visualizations/">D3 Visualization Support</a>: create and preview web-native D3 visualizations</li><li><a href="https://blog.rstudio.com/2018/10/09/rstudio-1-2-preview-reticulated-python/">Reticulated Python Support</a>: an embedded Python session with a REPL and autocompletion</li><li><a href="https://blog.rstudio.com/2018/10/16/rstudio-1-2-preview-stan/">Stan Support</a>: including syntax highlighting, compilation, and document outlines</li><li><a href="https://blog.rstudio.com/2018/10/23/rstudio-1-2-preview-plumber-integration/">Plumber Support</a>: create and interact with R APIs, and publish them to RStudio Connect</li><li>&hellip; <a href="https://www.rstudio.com/blog/rstudio-1-2-preview-the-little-things/">and more</a>.</li></ul><h3 id="new-named-user-packaging-and-pricing-for-rstudio-server-pro">New Named User Packaging and Pricing for RStudio Server Pro</h3><p>Effective today, RStudio is introducing RStudio Server Pro Standard and RStudio Server Pro Enterprise. Standard and Enterprise are priced per user, and include the remote Launcher feature. Existing RStudio Server Pro customers may upgrade to Standard or Enterprise to take advantage of the remote Launcher feature, or continue to purchase RStudio Server Pro without Launcher under their current terms.</p><p><strong>RStudio Server Pro Standard for smaller teams - an affordable place to start</strong></p><p>RStudio Server Pro Standard is more affordable for smaller teams. Instead of $9,995 per year at a minimum, RStudio Server Pro Standard is now only $4,975 per year for 5 users on a single server. Additional Named Users are $995 each per year. Additional “Staging” servers for testing and “High Availability” servers for load balancing user sessions are optional. SMB, Academic, Volume, or Bundle discounts may apply.</p><p><strong>RStudio Server Pro Enterprise for larger teams - unrestricted servers</strong></p><p>RStudio Server Pro Enterprise eliminates restrictions on servers for larger teams. Containerized IT infrastructure using tools such as Docker and Kubernetes are increasingly popular. Larger teams often need more development, staging, production, and high availability servers and find license key management troublesome. RStudio Server Pro Enterprise starts at only $11,950 per year, allowing up to 10 Named Users to use the software on as many servers as needed. Additional Named Users are $1,195 each per year. SMB, Academic, Volume, or Bundle discounts may apply.</p><p>The table below summarizes our new pricing and packages for RStudio Server Pro; visit <a href="https://www.rstudio.com/pricing/">RStudio Pricing on our website</a> to learn more.</p><p>RStudio Server Pro 1.2 Packaging and Pricing</p><table><thead><tr><th align="left">Package</th><th align="left">Annual Price</th><th align="left">Named Users</th><th align="left">Launcher</th><th align="left">License Type</th><th align="left">Server Licenses</th></tr></thead><tbody><tr><td align="left">RStudio Server Pro Enterprise</td><td align="left">$11,995</td><td align="left">10</td><td align="left">Yes</td><td align="left">Named User</td><td align="left">Unrestricted. Additional Named Users $1,195 each</td></tr><tr><td align="left">RStudio Server Pro Standard</td><td align="left">$4,975</td><td align="left">5</td><td align="left">Yes</td><td align="left">Named User per Server</td><td align="left">One included. Staging and high availability servers available for additional charge. Additional Named Users $995 each</td></tr><tr><td align="left">RStudio Server Pro*</td><td align="left">$9,995</td><td align="left">Unlimited</td><td align="left"></td><td align="left">Per Server</td><td align="left">One. Staging servers available for additional charge.</td></tr></tbody></table><p>*<strong>RStudio Server Pro per server licensing is available only to existing RStudio Server Pro customers who have purchased prior to May 2019.</strong></p><p>For questions about RStudio Server Pro, RStudio Server Pro Standard, or RStudio Server Pro Enterprise, please contact <a href="mailto:sales@rstudio.com">sales@rstudio.com</a> or your customer success representative.</p><p><strong>Additional information is also available on our Support FAQ</strong></p><ul><li><a href="https://support.rstudio.com/hc/en-us/articles/360022558754#named_user">Why did RStudio choose Named User packaging and pricing for RStudio Server Pro?</a></li><li><a href="https://support.rstudio.com/hc/en-us/articles/360022558754#launcher">Why is Launcher only available in RStudio Server Pro Standard and Enterprise?</a></li><li><a href="https://support.rstudio.com/hc/en-us/articles/360022558754#upgrades">How do current customers upgrade to RStudio Server Pro Standard or Enterprise?</a></li></ul></description></item><item><title>Introducing RStudio Team</title><link>https://www.rstudio.com/blog/introducing-rstudio-team/</link><pubDate>Thu, 09 May 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/introducing-rstudio-team/</guid><description><p><img src="https://www.rstudio.com/blog-images/2019-05-09-TEAM.png" style="width: 40%; float: right"/></p>RStudio is excited to announce RStudio Team, a new software bundle that makes it easier and more economical to adopt our commercially licensed and supported professional offerings.<p>RStudio Team includes RStudio Server Pro, RStudio Connect, and RStudio Package Manager. With RStudio Team, your data science team will be properly equipped to analyze data at scale using R and Python; manage R packages; and create and share plots, Shiny apps, R Markdown documents, REST APIs (with plumber), and even Jupyter Notebooks, with your entire organization.</p><br><a href="https://rstudio.youcanbook.me" button type="button" style= "padding: 12px 20px; border: none; font-size: 12px; border-radius: 3px; cursor: pointer; background-color: #4c83b6; color: #fff; box-shadow: 0, 1px, 3px, 0px, rgba(0,0,0,0.10);">Contact Sales</a>&emsp;<a href="https://www.rstudio.com/pricing/" button type="button" style= "padding: 12px 20px; border: none; font-size: 12px; border-radius: 3px; cursor: pointer; background-color: #4c83b6; color: #fff; box-shadow: 0, 1px, 3px, 0px, rgba(0,0,0,0.10);">Configure your own RStudio Team</a>&emsp;<a href="https://www.rstudio.com/quickstart-vm" button type="button" style= "padding: 12px 20px; border: none; font-size: 12px; border-radius: 3px; cursor: pointer; background-color: #4c83b6; color: #fff; box-shadow: 0, 1px, 3px, 0px, rgba(0,0,0,0.10);">Evaluate RStudio Team QuickStart VM</a>&emsp;<a href="https://www.rstudio.com/products/team/" button type="button" style= "padding: 12px 20px; border: none; font-size: 12px; border-radius: 3px; cursor: pointer; background-color: #4c83b6; color: #fff; box-shadow: 0, 1px, 3px, 0px, rgba(0,0,0,0.10);">Learn More</a><br><br><h3 id="rstudio-team-standard-and-rstudio-team-enterprise">RStudio Team Standard and RStudio Team Enterprise</h3><p>RStudio Team is available in two configurations: Standard and Enterprise.</p><table><thead><tr><th align="left"></th><th align="left">RStudio Team Standard</th><th align="left">RStudio Team Enterprise</th></tr></thead><tbody><tr><td align="left">Number of <a href="https://www.rstudio.com/products/rstudio-server-pro/">RStudio Server Pro Users</a></td><td align="left">5+</td><td align="left">10+</td></tr><tr><td align="left">Number of <a href="https://www.rstudio.com/products/connect/">RStudio Connect Users</a></td><td align="left">20+</td><td align="left">100+</td></tr><tr><td align="left">RStudio <a href="https://www.rstudio.com/products/package-manager/">Package Manager Version</a></td><td align="left">Base or Standard</td><td align="left">Enterprise</td></tr><tr><td align="left">Number of Key Activations</td><td align="left">One per product*</td><td align="left">Unrestricted</td></tr><tr><td align="left">Staging or High Availability Servers</td><td align="left">Optional Purchase</td><td align="left">Included</td></tr></tbody></table><p><strong>RStudio Team Standard - an affordable place to start for smaller teams</strong></p><p>Team Standard fits the needs and budgets of smaller businesses and data science departments. For 5 RStudio Server Pro users and 20 RStudio Connect users sharing RStudio Package Manager Base, RStudio Team Standard starts at $22,000 per year. SMB and Academic discounts can further reduce the starting price, ensuring that every professional data science team can afford to start out with the right solution.</p><p><strong>RStudio Team Enterprise - unrestricted servers for larger deployments</strong></p><p>Team Enterprise has it all. For 10 RStudio Server Pro users and 100 RStudio Connect users sharing RStudio Package Manager Enterprise, RStudio Team Enterprise starts at $58,000, including all development, production, staging, high-availability, and disaster-recovery servers you require. With RStudio Team Enterprise, servers are unrestricted so you can deploy the professional data science configuration your teams need and support the virtualized IT infrastructure your organization wants. SMB, Academic, and volume discounts also apply.</p><h3 id="visit-our-pricing-page-to-get-your-estimate-for-rstudio-team">Visit our pricing page to get your estimate for RStudio Team</h3><p>RStudio Team is a new bundled offering from RStudio that combines RStudio Server Pro, RStudio Connect, and RStudio Package Manager products. Now organizations using the open-source ecosystems of Python and R have what they need to analyze data effectively and inform critical business decisions across the enterprise with convenience, simplicity, and savings. Visit our <a href="https://www.rstudio.com/pricing/">pricing page</a> to estimate your own price for RStudio Team. Except for a few RStudio Team Standard options, all you need to provide is the number of RStudio Server Pro users you want and the number of RStudio Connect users you will have. For most scenarios, we will estimate your total price on the spot. Please contact us at <a href="mailto:sales@rstudio.com">sales@rstudio.com</a> to discuss your unique circumstances or to get a formal price quote.</p></description></item><item><title>RStudio 1.2 Released</title><link>https://www.rstudio.com/blog/rstudio-1-2-release/</link><pubDate>Tue, 30 Apr 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-1-2-release/</guid><description><p style="text-align: center;">We're excited to announce the <strong>official release of RStudio 1.2</strong>!</p><h3 id="whats-new-in-rstudio-12">What&rsquo;s new in RStudio 1.2?</h3><p>Over a year in the making, this new release of RStudio includes dozens of new productivity enhancements and capabilities. You&rsquo;ll now find RStudio a more comfortable workbench for working in <a href="https://blog.rstudio.com/2018/10/02/rstudio-1-2-preview-sql/">SQL</a>, <a href="https://blog.rstudio.com/2018/10/16/rstudio-1-2-preview-stan/">Stan</a>, <a href="https://blog.rstudio.com/2018/10/09/rstudio-1-2-preview-reticulated-python/">Python</a>, and <a href="https://blog.rstudio.com/2018/10/05/r2d3-r-interface-to-d3-visualizations/">D3</a>. Testing your R code is easier, too, with integrations for <a href="https://blog.rstudio.com/2018/10/18/shinytest-automated-testing-for-shiny-apps/">shinytest and testthat</a>. Create, and test, and publish APIs in R with <a href="https://blog.rstudio.com/2018/10/23/rstudio-1-2-preview-plumber-integration/">Plumber</a>. And get more done with <a href="https://blog.rstudio.com/2019/03/14/rstudio-1-2-jobs/">background jobs</a>, which let you run R scripts while you work.</p><p>Underpinning it all is a new rendering engine based on modern Web standards, so RStudio Desktop looks sharp on displays large and small, and performs better everywhere &ndash; especially if you&rsquo;re using the latest Web technology in your visualizations, Shiny applications, and R Markdown documents. Don&rsquo;t like how it looks now? No problem&ndash;just <a href="https://blog.rstudio.com/2018/10/29/rstudio-ide-custom-theme-support/">make your own theme</a>.</p><p>You can read more about what&rsquo;s new this release in the <a href="https://www.rstudio.com/products/rstudio/release-notes/">release notes</a>, or our <a href="https://blog.rstudio.com/categories/rstudio-ide">RStudio 1.2 blog series</a>.</p><h3 id="thank-you">Thank you!</h3><p>We&rsquo;d like to thank the open source community for helping us to make this release possible. Many of you used the preview releases for your day to day work and gave us invaluable bug reports, ideas, and feedback. We&rsquo;re grateful for your support&ndash;thank you for helping us to continue making RStudio the best workbench for data science!</p><p>All products based on RStudio have been updated. You can download the new release, RStudio 1.2.1335, here:</p><p><a href="https://www.rstudio.com/products/rstudio/download/">Download RStudio 1.2</a></p><p>Feedback on the new release is always welcome in the <a href="https://community.rstudio.com/c/rstudio-ide">community forum</a>. Update and let us know what you think!</p><p><em>Coming Soon: RStudio Pro 1.2 General Availability announcement</em></p></description></item><item><title>Shiny v1.3.2</title><link>https://www.rstudio.com/blog/shiny-1-3-2/</link><pubDate>Fri, 26 Apr 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-1-3-2/</guid><description><p>We&rsquo;re excited to announce the release of Shiny v1.3.2. This release has two main features: a new reactivity debugging tool we call <a href="https://rstudio.github.io/reactlog/"><code>reactlog</code></a>, and much faster serving of static file assets.</p><h2 id="introducing-reactlog-visually-debug-your-reactivity-issues">Introducing reactlog: Visually debug your reactivity issues</h2><p>Debugging faulty reactive logic can be challenging, as we&rsquo;ve <a href="https://shiny.rstudio.com/articles/debugging.html">written</a> and <a href="https://www.rstudio.com/resources/videos/debugging-techniques/">talked</a> about in the past. In particular, some of the most difficult Shiny app bugs to track down are when reactive expressions and observers re-execute either too often (i.e. plots that render multiple times in succession after a single change), or not often enough (i.e. outputs that don&rsquo;t update when you expected them to).</p><p>This release has an important new addition to the Shiny debugging toolbox: <strong><code>reactlog</code></strong>! To use <code>reactlog</code>, execute this line before running your Shiny app:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">options</span>(shiny.reactlog <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>)</code></pre></div><p>This will instruct Shiny to keep a record of all the interactions between reactive objects.</p><p>Then, use your app, reproducing the problematic symptoms. Once you have done that, press <code>Ctrl+F3</code> (Mac users: <code>Cmd+F3</code>) from within your browser, and you&rsquo;ll see something like this:</p><img src="https://www.rstudio.com/blog-images/2019-04-26-shiny-1-3-2-pythagoras.gif" width="100%" alt="reactlog of a pythagoras theorem shiny application" style='border: 1px solid #ddd; box-shadow:5px 5px 5px #eee;'/><p>This screen lets you interactively explore the reactive history of your Shiny session. You can step forwards and backwards through time, watching as reactive objects execute, create and sever relationships, invalidate, etc.</p><h3 id="filtering-the-reactlog">Filtering the reactlog</h3><p>For medium and large Shiny apps, the reactive graph may be pretty crowded when visualized in two dimensions. Two <code>reactlog</code> features help you separate the signal from the noise.</p><ul><li>First, you can use the search field in the upper-right corner to filter by name (such as input or output ID, or the variable name of a reactive expression).</li></ul><img src="https://www.rstudio.com/blog-images/2019-04-26-shiny-1-3-2-search-by-name.gif" width="100%" alt="An example of filtering a reactlog graph by searching for labels" style='border: 1px solid #ddd; box-shadow:5px 5px 5px #eee;'/><ul><li>Second, you can double-click a node or edge in the graph to focus in on it, which will remove all unrelated reactive elements. Double-click on the background to restore the original view.</li></ul><img src="https://www.rstudio.com/blog-images/2019-04-26-shiny-1-3-2-filter-click.gif" width="100%" alt="An example of filtering a reactlog graph by double-clicking the graph" style='border: 1px solid #ddd; box-shadow:5px 5px 5px #eee;'/><p>Together, these features make it easy to find and focus on the relevant objects in your app.</p><p>You can find out more in <a href="https://resources.rstudio.com/rstudio-conf-2019/reactlog-2-0-debugging-the-state-of-shiny">this rstudio::conf talk</a> by Barret Schloerke, or read the docs at the <a href="https://rstudio.github.io/reactlog/"><code>reactlog</code> website</a>.</p><h2 id="improved-performance-for-serving-javascript-and-css-files">Improved performance for serving JavaScript and CSS files</h2><p>In previous versions of Shiny, every HTTP request was handled by R, including requests for static JavaScript and CSS files. For apps that have many add-on interactive components, there could be a dozen or more of these requests. As an R process becomes heavily loaded with long-running computations, the requests for these static files have to fight for a slice of R&rsquo;s attention.</p><p>This is most noticeable when one user&rsquo;s session affects the startup of another user&rsquo;s session. A single R process can serve multiple Shiny user sessions, and in previous versions of Shiny, a user&rsquo;s session could be blocked from loading startup-related JavaScript and CSS files because another user happened to be doing an intensive computation at that moment.</p><p>With the new version of Shiny, static files are always served up at lightning speed, no matter what&rsquo;s going on in R. We accomplished this by adding new static-file serving options to <strong><code>httpuv</code></strong>, using dedicated C++ code paths running on a background thread. This means that computations in R won&rsquo;t affect the serving of static files, and serving static files won&rsquo;t affect computations in R. The experience for users of heavily-loaded Shiny applications should be noticeably better. Note that it has always been possible with RStudio Connect and Shiny Server Pro to improve performance by increasing the number of R processes serving an application, but now Shiny itself is more efficient and multithreaded, so each R process can effectively handle more user sessions.</p><p>The best part is that you don&rsquo;t need to do anything to take advantage of these speed improvements—just upgrading Shiny to v1.3.2 will do it!</p><p>See the <a href="http://shiny.rstudio.com/reference/shiny/1.3.0/upgrade.html">full list of v1.3.0 changes</a> (and <a href="http://shiny.rstudio.com/reference/shiny/1.3.1/upgrade.html">v1.3.1</a>, <a href="http://shiny.rstudio.com/reference/shiny/1.3.2/upgrade.html">v1.3.2</a>) to learn about minor bug fixes and improvements we&rsquo;ve made in this release.</p><p><strong>Note:</strong> A number of users have reported that upgrading to Shiny v1.3.0 (or higher) breaks their apps when running behind an Nginx proxy: the HTML loads, but none of the styles are applied and none of the calculations run. This occurs when Nginx is subtly misconfigured. We&rsquo;ve posted details and a fix in <a href="https://community.rstudio.com/t/having-problems-with-shiny-v1-3-0-and-nginx/28180">this RStudio Community post</a>.</p></description></item><item><title>RStudio Package Manager 1.0.8 - System Requirements</title><link>https://www.rstudio.com/blog/rstudio-package-manager-1-0-8-system-requirements/</link><pubDate>Thu, 18 Apr 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-package-manager-1-0-8-system-requirements/</guid><description><p>Installing R packages on Linux systems has always been a risky affair. In RStudioPackage Manager 1.0.8, we&rsquo;re giving administrators and R users the informationthey need to make installing packages easier. We&rsquo;ve also made iteasier to use Package Manager offline and improved search performance.</p><h4 id="new-to-rstudio-package-manager">New to RStudio Package Manager?</h4><p><a href="https://rstudio.com/products/package-manager/">Download</a> the 45-day evaluationtoday to see how RStudio Package Manager can help you, your team, and yourentire organization access and organize R packages. Learn more with our <a href="https://demo.rstudiopm.com">onlinedemo server</a> or <a href="https://resources.rstudio.com/webinars/introduction-to-the-rstudio-package-manager-sean-lopp">latest webinar</a>.</p><img src="https://www.rstudio.com/blog/images/rspm-108-srdb.png" caption="System prerequisites for R packages" alt= "System prerequisites for R packages"><h2 id="updates">Updates</h2><h3 id="introducing-system-prerequisites">Introducing System Prerequisites</h3><p>R packages can depend on one another, but they can also depend on softwareexternal to the R ecosystem. On Ubuntu 18.04, for example, in order to install the <code>curl</code> R package, you must have previously run <code>apt-get install libcurl</code>. Rpackages often note these dependencies inside their DESCRIPTION files, but thisinformation is free-form text that varies by package. In the past, systemadministrators would need to manually parse these files. In order to install<code>ggplot2</code>, you&rsquo;d need to look at the system requirements for <code>ggplot2</code> and allits dependencies. This labor-intensive process rarely goes smoothly. Frequently,system dependencies are not be uncovered until a package failed to install,often with a cryptic error message that can leave R users and administrators franticallysearching StackOverflow.</p><p>To address this problem, we&rsquo;ve begun cataloging and testing<a href="https://docs.rstudio.com/rspm/1.0.8/admin/system-dependency-detection.html">systemprerequisites</a>.The result is a list of install commands available for administrators and Rusers. We&rsquo;ve tested this list by installing all 14,024 CRAN packages across sixLinux distributions.</p><p><img src="https://www.rstudio.com/post/2019-04-17-rstudio-package-manager-1-0-8-system-requirements_files/figure-html/unnamed-chunk-1-1.png" width="672" /></p><p>For any package, Package Manager shows you if there are system pre-requisitesand the commands you can use to install them. Today this support is limited toLinux, but we plan to support Windows and Mac requirements in the future.Package Manager automatically rolls up prerequisites for dependent R packages.As an example, the <code>httr</code> R package depends on the <code>curl</code> package which dependson <code>libcurl</code>. Package Manager will show the <code>libcurl</code> prerequisite for the<code>httr</code> package&ndash;and for all of <code>httr</code>'s reverse dependencies!</p><h3 id="new-offline-and-air-gapped-downloader">New Offline and Air-Gapped Downloader</h3><p>In most cases, RStudio Package Manager provides the checks and governancecontrols needed by IT to bridge the gap between offline production systems andRStudio&rsquo;s public CRAN service. However, in certain cases it may be necessary torun RStudio Package Manager offline. Version 1.0.8 introduces <a href="https://docs.rstudio.com/rspm/1.0.8/admin/air-gapped.html">a newtool</a> to help offlinecustomers. A new utility has been created to make cloning packages into anair-gapped environment safe and fast.</p><h2 id="other-improvements">Other Improvements</h2><p>In addition to these major changes, the new release includes the following updates:</p><ul><li>Support for <a href="https://docs.rstudio.com/rspm/1.0.8/admin/s3-config.html">using Amazon S3 for storage</a> is out of beta and ready for production systems.</li><li>Logs for <a href="https://docs.rstudio.com/rspm/1.0.8/admin/repositories.html#git-sources">Git sources</a> have been improved, making it easier to track down cases where a repository fails to build.</li><li>Package search and listing performance has been significantly improved.</li><li>The <a href="https://blog.rstudio.com/2019/03/13/rstudio-package-manager-1-0-6-readme/">support for README files</a> introduced in version 1.0.6 has been expanded to better support READMEs with links, badges, and images.</li></ul><img src="https://www.rstudio.com/blog/images/rspm-108-readme.png" caption="Even more README support" alt= "Even more README support"><p>Please review the <a href="https://docs.rstudio.com/rspm/news">full release notes</a>.</p><blockquote><h4 id="upgrade-planning">Upgrade Planning</h4><p>Upgrading to 1.0.8 from 1.0.6 will take less than five minutes. If you areupgrading from an earlier version, be sure to consult the release notes for theintermediate releases, as well.</p></blockquote><p>Don&rsquo;t see that perfect feature? Wondering why you should be worried aboutpackage management? Want to talk about other package-management strategies?<a href="mailto:sales@rstudio.com">Email us</a>, our product team is happy to help!</p><ul><li><a href="https://docs.rstudio.com/rspm/admin">Admin Guide</a></li><li><a href="https://www.rstudio.com/wp-content/uploads/2018/07/RStudio-Package-Manager-Overview.pdf">Overview PDF</a></li><li><a href="https://resources.rstudio.com/webinars/introduction-to-the-rstudio-package-manager-sean-lopp">Introductory Webinar</a></li><li><a href="https://demo.rstudiopm.com">Online Demo</a></li></ul></description></item><item><title>Winners of the 1st Shiny Contest</title><link>https://www.rstudio.com/blog/first-shiny-contest-winners/</link><pubDate>Fri, 05 Apr 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/first-shiny-contest-winners/</guid><description><p><img src="https://www.rstudio.com/blog-images/2019-01-07-shiny-contest.png" style="width: 40%; float: right"/></p><p>Back in January we <a href="https://blog.rstudio.com/2019/01/07/first-shiny-contest/">announced the first Shiny contest</a>. The timehas come to share the results with you!</p><p>First and foremost, we were overwhelmed (in the best way possible!)by the 136 submissions! Reviewing all these submissions was incrediblyinspiring and humbling. We really appreciate the time and effort each contestsantput into building these apps, as well as submitting them as a fully reproducibleartifacts via RStudio Cloud.</p><p>Let’s start with a few stats about the contest submissions:</p><ul><li>There were 136 submissions from 122 unique app developers!</li><li>Approximately 92% of these developers submitted one entry for the contest and approximately 7% submitted two entries.</li><li>There was one developer who submitted three entries and one developer whosubmitted five!</li></ul><p>And here is a look at the growth of submissions over time…</p><p><img src="https://www.rstudio.com/blog-images/2019-04-05-contest-submissions.gif" style="width: 100%; float: center"/></p><p>So many of these apps were quite complex, very well-designed, and fun to interactwith. Saying that selecting winners was difficult would be the biggest understatementof the year! But we promised to do it, so we did! Below we list the honorablementions, runners up, and last but not least, the winners of the first Shiny contest.</p><p>Before we get to them though, a quick point of claerification: we had promised twowinners, one in the novice and one in the open category. However since we didn’task developers to self select into these categories it was very difficult to place<br />apps into these categories post-hoc. So instead we picked four winners invarious categories. At the end of the post we also discuss how this experiencewill help shape the definitions of winning categories in the next Shiny contest.</p><p>Over the next week we will be getting in touch with all the winners, runners up,and honorable mentions to arrange delivery of their awards and to highlighttheir submissions on the Shiny User Showcase.</p><div id="winners" class="section level2"><h2>Winners</h2><p>The four winners, presented here in no particular order, have won the following:</p><ul><li>One year of shinyapps.io Basic plan</li><li>All hex/RStudio stickers we can find</li><li>Any number of RStudio t-shirts, books, and mugs (worth up to $200)</li><li>Special &amp; persistent recognition by RStudio in the form of a winners page,and a badge that’ll be publicly visible on your RStudio Community profile</li><li>Half-an-hour one-on-one with a representative from the RStudio Shiny teamfor Q&amp;A and feedback</li></ul><div id="most-technically-impressive-isee" class="section level3"><h3>Most technically impressive: <a href="https://kevinrue.shinyapps.io/isee-shiny-contest/">iSEE</a></h3><p>iSEE (interactive SummarizedExperiment Explorer) by <a href="https://community.rstudio.com/u/kevinrue">Kevin Rue</a>, <a href="https://community.rstudio.com/u/csoneson">Charlotte Soneson</a>, <a href="https://community.rstudio.com/u/federicomarini">Federico Marini</a>, <a href="https://github.com/LTLA">Aaron Lun</a>and is designed for interactiveexploration of high-throughput biological data sets. The deeper we dove intothis app, the more impressed we were at its feature set. The data-visual-selectioncontrols in each panel were well presented, and the dynamic, directionalcrosslinking feature is something we haven’t seen before. And it can evengenerate a reproducible R script!</p><p><img src="https://www.rstudio.com/blog-images/2019-04-05-iSEE.png" style="width: 100%; float: center"/></p><ul><li>Interact with <a href="https://kevinrue.shinyapps.io/isee-shiny-contest/">the app</a> on shinyapps.io</li><li><a href="https://community.rstudio.com/t/shiny-contest-submission-isee-interactive-and-reproducible-exploration-and-visualization-of-genomics-data/25136">RStudio Community post</a> to find out more about it</li><li>Reproduce the app on <a href="https://rstudio.cloud/project/230765">RStudio Cloud</a></li><li>View the code on <a href="https://github.com/kevinrue/isee-shiny-contest">GitHub</a></li></ul></div><div id="best-design-69-love-songs-a-lyrical-analysis" class="section level3"><h3>Best design: <a href="https://committedtotape.shinyapps.io/sixtyninelovesongs/">69 Love Songs: A Lyrical Analysis</a></h3><p>This app by <a href="https://community.rstudio.com/u/committedtotape/">David Smale</a> is alyrical analysis of the three-volume concept album by the MagneticFields containing (yep, you guessed it) 69 love songs. We fell in love with thelook of this app, and really appreciated that the font and colours used in theapp have been chosen to match the album artwork. You don’t have to be a fan ofthe Magnetic Fields to appreciate the care and attention to detail that wentinto each panel!</p><p><img src="https://www.rstudio.com/blog-images/2019-04-05-sixty-nine-love-songs.gif" style="width: 100%; float: center"/></p><ul><li>Interact with <a href="https://committedtotape.shinyapps.io/sixtyninelovesongs/">the app</a> on shinyapps.io</li><li>Read the <a href="https://community.rstudio.com/t/shiny-contest-submission-lyrical-analysis-of-69-love-songs-by-magnetic-fields/25202">RStudio Community post</a> and the <a href="https://davidsmale.netlify.com/portfolio/69-love-songs/">blog post</a> to find out more about the motivation, design decisions, and technical details</li><li>Reproduce the app on <a href="https://rstudio.cloud/project/245439">RStudio Cloud</a></li></ul></div><div id="most-fun-hex-memory-game" class="section level3"><h3>Most Fun: <a href="https://dreamrs.shinyapps.io/memory-hex/">Hex Memory Game</a></h3><p>A brave handful of people have built small games in Shiny. It’s always impressiveto us when people pull that off at all, but we haven’t seen one that works as wellas Hex Memory Game created by <a href="https://community.rstudio.com/u/pvictor/">pvictor</a>.Not only that, but the code is super clean and easy to reason about.</p><p><img src="https://www.rstudio.com/blog-images/2019-04-05-hex-game.gif" style="width: 100%; float: center"/></p><ul><li>Interact with <a href="https://dreamrs.shinyapps.io/memory-hex/">the app</a> on shinyapps.io</li><li>Read the <a href="https://community.rstudio.com/t/shiny-contest-submission-hex-memory-game/25336">RStudio Community post</a> to find out more about the app and technical highlights</li><li>Reproduce the app on <a href="https://rstudio.cloud/project/250892">RStudio Cloud</a></li><li>View the code on <a href="https://github.com/dreamRs/memory-hex">GitHub</a></li></ul></div><div id="the-awww-award-pet-records" class="section level3"><h3>The “Awww” Award: <a href="https://jennadallen.shinyapps.io/pet-records-app/">Pet Records</a></h3><p>Think you’re a good pet owner? This app by <a href="https://community.rstudio.com/u/jallen1006/">Jenna Allen</a> will make you think again!Jenna, who describes herself as a digital nomad traveling with two dogs, Laylaand Lloyd, has built this app for keeping track of her dogs’ medical and vaccinerecords. The timeline visualizations in the app are extremely effective, and theamount you can drill down – all the way to vaccine certificates and exam notesin PDF format! – is very impressive.</p><p><img src="https://www.rstudio.com/blog-images/2019-04-05-pet-records.gif" style="width: 100%; float: center"/></p><ul><li>Interact with <a href="https://jennadallen.shinyapps.io/pet-records-app/">the app</a> on shinyapps.io</li><li>Read the <a href="https://community.rstudio.com/t/shiny-contest-submission-an-app-to-visualize-and-share-my-dogs-medical-history/21511">RStudio Community post</a> and the <a href="https://www.jennadallen.com/post/a-shiny-app-to-visualize-and-share-my-dogs-medical-history/">blog post</a> to find out more about the app’s design and technical highlights</li><li>Reproduce the app on <a href="https://rstudio.cloud/project/168716">RStudio Cloud</a> (Notethat for any images and documents stored in the app author’s S3 bucket, you’llsee “Error: Forbidden (HTTP 403)” on the RStudio Cloud version since credentialsare not shared.)</li><li>View the code on <a href="https://github.com/jennaallen/dog_days">GitHub</a></li></ul></div></div><div id="runners-up" class="section level2"><h2>Runners up</h2><p>The following six apps are our runners up, and once again presented here inno particular order. Congratulations to the developers who have won thefollowing:</p><ul><li>One year of shinyapps.io Basic plan</li><li>All hex/RStudio stickers we can find</li><li>Any number of RStudio t-shirts, books, and mugs (worth up to $200)</li></ul><div id="a-virtual-lab-for-teaching-physiology" class="section level3"><h3>A Virtual Lab for Teaching Physiology</h3><p>If we were judging solely by ambition of vision, this submission by<a href="https://community.rstudio.com/u/DGranjon">David Granjon</a> would have to be ourwinner! The centerpiece of this app is a strikingly detailed visNetwork,but the patient simulator idea is interesting as well.</p><p><img src="https://www.rstudio.com/blog-images/2019-04-05-virtual-patient-simulator.gif" style="width: 100%; float: center"/></p><ul><li>This submission includes two apps that can be accessed via the <a href="https://rinterface.com/AppsPhysiol.html">Apps.Physiol page</a><ul><li><a href="https://davidgranjon.shinyapps.io/entry_level/">Entry level app</a></li><li><a href="https://davidgranjon.shinyapps.io/virtual_patient_v2/">Virtual patient simulator</a></li></ul></li><li>Read the <a href="https://community.rstudio.com/t/shiny-contest-submission-a-virtual-lab-for-teaching-physiology/25348">RStudio Community post</a> to find out more about the motivation behind and design detailsof the apps</li><li>Reproduce the app on <a href="https://rstudio.cloud/project/257646">RStudio Cloud</a></li><li>View the code on <a href="https://github.com/DivadNojnarg/CaPO4Sim">GitHub</a></li></ul></div><div id="scotpho-online-profiles-tool" class="section level3"><h3>ScotPHO Online Profiles Tool</h3><p>This app by Jaime Villacampa, Zsanett Bahor, and Vicky Elliott was createdto help people living and working in Scotland explore how geographical areashave changed over time or how they compare to other areas, across a range ofindicators of health and wider determinants of health. The app is pretty complicated,but no more than it has to be with such a sprawling dataset behind it. Each partof the app has a carefully curated set of options that expose lots of powerwithout being totally overwhelming. The context-sensitive Definition buttonand pervasive “Download data”/“Save chart” options are nice touches as well.</p><p><img src="https://www.rstudio.com/blog-images/2019-04-05-scotpho.gif" style="width: 100%; float: center"/></p><ul><li>Interact with <a href="https://scotland.shinyapps.io/ScotPHO_profiles_tool/">the app</a> on shinyapps.io</li><li>Read the <a href="https://community.rstudio.com/t/shiny-contest-submission-shiny-app-for-exploring-scottish-public-health-data-within-local-areas/25560">RStudio Community post</a> post</li><li>Reproduce the app on <a href="https://rstudio.cloud/project/256533">RStudio Cloud</a></li><li>View the code on <a href="https://github.com/ScotPHO/scotpho-profiles-tool">GitHub</a></li></ul></div><div id="tidytuesday.rocks" class="section level3"><h3>tidytuesday.rocks</h3><p>If you haven’t heard of #TidyTuesday, you’re missing out on one of the mostdynamic virtual events in the R community. The tidytuesday.rocks app by<a href="https://community.rstudio.com/u/nsgrantham/">Neal Grantham</a> is a tastefullyminimalist interface for exploring previous weeks’ datasets and communitysubmissions for visualizations.</p><p><img src="https://www.rstudio.com/blog-images/2019-04-05-tidytuesday.gif" style="width: 100%; float: center"/></p><ul><li>Interact with <a href="https://nsgrantham.shinyapps.io/tidytuesdayrocks/">the app</a> on shinyapps.io</li><li>Read the <a href="https://community.rstudio.com/t/shiny-contest-submission-tidytuesday-rocks-an-interactive-catalogue-of-tidytuesday-tweets-from-2018/25205">RStudio Community post</a> post</li><li>Reproduce the app on <a href="https://rstudio.cloud/project/246977">RStudio Cloud</a></li><li>View the code on <a href="https://github.com/nsgrantham/tidytuesdayrocks">GitHub</a></li></ul></div><div id="cran-explorer" class="section level3"><h3>CRAN Explorer</h3><p>There are plenty of Shiny apps for exploring CRAN metadata, but none of them lookas striking as this one by <a href="https://community.rstudio.com/u/nz-stefan">nz-stefan</a>!This is a really nice example of HTML Template usage; the separation between theR UI and the raw HTML UI is extremely clean.</p><p><img src="https://www.rstudio.com/blog-images/2019-04-05-cran-explorer.gif" style="width: 100%; float: center"/></p><ul><li>Interact with <a href="https://nz-stefan.shinyapps.io/cran-explorer/">the app</a> on shinyapps.io</li><li>Read the <a href="https://community.rstudio.com/t/shiny-contest-submission-cran-explorer/25669">RStudio Community post</a> post</li><li>Reproduce the app on <a href="https://rstudio.cloud/project/258634">RStudio Cloud</a></li><li>View the code on <a href="https://github.com/nz-stefan/cran-explorer">GitHub</a></li></ul></div><div id="the-shiny-lego-mosaic-creator" class="section level3"><h3>The Shiny LEGO mosaic creator</h3><p>This app by <a href="https://community.rstudio.com/u/rpodcast">Eric Nantz</a> is just fun!Upload any (relatively small) image and within seconds this app will design aLEGO mosaic for you, complete with a list of required bricks and build instructions!</p><p><img src="https://www.rstudio.com/blog-images/2019-04-05-shinylego.gif" style="width: 100%; float: center"/></p><ul><li>Interact with <a href="https://rpodcast.shinyapps.io/shinylego">the app</a> on shinyapps.io</li><li>Read the <a href="https://community.rstudio.com/t/shiny-contest-submission-the-shiny-lego-mosaic-creator/25648">RStudio Community post</a> post</li><li>Reproduce the app on <a href="https://rstudio.cloud/project/257906">RStudio Cloud</a></li><li>View the code on <a href="https://github.com/rpodcast/shinylego">GitHub</a></li></ul></div><div id="a-dashboard-for-conference-tweets" class="section level3"><h3>A Dashboard for Conference Tweets</h3><p>We have fond memories of rstudio::conf 2019, but nostalgia isn’t the only thinggoing for this dashboard by <a href="https://community.rstudio.com/u/grrrck/">Garrick Aden-Buie</a>.There are a lot of great tweets we missed the first time around, and we addedsome new follows. We also loved the “Top Emoji” plot (spoiler: the top emoji was 🤯!).</p><p><img src="https://www.rstudio.com/blog-images/2019-04-05-conf-tweets.png" style="width: 100%; float: center"/></p><ul><li>Interact with <a href="https://gadenbuie.shinyapps.io/tweet-conf-dash/">the app</a> on shinyapps.io</li><li>Read the <a href="https://community.rstudio.com/t/shiny-contest-submission-a-dashboard-for-conference-tweets/25745">RStudio Community post</a> post</li><li>Reproduce the app on <a href="https://rstudio.cloud/spaces/12362/project/258314">RStudio Cloud</a></li><li>View the code on <a href="https://github.com/gadenbuie/tweet-conf-dash">GitHub</a></li></ul></div></div><div id="honorable-mentions" class="section level2"><h2>Honorable mentions</h2><p>Remember how we said earlier that there so many gems among the submissions andhow it was so difficult to choose between them? Yeah, it was! The followingtwenty-one apps are the honorable mentions. The developers of these apps willreceive one year of shinyapps.io Basic Plan and one RStudio t-shirt.</p><p>We have linked to the RStudio Community post for each of the submissions whereyou can read more about each app, interact with it, and reproduce it in RStudioCloud.</p><ul><li><a href="https://community.rstudio.com/t/24743">Exploring large hospital data for better use of antimicrobials</a></li><li><a href="https://community.rstudio.com/t/23995">ShinyMRI - View MRI images in Shiny</a></li><li><a href="https://community.rstudio.com/t/23831">National Hockey League Play-by-Play App</a></li><li><a href="https://community.rstudio.com/t/25624">Reimagining NYC Neighborhoods with NewerHoods</a></li><li><a href="https://community.rstudio.com/t/25586">Voronoys - Understanding voters’ profile in Brazilian elections</a></li><li><a href="https://community.rstudio.com/t/25371">Career Path Exploration Tool</a></li><li><a href="https://community.rstudio.com/t/25278">Identifying real estate investment opportunities</a></li><li><a href="https://community.rstudio.com/t/22892">ctmmweb - a web app to analysis Animal tracking data</a></li><li><a href="https://community.rstudio.com/t/20997">Tetris-like game using Nanopore Flongle screenshots as starting fields</a></li><li><a href="https://community.rstudio.com/t/25637">The OCR Handwriting Game</a></li><li><a href="https://community.rstudio.com/t/25746">Stock Portfolio Monitor</a></li><li><a href="https://community.rstudio.com/t/25738">Interactive isochrone mapper for anywhere in the world!</a></li><li><a href="https://community.rstudio.com/t/25654">Real time public transport info for Dublin, Ireland</a></li><li><a href="https://community.rstudio.com/t/25639">ubeRideR - A shiny app to visualise Uber data</a></li><li><a href="https://community.rstudio.com/t/25605">Climate indicators and their effects on health at the small area level in Barcelona</a></li><li><a href="https://community.rstudio.com/t/25747">Animated leaflet to view NYC metro entries with Shiny</a></li><li><a href="https://community.rstudio.com/t/25207">Sentify - Spotify musical sentiment visualization</a></li><li><a href="https://community.rstudio.com/t/24001">Impact Replays - Relive the CFL Highlights and Play-by-Play</a></li><li><a href="https://community.rstudio.com/t/23256">Utah Lake Water Quality Profile Dashboard</a></li><li><a href="https://community.rstudio.com/t/21530">Create Visual Abstracts for Original Research</a></li><li><a href="https://community.rstudio.com/t/21418">Radio DJ Playlist Analyzer</a></li></ul></div><div id="next-shiny-contest" class="section level2"><h2>Next Shiny contest</h2><p>The first Shiny contest was not only fun to review, but we also learned a lotabout how to structure it next time around. (Oh yeah, there will be a next time!)Our plan is to pre-identify clear categories for winners and announce them at thebeginning of the contest. We’re pretty sure you will wow us again, and developapps that are awe-inspiring and don’t fit into any of the categories we outlined,and we’ll want to update things again for the third round of the contest, butc’est la vie! We will also plan better for a high number of submissionsso that we can turn around the review quicker (thank you for your patience thistime around!).</p></div></description></item><item><title>Summer Interns 2019</title><link>https://www.rstudio.com/blog/2019-03-25-summer-interns-2019/</link><pubDate>Mon, 25 Mar 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/2019-03-25-summer-interns-2019/</guid><description><p>We received almost 400 applications for our <a href="https://www.rstudio.com/blog/2019-01-18-summer-internships-2019/">2019 internship program</a> from students with very diverse backgrounds. After interviewing several dozen people and making some very difficult decisions, we are pleased to announce that these twelve interns have accepted positions with us for this summer:</p><ul><li><p><strong><a href="https://github.com/thereseanders">Therese Anders</a></strong>: Calibrated Peer Review. Prototype tools to conduct experiments to see whether calibrated peer review is a useful and feasible feedback strategy in introductory data science classes and industry workshops. (mentor: Mine Çetinkaya-Rundel)</p></li><li><p><strong><a href="https://github.com/malcolmbarrett">Malcolm Barrett</a></strong>: R Markdown Enhancements. Tidy up and refactoring the R Markdown code base. (mentor: Rich Iannone)</p></li><li><p><strong><a href="https://community.rstudio.com/u/jcblum">Julia Blum</a></strong>: RStudio Community Sustainability. Study <a href="https://community.rstudio.com">community.rstudio.com</a>, enhance documentation and processes, and onboard new users. (mentor: Curtis Kephart)</p></li><li><p><strong><a href="https://github.com/jyuu">Joyce Cahoon</a></strong>: Object Scrubbers. Help write a set of methods to scrub different types of objects to reduce their size on disk. (mentors: Max Kuhn and Davis Vaughan)</p></li><li><p><strong><a href="https://github.com/chendaniely">Daniel Chen</a></strong>: Grader Enhancements. Enhance <a href="https://github.com/rstudio-education/grader">grader</a> to identify students&rsquo; mistakes when doing automated tutorials. (mentor: Garrett Grolemund)</p></li><li><p><strong><a href="https://github.com/marlycormar">Marly Cormar</a></strong>: Production Testing Tools for Data Science Pipelines. Build on applicability domain methods from computational chemistry to create functions that can be included in a <code>dplyr</code> pipeline to perform statistical checks on data in production. (mentor: Max Kuhn)</p></li><li><p><strong><a href="https://github.com/dcossyleon">Desiree De Leon</a></strong>: Teaching and Learning with RStudio. Create a one-stop guide to teaching with RStudio similar to <a href="https://jupyter4edu.github.io/jupyter-edu-book/">Teaching and Learning with Jupyter</a>. (mentor: Alison Hill)</p></li><li><p><strong><a href="https://github.com/paleolimbot">Dewey Dunnington</a></strong>: <code>ggplot2</code> Enhancements. Contribute to <code>ggplot2</code> or an associated package (like <code>scales</code>) by writing R code for graphics and helping to manage a large, popular open source project. (mentor: Hadley Wickham)</p></li><li><p><strong><a href="https://github.com/MayaGans">Maya Gans</a></strong>: Tidy Blocks. Prototype and evaluate a block-based version of the tidyverse so that young students can do simple analysis using an interface like <a href="https://scratch.mit.edu/">Scratch</a>. (mentor: Greg Wilson)</p></li><li><p><strong><a href="https://github.com/leslie-huang">Leslie Huang</a></strong>: Shiny Enhancements. Enhance Shiny&rsquo;s UI, improve performance bottlenecks, fix bugs, and create a set of higher-order reactives for more sophisticated programming. (mentor: Barret Schloerke)</p></li><li><p><strong><a href="https://github.com/gracelawley">Grace Lawley</a></strong>: Tidy Practice. Develop practice projects so learners can practice tidyverse skills using interesting real-world data. (mentor: Alison Hill)</p></li><li><p><strong><a href="https://github.com/YimRegister">Yim Register</a></strong>: Data Science Training for Software Engineers. Develop course materials to teach basic data analysis to programmers using software engineering problems and data sets. (mentor: Greg Wilson)</p></li></ul><p>We are very excited to welcome them all to the RStudio family, and we hope you&rsquo;ll enjoy following their progress over the summer.</p></description></item><item><title>RStudio Connect 1.7.2</title><link>https://www.rstudio.com/blog/announcing-rstudio-connect-1-7-2/</link><pubDate>Fri, 22 Mar 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-rstudio-connect-1-7-2/</guid><description><p>RStudio Connect 1.7.2 is ready to download, and this release contains somelong-awaited functionality that we are excited to share. Several authenticationand user-management tooling improvements have been added, including the abilityto change authentication providers on an existing server, new group supportoptions, and the official introduction of SAML as a supported authenticationprovider (currently a beta feature*). But that’s not all&hellip; keep reading tolearn about great additions to the RStudio Connect UI, updates to Pythonsupport, and a brand new Admin dashboard view for tracking scheduled content.</p><figure><img src="https://www.rstudio.com/blog-images/rsc-172-content.png"alt="Content Expanded View"/> <figcaption><p>Content Expanded View - Improve discoverability with descriptions and images</p></figcaption></figure><h2 id="updates">Updates</h2><h3 id="authentication-migration-tools">Authentication Migration Tools</h3><p>It is now possible to delete users, transfer content ownership, and changeauthentication mechanisms for users and groups in RStudio Connect. This enablesseveral workflows that were previously impossible:</p><ul><li>Migrate authentication providers when prompted by IT</li><li>Transition a Proof-of-Concept environment with “starter” authentication into a production context</li><li>Clean up and remove users who are no longer relevant for the system</li></ul><p>All of this functionality is available with the <a href="https://docs.rstudio.com/connect/admin/cli.html#cli-usermanager">usermanager CLItool</a>. Aspecific walkthrough of these workflows is available in the <a href="https://docs.rstudio.com/connect/admin/authentication.html#change-auth-provider">RStudio ConnectAdminGuide</a>.</p><h3 id="group-support-for-pam-and-proxied-authentication">Group Support for PAM and Proxied Authentication</h3><p>Group support has been enabled for all authentication providers in RStudioConnect. The following grid illustrates the type of group support available forthe different authentication providers:</p><table><thead><tr><th>Authentication Provider</th><th>Local Groups</th><th>Remote Groups</th></tr></thead><tbody><tr><td>Password</td><td>Yes</td><td></td></tr><tr><td>LDAP / Active Directory</td><td></td><td>Yes</td></tr><tr><td>SAML</td><td>Yes</td><td></td></tr><tr><td>Google OAuth2</td><td>Yes</td><td></td></tr><tr><td>PAM</td><td>Yes</td><td></td></tr><tr><td>Proxied Authentication</td><td>Yes</td><td>Yes</td></tr></tbody></table><p>LDAP and Active Directory groups are managed by the authentication provider(i.e., are configured and maintained in your LDAP or Active Directory server).For the other authentication providers, groups are stored and managed insideRStudio Connect. They can be managed in the groups UI (under People) or via the<a href="https://docs.rstudio.com/connect/api/#groups">RStudio Connect Server API</a>.</p><h3 id="saml-authentication-beta-release">SAML Authentication (Beta Release)</h3><p>RStudio Connect now supports using SAML as an authentication provider tosupport single sign-on (SSO). If you use SAML as anauthentication provider, we encourage you to try this feature in your testenvironment by integrating with your SAML Identity Provider. Any feedback youhave to share will be appreciated.</p><blockquote><p>*SAML integration is a Beta feature of RStudio Connect. Beta features aresupported and unlikely to face breaking changes in a future release. Anyissues found in the feature will be addressed during the regular releaseschedule; they will not result in immediate patches or hotfixes. We encouragecustomers to try these features and welcome any feedback, but recommend thefeature not be used in production until it is in general availability.</p></blockquote><h3 id="view-scheduled-content">View Scheduled Content</h3><p>Administrators can now review all scheduled content in the RStudio Connectdashboard. The Scheduled Content view helps you understand how server resourceswill be used over time. Scheduled content can be filtered by frequency ofexecution, letting you focus on the items that run most often.</p><figure><img src="https://www.rstudio.com/blog-images/rsc-172-scheduledUI.png"alt="View and filter scheduled content"/></figure><h3 id="usage-metrics-summaries">Usage Metrics Summaries</h3><p>A summary of recent usage is shown to content owners and administrators withinthe “Info” settings panel in the RStudio Connect dashboard. Metrics aredisplayed for Shiny applications and rendered/static content; they are notavailable for other content types.</p><figure><img src="https://www.rstudio.com/blog-images/rsc-172-usage.png"alt="Usage Metrics Summaries"/></figure><p>Usage data for the content item is summarized to show the last 30 days ofactivity across all associated versions and variants. Document content itemswill display a chart of the daily visit count and a total visit counter for thepast 30-day period. Shiny applications will have the same statistics displayed,plus a metric for total user interaction time.</p><h3 id="python-support">Python Support</h3><p>In case you missed it, <a href="https://www.rstudio.com/blog/announcing-rstudio-connect-1-7-0/">RStudio Connect1.7.0</a>introduced support for publishing Jupyter Notebooks as well as Shinyapplications, R Markdown reports, and plumber APIs that <a href="https://www.rstudio.com/wp-content/uploads/2019/01/Using-Python-with-RStudio-Connect-1.7.0.pdf">combine R andPython</a>.Today, we’re excited to share that publishing Jupyter Notebooks is easier thanever; start by downloading the<a href="https://pypi.org/project/rsconnect-jupyter/">rsconnect-jupyter</a> Notebookextension, now available on PyPi.</p><figure><img src="https://www.rstudio.com/blog-images/rsc-172-python.png"alt="Jupyter Notebook Support"/></figure><h2 id="additional-updates">Additional Updates</h2><ul><li><strong>Generate Diagnostic Reports and Support Bundles</strong> This diagnostic reportcan be used by administrators to verify the status and configuration of yourRStudio Connect instance. The report also helps you work with our support team by collecting information and logs from your environment to help us quickly identify common issues and reduce the amount of time required to resolve them.See the <a href="https://docs.rstudio.com/connect/admin/getting-started.html#need-help">Getting Started section of the AdminGuide</a>for more information.</li><li><strong>API Versioning Documentation</strong> The versioning scheme of the Connect ServerAPI, including definitions for &ldquo;experimental&rdquo; endpoints and a deprecationstrategy, is now included in the <a href="https://docs.rstudio.com/connect/api/">APIReference</a> documentation.</li><li><strong>Expanded Content View</strong> (Screenshot in introduction) Expanded view showscontent descriptions and images in addition to the information available inthe familiar compact view. It is available to all users who can view thecontent list. This expanded view can help viewers navigate and discover thevaluable data products your data science teams create.</li></ul><h2 id="security--authentication-changes">Security &amp; Authentication Changes</h2><ul><li><strong>Browser Configurations</strong> Fixed an issue where certain browserconfigurations caused environment variable values to be stored in thebrowser’s autofill cache.</li><li><strong>OAuth2 Usernames</strong> Rules for generating OAuth2 usernames are documented inthe Admin Guide section for<a href="https://docs.rstudio.com/connect/admin/authentication.html#authentication-oauth2">OAuth2</a>.</li><li><strong>LDAP Usernames</strong> Login failures due to case-sensitivity handling in LDAPusernames have been fixed. This fix also applies to proxied authenticationwhen using a UniqueID distinct from the username.</li></ul><h2 id="deprecations-breaking-changes--bug-fixes">Deprecations, Breaking Changes &amp; Bug Fixes</h2><ul><li><strong>Breaking Change</strong> Publishers can no longer create groups. The creation ofgroups by publishers without consent from an administrator made it harderto ensure limited access to content. All publisher-owned groups currently in existence will remain, but any new group creation by publishers will be blocked. To restore the previousbehavior and allow publishers to create groups, use the new setting:<code>Authorization.PublishersCanOwnGroups</code></li><li><strong>Breaking Change</strong> API requests with a malformed GUID in a path segmentreturn a <code>400 Bad Request</code> HTTP status code rather than a <code>404 Not Found</code>.</li><li><strong>Bug Fix</strong> Shiny App usage historical information had the <code>started</code> timestampstored in the local timezone while the <code>end</code> timestamp was in UTC. Now both arestored in UTC; existing records will be adjusted automatically during the course of the upgrade.</li></ul><p>Please review the <a href="http://docs.rstudio.com/connect/news">full release notes</a></p><blockquote><h4 id="upgrade-planning">Upgrade Planning</h4><p>If you rely on publisher-created groups in RStudioConnect, please make note of the breaking changes described above and in therelease notes.</p><p>Due to the bug fix on historical timestamp information forShiny App usage, upgrades could take several minutes depending on the numberof records to be adjusted.</p><p>Aside from the breaking changes above, there areno other special considerations. If you are upgrading from an earlierversion, be sure to consult the release notes for the intermediate releases,as well.</p></blockquote><p>If you haven&rsquo;t yet had a chance to download and try <a href="https://rstudio.com/products/connect/">RStudioConnect</a>, we encourage you to do so.RStudio Connect is the best way to share all the work that you do in R (Shinyapps, R Markdown documents, plots, dashboards, Plumber APIs, etc.) withcollaborators, colleagues, or customers.</p><p>You can find more details or download a 45-day evaluation of the product at<a href="https://www.rstudio.com/products/connect/">https://www.rstudio.com/products/connect/</a>.Additional resources can be found below.</p><ul><li><a href="https://www.rstudio.com/products/connect/">RStudio Connect home page &amp; downloads</a></li><li><a href="http://docs.rstudio.com/connect/admin/">RStudio Connect Admin Guide</a></li><li><a href="http://docs.rstudio.com/connect/news/">Detailed news and changes between each version</a></li><li><a href="https://www.rstudio.com/pricing/#ConnectPricing">Pricing</a></li><li><a href="https://beta.rstudioconnect.com/connect/">An online preview of RStudio Connect</a></li></ul></description></item><item><title>sparklyr 1.0: Apache Arrow, XGBoost, Broom and TFRecords</title><link>https://www.rstudio.com/blog/sparklyr-1-0/</link><pubDate>Fri, 15 Mar 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/sparklyr-1-0/</guid><description><img src="https://www.rstudio.com/blog-images/2019-03-15-sparklyr-1-0-sparklyr-arrow-spark-small.png" style="display: none;" alt="sparklyr spark_apply() performance with arrow"/><p>With much excitement built over the past three years, we are thrilled to share that <a href="https://github.com/rstudio/sparklyr">sparklyr</a> <code>1.0</code> is now available on <a href="https://CRAN.R-project.org/package=sparklyr">CRAN</a>!</p><p>The <code>sparklyr</code> package provides an R interface to <a href="http://spark.apache.org">Apache Spark</a>. It supports <a href="https://dplyr.tidyverse.org/">dplyr</a>, <a href="https://spark.apache.org/mllib/">MLlib</a>, <a href="https://spark.rstudio.com/guides/streaming/">streaming</a>, <a href="https://spark.rstudio.com/extensions/">extensions</a> and many other features; however, this particular release enables the following new features:</p><ul><li><strong><a href="#arrow">Arrow</a></strong> enables <strong>faster</strong> and <strong>larger</strong> data transfers between Spark and R.</li><li><strong><a href="#xgboost">XGBoost</a></strong> enables training <strong>gradient boosting</strong> models over distributed datasets.</li><li><strong><a href="#broom">Broom</a></strong> converts Spark&rsquo;s models into <strong>tidy</strong> formats that you know and love.</li><li><strong><a href="#tfrecords">TFRecords</a></strong> writes TensorFlow records from Spark to support <strong>deep learning</strong> workflows.</li></ul><p>This release also brings support for <a href="https://spark.apache.org/releases/spark-release-2-4-0.html">Spark 2.4</a>, the ability to collect and copy in batches, increased Livy performance, and many more improvements listed in the sparklyr <a href="https://github.com/rstudio/sparklyr/blob/master/NEWS.md">NEWS</a> file. You can install <code>sparklyr 1.0</code> from CRAN as follows:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">sparklyr&#34;</span>)</code></pre></div><h2 id="arrow">Arrow</h2><p><a href="https://arrow.apache.org/">Apache Arrow</a> is a cross-language development platform for in-memory data, you can read more about this in the <a href="https://blog.rstudio.com/2018/04/19/arrow-and-beyond/">Arrow and beyond</a> blog post. In <code>sparklyr 1.0</code>, we are embracing Arrow as an efficient bridge between R and Spark, conceptually:</p><p>&lt;img src=&rdquo;/blog-images/2019-03-15-sparklyr-1-0-sparklyr-arrow-spark.png&rdquo; width=&quot;70%&rdquo;&rdquo; alt=&quot;sparklyr using Apache Arrow diagram&rdquo;/&gt;</p><p>In practice, this means faster data transfers and support for larger datasets; specifically, this improves <code>collect()</code>, <code>copy_to()</code> and <code>spark_apply()</code>. The following benchmarks make use of the <a href="http://bench.r-lib.org/">bench</a> package to measure performance with and without <code>arrow</code>.</p><p>We will first benchmark <code>copy_to()</code> over a dataframe with 1M and 10M rows. Notice that, with the default memory configuration, <code>copy_to()</code> can&rsquo;t handle 10M rows while <code>arrow</code> can.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(sparklyr)sc <span style="color:#666">&lt;-</span> <span style="color:#06287e">spark_connect</span>(master <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">local&#34;</span>)bench<span style="color:#666">::</span><span style="color:#06287e">press</span>(rows <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#40a070">10</span>^6, <span style="color:#40a070">10</span>^7), {bench<span style="color:#666">::</span><span style="color:#06287e">mark</span>(arrow_on <span style="color:#666">=</span> {<span style="color:#06287e">library</span>(arrow)sparklyr_df <span style="color:#666">&lt;&lt;-</span> <span style="color:#06287e">copy_to</span>(sc, <span style="color:#06287e">data.frame</span>(y <span style="color:#666">=</span> <span style="color:#40a070">1</span><span style="color:#666">:</span>rows), overwrite <span style="color:#666">=</span> T)},arrow_off <span style="color:#666">=</span> <span style="color:#06287e">if </span>(rows <span style="color:#666">&lt;=</span> <span style="color:#40a070">10</span>^6) {<span style="color:#06287e">if </span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">arrow&#34;</span> <span style="color:#666">%in%</span> <span style="color:#06287e">.packages</span>()) <span style="color:#06287e">detach</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">package:arrow&#34;</span>)sparklyr_df <span style="color:#666">&lt;&lt;-</span> <span style="color:#06287e">copy_to</span>(sc, <span style="color:#06287e">data.frame</span>(y <span style="color:#666">=</span> <span style="color:#40a070">1</span><span style="color:#666">:</span>rows), overwrite <span style="color:#666">=</span> T)} else <span style="color:#007020;font-weight:bold">NULL</span>, iterations <span style="color:#666">=</span> <span style="color:#40a070">4</span>, check <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">FALSE</span>)})</code></pre></div><p><img src="https://www.rstudio.com/blog-images/2019-03-15-sparklyr-1-0-copy-to.png" alt=""></p><p>Next, we will benchmark <code>collect()</code> over 10M and 50M records; collecting 50M+ records is only possible with <code>arrow</code>.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">bench<span style="color:#666">::</span><span style="color:#06287e">press</span>(rows <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#40a070">10</span>^7, <span style="color:#40a070">5</span> <span style="color:#666">*</span> <span style="color:#40a070">10</span>^7), {bench<span style="color:#666">::</span><span style="color:#06287e">mark</span>(arrow_on <span style="color:#666">=</span> {<span style="color:#06287e">library</span>(arrow)collected <span style="color:#666">&lt;-</span> <span style="color:#06287e">sdf_len</span>(sc, rows) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">collect</span>()},arrow_off <span style="color:#666">=</span> <span style="color:#06287e">if </span>(rows <span style="color:#666">&lt;=</span> <span style="color:#40a070">10</span>^7) {<span style="color:#06287e">if </span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">arrow&#34;</span> <span style="color:#666">%in%</span> <span style="color:#06287e">.packages</span>()) <span style="color:#06287e">detach</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">package:arrow&#34;</span>)collected <span style="color:#666">&lt;-</span> <span style="color:#06287e">sdf_len</span>(sc, rows) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">collect</span>()} else <span style="color:#007020;font-weight:bold">NULL</span>, iterations <span style="color:#666">=</span> <span style="color:#40a070">4</span>, check <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">FALSE</span>)})</code></pre></div><p><img src="https://www.rstudio.com/blog-images/2019-03-15-sparklyr-1-0-collect.png" alt=""></p><p>Last but not least, <code>spark_apply()</code> over 100K and 1M rows shows the most significant improvements. A <strong>40x speedup</strong> when running R on Spark, additional details are available in the Arrow project <a href="https://arrow.apache.org/blog/2019/01/25/r-spark-improvements/">post</a>.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">bench<span style="color:#666">::</span><span style="color:#06287e">press</span>(rows <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#40a070">10</span>^5, <span style="color:#40a070">10</span>^6), {bench<span style="color:#666">::</span><span style="color:#06287e">mark</span>(arrow_on <span style="color:#666">=</span> {<span style="color:#06287e">library</span>(arrow)<span style="color:#06287e">sdf_len</span>(sc, rows) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">spark_apply</span>(<span style="color:#666">~</span> .x <span style="color:#666">/</span> <span style="color:#40a070">2</span>) <span style="color:#666">%&gt;%</span> dplyr<span style="color:#666">::</span><span style="color:#06287e">count</span>() <span style="color:#666">%&gt;%</span> collect},arrow_off <span style="color:#666">=</span> <span style="color:#06287e">if </span>(rows <span style="color:#666">&lt;=</span> <span style="color:#40a070">10</span>^5) {<span style="color:#06287e">if </span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">arrow&#34;</span> <span style="color:#666">%in%</span> <span style="color:#06287e">.packages</span>()) <span style="color:#06287e">detach</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">package:arrow&#34;</span>)<span style="color:#06287e">sdf_len</span>(sc, rows) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">spark_apply</span>(<span style="color:#666">~</span> .x <span style="color:#666">/</span> <span style="color:#40a070">2</span>) <span style="color:#666">%&gt;%</span> dplyr<span style="color:#666">::</span><span style="color:#06287e">count</span>() <span style="color:#666">%&gt;%</span> collect} else <span style="color:#007020;font-weight:bold">NULL</span>, iterations <span style="color:#666">=</span> <span style="color:#40a070">4</span>, check <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">FALSE</span>)})</code></pre></div><p><img src="https://www.rstudio.com/blog-images/2019-03-15-sparklyr-1-0-spark-apply.png" alt=""></p><p>To use <code>arrow</code>, you will first have to install the Apache Arrow runtime followed by installing the R <code>arrow</code> package, additional instructions are available under <a href="https://spark.rstudio.com/guides/arrow">spark.rstudio.com/guides/arrow</a>.</p><h2 id="xgboost">XGBoost</h2><p><a href="https://github.com/rstudio/sparkxgb">sparkxgb</a> is a new <code>sparklyr</code> extension that can be used to train <a href="https://xgboost.ai/">XGBoost</a> models in Spark. <code>sparkxgb</code> is available on CRAN and can be installed as follows:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">sparkxgb&#34;</span>)</code></pre></div><p>We can then use <code>xgboost_classifier()</code> to train and <code>ml_predict()</code> to predict over large datasets with ease:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(sparkxgb)<span style="color:#06287e">library</span>(sparklyr)<span style="color:#06287e">library</span>(dplyr)sc <span style="color:#666">&lt;-</span> <span style="color:#06287e">spark_connect</span>(master <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">local&#34;</span>)iris <span style="color:#666">&lt;-</span> <span style="color:#06287e">copy_to</span>(sc, iris)xgb_model <span style="color:#666">&lt;-</span> <span style="color:#06287e">xgboost_classifier</span>(iris,Species <span style="color:#666">~</span> .,num_class <span style="color:#666">=</span> <span style="color:#40a070">3</span>,num_round <span style="color:#666">=</span> <span style="color:#40a070">50</span>,max_depth <span style="color:#666">=</span> <span style="color:#40a070">4</span>)xgb_model <span style="color:#666">%&gt;%</span><span style="color:#06287e">ml_predict</span>(iris) <span style="color:#666">%&gt;%</span><span style="color:#06287e">select</span>(Species, predicted_label, <span style="color:#06287e">starts_with</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">probability_&#34;</span>)) <span style="color:#666">%&gt;%</span><span style="color:#06287e">glimpse</span>()</code></pre></div><pre><code>#&gt; Observations: ??#&gt; Variables: 5#&gt; Database: spark_connection#&gt; $ Species &lt;chr&gt; &quot;setosa&quot;, &quot;setosa&quot;, &quot;setosa&quot;, &quot;setosa&quot;, &quot;…#&gt; $ predicted_label &lt;chr&gt; &quot;setosa&quot;, &quot;setosa&quot;, &quot;setosa&quot;, &quot;setosa&quot;, &quot;…#&gt; $ probability_versicolor &lt;dbl&gt; 0.003566429, 0.003564076, 0.003566429, 0.…#&gt; $ probability_virginica &lt;dbl&gt; 0.001423170, 0.002082058, 0.001423170, 0.…#&gt; $ probability_setosa &lt;dbl&gt; 0.9950104, 0.9943539, 0.9950104, 0.995010…</code></pre><p>You can read more about <code>sparkxgb</code> under its <a href="https://github.com/rstudio/sparkxgb#sparkxgb">README</a> file. Note that Windows is currently unsupported.</p><h2 id="broom">Broom</h2><p>While support for <a href="https://broom.tidyverse.org/">broom</a> in Spark through <code>sparklyr</code> has been under development for quite some time, this release marks the completion of all modeling functions. For instance, we can now augment using an ALS model with ease:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">movies <span style="color:#666">&lt;-</span> <span style="color:#06287e">data.frame</span>(user <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#40a070">1</span>, <span style="color:#40a070">2</span>, <span style="color:#40a070">0</span>, <span style="color:#40a070">1</span>, <span style="color:#40a070">2</span>, <span style="color:#40a070">0</span>),item <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#40a070">1</span>, <span style="color:#40a070">1</span>, <span style="color:#40a070">1</span>, <span style="color:#40a070">2</span>, <span style="color:#40a070">2</span>, <span style="color:#40a070">0</span>),rating <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#40a070">3</span>, <span style="color:#40a070">1</span>, <span style="color:#40a070">2</span>, <span style="color:#40a070">4</span>, <span style="color:#40a070">5</span>, <span style="color:#40a070">4</span>))<span style="color:#06287e">copy_to</span>(sc, movies) <span style="color:#666">%&gt;%</span><span style="color:#06287e">ml_als</span>(rating <span style="color:#666">~</span> user <span style="color:#666">+</span> item) <span style="color:#666">%&gt;%</span><span style="color:#06287e">augment</span>()</code></pre></div><pre><code># Source: spark&lt;?&gt; [?? x 4]user item rating .prediction&lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;1 2 2 5 4.862 1 2 4 3.983 0 0 4 3.884 2 1 1 1.085 0 1 2 2.006 1 1 3 2.80</code></pre><h2 id="tfrecords">TFRecords</h2><p><a href="https://github.com/rstudio/sparktf">sparktf</a> is a new <code>sparklyr</code> extension allowing you to write TensorFlow records in Spark. This can be used to preprocess large amounts of data before processing them in GPU instances with Keras or TensorFlow. <code>sparktf</code> is now available on CRAN and can be installed as follows:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">sparktf&#34;</span>)</code></pre></div><p>You can simply preprocess data in Spark and write it as TensorFlow records using <code>spark_write_tf()</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(sparktf)<span style="color:#06287e">library</span>(sparklyr)sc <span style="color:#666">&lt;-</span> <span style="color:#06287e">spark_connect</span>(master <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">local&#34;</span>)<span style="color:#06287e">copy_to</span>(sc, iris) <span style="color:#666">%&gt;%</span><span style="color:#06287e">ft_string_indexer_model</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Species&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">label&#34;</span>,labels <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">setosa&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">versicolor&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">virginica&#34;</span>)) <span style="color:#666">%&gt;%</span><span style="color:#06287e">spark_write_tfrecord</span>(path <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">tfrecord&#34;</span>)</code></pre></div><p>You can then use TensorFlow and Keras from R to load this recordset and train deep learning models; for instance, using <a href="https://tensorflow.rstudio.com/tools/tfdatasets/reference/tfrecord_dataset.html">tfrecord_dataset()</a>. Please read the <code>sparktf</code> <a href="https://github.com/rstudio/sparktf#sparktf">README</a> for more details.</p><h2 id="moarhttpsikym-cdncomentriesiconsoriginal000000574moar-catjpg"><a href="https://i.kym-cdn.com/entries/icons/original/000/000/574/moar-cat.jpg">Moar</a>?</h2><p>When connecting to Spark running in YARN, RStudio&rsquo;s connection pane can now launch YARN&rsquo;s web application.</p><img src="https://www.rstudio.com/blog-images/2019-03-15-sparklyr-1-0-rstudio-yarn.png" width=70% style="margin-left: 15px;" alt="RStudio Connections Pane YARN action"/><p>We also made it possible to copy and collect larger datasets by using callbacks. For instance, you can collect data incrementally in batches of 100K rows; this is configurable through the <code>sparklyr.collect.batch</code> setting. The following example collects 300K rows using batches and prints the total records collected; in practice, you save and load from disk.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">sdf_len</span>(sc, <span style="color:#40a070">3</span> <span style="color:#666">*</span> <span style="color:#40a070">10</span>^5) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">collect</span>(callback <span style="color:#666">=</span> <span style="color:#666">~</span><span style="color:#06287e">message</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">(&#34;</span>, .y, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">) Collecting &#34;</span>, <span style="color:#06287e">nrow</span>(.x), <span style="color:#4070a0">&#34;</span><span style="color:#4070a0"> rows.&#34;</span>))</code></pre></div><pre><code>(1) Collecting 100000 rows.(2) Collecting 100000 rows.(3) Collecting 100000 rows.</code></pre><p>For Livy connections, performance is improved when setting the <code>spark_version</code> parameter in <code>livy_config()</code>, this allows <code>sparklyr</code> to start a connection using JARs instead of loading sources.</p><p>In addition, <a href="https://spark.rstudio.com/extensions/#examples">extensions</a> are now also supported in Livy. For example, you can run pagerank with Livy and <a href="https://github.com/rstudio/graphframes">graphframes</a> as follows:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(graphframes)<span style="color:#06287e">library</span>(sparklyr)<span style="color:#06287e">livy_service_start</span>()sc <span style="color:#666">&lt;-</span> <span style="color:#06287e">spark_connect</span>(master <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">local&#34;</span>, method <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">livy&#34;</span>, version <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">2.4.0&#34;</span>)<span style="color:#06287e">gf_pagerank</span>(<span style="color:#06287e">gf_friends</span>(sc), tol <span style="color:#666">=</span> <span style="color:#40a070">0.01</span>, reset_probability <span style="color:#666">=</span> <span style="color:#40a070">0.15</span>)</code></pre></div><pre><code>GraphFrameVertices:Database: spark_connection$ id &lt;chr&gt; &quot;f&quot;, &quot;g&quot;, &quot;a&quot;, &quot;e&quot;, &quot;d&quot;, &quot;b&quot;, &quot;c&quot;$ name &lt;chr&gt; &quot;Fanny&quot;, &quot;Gabby&quot;, &quot;Alice&quot;, &quot;Esther&quot;, &quot;David&quot;, &quot;Bob&quot;, &quot;Charlie&quot;$ age &lt;int&gt; 36, 60, 34, 32, 29, 36, 30$ pagerank &lt;dbl&gt; 0.3283607, 0.1799821, 0.4491063, 0.3708523, 0.3283607, 2.6555078, 2.6878300Edges:Database: spark_connection$ src &lt;chr&gt; &quot;a&quot;, &quot;b&quot;, &quot;e&quot;, &quot;e&quot;, &quot;c&quot;, &quot;a&quot;, &quot;f&quot;, &quot;d&quot;$ dst &lt;chr&gt; &quot;b&quot;, &quot;c&quot;, &quot;f&quot;, &quot;d&quot;, &quot;b&quot;, &quot;e&quot;, &quot;c&quot;, &quot;a&quot;$ relationship &lt;chr&gt; &quot;friend&quot;, &quot;follow&quot;, &quot;follow&quot;, &quot;friend&quot;, &quot;follow&quot;, &quot;friend&quot;, &quot;follow&quot;, &quot;friend&quot;$ weight &lt;dbl&gt; 0.5, 1.0, 0.5, 0.5, 1.0, 0.5, 1.0, 1.0</code></pre><p>The <a href="https://github.com/rstudio/sparklyr/blob/master/NEWS.md">sparklyr NEWS</a> contains a complete list of changes and features for this release. To catch up on previously released features, you can read the blog posts that got us here:</p><ul><li><a href="https://blog.rstudio.com/2018/10/01/sparklyr-0-9/">sparklyr 0.9</a>: Streams and Kubernetes.</li><li><a href="https://blog.rstudio.com/2018/05/14/sparklyr-0-8/">sparklyr 0.8</a>: Production pipelines and graphs.</li><li><a href="https://blog.rstudio.com/2018/01/29/sparklyr-0-7/">sparklyr 0.7</a>: Spark Pipelines and Machine Learning.</li><li><a href="https://blog.rstudio.com/2017/07/31/sparklyr-0-6/">sparklyr 0.6</a>: Distributed R and external sources.</li><li><a href="https://blog.rstudio.com/2017/01/24/sparklyr-0-5/">sparklyr 0.5</a>: Livy and dplyr improvements.</li><li><a href="https://blog.rstudio.com/2016/09/27/sparklyr-r-interface-for-apache-spark/">sparklyr 0.4</a>: R interface for Apache Spark.</li></ul><p>We hope you enjoy this exciting release!</p></description></item><item><title>RStudio 1.2 Preview: Jobs</title><link>https://www.rstudio.com/blog/rstudio-1-2-jobs/</link><pubDate>Thu, 14 Mar 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-1-2-jobs/</guid><description><p>When you run an R script in RStudio today, the R console waits for it to complete, and you can&rsquo;t do much with RStudio until the script is finished running. When your R scripts take a long time to run, it can be difficult to get much done in RStudio while they do, unless you&rsquo;re willing to juggle multiple instances of RStudio.</p><p>In RStudio 1.2, we&rsquo;re introducing two new features to keep you productive while your code&rsquo;s working: <em>local jobs</em> and <em>remote jobs</em>. You can use these to run your scripts in the background while you continue to use the IDE.</p><p><img src="running-jobs.png" ></p><h2 id="local-jobs">Local jobs</h2><p>A &ldquo;local job&rdquo; is an R script that runs in a separate, dedicated R session. You can run any R script in a separate session by pulling down the Source menu and choosing <em>Source as Local Job</em>.</p><p><img src="source-as-local-job.png" alt="" title="Source script as local job"></p><p>This will give you some options for running your job.</p><p><img src="run-script-dialog.png" alt="" title="Dialog showing options for starting R script job"></p><p>By default, the job will run in a clean R session, and its temporary workspace will be discarded when the job is complete. This is the fastest and safest configuration, good for reproducible scripts that have no side effects.</p><p>However, if you want to feed data from your current R session into the job, or have the job return data to your current R session, change the dialog options as follows:</p><p><strong>Run job with copy of global environment</strong>: If ticked, this option saves your global environment and loads it into the job&rsquo;s R session before it runs. This is useful because it will allow your job to see all the same variables you can see in the IDE. Note that this can be slow if you have large objects in your environment.</p><p><strong>Copy job results</strong>: By default, the temporary workspace in which the job runs is not saved. If you&rsquo;d like to import data from your job back into your R session, you have a couple of choices:</p><p><em>Global environment</em>: This places all the R objects your job creates back in your R session&rsquo;s global environment. Use this option with caution! The objects created by the job will overwrite, without a warning, any objects that have the same name in your environment.</p><p><em>Results object</em>: This places all the R objects your job creates into a new environment named <code>yourscript_results</code>.</p><h3 id="lifetime">Lifetime</h3><p>Local jobs run as non-interactive child R processes of your main R process, which means that they will be shut down if R is. While your R session is running jobs:</p><ul><li>You will be warned if you attempt to close the window while jobs are still running (on RStudio Desktop)</li><li>Your R session will not be suspended (on RStudio Server)</li></ul><p>While local jobs are running, a progress bar will appear in the R console summarizing the progress of all running jobs.</p><p><img src="job-progress-summary.png" alt="" title="R console pane showing job progress tab"></p><h3 id="detailed-progress">Detailed progress</h3><p>The progress bar RStudio shows for your job represents the execution of each top-level statement in your R script. If you want a little more insight into which part of the script is currently running, you can use RStudio&rsquo;s <a href="https://support.rstudio.com/hc/en-us/articles/200484568-Code-Folding-and-Sections">code sections</a> feature. Add a section marker like this to your R script:</p><pre><code># Apply the model ----</code></pre><p>When your job reaches that line in your script, the name of the section will appear on the progress bar.</p><p><img src="job-progress-sections.png" alt="" title="Job progress bar showing section progress"></p><p>You can also emit output using the usual R mechanisms, such as <code>print</code>, <code>message</code>, and <code>cat</code>. This output appears in the Jobs pane when you select your job.</p><p><img src="local-job-output.png" alt="" title="Jobs pane with output of a local job"></p><h3 id="scripting">Scripting</h3><p>You can script the creation of jobs using the <strong>rstudioapi</strong> package method <a href="https://www.rdocumentation.org/packages/rstudioapi/versions/0.9.0/topics/jobRunScript">jobRunScript</a>; it has options which correspond to each dialog option above. This makes it possible to automate and orchestrate more complicated sets of background tasks.</p><p>Note however that the IDE&rsquo;s background job runner is generally designed for one-off, interactive script runs. If you are writing R code and need to run a subtask asynchronously in a background R session, we recommend using the new <a href="https://callr.r-lib.org/">callr package</a> instead.</p><h2 id="remote-launcher-jobs">Remote (Launcher) jobs</h2><p>On RStudio Server Pro, you also have the option of running your R script on your company&rsquo;s compute infrastructure, using the new <a href="https://blog.rstudio.com/2018/11/05/rstudio-rsp-1.2-features/">Job Launcher</a>. To do this, select:</p><p><img src="source-as-launcher-job.png" alt="" title="Source script as launcher job"></p><p>When launching a job, you&rsquo;ll have the opportunity to specify how you want to run it, depending of course on the configuration the compute infrastructure exposes to RStudio Server. This can include settings like resource constraints as well as configuration parameters like which Docker image to use.</p><p><img src="launcher-job-options.png" alt="" title="Launcher job option dialog"></p><h3 id="monitoring-launcher-jobs">Monitoring launcher jobs</h3><p>Unlike local jobs, launcher jobs are <strong>independent from the R session</strong>. You can safely quit your R session without affecting any launcher jobs you may have started from it. Once you have started a job, you can see its status in the <em>Launcher</em> tab, which shows all your jobs (not just those launched from the current session).</p><p><img src="ide-launcher-tab.png" alt="" title="IDE tab showing status of launcher jobs"></p><p>You can also monitor the status and progress of your launcher jobs on your RStudio dashboard:</p><p><img src="rsp-jobs-dashboard.png" alt="" title="RStudio Server Pro dashboard showing executing jobs"></p><h2 id="showing-task-progress">Showing task progress</h2><p>RStudio&rsquo;s new Jobs pane can show more than just the progress of background jobs. It can also be scripted from R packages (and R code) to show status, progress, and output for any long-running task.</p><p>If you&rsquo;d like to show progress and/or output from a task using the jobs UI, refer to the <a href="https://www.rdocumentation.org/packages/rstudioapi/versions/0.9.0">rstudioapi documentation</a> for details; start with <code>addJob</code>, which creates a new job in the UI and returns a handle you can use to update the UI as the job progresses.</p><h2 id="wrap-up">Wrap up</h2><p>We hope RStudio&rsquo;s new Jobs functionality helps streamline your workflow and get the most out of your hardware, especially if you often work with R scripts that take time to execute. Try out the new functionality in the <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio 1.2 Preview Release</a> (stable release coming very soon), and let us know what you think on the <a href="https://community.rstudio.com/c/rstudio-ide">community forum</a>!</p></description></item><item><title>RStudio Package Manager 1.0.6 - README</title><link>https://www.rstudio.com/blog/rstudio-package-manager-1-0-6-readme/</link><pubDate>Wed, 13 Mar 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-package-manager-1-0-6-readme/</guid><description><p>The 1.0.6 release of RStudio Package Manager helps R users understand packages.The primary feature in this release is embedded package READMEs, detailed below.If you&rsquo;re new to Package Manager, it is an on-premise product built to give teams and organizations reliable and consistent package management. Download an <a href="https://rstudio.com/products/package-manager">evaluationtoday</a>.</p><figure><img src="https://www.rstudio.com/blog-images/rspm-106-readmes.png"alt="View package READMEs in Package Manager"/> <figcaption><p>View package READMEs in Package Manager</p></figcaption></figure><h2 id="package-readmes">Package READMEs</h2><p>Many R packages have rich README files that can include:</p><ul><li>An introduction to the package</li><li>Examples for key functions</li><li>Badges to indicate download counts, build status, code coverage, and other metrics</li><li>Other helpful information, like the package&rsquo;s hex sticker!</li></ul><p>This information can help a new user when they are first introduced to a package, or help an experienced user or admin gauge package quality. Package READMEs distill and supplement the rich information available in vignettes, Description files, and help files.</p><p>Starting in version 1.0.6, READMEs are automatically shown alongside the traditional package metadata. For CRAN packages, Package Manager will automatically show a README for the 12,000 CRAN packages that have them. READMEs are also displayed for internal packages <a href="https://docs.rstudio.com/rspm/1.0.6/admin/repositories.html#git-sources">sourced from Git</a> or <a href="https://docs.rstudio.com/rspm/1.0.6/admin/quickstarts.html#quickstart-local">local files</a>. These READMEs provide an easy way for package authors to document their code for colleagues, publicize new releases and features, and disseminate knowledge to team members.</p><h2 id="deprecations-breaking-changes-and-security-updates">Deprecations, Breaking Changes, and Security Updates</h2><ul><li>Version 1.0.6 includes a number of updates to Package Manager&rsquo;s built-in CRAN source. Customers using an internet-connected server <em>do not need to take any action</em>. Updates will be applied during the next CRAN sync. <em>Offline, air-gapped customers should following <a href="https://docs.rstudio.com/rspm/1.0.6/admin/air-gapped.html#air-gapped-upgrade">these instructions</a> to re-fetch the CRAN data immediately after upgrading, and then run the <code>rspm sync</code> command.</em></li></ul><p>Please consult the full <a href="https://docs.rstudio.com/rspm/news/">release notes</a>.</p><blockquote><h4 id="upgrade-planning">Upgrade Planning</h4><p>Please note the breaking changes and deprecations above. Upgrading to 1.0.6 from1.0.4 will take less than five minutes. There will be a five-to-ten minute delayin the next CRAN sync following the upgrade. If you are upgrading from an earlierversion, be sure to consult the release notes for the intermediate releases, as well.</p></blockquote><p>Don&rsquo;t see that perfect feature? Wondering why you should be worried aboutpackage management?<a href="mailto:sales@rstudio.com">Email us</a>; our product team is happy to help!</p><ul><li><a href="https://docs.rstudio.com/rspm/admin">Admin Guide</a></li><li><a href="https://www.rstudio.com/wp-content/uploads/2018/07/RStudio-Package-Manager-Overview.pdf">Overview PDF</a></li><li><a href="https://resources.rstudio.com/webinars/introduction-to-the-rstudio-package-manager-sean-lopp">Introductory Webinar</a></li><li><a href="https://demo.rstudiopm.com">Online Demo</a></li></ul></description></item><item><title>Building tidy tools workshop</title><link>https://www.rstudio.com/blog/building-tidy-tools-workshop/</link><pubDate>Fri, 08 Mar 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/building-tidy-tools-workshop/</guid><description><p>Join RStudio Chief Data Scientist Hadley Wickham for his popular &ldquo;Building tidy tools&rdquo; workshop in Sydney, Australia! If you&rsquo;d missed the sold out course at rstudio::conf 2019 now is your chance.</p><p>Register here: <a href="https://www.rstudio.com/workshops/building-tidy-tools/">https://www.rstudio.com/workshops/building-tidy-tools/</a></p><p>You should take this class if you have some experience programming in R and you want to learn how to tackle larger-scale problems. You&rsquo;ll get the most if you&rsquo;re already familiar with the basics of functions (i.e., you&rsquo;ve written a few) and are comfortable with R’s basic data structures (vectors, matrices, arrays, lists, and data frames). There is approximately a 30% overlap in the material with Hadley&rsquo;s previous &ldquo;R Masterclass&rdquo;. However, the material has been substantially reorganised, so if you&rsquo;ve taken the R Masterclass in the past, you&rsquo;ll still learn a lot in this class.</p><h3 id="what-will-i-learn">What will I learn?</h3><p>This course has three primary goals. You will:</p><ul><li>Learn efficient workflows for developing high-quality R functions, using the set of codified conventions of good package design. You&rsquo;ll also learn workflows for unit testing, which helps ensure that your functions do exactly what you think they do.</li><li>Master the art of writing functions that do one thing well and can be fluently combined together to solve more complex problems. We&rsquo;ll cover common function-writing pitfalls and how to avoid them.</li><li>Learn best practices for API design, programming tools, object design in S3, and the tidy eval system for NSE.</li></ul><p><strong>When</strong> - WEDNESDAY, MAY 1, 2019, 8:00AM - THURSDAY, MAY 2, 2019, 5:00PM</p><p><strong>Where</strong> - The Westin Sydney, 1 Martin Pl, Sydney NSW 2000, Australia</p><p><strong>Who</strong> - Hadley Wickham, Chief Scientist at RStudio</p><p>Build your skills and learn from the best at this rare in-person workshop - the only Australia workshop from Hadley in 2019.</p><p>Register here: <a href="https://www.rstudio.com/workshops/building-tidy-tools/">https://www.rstudio.com/workshops/building-tidy-tools/</a></p><p>Discounts are available for 5 or more attendees from any organization, and for students.</p><p>Please email <a href="mailto:training@rstudio.com">training@rstudio.com</a> if you have any questions about the workshop that aren&rsquo;t answered on the registration page.</p></description></item><item><title>RStudio Instructor Training</title><link>https://www.rstudio.com/blog/2019-02-28-instructor-training/</link><pubDate>Thu, 28 Feb 2019 05:20:59 +0000</pubDate><guid>https://www.rstudio.com/blog/2019-02-28-instructor-training/</guid><description><p>We are pleased to announce the launch of RStudio&rsquo;s instructor training and certification program. Its goal is to help people apply modern evidence-based teaching practices to teach data science using R and RStudio&rsquo;s products, and to help people who need such training find the trainers they need.</p><p>Like the training programs for flight instructors, the ski patrol, and <a href="https://carpentries.org/">the Carpentries</a>, ours distinguishes between instructors (who teach end users) and coaches (who teach instructors). This program focuses on instructors; anyone in the R community who wishes to become one is encouraged to apply by filling in <a href="https://goo.gl/forms/wQ6p8kqOnHxwi8152">this form</a>. Candidates must be proficient in the technical skills and tools they wish to teach; they can demonstrate this when applying by pointing us at materials they have previously developed and/or sharing a short video with us (such as a screencast or a recording of a conference talk).</p><p>There are three steps to becoming certified:</p><ol><li><p>Candidates must take part in a one-day training course on modern teaching methods similar to that offered at <a href="https://resources.rstudio.com/rstudio-conf-2019">rstudio::conf 2019</a>. (We will offer this course several times in the coming year.)</p></li><li><p>After completing that course, candidates must complete a one-hour exam on the material, then prepare and deliver a demonstration lesson on a topic assigned to them.</p></li><li><p>Finally, in order to ensure that instructors are proficient with the technical content they will be teaching, they must complete a practical examination and deliver a demonstration for each subject they wish to be certified to teach.</p></li></ol><p>Instructors must certify on a per-topic basis, just as pilots obtain ratings for different kinds of aircraft. We will initially offer certification on data analysis using the tidyverse and on Shiny. Other subjects, such as administering our professional products and using Connect, will be rolled out before the end of 2019.</p><p>As at present, we will advertise instructors on our web site, and instructors will be eligible for free teaching licenses to RStudio professional on-premises and cloud products for use in their training. They can teach in whatever format they want, from half-day workshops to multi-week courses, and charge whatever the market will bear, but will be required to make their materials available to RStudio for inspection (but not for use) if asked to do so. (And yes, there will be stickers&hellip;)</p><p>The training course will cost $500, and each examination will cost an additional $500; anyone who does not pass an exam can re-try once at a later date for an additional $500. Once certified, instructors will be required to pay $50/year in membership dues, and to re-qualify every three years. Exemptions will be provided on a case-by-case basis for those who might otherwise not be able to take part.</p><p>Finally, we will make the training materials we have developed available for instructors to use under Creative Commons licenses, and we will encourage them to collaborate on modifying and extending these materials and the curricula they develop themselves. Different instructors may wish to teach topics in a variety of ways to meet the needs of different audiences, and may choose to keep their materials private, but as with open source software, we believe that pooling effort will make everyone more effective.</p><p>If you would like to know more about this program, or would like to arrange training for staff at your company or institution, please contact <a href="mailto:certification@rstudio.com">certification@rstudio.com</a> or fill in <a href="https://goo.gl/forms/wQ6p8kqOnHxwi8152">this form</a>. If you are already listed on our trainers&rsquo; page, please keep an eye on your inbox for news as well.</p></description></item><item><title>Try out RStudio Connect on Your Desktop for Free</title><link>https://www.rstudio.com/blog/try-out-rstudio-connect-on-your-desktop/</link><pubDate>Wed, 13 Feb 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/try-out-rstudio-connect-on-your-desktop/</guid><description><p>Have you heard of RStudio Connect, but do not know where to start? Maybe you aretrying to show your manager how Shiny applications can be deployed inproduction, or convince a DevOps engineer that R can fit into her existingtooling. Perhaps you want to explore the functionality of RStudio&rsquo;s Professionalproducts to see if they fit the needs you have in your work.</p><p>Today, we are excited to announce the RStudio QuickStart, which allows you totry out RStudio Connect for free from your desktop.</p><p><img src="https://www.rstudio.com/blog-images/qs-home.png" alt="The RStudio QuickStart home page"></p><p>In many organizations, we find that R is already being used internally byindividual data scientists and analysts for productive work on their desktops.For other organizations, R has been chosen as a standard for analytics and datascience, but a process of exploration is necessary to understand what adoptionof open source software looks like in the enterprise.</p><p>In all cases, we recommend users showcase our full server-side products to showhow R can be useful across a team or department, rather than on a single user&rsquo;smachine. Furthermore, this process integrates R into an organization&rsquo;s IT practicesand secures it as an analytic standard. The QuickStart is a great first steptowards this objective because it is free, easy, and comes pre-populated withsome of the most common workflows that we see in enterprise use of our software.</p><p>To get started with your 45-day free evaluation, visit the <a href="https://www.rstudio.com/products/quickstart/">RStudioQuickStart</a> page for instructions.You should download and run the QuickStart and take the product tour. Discusson <a href="https://community.rstudio.com/tags/c/r-admin/quickstart">RStudio Community</a> or <a href="mailto:sales@rstudio.com">Contact our sales team</a>throughout your exploration if you have difficulties or questions about next steps!</p><ul><li>Visit the <a href="https://www.rstudio.com/products/quickstart/">QuickStart page</a></li><li>Download and run the QuickStart, then take the product tour</li><li><a href="mailto:sales@rstudio.com">Contact us</a> to help you set up RStudio Professional software on your productioninfrastructure</li></ul><blockquote><p>RStudio provides free and open source tools for data science andenterprise-ready professional software for teams to develop and share their workat scale. Now you can try out RStudio professional software on your desktop forfree!</p></blockquote></description></item><item><title>rstudio::conf 2019 Workshop materials now available</title><link>https://www.rstudio.com/blog/rstudio-conf-2019-workshops/</link><pubDate>Wed, 06 Feb 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-conf-2019-workshops/</guid><description><p>rstudio::conf 2019 featured 15 workshops on tidyverse, Shiny, R Markdown, modeling and machine learning, deep learning, big data, and what they forgot to teach you about working with R. Some of the new workshops for this year touched on topics like putting Shiny applications into production at scale and R &amp; Tensorflow. The conference also featured certification workshops on RStudio Professional Administrator and Train-the-trainer for tidyverse and Shiny.</p><p>Below is a list of all workshops we hosted, with links to materials. Even thoughthe materials alone cannot replace the actual workshop experience, we hope that you&rsquo;ll find them useful. RStudio regularly hosts workshops throughout the year so please subscribe to <a href="https://www.rstudio.com/about/subscription-management/">training updates</a>. You can also find out more about each of the workshopsat the <a href="https://github.com/rstudio/rstudio-conf/blob/master/2019/workshops.md">conference repository</a>.</p><table><thead><tr><th>Workshop</th><th>Instructor(s)</th></tr></thead><tbody><tr><td><a href="https://github.com/AmeliaMN/data-science-in-tidyverse">Introduction to Data Science in the Tidyverse</a></td><td>Amelia McNamara, Hadley Wickham</td></tr><tr><td><a href="https://rstd.io/tidytools19">Building Tidy Tools</a></td><td>Charlotte Wickham, Hadley Wickham</td></tr><tr><td><a href="https://rstd.io/wtf-2019-rsc">What They Forgot to Teach You About R</a></td><td>Jenny Bryan, Jim Hester</td></tr><tr><td><a href="https://github.com/dtkaplan/shinymark">Intro to Shiny and RMarkdown</a></td><td>Danny Kaplan</td></tr><tr><td><a href="https://arm.rbind.io/">Advanced R Markdown</a></td><td>Alison Hill, Yihui Xie</td></tr><tr><td><a href="https://github.com/aimeegott/RStudio-Conf-Intermediate-Shiny">Intermediate Shiny</a></td><td>Aimee Gott, Winston Chang</td></tr><tr><td><a href="https://github.com/kellobri/spc-app">Using Shiny in Production</a></td><td>Kelly O&rsquo;Briant, Sean Lopp</td></tr><tr><td><a href="https://github.com/topepo/rstudio-conf-2019">Applied Machine Learning</a></td><td>Max Kuhn, Alex Hayes, Davis Vaughan</td></tr><tr><td><a href="https://github.com/rstudio/conf_tensorflow_training_day2">Introduction to Deep Learning + Beyond the Basics</a></td><td>Sigrid Keydana, Kevin Kuo, Rick Scavetta</td></tr><tr><td><a href="https://github.com/rstudio/bigdataclass">Big Data with R</a></td><td>Edgar Ruiz, James Blair</td></tr><tr><td><a href="https://github.com/rstudio-education/teaching-workshop-2019-01">Train-the-Trainer Certification Workshop</a></td><td>Greg Wilson</td></tr><tr><td><a href="http://teach-shiny.rbind.io">Shiny Train-the-Trainer Certification Workshop</a></td><td>Mine Çetinkaya-Rundel</td></tr><tr><td><a href="https://github.com/rstudio-education/teach-tidy">Tidyverse Train-the-Trainer Certification Workshop</a></td><td>Garrett Grolemund</td></tr><tr><td><a href="https://colorado.rstudio.com/rsc/pro-admin-training/overview/Overview.html">RStudio Professional Administrator Certification Workshop</a></td><td>Andrie de Vries</td></tr></tbody></table></description></item><item><title>Time Travel with RStudio Package Manager 1.0.4</title><link>https://www.rstudio.com/blog/time-travel-with-rstudio-package-manager-1-0-4/</link><pubDate>Wed, 30 Jan 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/time-travel-with-rstudio-package-manager-1-0-4/</guid><description><p>We all love packages. We don&rsquo;t love when broken package environments prevent usfrom reproducing our work. In version 1.0.4 of RStudio Package Manager,individuals and teams can navigate through repository checkpoints,making it easy to recreate environments and reproduce work. The new release alsoadds important security updates, improvements for Git sources, further access toretired packages, and beta support for Amazon S3 storage.</p><h4 id="new-to-rstudio-package-manager">New to RStudio Package Manager?</h4><p><a href="https://rstudio.com/products/package-manager/">Download</a> the 45-day evaluationtoday to see how RStudio Package Manager can help you, your team, and yourentire organization access and organize R packages. Learn more with our <a href="https://demo.rstudiopm.com">onlinedemo server</a> or <a href="https://resources.rstudio.com/webinars/introduction-to-the-rstudio-package-manager-sean-lopp">latest webinar</a>.</p><figure><img src="https://www.rstudio.com/blog-images/rspm-104-calendar.png"alt="Easily navigate historical repositories"/> <figcaption><p>Easily navigate historical repositories</p></figcaption></figure><h2 id="updates">Updates</h2><h3 id="time-travel">Time Travel</h3><p>RStudio Package Manager automatically tracks every change to your repositories,whether you&rsquo;re adding new packages to a curated source, syncing the latest datafrom CRAN, or building a new commit of your internal package. These changes areefficiently stored as checkpoints. By default, R users installing packages willget the latest and greatest, but they can also install packages from any pointin the past.</p><p>A <a href="https://demo.rstudiopm.com/client/#/repos/3/overview">new calendar</a> on arepository&rsquo;s setup page can be used to travel backwards in time. If you lastused a project in November, you can install packages as they existed <strong>in yourrepository</strong> from that moment, making it much easier to guarantee your work isreproducible.</p><figure><img src="https://www.rstudio.com/blog-images/rspm-104-calendar.gif"alt="Time travel with a repository calendar"/> <figcaption><p>Time travel with a repository calendar</p></figcaption></figure><p>Alternatively, it is also possible to preemptively pin a project to a frozencheckpoint. This can be really useful in cases where you know you&rsquo;ll always wantthe same set of packages. For example, you can include a reference to acheckpoint inside of a Dockerfile to ensure anytime the Docker image is rebuilt,you&rsquo;ll get the same packages and versions.</p><pre><code>RUN Rscript -e 'install.packages(...,repos = &quot;https://rpkgs.example.com/cran/128&quot;)'</code></pre><h3 id="new-storage-options">New Storage Options</h3><p>Version 1.0.4 adds beta support for storing packages on Amazon S3 instead oflocal or shared storage. In addition, we&rsquo;ve expanded the <a href="https://docs.rstudio.com/rspm/admin/appendix-configuration.html#appendix-configuration-filestorage">configurationoptions</a> for administrators to control exactly where and how Package Manager stores packages, data, and metrics.</p><h3 id="retired-packages">Retired Packages</h3><p>You may be familiar with archived packages - they are older versions of packagesthat are listed at the bottom of a package&rsquo;s information page.</p><figure><img src="https://www.rstudio.com/blog-images/rspm-104-archive.png"alt="Access archived versions"/> <figcaption><p>Access archived versions</p></figcaption></figure><p>Did you know that CRAN packages can be retired? &ldquo;Retirement&rdquo; occurs when everyversion of a package is placed in the archive and no version remains current.Packages can be retired for a variety of reasons: perhaps the maintainer is nolonger fixing breaking changes, or the functionality has been replacedby a new package. While retired packages are typically not used by new projects,it can be useful to see if a package you&rsquo;re searching for is retired. Librarymanagement tools like <code>packrat</code> also make use of retired packages to recreateolder environments. In 1.0.4, retired packages show up in a repository with aspecial page indicating their status.</p><figure><img src="https://www.rstudio.com/blog-images/rspm-104-retire.png"alt="View retired packages"/> <figcaption><p>View retired packages</p></figcaption></figure><h3 id="git-source-improvements">Git Source Improvements</h3><p>RStudio Package Manager makes it easy to share R packages that live inside ofGit, either internal packages or packages from GitHub. This release includes anumber of quality of life improvements:</p><ul><li><p><strong>Subdirectories</strong> In verison 1.0.4, we&rsquo;ve added support to build packages that live insub-directories of a Git file system.</p></li><li><p><strong>SSH keys</strong> We&rsquo;ve added support for SSH keys that use a passphrase, andwe&rsquo;ve significantly improved how SSH keys are used to access Git repos.</p></li><li><p><strong>Description Files</strong> Packages built from Git now have the commit SHA includedin their DESCRIPTION file for reference.</p></li></ul><h2 id="deprecations-breaking-changes-and-security-updates">Deprecations, Breaking Changes, and Security Updates</h2><ul><li><p><strong>Breaking Change</strong> Version 1.0.4 introduces an important security enhancement thathelps isolate package builds from the rest of Package Manager. If you are usingpackages from Git <strong>and</strong> running on RedHat/CentOS or inside of a Dockercontainer, you may need to update your configuration. Follow <a href="https://docs.rstudio.com/rspm/1.0.4/admin/process-management.html">theseinstructions</a>for more information.</p></li><li><p>The use of SSH keys for accessing Git repositories has been improved by addingsupport for passphrases and isolated SSH agents.</p></li></ul><p>Please review the <a href="https://docs.rstudio.com/rspm/news">full release notes</a>.</p><blockquote><h4 id="upgrade-planning">Upgrade Planning</h4><p>Please note the breaking changes and deprecations above. Upgrading to 1.0.4from 1.0.0 will take less than five minutes. If you are upgrading from anearlier beta version, be sure to consult the release notes for theintermediate releases, as well.</p></blockquote><p>Don&rsquo;t see that perfect feature? Wondering why you should be worried aboutpackage management? Want to talk about other package management strategies?<a href="mailto:sales@rstudio.com">Email us</a>, our product team is happy to help!</p><ul><li><a href="https://docs.rstudio.com/rspm/admin">Admin Guide</a></li><li><a href="https://www.rstudio.com/wp-content/uploads/2018/07/RStudio-Package-Manager-Overview.pdf">Overview PDF</a></li><li><a href="https://resources.rstudio.com/webinars/introduction-to-the-rstudio-package-manager-sean-lopp">Introductory Webinar</a></li><li><a href="https://demo.rstudiopm.com">Online Demo</a></li></ul></description></item><item><title>Summer Internships 2019</title><link>https://www.rstudio.com/blog/2019-01-18-summer-internships-2019/</link><pubDate>Fri, 18 Jan 2019 15:20:59 +0000</pubDate><guid>https://www.rstudio.com/blog/2019-01-18-summer-internships-2019/</guid><description><p>We are excited to announce the second formal summer internship program at RStudio. The goal of this program is to enable RStudio employees to collaborate with students to do work that will help both RStudio users and the broader R community, and help ensure that the community of R developers is as diverse as its community of users. Over the course of the internship, you will work with experienced data scientists, software developers, and educators to create and share new tools and ideas.</p><p>The internship pays approximately $12,000 USD (paid hourly), lasts up to 10-12 weeks, and will start around June 1 (depending on your availability, applications are open now, and close at the end of February. To qualify, you must currently be a student (broadly construed - if you think you&rsquo;re a student, you probably qualify) and have some experience writing code in R and using Git and GitHub. To demonstrate these skills, your application needs to include a link to a package, Shiny app, or data analysis repository on GitHub. It&rsquo;s OK if you create something specifically for this application: we just want to know that you&rsquo;re already familiar with the mechanics of collaborative development in R.</p><p>RStudio is a geographically distributed team which means you can be based anywhere in the United States (we hope to expand the program to support interns in other countries next year). This means that unless you are based in Boston or Seattle, you will be working 100% remotely, though you will meet with your mentor regularly online, and we will pay for you to travel to one face-to-face work sprint with them.</p><p>We are recruiting interns for the following projects:</p><p><strong>Calibrated Peer Review</strong> - Prototype some tools to conduct experiments to see whether calibrated peer review is a useful and feasible feedback strategy in introductory data science classes and industry workshops. (Mine Çetinkaya-Rundel)</p><p><strong>Tidy Blocks</strong> - Prototype and evaluate a block-based version of the tidyverse so that young students can do simple analysis using an interface like Scratch. (Greg Wilson)</p><p><strong>Data Science Training for Software Engineers</strong> - Develop course materials to teach basic data analysis to programmers using software engineering problems and data sets. (Greg Wilson)</p><p><strong>Tidy Practice</strong> - Develop practice projects for learners to tackle to practice tidyverse (or other) skills using interesting real-world data. (Alison Hill)</p><p><strong>Teaching and Learning with RStudio</strong> - Create a one-stop guide to teaching with RStudio similar to Teaching and Learning with Jupyter (<a href="https://jupyter4edu.github.io/jupyter-edu-book/">https://jupyter4edu.github.io/jupyter-edu-book/</a>) (Alison Hill)</p><p><strong>Grader Enhancements</strong> - <a href="https://github.com/rstudio-education/grader">grader</a> works with <a href="https://github.com/rstudio/learnr">learnr tutorials</a> to grade student code. This project will enhance this ambitious project to help grader identify students&rsquo; exact mistakes so that it can help students do better. (Garrett Grolemund)</p><p><strong>Object Scrubbers</strong> - A lot of R objects contain elements that could be recreated and these can result in large object sizes for large data sets. Also, terms, formulas, and other objects can carry the entire global environment with them when they are saved. This internship would help write a set of methods that would scrub different types of objects to reduce their size on disk. (Max Kuhn and Davis Vaughan)</p><p><strong>Production Testing Tools for Data Science Pipelines</strong> - This project will build on “applicability domain” methods from computational chemistry to create functions that can be included in a dplyr pipeline to perform statistical checks on data in production. (Max Kuhn)</p><p><strong>Shiny Enhancements</strong> - There are a several Shiny and Shiny-related projects that are available, depending on the intern&rsquo;s interests and and skill set. Possible topics include: Shiny UI enhancements, improving performance bottlenecks by rewriting in C and C++, fixing bugs, and creating a set of higher-order reactives for more sophisticated reactive programming. (Barret Schloerke)</p><p><strong>ggplot2 Enhancements</strong> - Contribute to ggplot2 or an associated package (like scales). You&rsquo;ll write R code for graphics, but mostly you&rsquo;ll learn the challenges of managing a large, popular open source project including the care needed to avoid breaking changes, and actively gardening issues. You work will impact the millions of people who use ggplot2. (Hadley Wickham)</p><p><strong>R Markdown Enhancements</strong> - R Markdown is a cornerstone product of RStudio used by millions to create documents in their own publishing pipelines. The code base has grown organically over several years; the goal of this project is to refactor it. This involves tidying up inconsistencies in formatting, adding a comprehensive test suite, and improving consistency and coverage of documentation. (Rich Iannone)</p><p><strong><a href="https://docs.google.com/forms/d/e/1FAIpQLScUuZDuyfYS_MQ94qGC_3zKt5ykwFc9tytc5Ssu9Mlkg5CiMA/viewform?usp=sf_link">Apply now!</a></strong> Application deadline is February 22nd.</p><p>RStudio is committed to being a diverse and inclusive workplace. We encourage applicants of different backgrounds, cultures, genders, experiences, abilities and perspectives to apply. All qualified applicants will receive equal consideration without regard to race, color, national origin, religion, sexual orientation, gender, gender identity, age, or physical disability. However, applicants must legally be able to work in the United States.</p></description></item><item><title>RStudio Connect 1.7.0</title><link>https://www.rstudio.com/blog/announcing-rstudio-connect-1-7-0/</link><pubDate>Thu, 17 Jan 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-rstudio-connect-1-7-0/</guid><description><p>RStudio Connect is the publishing platform for everything you create in R. Inconversations with our customers, R users were excited to have a central placeto share all their data products, but were facing a tough problem. Theircolleagues working in Python didn&rsquo;t have the same option, leaving their workstranded on their desktops. Today, we are excited to introduce the ability fordata science teams to publish Jupyter Notebooks and mixed Python and R content to RStudio Connect.</p><p>Connect 1.7.0 is a major release that also includes many other significantimprovements, such as programmatic deployment and historical event data. Weencourage existing customers to <a href="https://rstudio.com/products/connect/download-commercial">upgrade</a>.</p><p>We also welcome anyone who has not yet experienced RStudio Connect to <a href="https://rstudio.com/products/connect/evaluation">try ittoday</a>!</p><figure><img src="https://www.rstudio.com/blog-images/rsc-170-jupyter.png"alt="Publishing to Connect from Jupyter"/> <figcaption><p>Publishing to Connect from Jupyter</p></figcaption></figure><h2 id="updates">Updates</h2><h3 id="python--r-reticulated--apps-apis-and-reports">Python &amp; R: Reticulated Apps, APIs, and Reports</h3><p>RStudio has often provided support for other languages frequently used with R.Earlier this year, we announced the <a href="https://rstudio.github.io/reticulate/">reticulate Rpackage</a> and <a href="https://blog.rstudio.com/2018/10/09/rstudio-1-2-preview-reticulated-python">IDEsupport</a>for creating projects that use R and Python. Now, these projects are fullysupported in RStudio Connect, as well. Whether you’re creating a reticulated Shiny app, aPlumber API that calls Python, or an R Markdown document that mixes Python andR, RStudio Connect will automatically re-create both the R and Pythonenvironments!</p><figure><img src="https://www.rstudio.com/blog-images/rsc-170-api.png"alt="Reticulated API with Python Environment Logs"/> <figcaption><p>Reticulated API with Python Environment Logs</p></figcaption></figure><p>Note: Server administrators need to add a <a href="https://docs.rstudio.com/connect/1.6.11/admin/python.html">Pythonconfiguration</a>for RStudio Connect.</p><h3 id="jupyter-notebooks">Jupyter Notebooks</h3><p>Data science teams in the enterprise can include people who use RStudio, JupyterNotebooks, or both. Now RStudio, Jupyter, and JupyterHub users can publish andshare the data products they create every day in one convenient place. JupyterNotebooks can be published to RStudio Connect using a <a href="https://docs.rstudio.com/rsconnect-jupyter">Jupyterextension</a>.Notebooks are published as static HTML files, or the notebook source can bepublished. When the source is included, RStudio Connect automatically restoresthe Python environment and the Jupyter Notebooks can be re-executed, emailed,and scheduled.</p><h3 id="programmatic-deployment">Programmatic Deployment</h3><p>As data products become critical to organizations, RStudio Connect users haverequested more flexible deployment options. Enterprise workflows sometimesrequire approvals to publish to production environments. For example, contentstored in Git may be published to separate QA or Production environments based onIT approvals.</p><p>In RStudio Connect 1.7.0, we’ve added support for <a href="https://docs.rstudio.com/connect/1.7.0/user/cookbook.html#cookbook-deploying">programmaticdeployment</a>in the <a href="https://docs.rstudio.com/connect/1.7.0/api/#content">RStudio Connect ServerAPI</a>. These new APIs letyour deployment engineers craft custom deployment workflows. We have created<a href="https://github.com/rstudio/connect-api-deploy-shiny">example scripts</a> showinghow to use the content APIs to deploy a Shiny application.</p><figure><img src="https://www.rstudio.com/blog-images/rsc-170-handoff.png"alt="Architecture Diagrams for Deployment Strategies"/> <figcaption><p>Architecture Diagrams for Deployment Strategies</p></figcaption></figure><h3 id="historical-event-data">Historical Event Data</h3><p>RStudio Connect now <a href="https://docs.rstudio.com/connect/1.7.0/admin/historical-information.html#historical_events">collects and surfacesdata</a>you can use to answer how often your data product is being viewed, whether itneeds updating, and who is using it. For reports, dashboards, plots, andnotebooks, RStudio Connect records “who, what, and when” for each visit. Thisdata is available to publishers and admins through the <a href="https://docs.rstudio.com/connect/1.7.0/api/#instrumentation">Connect ServerAPI</a>. We’vecreated a <a href="https://github.com/sol-eng/connect-usage">sample dashboard</a> you canuse out of the box or as a launch pad for your own analysis!</p><figure><img src="https://www.rstudio.com/blog-images/rsc-170-dash.png"alt="Dashboard of 30-day Usage Metrics"/> <figcaption><p>Dashboard of 30-day Usage Metrics</p></figcaption></figure><h3 id="security--authentication-changes">Security &amp; Authentication Changes</h3><ul><li><p><strong>Bundle Uploads</strong> Better protection against malicious bundle uploads thatpreviously may have written outside the content directory.</p></li><li><p><strong>Brute-force Protection</strong> A brute force attack protection has been added forinteractive authentication attempts.</p></li><li><p><strong>User Profiles</strong> Admins can now prevent users from editing their profiles for allauthentication providers. Additionally, Connect better handles email valuessupplied by authentication providers, fixing a bug that would prevent users fromlogging in. See the <a href="http://docs.rstudio.com/connect/news">release notes</a> for details.</p></li><li><p><strong>Audit Improvements</strong> All modifications to users and groups done via the<code>usermanager</code> utility are reported in the audit logs.</p></li><li><p><strong>Unique Users</strong> RStudio Connect better enforces unique users. The <code>usermanager</code> utility hasmultiple updates, making it easier to fix broken user accounts.</p></li><li><p><strong>Proxied Auth Improvements</strong> Installations using a custom authenticationprovider can provide <a href="https://docs.rstudio.com/connect/1.7.0/admin/authentication.html#authentication-proxy">complete user profiles through proxy headers</a>,removing the need for users or admins to update profiles manually or via theConnect Server API.</p></li><li><p><strong>Programmatic Group Management</strong> The <a href="https://docs.rstudio.com/connect/1.7.0/api/#groups">Connect Server API</a>can be used to create and manage groups for installations using Password or OAuth2 authentication methods.</p></li></ul><h3 id="deprecations--breaking-changes">Deprecations &amp; Breaking Changes</h3><ul><li><p><strong>Breaking</strong> The deprecated <code>OAuth2.DiscoveryEndpoint</code> configuration value has beenremoved.</p></li><li><p><strong>Deprecation</strong> The configuration setting <code>Password.UserInfoEditableBy</code> isdeprecated in favor of <code>Authorization.UserInfoEditableBy</code>. Future releases willremove the setting entirely.</p></li></ul><p>Please review the <a href="https://docs.rstudio.com/connect/news">full release notes</a></p><blockquote><h4 id="upgrade-planning">Upgrade Planning</h4><p>Please note the breaking changes and deprecations above. If you are interested in adding Python support (for Jupyter Notebooks or Reticulated Python &amp; R content), please follow the instructions to configure <a href="https://docs.rstudio.com/connect/1.7.0/admin/python.html">Python for RStudio Connect</a>. Upgrading to 1.7.0 from 1.6.10 can take upwards of ten minutes due to a data migration. If you are upgrading from an earlier version, be sure to consult the release notes for the intermediate releases, as well.</p></blockquote><p>If you haven&rsquo;t yet had a chance to download and try <a href="https://rstudio.com/products/connect/">RStudioConnect</a>, we encourage you to do so.RStudio Connect is the best way to share all your data science work (Shiny apps,R Markdown documents, Jupyter Notebooks, plots, dashboards, Plumber APIs, etc.)with collaborators, colleagues, or customers.</p><p>You can find more details or download a 45-day evaluation of the product at<a href="https://www.rstudio.com/products/connect/">https://www.rstudio.com/products/connect/</a>.Additional resources can be found below.</p><ul><li><a href="https://www.rstudio.com/products/connect/">RStudio Connect home page &amp; downloads</a></li><li><a href="http://docs.rstudio.com/connect/admin/">RStudio Connect Admin Guide</a></li><li><a href="http://docs.rstudio.com/connect/news/">Detailed news and changes between each version</a></li><li><a href="https://www.rstudio.com/pricing/#ConnectPricing">Pricing</a></li><li><a href="https://beta.rstudioconnect.com/connect/">An online preview of RStudio Connect</a></li></ul></description></item><item><title>RStudio Server Pro is now available on Microsoft Azure Marketplace</title><link>https://www.rstudio.com/blog/rstudio-server-pro-is-now-available-on-microsoft-azure-marketplace/</link><pubDate>Tue, 08 Jan 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-server-pro-is-now-available-on-microsoft-azure-marketplace/</guid><description><img src="https://www.rstudio.com/blog-images/2019-01-08-rsp-azure.png" style="width: 40%; float: right"/><p>RStudio is excited to announce the availability of its flagship, enterprise-ready, integrated development environment for R in Azure Marketplace.</p><p><a href="https://azuremarketplace.microsoft.com/en-us/marketplace/apps/rstudio-5237862.rstudioserverpro?tab=Overview">RStudio Server Pro for Azure</a> is an on-demand, commercially-licensed integrated development environment (IDE) for R on the Microsoft Azure Cloud. It offers all of the capabilities found in the popular RStudio open source IDE, plus turnkey convenience, enhanced security, the ability to manage multiple R versions and sessions, and more. It comes pre-configured with multiple versions of R, common systems libraries, and the most popular R packages.</p><p>RStudio Server Pro Azure helps you adapt to your unique circumstances. It allows you to choose different Azure computing instances whenever a project requires it, and helps avoid the sometimes complicated processes for procuring on-premises software.</p><p>If the enhanced security, elegant support for multiple R versions and multiple sessions, and commercially licensed and supported features of RStudio Server Pro appeal to you, consider RStudio Server Pro for Azure!</p><p><a href="https://support.rstudio.com/hc/en-us/articles/360014725833-Getting-Started-with-RStudio-Server-Pro-for-Azure">Read the FAQ Getting Started with RStudio Server Pro for Azure</a></p></description></item><item><title>Announcing the 1st Shiny Contest</title><link>https://www.rstudio.com/blog/first-shiny-contest/</link><pubDate>Mon, 07 Jan 2019 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/first-shiny-contest/</guid><description><img src="https://www.rstudio.com/blog-images/2019-01-07-shiny-contest.png" style="width: 40%; float: right"/><p>Shiny apps are a great way to communicate your data science insights with striking, dynamic, interactive visualizations and reports. Over the years, we have loved interacting with the Shiny community and loved seeing and sharing all the exciting apps, dashboards, and interactive documents Shiny developers have produced. We also love seeing Shiny developers openly sharing their code and process for building apps so that others can learn from them and improve their apps. In order to encourage more sharing, as well as to recognize the many outstanding ways people work with Shiny, we are happy to announce the first contest to recognize outstanding Shiny applications!</p><h2 id="criteria">Criteria</h2><p>Apps will be judged based on technical merit and/or on artistic achievement (e.g., UI design). We recognize that some apps may excel in one of these categories and some in the other, and some in both. Evaluation will be done keeping this in mind.</p><p>Evaluation will also take into account the feedback/reaction of other users in the submission posts in RStudio Community.</p><h2 id="requirements">Requirements</h2><p>There are only a few requirements to enter this contest:</p><ul><li>Your app should be in an <a href="http://rstudio.cloud">RStudio Cloud</a> project.</li><li>Your app should be <a href="http://shiny.rstudio.com/articles/shinyapps.html">deployed on shinyapps.io</a>.</li><li>Any data and code used in the app should be publicly available and/or openly licensed.</li></ul><p>If you’re new to <a href="http://rstudio.cloud">RStudio Cloud</a> and <a href="http://www.shinyapps.io/">shinyapps.io</a>, you can create an account for free. Additionally, you can find instructions specific to this contest <a href="https://docs.google.com/document/d/1p-5Ls2kEU9TUoUTQfBNqwEMPoAL0eNHceKRXDZ1koXc/edit?usp=sharing">here</a> and find the general RStudio Cloud guide <a href="https://rstudio.cloud/learn/guide">here</a>.</p><h2 id="need-inspiration">Need inspiration?</h2><p>Want to participate but need some inspiration to get started building an app? Take a look at the <a href="https://github.com/rfordatascience/tidytuesday">Tidy Tuesdays</a> datasets and feel free to use any of them as your starting point. If you’re just getting started with Shiny, the learning resources at <a href="http://shiny.rstudio.com/tutorial/">http://shiny.rstudio.com/tutorial/</a> might also come in handy.</p><p>Note that the app(s) you submit do not have to be based on data analyses. They could also be anything from an app for <a href="http://www.intro-stats.com/">teaching</a> to scaling a recipe for making <a href="https://hadley.shinyapps.io/eggnogr/">eggnog</a>.</p><p>Browsing the <a href="http://shiny.rstudio.com/gallery/">Shiny Gallery</a> and the <a href="https://www.rstudio.com/products/shiny/shiny-user-showcase/">Shiny User Showcase</a> might also provide lots of inspiration.</p><h2 id="awards">Awards</h2><h3 id="honorable-mention-prizes">Honorable Mention Prizes:</h3><ul><li>One year of shinyapps.io Basic plan</li><li>One RStudio t-shirt</li></ul><h3 id="runner-up-prizes">Runner Up Prizes:</h3><p>All awards above, plus</p><ul><li>All hex/RStudio stickers we can find</li><li>Any number of RStudio t-shirts, books, and mugs (worth up to $200)</li></ul><h3 id="grand-prizes">Grand Prizes:</h3><p><em>One in each category (Novice and Open)</em></p><p>All awards above, and</p><ul><li>Special &amp; persistent recognition by RStudio in the form of a winners page, and a badge that&rsquo;ll be publicly visible on your RStudio Community profile</li><li>Half-an-hour one-on-one with a representative from the RStudio Shiny team for Q&amp;A and feedback</li></ul><p>The names and work of all winners will be highlighted in the Shiny User Showcase and we will announce them on RStudio’s social platforms, including community.rstudio.com (unless the winner prefers not to be mentioned).</p><p>Of course, the main reward is knowing that you’ve helped future app developers!</p><p>This year’s competition will be judged Joe Cheng and Mine Çetinkaya-Rundel. Winners will be invited to serve as judges in future Shiny contests.</p><h2 id="submission">Submission</h2><p>To participate in this contest, please follow the link <a href="https://rstd.io/shiny-contest-2019">https://rstd.io/shiny-contest-2019</a> to create a new post in RStudio Community (you will be asked to sign up if you don’t have an account). The post title should start with “Shiny contest submission:“, followed by a short title to describe your application (e.g., “a Shiny app for mapping electric scooters in LA”). The post may describe features and highlights of the application, include screenshots and links to live examples and source repositories, and briefly explain key technical details (how the customization or extension was achieved).</p><p>In addition, the post should include</p><ul><li>link to an RStudio Cloud project with everything required to run your app (scripts, data, images, css, etc.), and</li><li>link to the deployed app on shinyapps.io.</li></ul><p>There is no limit on the number of entries one participant can submit. Please submit as many as you wish!</p><p>The deadline for the submission is March 8, 2019. You are welcome to either submit your existing Shiny apps or create one in two months! We will announce winners and their submissions in this blog, RStudio Community, and also on Twitter before March 22, 2019.</p><p>We’re looking forward to your submissions!</p></description></item><item><title>Thinking about going to rstudio::conf 2019? Act soon!</title><link>https://www.rstudio.com/blog/thinking-about-going-to-rstudio-conf-2019-act-soon/</link><pubDate>Wed, 28 Nov 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/thinking-about-going-to-rstudio-conf-2019-act-soon/</guid><description><p>rstudio::conf 2019, the conference for all things R and RStudio, is less than two months away and we’re looking forward to seeing everyone January 15 - 18 in Austin, TX.</p><br><p><a href="https://rstd.io/conf" button type="button" style= "padding: 12px 20px; border: none; font-size: 18px; border-radius: 3px; cursor: pointer; background-color: #4c83b6; color: #fff; box-shadow: 0, 1px, 3px, 0px, rgba(0,0,0,0.10);">Register for rstudio::conf 2019</a></p><br><p>Already registered?</p><br><a href="https://book.passkey.com/go/rstudio2019" button type="button" style= "padding: 12px 20px; border: none; font-size: 18px; border-radius: 3px; cursor: pointer; background-color: #4c83b6; color: #fff; box-shadow: 0, 1px, 3px, 0px, rgba(0,0,0,0.10);">Book your room at the Fairmont</a><br><p>If you haven’t registered for a workshop or the conference or booked your travel yet, we have some good news and some bad.</p><p>The bad news (because we want everyone to be able to attend) is that many workshops are sold out, and conference registrations were already at 91% of capacity as of November 28!</p><p>The good news is that several excellent workshops still have room, and we’ve just expanded the available meeting space and catering at the Fairmont to accommodate more conference registrations. Regrettably, we have no more academic seats available, but the discount for groups of five or more from a single organization is still available. The hotel also still has rooms under the conference block so you can make the most of your conference experience - and stay warm and dry should the weather turn chilly and wet.</p><p>So, if you want to get the amazing data science skills booster shot that is rstudio::conf, please register for the conference today. And if you haven’t picked a workshop yet, here are some workshops with at least 20 seats still available as of November 28, 2018.</p><p><strong>Using Shiny in Production</strong>Shiny applications are being deployed in high-value, customer-facing, and/or enterprise-wide scenarios. Unfortunately, they are often being done without the benefit of best practices distilled by our teams who work with RStudio customers. Few know these practices as well as Sean Lopp and Kelly O’Briant. This workshop will help you and/or your IT colleagues who support your data scientists learn how to accelerate a successful Shiny application deployment in production scenarios.</p><p><strong>Advanced R Markdown</strong>Few experienced R users are unaware of Yihui Xie or the many R packages he has authored and co-authored. This workshop, based on the new book R Markdown: The Definitive Guide by Yihui Xie, J.J. Allaire, and Garrett Grolemund, is a rare opportunity to learn and practice the things you never knew R Markdown could do, with Yihui himself and Alison Hill.</p><p><strong>Big Data with R</strong>Edgar Ruiz returns with James Blair to deliver one of the most popular and highly rated workshops from last year. If your R work involves databases and big data clusters, this is the workshop for you.</p><p><strong>Shiny Train-the-Trainer Certification</strong>Popular educator Mine Çetinkaya-Rundel leads this workshop for proficient Shiny users who want to build classes or internal workshops with RStudio’s beginner and intermediate curricula. The course will also cover how to use RStudio Cloud and its tutorials to jump-start your lessons.</p><p>Other workshops still available, but expected to sell out soon, include “What they Forgot to Teach you about R” (Instructor: Jenny Bryan); “Intermediate Shiny” (Instructor: Aimee Gott); and “Introduction to the Tidyverse” (Instructor: Amelia McNamara with Hadley Wickham)</p><p>We can’t wait to see you in Austin!</p></description></item><item><title>RStudio 1.2 Preview: The Little Things</title><link>https://www.rstudio.com/blog/rstudio-1-2-preview-the-little-things/</link><pubDate>Mon, 19 Nov 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-1-2-preview-the-little-things/</guid><description><p><em>Today, we’re continuing our blog series on new features in RStudio 1.2. If you’d like to try these features out for yourself, you can download a <a href="https://www.rstudio.com/products/rstudio/download/preview/">preview release of RStudio 1.2</a>.</em></p><p>In this blog series thus far, we&rsquo;ve focused on the biggest new features in RStudio 1.2. Today, we&rsquo;ll take a look at some of the smaller ones.</p><h2 id="detect-missing-r-packages">Detect missing R packages</h2><p>Many R scripts open with calls to <code>library()</code> and <code>require()</code> to load the packages they need in order to execute. If you open an R script that references packages that you don&rsquo;t have installed, RStudio will now offer to install all the needed packages in a single click. No more typing <code>install.packages()</code> repeatedly until the errors go away!</p><img src="https://www.rstudio.com/blog-images/2018-11-19-missing-packages.png" style="width: 466px"/><p>If this isn&rsquo;t helpful in your workflow, disable it in <em>Options -&gt; Code -&gt; Diagnostics -&gt; [x] Prompt to install missing R packages</em>.</p><h2 id="create-powerpoint-presentations">Create PowerPoint presentations</h2><p>R Markdown is a great tool for making presentations &ndash; in addition to being convenient (no more copying and pasting results and graphs!), your whole presentation becomes a reproducible document. Now you can <a href="https://bookdown.org/yihui/rmarkdown/powerpoint-presentation.html">author PowerPoint presentations in R Markdown</a>, thanks to the <a href="https://pandoc.org/MANUAL.html#producing-slide-shows-with-pandoc">new PowerPoint presentation support in Pandoc 2</a>.</p><p>To make a new PowerPoint presentation, go to <em>File -&gt; New File -&gt; R Markdown -&gt; Presentation -&gt; PowerPoint</em>. You can also make a PowerPoint presentation out of one of your existing presentations by opening the Knit menu&hellip;</p><img src="https://www.rstudio.com/blog-images/2018-11-19-r-markdown-powerpoint.png" style="width: 311px"/><p>&hellip; or just by including <code>output: powerpoint_presentation</code> in your presentation&rsquo;s YAML header.</p><p>When working with PowerPoint, using the <em>Knit</em> command will cause your slideshow to re-open in PowerPoint right where you left off, so it&rsquo;s easy to iterate on changes.</p><p>Se our <a href="https://support.rstudio.com/hc/en-us/articles/360004672913-Rendering-PowerPoint-Presentations-with-RStudio">guide to rendering PowerPoint presentations with RStudio</a> for more details, including how to use columns, templates, and speaker notes.</p><h2 id="filter-data-with-histograms">Filter data with histograms</h2><p>The data viewer in RStudio 1.1 lets you filter numeric columns by dragging a pair of endpoints to select the range of interest.</p><img src="https://www.rstudio.com/blog-images/2018-11-19-filter-sliders.png" style="width: 237px"/><p>Many of you let us know that you wanted to be able to enter the exact values you wanted to filter by, or type just one value. We&rsquo;ve implemented this, and we&rsquo;ve also added a histogram to make it easy to see the distribution of the data in the column at a glance.</p><img src="https://www.rstudio.com/blog-images/2018-11-19-filter-histogram.png" style="width: 269px"/><p>Just brush (drag over) the section of the histogram you&rsquo;re interested in to apply a filter. You can also type your own range into the textbox, or even a single value if that&rsquo;s what you&rsquo;re interested in.</p><h2 id="show-hidden-files">Show hidden files</h2><p>RStudio&rsquo;s <em>Files</em> pane has traditionally not shown hidden files, with the exception of a handful that it knows to be useful (for example, your <code>.Rprofile</code>). Sometimes, however, you really want to be able to see everything &ndash; and now you can. Click <em>More</em> on the Files pane, then check <em>Show Hidden Files</em>.</p><img src="https://www.rstudio.com/blog-images/2018-11-19-show-hidden-files.png" style="width: 483px"/><h2 id="click-to-force-promises">Click to force promises</h2><p>You might have noticed that sometimes &ndash; especially while debugging &ndash; values in the Environment pane look disabled and show an expression rather than a value. For example, try typing <code>data(&quot;AirPassengers&quot;)</code>. You&rsquo;ll see this in your Environment pane:</p><img src="https://www.rstudio.com/blog-images/2018-11-19-force-promises.png" style="width: 246px"/><p>These values are called &ldquo;promises&rdquo; and represent function arguments or other unevaluated expressions (read more about promises in the <a href="http://adv-r.had.co.nz/Computing-on-the-language.html">non-standard evaluation chapter of Advanced R</a>). (Why doesn&rsquo;t RStudio show the value right away? The Environment pane tries hard to avoid causing side effects, so it doesn&rsquo;t evaluate unevaluated expressions.)</p><p>If you <em>do</em> want to see the value right away, you can now just click it in the Environment pane. RStudio will call <code>force()</code> on the promise for you, and you can see its value immediately.</p><h2 id="explore-list-columns">Explore list columns</h2><p>Sometimes the data in a data frame isn&rsquo;t simple; if the data frame was derived from nested data (such as JSON) some of its values may themselves be lists (read the <a href="https://jennybc.github.io/purrr-tutorial/ls13_list-columns.html">purrr tutorial on list columns</a> for more). In RStudio 1.2, the data viewer makes these columns easy to explore by rendering the list values:</p><img src="https://www.rstudio.com/blog-images/2018-11-19-list-columns.png" style="width: 493px"/><p>Click on any list value to open it in the Object Explorer.</p><h2 id="search-in-connections">Search in Connections</h2><p>If you connect to a database with a lot of tables, it can be tedious to scroll to the one you want. You can now type part of the table&rsquo;s name in the new Search box instead:</p><img src="https://www.rstudio.com/blog-images/2018-11-19-search-connections.png" style="width: 423px"/><p><em>Note that, for performance reasons, this search field doesn&rsquo;t search all the object names in your database; it only filters the ones that are already visible.</em></p><h2 id="custom-knitr-engines-in-r-notebooks">Custom knitr engines in R Notebooks</h2><p>One of the things that makes knitr powerful is its support for custom engines. In RStudio 1.2, we&rsquo;ve made R Notebooks more extensible by integrating support for these custom engines. Here&rsquo;s an example of an R Markdown document with a custom engine called <code>data</code>, which evaluates a chunk&rsquo;s contents as raw data (in the style of a <a href="https://www.tldp.org/LDP/abs/html/here-docs.html">bash heredoc</a> for data):</p><pre><code>---title: Hello Text Data---Register the `data` engine.```{r}knitr::knit_engines$set(data = function(options) {assign(options$output.var,read.table(text = options$code),envir = knitr::knit_global())NULL})```Use the `data` engine to evaluate this chunk, and save results to `x`.```{data, output.var='x'}26 A L30 A L18 A M20 B H```Print the resultant value.```{r}x```</code></pre><p>In RStudio 1.2, you can run all of this code in an R Notebook. The results appear right beneath the chnk:</p><img src="https://www.rstudio.com/blog-images/2018-11-19-custom-knit-engine.png" style="width: 561px"/><h2 id="improved-package-and-repo-management">Improved package and repo management</h2><p>We&rsquo;ve revamped the <em>Packages</em> section of Options. Now you can manage primary <em>and</em> secondary repositories from inside RStudio.</p><img src="https://www.rstudio.com/blog-images/2018-11-19-packages-pane.png" style="width: 595px"/><p>(Hint: this feature works great with the new <a href="https://www.rstudio.com/products/package-manager/">RStudio Package Manager</a>.)</p><p>You might also have noticed that we&rsquo;ve added web links to every package in the Packages pane. These will take you the package&rsquo;s homepage &ndash; even if it&rsquo;s on Github rather than CRAN.</p><img src="https://www.rstudio.com/blog-images/2018-11-19-browse-package.png" style="width: 494px"/><h2 id="magic-source-comments">Magic !source comments</h2><p>Have you ever wanted to customize the behavior of the <em>Source</em> command in RStudio? Maybe you have an R script that needs some pre- or post- actions before <code>source()</code>, or you&rsquo;re working with a file for which <code>source()</code> doesn&rsquo;t make sense at all &ndash; for instance, when using <a href="https://github.com/jimhester/altparsers">alternate R parsers</a>.</p><p>Now you can make RStudio&rsquo;s <em>Source</em> command do whatever you like, using a magic comment like the following:</p><pre><code># !source altparsers::src</code></pre><p>This tells RStudio to use <code>altparsers::src()</code> instead of <code>source()</code> when you invoke the <em>Source</em> command. You can customize the behavior even more by using <code>.file</code> or <code>.code</code> in the magic command to refer to the filename and its contents, respectively.</p><h2 id="wrapup">Wrapup</h2><p>We&rsquo;re always looking for little ways to make RStudio a more comfortable environment for your day-to-day work with R, and we hope these small changes add up for you. We very much appreciate your feedback and ideas &ndash; many of the above were suggestions from the community. Download the <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio 1.2 Preview</a> to try these features out, and visit the <a href="https://community.rstudio.com/c/rstudio-ide">community forum</a> to let us know what you think!</p></description></item><item><title>Shiny 1.2.0: Plot caching</title><link>https://www.rstudio.com/blog/shiny-1-2-0/</link><pubDate>Tue, 13 Nov 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-1-2-0/</guid><description><p>We&rsquo;re pleased to announce the CRAN release of Shiny v1.2.0! This release features Plot Caching, an important new tool for improving performance and scalability in Shiny apps.</p><p>If you&rsquo;re not familiar with the term &ldquo;caching&rdquo;, it just means that when we perform a time-consuming operation, we save (cache) the results so that the next time that operation is requested, we can skip the actual operation and instantly fetch the previously cached results. Shiny&rsquo;s reactive expressions do some amount of caching for you already, and you can use more explicit techniques to cache the various operations you might do to your data (using the <a href="https://github.com/r-lib/memoise">memoise</a> package, or manually saving intermediate data frames to disk as CSV or RDS, for two examples).</p><p>Plots, being very common and (potentially) expensive-to-compute outputs, are great candidates for caching, and in theory you can use <code>renderImage</code> to accomplish this. But because Shiny&rsquo;s <code>renderPlot</code> function contains a lot of complex infrastructure code, it&rsquo;s actually quite a difficult task. Despite some valiant <a href="https://stackoverflow.com/questions/24192570/caching-plots-in-r-shiny">attempts</a>, all of the examples we&rsquo;ve seen in the wild have had serious limitations we wanted to overcome.</p><p>Shiny v1.2.0 introduces a new function, <code>renderCachedPlot</code>, that makes it much easier to add plot caching to your app.</p><h3 id="using-rendercachedplot">Using <code>renderCachedPlot</code></h3><p>Let&rsquo;s take a simple, but expensive, plot output:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">output<span style="color:#666">$</span>plot <span style="color:#666">&lt;-</span> <span style="color:#06287e">renderPlot</span>({<span style="color:#06287e">ggplot</span>(diamonds, <span style="color:#06287e">aes</span>(carat, price, color <span style="color:#666">=</span> <span style="color:#666">!</span><span style="color:#666">!</span>input<span style="color:#666">$</span>color_by)) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>()})</code></pre></div><p>The <code>diamonds</code> dataset has 53,940 rows. On my laptop, this takes about 1580 milliseconds (1.58 seconds). Perhaps that&rsquo;s fast enough for doing exploratory data analysis, but it&rsquo;s slower than we&rsquo;d like for a high traffic Shiny app.</p><p>We can tell Shiny to cache this plot in two steps.</p><ol><li>Change <code>renderPlot</code> to <code>renderCachedPlot</code>.</li><li>Provide a suitable <code>cacheKeyExpr</code>. This is an expression that Shiny will use to determine which invocations of <code>renderCachedPlot</code> should be considered equivalent to each other. (In this case, two plots with different <code>input$color_by</code> values can&rsquo;t be considered the &ldquo;same&rdquo; plot, so the <code>cacheKeyExpr</code> needs to have <code>input$color_by</code>.)</li></ol><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">output<span style="color:#666">$</span>plot <span style="color:#666">&lt;-</span> <span style="color:#06287e">renderCachedPlot</span>({<span style="color:#06287e">ggplot</span>(diamonds, <span style="color:#06287e">aes</span>(carat, price, color <span style="color:#666">=</span> <span style="color:#666">!</span><span style="color:#666">!</span>input<span style="color:#666">$</span>color_by)) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>()}, cacheKeyExpr <span style="color:#666">=</span> { input<span style="color:#666">$</span>color_by })</code></pre></div><p>With these code changes, the first time a plot with a particular <code>input$color_by</code> is requested, it will take the normal amount of time. But the next time it is requested, it will be almost instant, as the previously rendered plot will be reused.</p><h3 id="benchmarking-the-results">Benchmarking the results</h3><p>To quantify the performance difference between the cached and uncached versions, I turned each into a minimal Shiny app (<a href="https://gist.github.com/jcheng5/1f09a0939ae45fd36f286a158bcb0dfb">source</a>). This app simply provides the <code>color_by</code> input using the new <code>varSelectInput</code> control, and then renders the output using either of the two code examples above. Then I used our (still-in-development) <a href="https://resources.rstudio.com/rstudio-conf-2018/scaling-shiny-sean-lopp">Shiny load testing tools</a> to record a test script, and &ldquo;replay&rdquo; it against both versions of the app, each running in a single R process.</p><p>I tested the <code>renderPlot</code> version of the app with 5 concurrent users, and the <code>renderCachedPlot</code> version with 25, 50, and 100 concurrent users. The difference in performance is as dramatic as we&rsquo;d expect:</p><img src="https://www.rstudio.com/blog-images/2018-11-05-shiny-1-2-0.png" width="500" alt="A chart showing that renderCachedPlot with 100 users is faster than renderPlot with 5 users"/><p>With only five concurrent users, the latency is already pretty bad with the <code>renderPlot</code> version. (Note that this isn&rsquo;t intended to represent typical performance with Shiny apps in general! We chose a particularly torturous ggplot on purpose.)</p><p>On the other hand, the <code>renderCachedPlot</code> version doesn&rsquo;t break a sweat with 50 concurrent users; and even at 100 concurrent users, the latency is still acceptable.</p><h3 id="when-to-use-plot-caching">When to use plot caching</h3><p>A Shiny app is a good candidate for plot caching if:</p><ol><li>The app has plot outputs that are time-consuming to generate,</li><li>These plots are a significant fraction of the total amount of time the app spends thinking, and</li><li>Most users are likely to request the same few plots.</li></ol><p>Since our test has a pretty expensive plot as its only output, and our load testing tools simulate <em>n</em> users all doing the same exact thing, this these numbers reflect a best-case scenario for plot caching.</p><p>Shiny can store your cached plots in memory, on disk, or with another backend like <a href="https://redis.io/">Redis</a>. You also have a number of options for limiting the size of the cache. Be sure to read <a href="http://shiny.rstudio.com/articles/plot-caching.html">this article</a> to get the most out of this feature!</p><p>In future releases of Shiny, we hope to build on this foundation we&rsquo;ve laid to dramatically speed up other areas of Shiny apps, like reactive expressions and non-plot outputs. In the meantime, we hope you find plot caching to be a useful addition to your performance toolkit!</p><h2 id="other-changes-in-shiny-v120">Other changes in Shiny v1.2.0</h2><ul><li><p>Upgrade FontAwesome from 4.7.0 to 5.3.1 and made <code>icon</code> tags browsable, which means they will display in a web browser or RStudio viewer by default (<a href="https://github.com/rstudio/shiny/issues/2186">#2186</a>). Note that if your application or library depends on FontAwesome directly using custom CSS, you may need to make some or all of the changes recommended in <a href="https://fontawesome.com/how-to-use/on-the-web/setup/upgrading-from-version-4">Upgrade from Version 4</a>. Font Awesome icons can also now be used in static R Markdown documents.</p></li><li><p>Address <a href="https://github.com/rstudio/shiny/issues/174">#174</a>: Added <code>datesdisabled</code> and <code>daysofweekdisabled</code> as new parameters to <code>dateInput()</code>. This resolves <a href="https://github.com/rstudio/shiny/issues/174">#174</a> and exposes the underlying arguments of <a href="http://bootstrap-datepicker.readthedocs.io/en/latest/options.html#datesdisabled">Bootstrap Datepicker</a>. <code>datesdisabled</code> expects a character vector with values in <code>yyyy/mm/dd</code> format and <code>daysofweekdisabled</code> expects an integer vector with day interger ids (Sunday=0, Saturday=6). The default value for both is <code>NULL</code>, which leaves all days selectable. Thanks, @nathancday! <a href="https://github.com/rstudio/shiny/pull/2147">#2147</a></p></li></ul><p>See the <a href="http://shiny.rstudio.com/reference/shiny/1.2.0/upgrade.html">full changelog for Shiny v1.2.0</a> for other minor improvements and bug fixes we&rsquo;ve made in this release.</p></description></item><item><title>RStudio 1.2 Preview - New Features in RStudio Pro</title><link>https://www.rstudio.com/blog/rstudio-rsp-1.2-features/</link><pubDate>Mon, 05 Nov 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-rsp-1.2-features/</guid><description><p><em>Today, we&rsquo;re continuing our blog series on new features in RStudio 1.2. If you’d like to try these features out for yourself, you can download a <a href="https://www.rstudio.com/products/rstudio/download/preview/">preview release of RStudio Pro 1.2</a>.</em></p><p>We&rsquo;ve added some great new features to RStudio Pro for v1.2, which includes not only Server Pro, but also the new and improved Pro Desktop. Let&rsquo;s get started!</p><h2 id="rstudio-server-pro">RStudio Server Pro</h2><h3 id="the-job-launcher">The Job Launcher</h3><p>Perhaps the biggest new change in v1.2 is the Job Launcher. This allows you to run RStudio sessions and ad-hoc R scripts within your already existing cluster workload managers, such as Kubernetes, allowing you to leverage your existing infrastructure instead of provisioning load balancer nodes manually. At release, we will support the following clusters:</p><ul><li><a href="https://kubernetes.io">Kubernetes</a></li><li><a href="https://slurm.schedmd.com">Slurm</a></li></ul><p>The following diagram shows an example of how you can use the Job Launcher with Kubernetes to painlessly scale RStudio Server Pro across potentially hundreds of nodes.</p><p><img src="https://www.rstudio.com/blog-images/2018-11-05-launcher-sessions.png" alt="Launcher Sessions"></p><p>When starting RSP sessions via the Launcher, users will still use the same home page that they are familiar with, but will have additional options for controlling the creation of their sessions within your distributed environment of choice.</p><p><img src="https://www.rstudio.com/blog-images/2018-11-05-launcher-home-page.png" alt="Launcher Homepage"></p><p>We determined that most RSP users were already using Slurm and Kubernetes, so integration with them was added first. However, the Job Launcher is an extensible system that makes it fairly simple to develop plugins to target different cluster types. We plan to develop more plugins in the future, and would love to hear from you about what we should tackle next! At present, we plan to add support for <a href="https://www.ibm.com/us-en/marketplace/hpc-workload-management">LSF</a>.</p><p>For more information on launching ad-hoc jobs, see our upcoming blog post on background jobs. For more information on using the Job Launcher with RStudio Server Pro, see the <a href="http://docs.rstudio.com/ide/server-pro/1.2.1086-1/job-launcher.html">documentation</a>.</p><h3 id="improved-r-version-management">Improved R Version Management</h3><p>We&rsquo;ve improved management of various versions of R within your environments, allowing you to:</p><ul><li>Label each version so users can have a friendly name associated with each version. This makes it easy to differentiate between similar versions for different environments, such as when running parallel versions of Microsoft R and Vanilla R.</li><li>Execute an arbitrary script when the version is loaded, perhaps to dynamically alter any important environment variables (such as LD_LIBRARY_PATH)</li><li>Load an arbitrary environment module, if using Environment Modules (see <a href="https://en.wikipedia.org/wiki/Environment_Modules_(software)">environment modules</a>)</li></ul><p>When specifying a label, users will see the label on the home page, as well as within the session itself when accessing the version switch menu. The following screenshots show an example where the R version 3.1.3 was given the label <code>My Special R Version</code>.</p><p><img src="https://www.rstudio.com/blog-images/2018-11-05-home-page-label.png" alt="Home Page Label"><img src="https://www.rstudio.com/blog-images/2018-11-05-session-version-select-label.png" alt="Session Version Select Label"></p><p>For a more detailed guide on configuring R Versions, see the <a href="http://docs.rstudio.com/ide/server-pro/1.2.1086-1/r-versions.html#extended-r-version-definitions">documentation</a>.</p><h3 id="configuration-reload">Configuration Reload</h3><p>We&rsquo;ve added the ability to reload some of Server Pro&rsquo;s configuration at run-time, without having to stop the server and interrupt users&rsquo; sessions. Currently, the following is supported:</p><ul><li>Reloading <code>/etc/rstudio/load-balancer</code> to add new nodes or remove existing nodes. Note that when removing nodes, removed nodes need to have their RStudio processes stopped by running <code>sudo rstudio-server stop</code> on that node before invoking the configuration reload.</li><li>Reloading the list of server R versions specified in <code>/etc/rstudio/r-versions</code>.</li></ul><p>In order to perform the configuration reload, simply edit the above files as desired and then send the <code>SIGHUP</code> signal to the <code>rserver</code> executable, like so:</p><pre><code>pidof rserver | sudo xargs kill -s SIGHUP</code></pre><h2 id="rstudio-pro-desktop">RStudio Pro Desktop</h2><p>With the release of RStudio v1.2 we are excited to announce the RStudio Pro Desktop, a fully licensed platform that provides enterprise users with an enhanced version of RStudio Desktop that comes with professional priority support. The Pro Desktop will be built on over time to include new capabilities and integrations with other RStudio professional products.</p><h3 id="bundled-odbc-drivers">Bundled ODBC Drivers</h3><p>Pro Desktop now adds support for installing the <a href="https://www.rstudio.com/products/drivers">RStudio Pro Drivers</a> for connecting to various ODBC data sources, such as MongoDB, Oracle, and PostgreSQL (just to name a few!).</p><p>Connecting to a database is simple - just click on the New Connection button under the Connections pane, and you&rsquo;ll be greeted with a dialog from which to select your database type.</p><p><img src="https://www.rstudio.com/blog-images/2018-11-05-odbc-connections.png" alt="Database Connections"></p><p>When connecting to a data source for the first time, you will be prompted to install the ODBC package. Simply click yes, and then you will be able to connect to many of the most popular databases available today!</p><p><img src="https://www.rstudio.com/blog-images/2018-11-05-odbc-install.png" alt="ODBC install"></p><p>For more information on database connectivity within RStudio Pro, see the <a href="http://db.rstudio.com">documentation</a>.</p><hr><p>If you&rsquo;re interested in giving the new RStudio Pro features a try, please <a href="https://www.rstudio.com/products/rstudio/download/preview">download the RStudio 1.2 preview</a>. For more detailed documentation on RStudio Pro features, see the <a href="http://docs.rstudio.com/ide/server-pro/1.2.1086-1">admin guide</a>.</p></description></item><item><title>RStudio IDE Custom Theme Support</title><link>https://www.rstudio.com/blog/rstudio-ide-custom-theme-support/</link><pubDate>Mon, 29 Oct 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-ide-custom-theme-support/</guid><description><p>We&rsquo;re excited to announce that RStudio v1.2 has added support for custom editor themes. Custom editor themes will allow you to adjust the background color of the editor and syntax highlighting of code in RStudio to better suit your own personal style.</p><p>New editor themes can be added to RStudio by importing a tmTheme or sharing an existing rstheme file. The tmTheme file format was first introduced for the TextMate editor, and has since become a standard theme format. The rstheme format is specific to RStudio.</p><h2 id="importing-a-custom-theme">Importing a Custom Theme</h2><p>Before you can add a theme to RStudio, you&rsquo;ll have to find a theme in the right format. This <a href="https://tmtheme-editor.herokuapp.com">online tmTheme editor</a> will allow you to create your own tmThemes or download an existing theme from a large collection of themes. If you are interested in writing your own theme be sure to read this <a href="https://rstudio.github.io/rstudio-extensions/rstudio-theme-creation.html">RStudio Extensions article about writing themes</a>.</p><p>Once you have a tmTheme or rstheme file for your favorite theme or themes, you can import it to RStudio. Follow the instructions below to import a theme.</p><ol><li><p>In the menu bar, open the &ldquo;Tools&rdquo; menu.</p></li><li><p>From the drop down menu, choose &ldquo;Global Options&rdquo;.<img src="https://www.rstudio.com/blog-images/2018-10-29-import-theme-steps-1-and-2.png" align="center"/></p></li><li><p>In the pane on the left hand side of the options window, click &ldquo;Appearance&rdquo;.</p></li><li><p>To import a theme, click on the &ldquo;Add&hellip;&rdquo; button.<img src="https://www.rstudio.com/blog-images/2018-10-29-import-theme-steps-3-and-4.png" align="center"/></p></li><li><p>In the file browser, navigate to the location where you&rsquo;ve saved your theme file.<img src="https://www.rstudio.com/blog-images/2018-10-29-import-theme-step-5.png" align="center"/></p></li><li><p>If prompted to install R packages, select &ldquo;Yes&rdquo;.<img src="https://www.rstudio.com/blog-images/2018-10-29-import-theme-step-6.png" align="center"/></p></li><li><p>You should now see your newly added theme in the list of editor themes. Simply click the &ldquo;Apply&rdquo; button to apply your theme to RStudio.<img src="https://www.rstudio.com/blog-images/2018-10-29-import-theme-step-7.png" align="center"/></p></li></ol><img src="https://www.rstudio.com/blog-images/2018-10-29-night-owl.png" align="center"/><p>The theme pictured in these examples is called Night Owlish, and was adapted from the Night Owl theme by RStudio&rsquo;s own Mara Averick. It can be found on <a href="https://github.com/batpigandme/night-owlish">her github page</a>.</p><h2 id="removing-a-custom-theme">Removing a Custom Theme</h2><p>If you accidentally added a theme, or you want to add an updated version, you can remove the theme from RStudio. To do so, follow the instructions below.</p><ol><li><p>As above, navigate to the Appearance Preferences Pane in the Global Options.</p></li><li><p>If the theme you wish to remove is the active theme, be sure to switch to a different theme first.</p></li><li><p>Select the theme you wish to remove from the list of themes and click the &ldquo;Remove&rdquo; button.<img src="https://www.rstudio.com/blog-images/2018-10-29-remove-theme-step-1.png" align="center"/></p></li><li><p>Select &ldquo;Yes&rdquo; when prompted for confirmation.<img src="https://www.rstudio.com/blog-images/2018-10-29-remove-theme-step-2.png" align="center"/></p></li></ol><h2 id="sharing-themes">Sharing Themes</h2><p>If you&rsquo;ve found (or made) a really cool theme that you want to share, you can do so just by sharing the tmTheme or rstheme file. Then the recipient can import it as per the instructions in the <a href="#importing-a-custom-theme">Importing a Custom Theme section</a>. There is no difference between sharing the tmTheme file, or the rstheme file that is generated after the theme gets imported to RStudio, unless you or someone else has made changes to the rstheme file itself.</p><p>rstheme files can be found in the <code>.R</code> directory under your home directory. On Windows, the path is <code>C:\Users\&lt;your user account&gt;\Documents\.R\rstudio\themes</code>. On all other operating systems, the path is <code>~/.R/rstudio/themes</code>.</p><h2 id="some-of-our-favorite-themes">Some of Our Favorite Themes</h2><p>To find out more about themes in RStudio, check out this <a href="https://support.rstudio.com/hc/en-us/articles/115011846747-Using-RStudio-Themes">support article about themes</a>. In the meantime, here is RStudio styled using some of our favorite themes:</p><p><a href="https://github.com/dempfi/ayu">Ayu Dark, Light, and Mirage by dempfi</a>:<img src="https://www.rstudio.com/blog-images/2018-10-29-ayu-dark.png" align="center"/>Ayu Dark</p><img src="https://www.rstudio.com/blog-images/2018-10-29-ayu-mirage.png" align="center"/>Ayu Mirage<img src="https://www.rstudio.com/blog-images/2018-10-29-ayu-light.png" align="center"/>Ayu Light<p><a href="https://tmtheme-editor.herokuapp.com/#!/editor/theme/Candy%20Brights">Candy Brights</a>:<img src="https://www.rstudio.com/blog-images/2018-10-29-candy-brights.png" align="center"/></p><p><a href="https://github.com/randy3k/dotfiles/blob/master/.R/rstudio/themes/Wombat.rstheme">Wombat, by randy3k</a>:<img src="https://www.rstudio.com/blog-images/2018-10-29-wombat.png" align="center"/></p><p>This theme is an example of a theme where the rstheme file was modified directly. Without editing the rstheme file, it wouldn&rsquo;t have been possible to change the style of non-editor elements of RStudio, like the tabs above the different panes. To learn more about creating new custom themes for RStudio, take a look at this <a href="https://rstudio.github.io/rstudio-extensions/rstudio-theme-creation.html">RStudio Extensions article about writing themes</a>.</p><p>We look forward to seeing what great new themes the RStudio community comes up with!</p><p>You can download the RStudio 1.2 Preview Release at <a href="https://www.rstudio.com/products/rstudio/download/preview/">https://www.rstudio.com/products/rstudio/download/preview/</a>. If you have any questions or comments, please get in touch with us on the <a href="https://community.rstudio.com/c/rstudio-ide">community forums</a>.</p></description></item><item><title>RStudio 1.2 Preview: Plumber Integration</title><link>https://www.rstudio.com/blog/rstudio-1-2-preview-plumber-integration/</link><pubDate>Tue, 23 Oct 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-1-2-preview-plumber-integration/</guid><description><p>The <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio 1.2 Preview Release</a>makes it even easier to create RESTful Web APIs in R using the <a href="https://cran.r-project.org/web/packages/plumber/index.html">plumber</a> package.</p><blockquote><p>plumber is a package that converts your existing R code to a web API using a handful of special one-line comments.</p></blockquote><p>RStudio 1.2 provides the following plumber-related productivity enhancements:</p><ul><li>push-button local server execution for development and testing</li><li>easy API publishing to <a href="https://rstudio.com/products/connect">RStudio Connect</a></li><li>automatic API documentation and interactive execution via Swagger</li><li>create a new Plumber API project or add an API to an existing directory</li></ul><p>A full discussion of Web APIs and the plumber package is beyond the scope of this article; for a primer, check out:<a href="https://rviews.rstudio.com/2018/07/23/rest-apis-and-plumber/">R Views: REST APIs and Plumber</a></p><p>Let&rsquo;s take a look at the new features.</p><h2 id="creating-an-api">Creating an API</h2><p>On the RStudio main menu, select <strong>File / New Files / Plumber API</strong>.</p><p>RStudio will offer to install plumber and any dependencies, if necessary. Then, give your API a folder name and a location for that folder:</p><p><img src="https://www.rstudio.com/post/2018-10-23-rstudio-1-2-preview-plumber_files/new_plumber_api.png" alt="New Plumber API Dialog"></p><p>An R source file named <strong>plumber.R</strong> containing sample APIs is opened in RStudio. This file shows the essentials of the plumber-specific annotations that identify and document APIs. For the example input above, the location would be <code>~/code/MyAPI/HelloWorld/plumber.R</code>.</p><p><img src="https://www.rstudio.com/post/2018-10-23-rstudio-1-2-preview-plumber_files/plumber_source.png" alt="plumber.R source file"></p><h2 id="creating-a-plumber-api-project">Creating a Plumber API project</h2><p>If a recent version of the plumber package is already installed, you can also create a new RStudio project via <strong>File / New Project / New Directory / New Plumber API Project</strong>. You may have to scroll down the list of project types to see the plumber option:</p><p><img src="https://www.rstudio.com/post/2018-10-23-rstudio-1-2-preview-plumber_files/new_project.png" alt="New Project Dialog"></p><p>This will prompt for the same information as the other approach, but the result is a standalone RStudio project containing the <strong>plumber.R</strong> file with the same sample APIs.</p><p><img src="https://www.rstudio.com/post/2018-10-23-rstudio-1-2-preview-plumber_files/new_project_details.png" alt="New Project Details Dialog"></p><h2 id="running-a-local-api-server">Running a Local API server</h2><p>Comments beginning with <strong><code>#*</code></strong> followed by plumber-specific annotations such as <strong><code>@get</code></strong> identify this as a plumber API file. RStudio adds the <strong>Run API</strong> button in place of the <strong>Run</strong> and <strong>Source</strong> buttons seen on a generic R source file.</p><blockquote><p>The plumber package also supports a <strong><code>#'</code></strong> prefix, but these are not recognized by RStudio; be sure to use <strong><code>#*</code></strong> if you are using RStudio to author APIs, otherwise the <strong>Run API</strong> button will not be shown.</p></blockquote><p>Clicking this button will start a local http server and display the auto-generated API documentation and test page in the location currently selected in the <strong>Run API</strong> button&rsquo;s dropdown.</p><p><img src="https://www.rstudio.com/post/2018-10-23-rstudio-1-2-preview-plumber_files/run_api.png" alt="Run API Button"></p><p>When an API is running, the <strong>Run API</strong> button changes to <strong>Reload API</strong>; use this after making changes to your API source files to test them out. To stop running the APIs, click the stop icon in the RStudio Console or the pane or window showing the API preview.</p><h2 id="interacting-with-the-apis">Interacting with the APIs</h2><p>The plumber package auto-generates a page showing all the APIs defined in your project and displays it in RStudio.</p><p><img src="https://www.rstudio.com/post/2018-10-23-rstudio-1-2-preview-plumber_files/swagger_overview.png" alt="Swagger Overview"></p><p>In addition to providing documentation, you can use this page to make test calls to the APIs. Click on the first API, <strong>/echo</strong>, to expand it. Then click <strong>Try it out</strong>. Enter some text in the edit box, and click <strong>Execute</strong> to make a REST call to the API. Scroll down to see both an example of how to construct the REST call, e.g. via <strong>curl</strong>, and the JSON response.</p><p><img src="https://www.rstudio.com/post/2018-10-23-rstudio-1-2-preview-plumber_files/swagger_echo.png" alt="Swagger Echo"></p><p>Try the other APIs, as well, and you will see that the output is not limited to <code>JSON</code> text; the <strong>/plot</strong> API returns an image, which is shown inline.</p><p><img src="https://www.rstudio.com/post/2018-10-23-rstudio-1-2-preview-plumber_files/swagger_plot.png" alt="Swagger Plot"></p><h2 id="breakpoints">Breakpoints</h2><p>Authoring APIs directly in RStudio provides a nice workflow. As your APIs become more complex, you may need to debug the code to figure out what&rsquo;s happening.</p><p>In an R source file with plumber annotations, inserting <strong>browser()</strong> in the source code at the point you wish to start debugging will trigger a breakpoint. Then RStudio&rsquo;s features can be used to examine and modify the program state, single-step, and so on. For more on RStudio debugging facilities, see <a href="https://support.rstudio.com/hc/en-us/articles/205612627-Debugging-with-RStudio">this article</a>.</p><h2 id="publishing-to-connect">Publishing to Connect</h2><p>Now that you have some APIs, you need to make them available to others. The easiest way is to publish to an <a href="https://rstudio.com/products/connect">RStudio Connect</a> server. With Connect, you get push-button publishing, and can leverage features such as load balancing, authentication, and access-control.</p><blockquote><p>A file named <strong>plumber.R</strong> containing plumber annotations is required to publish an API to RStudio Connect.</p></blockquote><p>Next to the <strong>Run API</strong> button is the publish button. Click that to begin the publishing process.</p><p><img src="https://www.rstudio.com/post/2018-10-23-rstudio-1-2-preview-plumber_files/connect_publish.png" alt="Connect Publish"></p><p>Upon completion, the browser launches to show your API on Connect. From there, use Connect tools and options to manage the APIs. The Publish button in RStudio may then be used to republish the APIs after you&rsquo;ve made changes.</p><h2 id="publishing-elsewhere">Publishing elsewhere</h2><p>The plumber docs provide <a href="https://www.rplumber.io/docs/hosting.html">details on how to publish to other services</a> such as DigitalOcean, and how to leverage Docker containers for running plumber APIs.</p><h1 id="resources">Resources</h1><ul><li><a href="https://www.rplumber.io/">plumber website</a></li><li><a href="https://www.rstudio.com/resources/videos/plumbing-apis-with-plumber/">Webinar: Plumbing APIs with Plumber</a></li><li><a href="https://rviews.rstudio.com/2018/07/23/rest-apis-and-plumber/">R Views: REST APIs and Plumber</a></li><li><a href="http://docs.rstudio.com/connect/1.5.4/user/publishing.html#publishing-plumber-apis">RStudio Connect: Publishing APIs</a></li></ul></description></item><item><title>Summer Intern Projects</title><link>https://www.rstudio.com/blog/2018-10-22-summer-intern-projects/</link><pubDate>Mon, 22 Oct 2018 14:59:20 +0000</pubDate><guid>https://www.rstudio.com/blog/2018-10-22-summer-intern-projects/</guid><description><p>This summer we had <a href="https://blog.rstudio.com/2018/04/18/summer-interns/">five interns</a> participate in our <a href="https://blog.rstudio.com/2018/02/12/summer-interns/">internship program</a>. Each intern was here for 10 weeks and worked closely with a mentor or team. Everyone jumped right in and contributed quickly. We are excited about the progress our interns made and wanted to share it with you here!</p><p>Fanny Chow (<a href="https://twitter.com/fannystats">@Fannystats</a>) worked with Max Kuhn on bootstrapping methods in the rsample package. Check out her blog post <a href="http://fbchow.rbind.io/2018/07/27/rstudio-summer-internship/">here</a>.</p><p>Alex Hayes (<a href="https://twitter.com/alexpghayes?lang=en">@alexpghayes</a>) work on a major new release of the `broom` package featuring more modern behavior, new tidiers, new documentation and some refactoring of package internals. He wrote a <a href="http://www.alexpghayes.com/blog/a-summer-with-rstudio/.">blog post</a> about his experience.</p><p>Dana Seidel (<a href="https://twitter.com/dpseidel">@dpseidel</a>) worked with Hadley this summer preparing the scales 1.0.0 release and improving themes and secondary axis functionality in ggplot2. Read more about her summer projects on her <a href="https://www.danaseidel.com/2018-09-01-ATidySummer/">blog</a> and take a look at <a href="https://www.danaseidel.com/MeetUpSlides">slides</a> from her September 19th talk introducing the scales package to RLadies-SF.</p><p>Timothy Mastny (<a href="https://twitter.com/timmastny">@timmastny</a>) worked with Winston Chang and the Shiny team on Sass. Sass is a CSS compiler that will help make dynamic themes for Shiny apps and other R packages. He made this <a href="https://github.com/rstudio/sass">R package</a> from the internship.</p><p>Irene Steves (<a href="https://twitter.com/i_steves">@i_steves</a>) worked with Jenny Bryan on The Tidies of March. Here&rsquo;s a <a href="https://irene.rbind.io/post/summer-rstudio/">link</a> to her blog post. Irene also <a href="https://isteves.github.io/paris/rladies.html#1">presented</a> at R-Ladies in Paris.</p><p>Thank you again to our interns and Hadley, Max, Jenny, Winston, and Dave!</p></description></item><item><title>shinytest - Automated testing for Shiny apps</title><link>https://www.rstudio.com/blog/shinytest-automated-testing-for-shiny-apps/</link><pubDate>Thu, 18 Oct 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shinytest-automated-testing-for-shiny-apps/</guid><description><p>Continuing our series on new features in the <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio v1.2</a> preview release, we would like to introduce <a href="https://rstudio.github.io/shinytest/">shinytest</a>. <code>shinytest</code> is a package to perform automated testing for Shiny apps, which allows us to:</p><ul><li><p>Record Shiny tests with ease.</p></li><li><p>Run and troubleshoot Shiny tests.</p></li></ul><p><code>shinytest</code> is available on <a href="https://cran.r-project.org/package=shinytest">CRAN</a>, supported in <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio v1.2</a> preview and can be installed as follows:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">shinytest&#34;</span>)</code></pre></div><img src="https://www.rstudio.com/blog-images/2018-10-18-rstudio-record-shinytest.png" alt="RStudio shinytest compare results" style="display: none"/><h2 id="recording-tests">Recording Tests</h2><p>This is the general procedure for recording tests:</p><ul><li>Run <code>recordTest()</code> to launch the app in a test recorder.</li><li>Create the tests by interacting with the application and telling the recorder to snapshot the state at various points.</li><li>Quit the test recorder. When you do this, three things will happen:<ul><li>The test script will be saved in a .R file in a subdirectory of the application named <code>tests/</code>.</li><li>If you are running in the RStudio IDE, it will automatically open this file in the editor.</li><li>The test script will be run, and the snapshots will be saved in a subdirectory of the <code>tests/</code> directory.</li></ul></li></ul><p>To record tests form <code>R</code>, run the following:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(shinytest)<span style="color:#60a0b0;font-style:italic"># Launch the target app (replace with the correct path)</span><span style="color:#06287e">recordTest</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">path/to/app&#34;</span>)</code></pre></div><p>To record a test from RStudio v1.2, when an application file (app.R, server.R, ui.R, or global.R) is open in the editor, a button labeled Run App will show at the top of the editor pane. If you click on the small black triangle next to Run App, a menu will appear.</p><img src="https://www.rstudio.com/blog-images/2018-10-18-rstudio-new-shinytest.png" alt="RStudio new shinytest menu" style="border: solid 1px #DDD; max-width: 600px"/><p>In a separate R process, this launches the Shiny application to be tested. We’ll refer to this as the <strong>target app</strong>. This also launches a special Shiny application in the current R process which displays the target app in an iframe and has some controls outside the iframe. We’ll refer to this as the <strong>recorder app</strong>. You will see something like this:</p><img src="https://www.rstudio.com/blog-images/2018-10-18-rstudio-record-shinytest.png" alt="RStudio record shinytest window" style="border: solid 1px #DDD"/><p>On the left is the target app (in this case, the “Shiny Widgets Gallery”), and on the right is the recorder app (titled “Test event recorder”). Note that you may need to make the browser window wider because the recorder panel occupies some space.</p><p>The panel on the right displays some controls for the test recorder, as well as a list of <strong>Recorded events</strong>. As you interact with the target app – in other words, when you set inputs on the app – you will see those interactions recorded in the Recorded events list.</p><p>For testing a Shiny application, setting inputs is only one part. It’s also necessary to check that the application produces the correct outputs. This is accomplished by taking <strong>snapshots</strong> of the application’s state.</p><p>There are two ways to record output values. One way is to take a snapshot of the application’s state. This will record all input values, output values, and exported values (more on exported values later). To do this, click the “Take snapshot” button on the recorder app.</p><h2 id="running-tests">Running Tests</h2><p>When you quit the test recorder, it will automatically run the test script. There are three separate components involved in running the tests:</p><ol><li><p>First is the <strong>test driver</strong>. This is the R process that coordinates the testing and controls the web browser. When working on creating tests interactively, this is the R process that you use.</p></li><li><p>Next is the <strong>Shiny process</strong>, also known as the <strong>server</strong>. This is the R process that runs the target Shiny application.</p></li><li><p>Finally, there is the <strong>web browser</strong>, also known as the <strong>client</strong>, which connects to the server. This is a headless web browser – one which renders the web page internally, but doesn’t display the content to the screen (PhantomJS).</p></li></ol><p>When you exit the test recorder, it will by default automatically run the test script, and will print something like this:</p><pre><code>Saved test code to /path/to/app/tests/mytest.RRunning mytest.R====== Comparing mytest ...No existing snapshots at mytest-expected/. This is a first run of tests.Updating baseline snapshot at tests/mytest-expectedRenaming tests/mytest-current=&gt; tests/mytest-expected.</code></pre><p>Behind the scenes, it runs <code>testApp()</code>. You can manually run the tests with this:</p><pre><code>testApp(&quot;myshinyapp&quot;, &quot;mytest&quot;)</code></pre><p>From RStudio v1.2, you can simply <strong>Run Tests</strong> from the drop down menu in your Shiny app source file:</p><img src="https://www.rstudio.com/blog-images/2018-10-18-rstudio-run-shinytest.png" alt="RStudio new shinytest menu" style="border: solid 1px #DDD; max-width: 600px"/><h2 id="subsequent-test-runs">Subsequent Test Runs</h2><p>After the initial test run, you can run the tests again in the future to check for changes in your application’s behavior.</p><p>If there are any differences between the current and expected results, you’ll see this output in <code>R</code>:</p><pre><code>Running mytest.R====== Comparing mytest ...Differences detected between mytest-current/ and mytest-expected/:Name Status001.json != Files differ001.png != Files differWould you like to view the differences between expected and current results [y/n]?</code></pre><p>When running inside RStudio, failed test are visible under the <strong>issues</strong> tab.</p><img src="https://www.rstudio.com/blog-images/2018-10-18-rstudio-shinytest-failed.png" alt="RStudio shinytest failed results" style="border: solid 1px #DDD; max-width: 600px"/><p>For each test with different results, you can see the differences between the expected and current results. For screenshots, the differences will be highlighted in red. You can also choose different ways of viewing the differences in screenshots:</p><img src="https://www.rstudio.com/blog-images/2018-10-18-rstudio-shinytest-compare.png" alt="RStudio shinytest compare results" style="border: none"/><p>For additional information on <code>shinytest</code> please visit: <a href="https://rstudio.github.io/shinytest">rstudio.github.io/shinytest</a>.</p><h2 id="testing-code">Testing Code</h2><p>While <code>shinytest</code> is well suited for testing Shiny applications, you can also consider testing particular functions using the <a href="https://github.com/r-lib/testthat">testthat</a> package. While we won&rsquo;t discuss in detail the <code>testthat</code> package in this post, we would like to highlight a couple improvements in RStudio v1.2.</p><p>First, similar to <code>shinytest</code> tests, you can now run specific <code>testthat</code> tests from each source file, this is useful to quickly validate specific functionality or troubleshot broken tests with ease, a new <strong>Run Tests</strong> command is available on the top-right of each test file:</p><img src="https://www.rstudio.com/blog-images/2018-10-18-rstudio-testthat-run.png" alt="RStudio shinytest compare results" style="border: none; max-width: 600px"/><p>Second, when tests fail, you can switch into a new <strong>Issues</strong> tab to browse a list of failed issues. You can also double-click each entry to open the file associated with the failure:</p><img src="https://www.rstudio.com/blog-images/2018-10-18-rstudio-testthat-results.png" alt="RStudio shinytest compare results" style="border: none; max-width: 600px"/><h2 id="try-it-out">Try it Out!</h2><p>You can try this new functionality in the RStudio v1.2 Preview Release at <a href="https://www.rstudio.com/products/rstudio/download/preview/">https://www.rstudio.com/products/rstudio/download/preview/</a>. If you have any questions or comments, please get in touch with us on the <a href="https://community.rstudio.com/c/rstudio-ide">community forums</a>.</p></description></item><item><title>Announcing RStudio Package Manager </title><link>https://www.rstudio.com/blog/announcing-rstudio-package-manager/</link><pubDate>Wed, 17 Oct 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-rstudio-package-manager/</guid><description><p>We’re excited to announce the general availability of our newest RStudio professional product, <a href="https://rstudio.com/products/package-manager">RStudio Package Manager</a>. RStudio Package Manager helps your team, department, or entire organization centralize and organize R packages.</p><p>Get started with the <a href="https://rstudio.com/products/package-manager/eval">45 day evaluation</a> today!</p><p>With more than 13,000 packages in the R ecosystem, managing the packages you and your teams need can be challenging. R users naturally want the latest, but everyone benefits from reproducibility, stability, and security in production code.</p><p><img src="https://www.rstudio.com/blog-images/rspm-overview.png" alt=""></p><p>RStudio Package Manager is an on-premises server product that allows R users and IT to work together to create a central repository for R packages. RStudio Package Manager supports your team wherever they run R, from bash scripts and Docker containers to RStudio, RStudio Server (Pro), Shiny Server (Pro), and RStudio Connect.</p><p><img src="https://www.rstudio.com/blog-images/rspm-stakeholders.png" alt=""></p><p>Administrators set up the server using a scriptable command line interface (CLI), and R users install packages from the server with their existing tools.</p><p><img src="https://www.rstudio.com/blog-images/rspm-release-setup.png" alt=""></p><p>We’ve spent the last year working with alpha and beta customers to ensure RStudio Package Manager is ready to meet the needs of your development and production use cases.</p><h3 id="cran">CRAN</h3><p>If you’re an R user, you know about CRAN. If you’re someone who helps R users get access to CRAN, you probably know about network exceptions on every production node. With RStudio Package Manager, you can enable your R users to access CRAN without requiring a network exception on every production node. You can also automate CRAN updates on your schedule. You can choose to optimize disk usage and only download the packages you need or, alternatively, download everything up-front for completely offline networks.</p><p>Currently, RStudio Package Manager does not serve binary packages from CRAN &ndash; only source packages. This limitation won&rsquo;t affect server-based users, but may impact desktop users. Future versions of RStudio Package Manager will address this limitation.</p><p><img src="https://www.rstudio.com/blog-images/rspm-release-cran.png" alt=""></p><h3 id="subsets-of-cran">Subsets of CRAN</h3><p>We know some projects call for even tighter restrictions. RStudio Package Manager helps admins create approved subsets of CRAN packages, and ensures that those subsets stay stable even as packages are added or updated over time.</p><p><img src="https://www.rstudio.com/blog-images/rspm-release-cmd.png" alt=""></p><h3 id="internal-packages-and-packages-from-github">Internal Packages and Packages from GitHub</h3><p>Sharing internal R code has never been easier. Administrators can add internal packages using the CLI. If your internal packages live in Git, RStudio Package Manager can automatically track your Git repositories and make commits or tagged releases accessible to users. The same tools make it painless to supplement CRAN with R packages from GitHub.</p><p><img src="https://www.rstudio.com/blog-images/rspm-release-git.png" alt=""></p><h3 id="optimized-for-r">Optimized for R</h3><p>Regardless of your use case, RStudio Package Manager provides a seamless experience optimized for R users. Packages are versioned, automatically keeping older versions accessible to users, tools like Packrat, and platforms like RStudio Connect.</p><p><img src="https://www.rstudio.com/blog-images/rspm-release-archive.png" alt=""></p><p>RStudio Package Manager also versions the repositories themselves, preserving the ability to always return the same set of R packages or receive the latest versions.</p><p><img src="https://www.rstudio.com/blog-images/rspm-release-repoversion.png" alt=""></p><p>RStudio Package Manager records usage statistics. These metrics help administrators conduct audits and give R users an easy way to discover popular and useful packages.</p><p><img src="https://www.rstudio.com/blog-images/rspm-release-usage.png" alt=""></p><h3 id="download-today">Download Today</h3><p>Get started with the <a href="https://rstudio.com/products/package-manager/eval">45-day evaluation</a> or check out our <a href="http://demo.rstudiopm.com">demo server</a>. Read the <a href="http://docs.rstudio.com/rspm/admin">admin guide</a> for answers to more questions and a guide to installation and setup. <a href="mailto:sales@rstudio.com">Contact Sales</a> for more information.</p></description></item><item><title>RStudio 1.2 Preview: Stan</title><link>https://www.rstudio.com/blog/rstudio-1-2-preview-stan/</link><pubDate>Tue, 16 Oct 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-1-2-preview-stan/</guid><description><p><img src="https://www.rstudio.com/blog-images/2018-10-16-rstudio-1-2-preview-stan_files/stan_logo.png" width="100px" align="right" style="margin-left: 16px;" alt="Stan Logo" /></p><p>We previously discussed improved support in RStudio v1.2 for <a href="https://blog.rstudio.com/2018/10/02/rstudio-1-2-preview-sql/">SQL</a>, <a href="https://blog.rstudio.com/2018/10/05/r2d3-r-interface-to-d3-visualizations/">D3</a>, <a href="https://blog.rstudio.com/2018/10/09/rstudio-1-2-preview-reticulated-python/">Python</a>, and <a href="https://blog.rstudio.com/2018/10/11/rstudio-1-2-preview-cpp/">C/C++</a>. Today, we&rsquo;re excited to announce improved support for the <a href="http://mc-stan.org/">Stan programming language</a>. The Stan programming language makes it possible for researchers and analysts to write high-performance and scalable statistical models.</p><blockquote><p>Stan® is a state-of-the-art platform for statistical modeling and high-performance statistical computation. Thousands of users rely on Stan for statistical modeling, data analysis, and prediction in the social, biological, and physical sciences, engineering, and business.</p></blockquote><p>With RStudio v1.2, we now provide:</p><ul><li><p>Improved, context-aware autocompletion for Stan files and chunks</p></li><li><p>A document outline, which allows for easy navigation between Stan code blocks</p></li><li><p>Inline diagnostics, which help to find issues while you develop your Stan model</p></li><li><p>The ability to interrupt Stan parallel workers launched within the IDE</p></li></ul><p>Together, these features bring the editing experience in Stan programs in-line with what you&rsquo;re familiar with in R.</p><h2 id="autocompletion">Autocompletion</h2><p>RStudio provides autocompletion results for Stan functions, drawing from the set of pre-defined Stan keywords and functions. The same autocompletion features you might be familiar with in R, like fuzzy matching, are now also available in Stan programs.</p><p><img src="https://www.rstudio.com/blog-images/2018-10-16-rstudio-1-2-preview-stan_files/autocomplete.png" width=720 /></p><p>As with R, RStudio will also provide a small tooltip describing the arguments accepted by a particular function.</p><p><img src="https://www.rstudio.com/blog-images/2018-10-16-rstudio-1-2-preview-stan_files/autocomplete-2.png" width=722 /></p><h2 id="document-outline">Document Outline</h2><p>The document outline allows for easy navigation between Stan blocks. This can be especially useful as your model definition grows in size, and you need to quickly reference the different blocks used in your program.</p><p><img src="https://www.rstudio.com/blog-images/2018-10-16-rstudio-1-2-preview-stan_files/document-outline.png" width=720 /></p><h2 id="diagnostics">Diagnostics</h2><p>RStudio now uses the Stan parser to provide inline diagnostics, and will report any problems discovered as you prepare your model.</p><p><img src="https://www.rstudio.com/blog-images/2018-10-16-rstudio-1-2-preview-stan_files/diagnostics.png" width=717 /></p><p>If the Stan compiler discovers any issues in your model, RStudio&rsquo;s diagnostics will show you exactly where those issues live so you can fix them quickly and easily.</p><h2 id="worker-interruption">Worker Interruption</h2><p>One aspect that had previously made working with Stan in RStudio frustrating was the inability to interrupt parallel Stan workers. This implied that attempts to fit a computationally expensive model could not be interrupted, and the only remedy previously was to restart the IDE or forcefully shut down the workers through another mechanism.</p><p>We&rsquo;re very happy to share that this limitation has been lifted with RStudio v1.2. Now, when fitting a Stan model with parallel workers, you can interrupt the workers as you would any regular R code &ndash; either use the Escape key, or press the Stop button in the Console pane.</p><h2 id="try-it-out">Try it Out</h2><p>With the improvements to Stan integration coming in RStudio v1.2, getting started with Stan has never been easier. If you&rsquo;d like to get more familiar with Stan, we think these resources will be helpful:</p><ul><li>Stan website: <a href="http://mc-stan.org/">http://mc-stan.org/</a></li><li>RStan &ldquo;Getting Started&rdquo; guide: <a href="https://github.com/stan-dev/rstan/wiki/RStan-Getting-Started">https://github.com/stan-dev/rstan/wiki/RStan-Getting-Started</a></li><li>Tutorials and Examples using Stan: <a href="http://mc-stan.org/users/documentation/tutorials">http://mc-stan.org/users/documentation/tutorials</a></li></ul><p>You can download the RStudio v1.2 Preview Release at <a href="https://www.rstudio.com/products/rstudio/download/preview/">https://www.rstudio.com/products/rstudio/download/preview/</a>. If you have any questions or comments, please get in touch with us on the <a href="https://community.rstudio.com/c/rstudio-ide">community forums</a>.</p></description></item><item><title>RStudio 1.2 Preview: C/C++ and Rcpp</title><link>https://www.rstudio.com/blog/rstudio-1-2-preview-cpp/</link><pubDate>Thu, 11 Oct 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-1-2-preview-cpp/</guid><description><p>We’ve now discussed the improved support in RStudio v1.2 for <a href="https://blog.rstudio.com/2018/10/02/rstudio-1-2-preview-sql/">SQL</a>, <a href="https://blog.rstudio.com/2018/10/05/r2d3-r-interface-to-d3-visualizations/">D3</a>, and <a href="https://blog.rstudio.com/2018/10/09/rstudio-1-2-preview-reticulated-python/">Python</a>. Today, we’ll talk about IDE support for C/C++ and <a href="http://www.rcpp.org/">Rcpp</a>.</p><p>The IDE has had excellent support for C/C++ since RStudio v0.99, including:</p><ul><li>Tight integration with the <a href="http://www.rcpp.org/">Rcpp</a> package</li><li>Code completion</li><li>Source diagnostics as you edit</li><li>Code snippets</li><li>Auto-indentation</li><li>Navigable list of compilation errors</li><li>Code navigation (go to definition)</li></ul><p>The major new C/C++ feature in RStudio v1.2 is an upgrade to <a href="https://clang.llvm.org/docs/Tooling.html">libclang</a> (our underlying completion and diagnostics engine). The update improves performance as well as adds compatibility with modern <a href="https://en.wikipedia.org/wiki/C%2B%2B17">C++ 17</a> language features.</p><div id="rcpp" class="section level2"><h2>Rcpp</h2><p>RStudio integrates closely with <a href="http://www.rcpp.org/">Rcpp</a>, which allows you to easily write performant C++ code and use that code in your R session. For example, the following chunk defines a simple Gibbs sampler:</p><pre class="cpp"><code>#include &lt;Rcpp.h&gt;using namespace Rcpp;// [[Rcpp::export]]NumericMatrix gibbs(int N, int thin) {NumericMatrix mat(N, 2);double x = 0, y = 0;for(int i = 0; i &lt; N; i++) {for(int j = 0; j &lt; thin; j++) {x = R::rgamma(3.0, 1.0 / (y * y + 4));y = R::rnorm(1.0 / (x + 1), 1.0 / sqrt(2 * x + 2));}mat(i, 0) = x;mat(i, 1) = y;}return(mat);}</code></pre><p>Such C++ code can be used both in standalone files (e.g. when used as part of an R package, or when prototyping locally) or within an R Markdown document (within an <code>Rcpp</code> chunk). In each case, we use <code>Rcpp::sourceCpp()</code> to compile and link the code – after this, any exported functions can be called like any other R function in your session.</p><pre class="r"><code>gibbs(10, 10)</code></pre><pre><code>## [,1] [,2]## [1,] 0.3488 0.9850## [2,] 0.9290 0.8519## [3,] 2.0505 0.8685## [4,] 0.5318 1.2941## [5,] 0.6710 0.8434## [6,] 0.1064 0.8212## [7,] 0.5903 0.7238## [8,] 0.6834 0.7078## [9,] 0.5379 0.5887## [10,] 0.1863 0.9741</code></pre><p>Thanks to the abstractions provided by Rcpp, the code implementing <code>gibbs()</code> in C++ is nearly identical to the code you’d write in R, but runs <a href="http://dirk.eddelbuettel.com/blog/2011/07/14/">20 times faster</a>.</p></div><div id="code-completion" class="section level2"><h2>Code Completion</h2><p>RStudio provides autocompletion support in C++ source files, and can autocomplete symbols used from R’s C API, Rcpp, and any other libraries you may have imported.</p><p><img src="https://www.rstudio.com/blog/images/2018-10-11-rstudio-preview-cpp-autocomplete.png" width="538px" /></p><p>We also now provide autocompletion results for the headers you’d like to use in your program.</p><p><img src="https://www.rstudio.com/blog/images/2018-10-11-rstudio-preview-cpp-autocomplete-2.png" width="543px" /></p></div><div id="diagnostics" class="section level2"><h2>Diagnostics</h2><p>RStudio also provides code diagnostics, alerting you to any issues that might exist in your code.</p><p><img src="https://www.rstudio.com/blog/images/2018-10-11-rstudio-preview-cpp-diagnostics.png" width="545px" /></p></div><div id="updated-libclang" class="section level2"><h2>Updated Libclang</h2><p>On Windows and macOS, we’ve updated the bundled version of <code>libclang</code> from 3.5.0 to 5.0.2. With this, RStudio gains improved support for modern C++: all standards from C++ 11, C++ 14 and C++ 17 are now supported.</p><p>On Linux, we now default to the version of <code>libclang</code> provided by your package manager, so that RStudio can make use of new and improved C++ tooling as it becomes available on your system. (Currently, Ubuntu 18.04 provides <code>libclang</code> 6.0.0)</p></div><div id="try-it-out" class="section level2"><h2>Try it Out</h2><p>If you are new to C++ or Rcpp, you might be surprised at how easy it is to get started. There are lots of great resources available, including:</p><ul><li><p>Rcpp website: <a href="http://www.rcpp.org/" class="uri">http://www.rcpp.org/</a></p></li><li><p>Rcpp book: <a href="http://www.rcpp.org/book/" class="uri">http://www.rcpp.org/book/</a></p></li><li><p>Tutorial for users new to C++: <a href="http://adv-r.had.co.nz/Rcpp.html" class="uri">http://adv-r.had.co.nz/Rcpp.html</a></p></li><li><p>Gallery of examples: <a href="http://gallery.rcpp.org/" class="uri">http://gallery.rcpp.org/</a></p></li></ul><p>You can download the RStudio 1.2 Preview Release at <a href="https://www.rstudio.com/products/rstudio/download/preview/" class="uri">https://www.rstudio.com/products/rstudio/download/preview/</a>. If you have any questions or comments, please get in touch with us on the <a href="https://community.rstudio.com/c/rstudio-ide">community forums</a>.</p></div></description></item><item><title>RStudio 1.2 Preview: C/C++ and Rcpp</title><link>https://www.rstudio.com/blog/rstudio-1-2-preview-cpp/</link><pubDate>Thu, 11 Oct 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-1-2-preview-cpp/</guid><description><p>We’ve now discussed the improved support in RStudio v1.2 for <a href="https://blog.rstudio.com/2018/10/02/rstudio-1-2-preview-sql/">SQL</a>, <a href="https://blog.rstudio.com/2018/10/05/r2d3-r-interface-to-d3-visualizations/">D3</a>, and <a href="https://blog.rstudio.com/2018/10/09/rstudio-1-2-preview-reticulated-python/">Python</a>. Today, we’ll talk about IDE support for C/C++ and <a href="http://www.rcpp.org/">Rcpp</a>.</p><p>The IDE has had excellent support for C/C++ since RStudio v0.99, including:</p><ul><li>Tight integration with the <a href="http://www.rcpp.org/">Rcpp</a> package</li><li>Code completion</li><li>Source diagnostics as you edit</li><li>Code snippets</li><li>Auto-indentation</li><li>Navigable list of compilation errors</li><li>Code navigation (go to definition)</li></ul><p>The major new C/C++ feature in RStudio v1.2 is an upgrade to <a href="https://clang.llvm.org/docs/Tooling.html">libclang</a> (our underlying completion and diagnostics engine). The update improves performance as well as adds compatibility with modern <a href="https://en.wikipedia.org/wiki/C%2B%2B17">C++ 17</a> language features.</p><h2>Rcpp</h2><p>RStudio integrates closely with <a href="http://www.rcpp.org/">Rcpp</a>, which allows you to easily write performant C++ code and use that code in your R session. For example, the following chunk defines a simple Gibbs sampler:</p><pre class="cpp"><code>#include &lt;Rcpp.h&gt;using namespace Rcpp;<p>// [[Rcpp::export]]NumericMatrix gibbs(int N, int thin) {</p><p>NumericMatrix mat(N, 2);double x = 0, y = 0;</p><p>for(int i = 0; i &lt; N; i++) {for(int j = 0; j &lt; thin; j++) {x = R::rgamma(3.0, 1.0 / (y * y + 4));y = R::rnorm(1.0 / (x + 1), 1.0 / sqrt(2 * x + 2));}mat(i, 0) = x;mat(i, 1) = y;}</p><p>return(mat);}</code></pre></p><p>Such C++ code can be used both in standalone files (e.g. when used as part of an R package, or when prototyping locally) or within an R Markdown document (within an <code>Rcpp</code> chunk). In each case, we use <code>Rcpp::sourceCpp()</code> to compile and link the code – after this, any exported functions can be called like any other R function in your session.</p><pre class="r"><code>gibbs(10, 10)</code></pre><pre><code>## [,1] [,2]## [1,] 0.3488 0.9850## [2,] 0.9290 0.8519## [3,] 2.0505 0.8685## [4,] 0.5318 1.2941## [5,] 0.6710 0.8434## [6,] 0.1064 0.8212## [7,] 0.5903 0.7238## [8,] 0.6834 0.7078## [9,] 0.5379 0.5887## [10,] 0.1863 0.9741</code></pre><p>Thanks to the abstractions provided by Rcpp, the code implementing <code>gibbs()</code> in C++ is nearly identical to the code you’d write in R, but runs <a href="http://dirk.eddelbuettel.com/blog/2011/07/14/">20 times faster</a>.</p><h2>Code Completion</h2><p>RStudio provides autocompletion support in C++ source files, and can autocomplete symbols used from R’s C API, Rcpp, and any other libraries you may have imported.</p><p><img src="code.png" width="538px" /></p><p>We also now provide autocompletion results for the headers you’d like to use in your program.</p><p><img src="code2.png" width="543px" /></p><h2>Diagnostics</h2><p>RStudio also provides code diagnostics, alerting you to any issues that might exist in your code.</p><p><img src="code3.png" width="545px" /></p><h2>Updated Libclang</h2><p>On Windows and macOS, we’ve updated the bundled version of <code>libclang</code> from 3.5.0 to 5.0.2. With this, RStudio gains improved support for modern C++: all standards from C++ 11, C++ 14 and C++ 17 are now supported.</p><p>On Linux, we now default to the version of <code>libclang</code> provided by your package manager, so that RStudio can make use of new and improved C++ tooling as it becomes available on your system. (Currently, Ubuntu 18.04 provides <code>libclang</code> 6.0.0)</p><h2>Try it Out</h2><p>If you are new to C++ or Rcpp, you might be surprised at how easy it is to get started. There are lots of great resources available, including:</p><ul><li><p>Rcpp website: <a href="http://www.rcpp.org/" class="uri">http://www.rcpp.org/</a></p></li><li><p>Rcpp book: <a href="http://www.rcpp.org/book/" class="uri">http://www.rcpp.org/book/</a></p></li><li><p>Tutorial for users new to C++: <a href="http://adv-r.had.co.nz/Rcpp.html" class="uri">http://adv-r.had.co.nz/Rcpp.html</a></p></li><li><p>Gallery of examples: <a href="http://gallery.rcpp.org/" class="uri">http://gallery.rcpp.org/</a></p></li></ul><p>You can download the RStudio 1.2 Preview Release at <a href="https://www.rstudio.com/products/rstudio/download/preview/" class="uri">https://www.rstudio.com/products/rstudio/download/preview/</a>. If you have any questions or comments, please get in touch with us on the <a href="https://community.rstudio.com/c/rstudio-ide">community forums</a>.</p></description></item><item><title>RStudio 1.2 Preview: Reticulated Python</title><link>https://www.rstudio.com/blog/rstudio-1-2-preview-reticulated-python/</link><pubDate>Tue, 09 Oct 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-1-2-preview-reticulated-python/</guid><description><img id="reticulated-python" src="https://rstudio.github.io/reticulate/images/reticulated_python.png" width=200 align=right style="margin-left: 15px;" alt="reticulated python"/><p>One of the primary focuses of <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio v1.2</a> is improved support for other languages frequently used with R. Last week on the blog we talked about new features for working with <a href="https://blog.rstudio.com/2018/10/02/rstudio-1-2-preview-sql/">SQL</a> and <a href="https://blog.rstudio.com/2018/10/05/r2d3-r-interface-to-d3-visualizations/">D3</a>. Today we&rsquo;re taking a look at enhancements we&rsquo;ve made around the <a href="https://rstudio.github.io/reticulate/">reticulate</a> package (an R interface to Python).</p><p>The <a href="https://rstudio.github.io/reticulate/">reticulate</a> package makes it possible to embed a Python session within an R process, allowing you to import Python modules and call their functions directly from R. If you are an R developer that uses Python for some of your work or a member of data science team that uses both languages, reticulate can dramatically streamline your workflow. New features in <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio v1.2</a> related to reticulate include:</p><ol><li><p>Support for executing reticulated Python chunks within <a href="https://rmarkdown.rstudio.com/r_notebooks.html">R Notebooks</a>.</p></li><li><p>Display of <a href="https://matplotlib.org/">matplotlib</a> plots within both notebook and console execution modes.</p></li><li><p>Line-by-line execution of Python code using the reticulate <code>repl_python()</code> function.</p></li><li><p>Sourcing Python scripts using the reticulate <code>source_python()</code> function.</p></li><li><p>Code completion and inline help for Python.</p></li></ol><p>Note that for data science projects that are Python-only, we still recommend IDEs optimized for that, such as <a href="https://jupyterlab.readthedocs.io/en/stable/">JupyterLab</a>, <a href="https://www.jetbrains.com/pycharm/">PyCharm</a>, <a href="https://code.visualstudio.com/docs/languages/python">Visual Studio Code</a>, <a href="https://rodeo.yhat.com/">Rodeo</a>, and <a href="https://www.spyder-ide.org/">Spyder</a>. However, if you are using reticulated Python within an R project then RStudio provides a set of tools that we think you will find very useful.</p><h2 id="installation">Installation</h2><p>You can download the RStudio v1.2 preview release here: <a href="https://www.rstudio.com/rstudio/download/preview/">https://www.rstudio.com/rstudio/download/preview/</a>.</p><p>All of the features described below require that you have previously installed the reticulate package, which you can do as follows:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">reticulate&#34;</span>)</code></pre></div><h2 id="r-notebooks">R Notebooks</h2><p>R Notebooks have been enhanced to support executing Python chunks using the reticulate Python engine. For example, here we use pandas to do some data manipulation then plot the results with ggplot2:</p><img src="https://www.rstudio.com/blog-images/rstudio-1-2-preview-reticulated-python/rmarkdown_reticulate.png" class="screenshot"/><p>Python objects all exist in a single persistent session so are usable across chunks just like R objects. R and Python objects are also shared across languages with conversions done automatically when required (e.g. from Pandas data frame to R data frame or NumPy 2D array to R matrix).</p><p>The article on <a href="https://rstudio.github.io/reticulate/articles/calling_python.html">Calling Python from R</a> describes the various ways to access Python objects from R as well as functions available for more advanced interactions and conversion behavior.</p><p>R Notebooks can also display matplotlib plots inline when they are printed from Python chunks:</p><p><img src="https://www.rstudio.com/blog-images/rstudio-1-2-preview-reticulated-python/rmarkdown_reticulate_matplotlib.png" class="screenshot" /></p><p>See the article on the reticulate <a href="https://rstudio.github.io/reticulate/articles/r_markdown.html">R Markdown Python Engine</a> for full details on using Python chunks within R Markdown documents, including how to call Python code from R chunks and vice-versa.</p><h2 id="python-scripts">Python Scripts</h2><p>You can execute code from Python scripts line-by-line using the <strong>Run</strong> button (or Ctrl+Enter) in the same way as you execute R code line-by-line. RStudio will automatically switch into reticulate&rsquo;s <code>repl_python()</code> mode whenever you execute lines from a Python script:</p><p><img src="https://www.rstudio.com/blog-images/rstudio-1-2-preview-reticulated-python/repl_python.png" class="screenshot" /></p><p>Type <code>exit</code> from the Python REPL to exit back into R (RStudio will also automatically switch back to R mode whenever you execute code from an R script).</p><p>Any Python objects created within the REPL are immediately available to the R session via the <code>reticulate::py</code> object (e.g. in the example above you could access the pandas object via <code>py$s</code>).</p><p>In addition, RStudio now provides code completion and inline help for Python scripts:</p><p><img src="https://www.rstudio.com/blog-images/rstudio-1-2-preview-reticulated-python/code_completion.png" class="screenshot" /></p><h2 id="sourcing-scripts">Sourcing Scripts</h2><p>Click the editor&rsquo;s <strong>Source Script</strong> button (or the Ctrl+Shift+Enter shortcut) within a Python source file to execute a script using reticulate&rsquo;s <code>source_python()</code> function:</p><p><img src="https://www.rstudio.com/blog-images/rstudio-1-2-preview-reticulated-python/source_python.png" class="screenshot" /></p><p>Objects created within the script will be made available as top-level objects in the R global environment.</p><h2 id="why-reticulate">Why reticulate?</h2><p>Since we released the package, we&rsquo;re often asked what the source of the name &ldquo;reticulate&rdquo; is.</p><p>Here&rsquo;s what <a href="https://en.wikipedia.org/wiki/Reticulated_python">Wikipedia</a> says about the reticulated python:</p><blockquote><p>The reticulated python is a species of python found in Southeast Asia. They are the world&rsquo;s longest snakes and longest reptiles&hellip;The specific name, reticulatus, is Latin meaning &ldquo;net-like&rdquo;, or reticulated, and is a reference to the complex colour pattern.</p></blockquote><p>And here&rsquo;s the <a href="https://www.merriam-webster.com/dictionary/reticulate">Merriam-Webster</a> definition of reticulate:</p><blockquote><p>1: resembling a net or network; especially : having veins, fibers, or lines crossing a reticulate leaf. 2: being or involving evolutionary change dependent on genetic recombination involving diverse interbreeding populations.</p></blockquote><p>The package enables you to <em>reticulate</em> Python code into R, creating a new breed of project that weaves together the two languages.</p><p>The <a href="https://www.rstudio.com/rstudio/download/preview/">RStudio v1.2 Preview Release</a> provides lots of enhancements for reticulated Python. Check it out and let us know what you think on <a href="https://community.rstudio.com/c/rstudio-ide">RStudio Community</a> and <a href="https://github.com/rstudio/rstudio/issues">GitHub</a>.</p><p><em><strong>UPDATE:</strong></em> <em>Nov. 27, 2019</em><br><em>Learn more about <a href="https://rstudio.com/solutions/python-and-r/">how R and Python work together in RStudio</a>.</em></p><style type="text/css">.screenshot, .illustration {margin-bottom: 10px;margin-top: 10px;border: solid 1px #cccccc;width: 95%;}</style></description></item><item><title>r2d3 - R Interface to D3 Visualizations</title><link>https://www.rstudio.com/blog/r2d3-r-interface-to-d3-visualizations/</link><pubDate>Fri, 05 Oct 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/r2d3-r-interface-to-d3-visualizations/</guid><description><img src="https://www.rstudio.com/blog-images/r2d3-r-interface-to-d3-visualizations/r2d3-hex.png" width=180 align="right" style="border: none; margin-left: 10px;"/><p>As part our series on new features in the <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio v1.2 Preview Release</a>, we&rsquo;re pleased to announce the <a href="https://rstudio.github.io/r2d3/">r2d3 package</a>, a suite of tools for using custom <a href="https://d3js.org/">D3 visualizations</a> with R.</p><p>RStudio v1.2 includes several features to help optimize your development experience with <strong>r2d3</strong>. We&rsquo;ll describe these features below, but first a bit more about the package. Features of <strong>r2d3</strong> include:</p><ul><li><p>Translating R objects into D3 friendly data structures</p></li><li><p>Publishing D3 visualizations to the web</p></li><li><p>Incorporating D3 scripts into <a href="https://rmarkdown.rstudio.com/">RMarkdown</a> reports, presentations,and dashboards</p></li><li><p>Creating interacive D3 applications with<a href="https://shiny.rstudio.com/">Shiny</a></p></li><li><p>Distributing D3 based <a href="http://www.htmlwidgets.org">htmlwidgets</a> in Rpackages</p></li></ul><div style="clear: both"></div><br/><p>With <strong>r2d3</strong>, you can bind data from R to D3 visualizations like theones found on <a href="https://github.com/d3/d3/wiki/Gallery">https://github.com/d3/d3/wiki/Gallery</a>,<a href="https://bl.ocks.org/">https://bl.ocks.org/</a>, and<a href="https://vida.io/explore">https://vida.io/explore</a>:</p><div style="margin-top: 20px; margin-bottom: 10px;"><p><a href="https://rstudio.github.io/r2d3/articles/gallery/chord/"><img src="https://www.rstudio.com/blog-images/r2d3-r-interface-to-d3-visualizations/chord_thumbnail.png" width="28%" class="illustration gallery-thumbnail"/></a> <a href="https://rstudio.github.io/r2d3/articles/gallery/bubbles/"><img src="https://www.rstudio.com/blog-images/r2d3-r-interface-to-d3-visualizations/bubbles_thumbnail.png" width="28%" class="illustration gallery-thumbnail"/></a> <a href="https://rstudio.github.io/r2d3/articles/gallery/cartogram/"><img src="https://www.rstudio.com/blog-images/r2d3-r-interface-to-d3-visualizations/cartogram_thumbnail.png" width="28%" class="illustration gallery-thumbnail"/></a></p></div><p>D3 visualizations created with <strong>r2d3</strong> work just like R plots withinRStudio, R Markdown documents, and Shiny applications.</p><p>You can install the <strong>r2d3</strong> package from CRAN as follows:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">r2d3&#34;</span>)</code></pre></div><h2 id="d3-scripts">D3 Scripts</h2><p>To use <strong>r2d3</strong>, write a D3 script and then pass R data to it using the<code>r2d3()</code> function. For example, here’s a simple D3 script that draws abar chart (“barchart.js”):</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-js" data-lang="js"><span style="color:#60a0b0;font-style:italic">// !preview r2d3 data=c(0.3, 0.6, 0.8, 0.95, 0.40, 0.20)</span><span style="color:#60a0b0;font-style:italic"></span><span style="color:#007020;font-weight:bold">var</span> barHeight <span style="color:#666">=</span> <span style="color:#007020">Math</span>.floor(height <span style="color:#666">/</span> data.length);svg.selectAll(<span style="color:#4070a0">&#39;rect&#39;</span>).data(data).enter().append(<span style="color:#4070a0">&#39;rect&#39;</span>).attr(<span style="color:#4070a0">&#39;width&#39;</span>, <span style="color:#007020;font-weight:bold">function</span>(d) { <span style="color:#007020;font-weight:bold">return</span> d <span style="color:#666">*</span> width; }).attr(<span style="color:#4070a0">&#39;height&#39;</span>, barHeight).attr(<span style="color:#4070a0">&#39;y&#39;</span>, <span style="color:#007020;font-weight:bold">function</span>(d, i) { <span style="color:#007020;font-weight:bold">return</span> i <span style="color:#666">*</span> barHeight; }).attr(<span style="color:#4070a0">&#39;fill&#39;</span>, <span style="color:#4070a0">&#39;steelblue&#39;</span>);</code></pre></div><p>To render the script within R you call the <code>r2d3()</code> function:</p><pre><code>library(r2d3)r2d3(data=c(0.3, 0.6, 0.8, 0.95, 0.40, 0.20), script = &quot;barchart.js&quot;)</code></pre><p>Which results in the following visualization:</p><img src="https://www.rstudio.com/blog-images/r2d3-r-interface-to-d3-visualizations/bar_chart.png" class="illustration" width=600/><h3 id="d3-variables">D3 Variables</h3><p>Note that data is provided to the script using the <code>data</code> argument tothe <code>r2d3()</code> function. This data is then automatically made available tothe D3 script. There are a number of other special variables availablewithin D3 scripts, including:</p><ul><li><code>data</code> — The R data converted to JavaScript.</li><li><code>svg</code> — The svg container for the visualization</li><li><code>width</code> — The current width of the container</li><li><code>height</code> — The current height of the container</li><li><code>options</code> — Additional options provided by the user</li><li><code>theme</code> — Colors for the current theme</li></ul><p>When you are learning D3 or translating D3 examples for use with R it’simportant to keep in mind that D3 examples will generally include codeto load data, create an SVG or other root element, and establish a widthand height for the visualization.</p><p>On the other hand with <strong>r2d3</strong>, these variables are <em>providedautomatically</em> so do not need to be created. The reasons these variablesare provided automatically are:</p><ol><li><p>So that you can dynamically bind data from R to visualizations; and</p></li><li><p>So that <strong>r2d3</strong> can automatically handle dynamic resizing for yourvisualization. Most D3 examples have a static size. This is fine foran example but not very robust for including the visualizationwithin a report, dashboard, or application.</p></li></ol><h2 id="rstudio-v12-and-r2d3">RStudio v1.2 and r2d3</h2><p>The <a href="https://www.rstudio.com/rstudio/download/preview/">RStudio v1.2 previewrelease</a>includes support for previewing D3 scripts as you write them. To try this out, install the preview releasethen create a D3 script using the new file menu:</p><img src="https://www.rstudio.com/blog-images/r2d3-r-interface-to-d3-visualizations/new_script.png" class="screenshot" width=600/><p>A simple template for a D3 script (the barchart.js example shown above)is provided by default. You can use the <strong>Preview</strong> command(Ctrl+Shift+Center) to render thevisualization:</p><img src="https://www.rstudio.com/blog-images/r2d3-r-interface-to-d3-visualizations/rstudio_preview.png" class="screenshot" width=600/><p>You might wonder where the data comes from for the preview. Note thatthere is a special comment at the top of the D3 script:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-js" data-lang="js"><span style="color:#60a0b0;font-style:italic">// !preview r2d3 data=c(0.3, 0.6, 0.8, 0.95, 0.40, 0.20)</span></code></pre></div><p>This comment enables you to specify the data (along with any otherarguments to the <code>r2d3()</code> function) to use for the preview.</p><h2 id="r-markdown">R Markdown</h2><p>RStudio v1.2 also includes support for rendering <strong>r2d3</strong> visualizations within R Markdown documents and R Notebooks. There is a new <code>d3</code> R Markdown engine which works like this:</p><pre><code>&#96``{r setup}library(r2d3)bars &lt;- c(10, 20, 30)&#96``&#96``{d3 data=bars, options=list(color = 'orange')}svg.selectAll('rect').data(data).enter().append('rect').attr('width', function(d) { return d * 10; }).attr('height', '20px').attr('y', function(d, i) { return i * 22; }).attr('fill', options.color);&#96``</code></pre><img src="https://www.rstudio.com/blog-images/r2d3-r-interface-to-d3-visualizations/rmarkdown-1.png" class="illustration" width=600/><p>Note that in order to use the <code>d3</code> engine you need to add<code>library(r2d3)</code> to the setup chunk (as illustrated above).</p><p>You can also of course call the <code>r2d3()</code> function from withinan R code chunk:</p><pre><code>---output: html_document---&#96``{r}library(r2d3)r2d3(data=c(0.3, 0.6, 0.8, 0.95, 0.40, 0.20), script = "barchart.js")&#96``</code></pre><img src="https://www.rstudio.com/blog-images/r2d3-r-interface-to-d3-visualizations/bar_chart.png" class="illustration" width=600/><h2 id="shiny">Shiny</h2><p>The <code>renderD3()</code> and <code>d3Output()</code> functions enable you to include D3visualizations within Shiny applications:</p><pre><code>library(shiny)library(r2d3)ui &lt;- fluidPage(inputPanel(sliderInput(&quot;bar_max&quot;, label = &quot;Max:&quot;,min = 0.1, max = 1.0, value = 0.2, step = 0.1)),d3Output(&quot;d3&quot;))server &lt;- function(input, output) {output$d3 &lt;- renderD3({r2d3(runif(5, 0, input$bar_max),script = system.file(&quot;examples/baranims.js&quot;, package = &quot;r2d3&quot;))})}shinyApp(ui = ui, server = server)</code></pre><img src="https://www.rstudio.com/blog-images/r2d3-r-interface-to-d3-visualizations/baranim-1.gif" class="illustration" width=600/><p>See the article on <a href="https://rstudio.github.io/r2d3/articles/shiny.html">Using r2d3 withShiny</a> to learn more(including how to create custom Shiny inputs that respond to userinteraction with D3 visualizations).</p><h2 id="try-it-out">Try It Out</h2><p>To try out <strong>r2d3</strong>, start by installing the package from CRAN:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">r2d3&#34;</span>)</code></pre></div><p>Then, download the <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio v1.2 Preview Release</a> and head over to <a href="https://rstudio.github.io/r2d3/">https://rstudio.github.io/r2d3/</a> for complete documentation on using the package.</p><p>If you aren&rsquo;t familliar with D3, check out these links to learn the basics and see some examples that might inspire yourown work:</p><ul><li><p><a href="https://rstudio.github.io/r2d3/articles/learning_d3.html">LearningD3</a> —Suggested resources for learning how to create D3 visualizations.</p></li><li><p><a href="https://rstudio.github.io/r2d3/articles/gallery.html">Gallery ofExamples</a> —Learn from a wide variety of example D3 visualizations.</p></li></ul><p>We hope that the <strong>r2d3</strong> package opens up many new horizons for creating custom interactive visualizations with R!</p><style type="text/css">.screenshot, .illustration {margin-bottom: 10px;margin-top: 10px;border: solid 1px #cccccc;}</style></description></item><item><title>RStudio 1.2 Preview: SQL Integration</title><link>https://www.rstudio.com/blog/rstudio-1-2-preview-sql/</link><pubDate>Tue, 02 Oct 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-1-2-preview-sql/</guid><description><p>The <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio 1.2 Preview Release</a>, available today, dramatically improves support and interoperability with many new programming languages and platforms, including SQL, D3, Python, Stan, and C++. Over the next few weeks on the blog, we&rsquo;re going to be taking a look at improvements for each of these in turn.</p><p>Today, we&rsquo;re looking at SQL, and as a motivating example, we&rsquo;re going to connect to a sample <em>Chinook</em> database to get a list of album titles.</p><h2 id="keyring-integration">Keyring Integration</h2><p>We&rsquo;ll start by connecting to the database. When connecting to databases that use usernames and passwords, it&rsquo;s not uncommon to see passwords stored in plain text in the connection string. It&rsquo;s not good practice, but it&rsquo;s understandable; it can be a big hassle to store and retrieve the password securely.</p><p>In RStudio 1.2, we&rsquo;ve made it much easier to secure your credentials. RStudio now integrates with the <a href="https://github.com/r-lib/keyring">keyring</a> package. Your password is stored, secure and encrypted, on your system&rsquo;s credential store (such as the MacOS Keychain or Windows Credential Store), so you can share your R code without leaking your password.</p><p>Instead of being prompted to make the password part of the connection string, you&rsquo;ll get a prompt to save it to your keyring.</p><p><img src="https://www.rstudio.com/blog-images/2018-10-02-rstudio-preview-sql-keyring.png" style="width: 631px" /></p><p>You can also take advantage of RStudio&rsquo;s API to prompt for secrets in your own packages. See <a href="https://support.rstudio.com/hc/en-us/articles/360000969634-Using-Keyring">Using Keyring</a> for more information.</p><h2 id="instant-query">Instant Query</h2><p>Great, we&rsquo;re connected; it&rsquo;s time to make a query! It&rsquo;s now a lot easier to build and execute SQL queries in RStudio. First, use the SQL button to generate a new SQL file with the open connection:</p><p><img src="https://www.rstudio.com/blog-images/2018-10-02-rstudio-preview-sql-template.png" style="width: 906px" /></p><h2 id="autocompletion">Autocompletion</h2><p>Now we need to refine our query with the fields we&rsquo;re interested in. RStudio can now autocomplete table names and field names associated with a connection. This works in <code>.sql</code> files, R Markdown documents, and R Notebooks. We&rsquo;ll use this to pick up the name of the <code>Title</code> field without extra typing or guessing.</p><p><img src="https://www.rstudio.com/blog-images/2018-10-02-rstudio-preview-sql-autocomplete.png" style="width: 492px" /></p><h2 id="instant-preview">Instant Preview</h2><p>You&rsquo;ll notice that there&rsquo;s a magic comment RStudio added to the top of the file:</p><pre><code>-- !preview conn=con</code></pre><p>This comment tells RStudio to execute the query against the open connection named <code>con</code>. We can now click <em>Preview</em> or press <em>Ctrl + Shift + Enter</em> to run the query. Results appear in a new tab:</p><p><img src="https://www.rstudio.com/blog-images/2018-10-02-rstudio-preview-sql-preview.png" style="width: 490px" /></p><p>You can also preview every time you save, if you&rsquo;re iterating quickly on your query and want to watch the results take shape as you go.</p><h2 id="filter">Filter</h2><p>Finally, you can now filter the list of displayed tables in the Connections pane by name. This is very useful when your database has a lot of tables!</p><p><img src="https://www.rstudio.com/blog-images/2018-10-02-rstudio-preview-sql-filter.png" style="width: 463px" /></p><h2 id="try-it-out">Try it out!</h2><p>If you&rsquo;d like to try these new features out now, you can download the latest preview release of RStudio from <a href="https://www.rstudio.com/products/rstudio/download/preview/">https://www.rstudio.com/products/rstudio/download/preview/</a>. If you do, we&rsquo;d very much appreciate your feedback on the <a href="https://community.rstudio.com/c/rstudio-ide">RStudio Community Forum</a>!</p></description></item><item><title>sparklyr 0.9: Streams and Kubernetes</title><link>https://www.rstudio.com/blog/sparklyr-0-9/</link><pubDate>Mon, 01 Oct 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/sparklyr-0-9/</guid><description><p>Today we are excited to share that a new release of <a href="https://spark.rstudio.com/">sparklyr</a> is <a href="https://CRAN.R-project.org/package=sparklyr">available on CRAN</a>! This <code>0.9</code> release enables you to:</p><ul><li>Create Spark structured <strong>streams</strong> to process real time data from many data sources using <a href="https://dplyr.tidyverse.org/">dplyr</a>, <a href="https://CRAN.R-project.org/package=DBI">SQL</a>, <a href="https://spark.rstudio.com/guides/pipelines/">pipelines</a>, and arbitrary R code.</li><li>Monitor connection progress with upcoming <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio Preview 1.2</a> features and support for properly <strong>interrupting</strong> Spark jobs from R.</li><li>Use <strong>Kubernetes</strong> clusters with <code>sparklyr</code> to simplify deployment and maintenance.</li></ul><p>In addition, <code>sparklyr 0.9</code> adds support for <strong>Spark 2.3.1</strong> and <strong>Spark 2.2.3</strong> and <a href="https://github.com/rstudio/sparklyr/blob/master/NEWS.md#broom">extends</a> <a href="https://broom.tidyverse.org/">broom</a> models in <code>sparklyr</code>. An extensive list of improvements and fixes is available in the <a href="https://github.com/rstudio/sparklyr/blob/master/NEWS.md">sparklyr NEWS</a> file.</p><h2 id="streams">Streams</h2><p>Spark <a href="https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html">structured streams</a> provide parallel and fault-tolerant data processing, useful when analyzing real time data. You create a stream in <code>sparklyr</code> by defining sources, transformations and a destination:</p><ul><li>The <strong>sources</strong> are defined using any of the <code>stream_read_*()</code> functions to read streams of data from various data sources.</li><li>The <strong>transformations</strong> can be specified using <code>dplyr</code>, <code>SQL</code>, scoring pipelines or R code through <code>spark_apply()</code>.</li><li>The <strong>destination</strong> is defined with the <code>stream_write_*()</code> functions, it is often also referenced as a sink.</li></ul><p>For instance, after connecting with <code>sc &lt;- spark_connect(master = &quot;local&quot;)</code>, the simplest stream we can define is one to continuously copy text files between a <code>source</code> folder and a <code>destination</code> folder as follows:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">stream_read_text</span>(sc, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">source/&#34;</span>) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">stream_write_text</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">destination/&#34;</span>)</code></pre></div><pre><code>Stream: 1857a67b-38f7-4f78-8a4c-959594bf0c70Status: Waiting for next triggerActive: TRUE</code></pre><p>Once this is executed, <code>sparklyr</code> creates the stream and starts running it; the stream will be destroyed when the R session terminates or when <code>stream_stop()</code> is called on the stream instance.</p><p>There are many useful use cases for streams. For example, you can use streams to analyze access logs in an Amazon S3 bucket in real time. The following example creates a stream over an S3 bucket containing access logs, parses the log entries using the <a href="https://CRAN.R-project.org/package=webreadr">webreadr</a> through <a href="https://spark.rstudio.com/guides/distributed-r/">spark_apply()</a>, finds the most accessed objects using <code>dplyr</code> and, writes the results into an in-memory data frame:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">urls_stream <span style="color:#666">&lt;-</span> <span style="color:#06287e">stream_read_text</span>(sc, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">s3a://your-s3-bucket/&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">spark_apply</span>(<span style="color:#666">~</span>webreadr<span style="color:#666">::</span><span style="color:#06287e">read_s3</span>(<span style="color:#06287e">paste</span>(<span style="color:#06287e">c</span>(.x<span style="color:#666">$</span>line, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">&#34;</span>), collapse <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">\n&#34;</span>)),columns <span style="color:#666">=</span> <span style="color:#06287e">lapply</span>(webreadr<span style="color:#666">::</span><span style="color:#06287e">read_s3</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">\n&#34;</span>), class)) <span style="color:#666">%&gt;%</span><span style="color:#06287e">group_by</span>(uri) <span style="color:#666">%&gt;%</span><span style="color:#06287e">summarize</span>(n <span style="color:#666">=</span> <span style="color:#06287e">n</span>()) <span style="color:#666">%&gt;%</span><span style="color:#06287e">arrange</span>(<span style="color:#06287e">desc</span>(n)) <span style="color:#666">%&gt;%</span><span style="color:#06287e">stream_write_memory</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">urls_stream&#34;</span>, mode <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">complete&#34;</span>)</code></pre></div><p>Now that the <code>urls_stream</code> is running, we can view data being processed through:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">stream_view</span>(urls_stream)</code></pre></div><img src="https://www.rstudio.com/blog-images/2018-10-01-sparklyr-stream-view.png" alt="RStudio monitoring sparklyr job" style="width: 100%"/><p>You can also easily display streaming data using <a href="https://shiny.rstudio.com/">Shiny</a>. Use the <code>sparklyr::reactiveSpark()</code> function to create a Shiny reactive from streaming data that can then be used to interact with other Shiny components and visualizations.</p><p>For instance, we can create a Shiny app using Spark streams that counts words from text files under a <code>source/</code> folder as follows:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(shiny)<span style="color:#06287e">library</span>(sparklyr)<span style="color:#06287e">library</span>(ggplot2)<span style="color:#06287e">library</span>(dplyr)sc <span style="color:#666">&lt;-</span> <span style="color:#06287e">spark_connect</span>(master <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">local&#34;</span>)<span style="color:#06287e">dir.create</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">source&#34;</span>)reactiveCount <span style="color:#666">&lt;-</span> <span style="color:#06287e">stream_read_text</span>(sc, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">source/&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">ft_tokenizer</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">line&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">tokens&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">ft_stop_words_remover</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">tokens&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">words&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">transmute</span>(words <span style="color:#666">=</span> <span style="color:#06287e">explode</span>(words)) <span style="color:#666">%&gt;%</span><span style="color:#06287e">filter</span>(<span style="color:#06287e">nchar</span>(words) <span style="color:#666">&gt;</span> <span style="color:#40a070">0</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">group_by</span>(words) <span style="color:#666">%&gt;%</span><span style="color:#06287e">summarize</span>(n <span style="color:#666">=</span> <span style="color:#06287e">n</span>()) <span style="color:#666">%&gt;%</span><span style="color:#06287e">arrange</span>(<span style="color:#06287e">desc</span>(n)) <span style="color:#666">%&gt;%</span><span style="color:#06287e">filter</span>(n <span style="color:#666">&gt;</span> <span style="color:#40a070">100</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">reactiveSpark</span>()ui <span style="color:#666">&lt;-</span> <span style="color:#06287e">fluidPage</span>(<span style="color:#06287e">plotOutput</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">wordsPlot&#34;</span>))server <span style="color:#666">&lt;-</span> <span style="color:#06287e">function</span>(input, output) {output<span style="color:#666">$</span>wordsPlot <span style="color:#666">&lt;-</span> <span style="color:#06287e">renderPlot</span>({<span style="color:#06287e">reactiveCount</span>() <span style="color:#666">%&gt;%</span><span style="color:#06287e">head</span>(n<span style="color:#666">=</span><span style="color:#40a070">10</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">ggplot</span>() <span style="color:#666">+</span> <span style="color:#06287e">aes</span>(x<span style="color:#666">=</span>words, y<span style="color:#666">=</span>n) <span style="color:#666">+</span> <span style="color:#06287e">geom_bar</span>(stat<span style="color:#666">=</span><span style="color:#4070a0">&#34;</span><span style="color:#4070a0">identity&#34;</span>)})}<span style="color:#06287e">shinyApp</span>(ui <span style="color:#666">=</span> ui, server <span style="color:#666">=</span> server)</code></pre></div><p>We can then write Jane Austen&rsquo;s books to this folder starting with <code>writeLines(janeaustenr::emma, &quot;source/emma.txt&quot;)</code> and similar code for the remaining ones, each time a book is saved, the Shiny app updates accordingly:</p><img src="https://www.rstudio.com/blog-images/2018-10-01-sparklyr-shiny-app-books.gif" alt="Shiny app using Spark stream to count words in Emma" style="border: solid 1px #DDD"/><p>You can learn more about <code>sparklyr</code> streaming at <a href="https://spark.rstudio.com/guides/streaming/">https://spark.rstudio.com/guides/streaming/</a>.</p><h2 id="monitoring-and-interrupting-jobs">Monitoring and Interrupting Jobs</h2><p>In <code>sparklyr 0.9</code>, you can now gracefully interrupt long-running operations and reuse the Spark connection to execute other operations. This is useful when you execute a query or modeling function that is taking longer than expected, or when you didn&rsquo;t quite execute the code you wanted to. For example:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic"># Stop the following long-running operation with `ctrl+c` or &#39;stop&#39; in RStudio</span><span style="color:#06287e">sdf_len</span>(sc, <span style="color:#40a070">10</span>) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">spark_apply</span>(<span style="color:#666">~</span> <span style="color:#06287e">Sys.sleep</span>(<span style="color:#40a070">60</span> <span style="color:#666">*</span> <span style="color:#40a070">10</span>))<span style="color:#60a0b0;font-style:italic"># Start a new operation without having to restart the Spark context.</span><span style="color:#06287e">sdf_len</span>(sc, <span style="color:#40a070">10</span>)</code></pre></div><p>While running <code>sparklyr 0.9</code> under <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio Preview 1.2</a>, long running jobs will be displayed in the console tab and under the Jobs panel:</p><img src="https://www.rstudio.com/blog-images/2018-10-01-sparklyr-monitored-connections.png" alt="RStudio monitoring sparklyr job" style="width: 80%"/><h2 id="kubernetes">Kubernetes</h2><p><code>sparklyr 0.9</code> enables support for Kubernetes. A cluster from a properly configured client can be launched as follows:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">sc <span style="color:#666">&lt;-</span> <span style="color:#06287e">spark_connect</span>(config <span style="color:#666">=</span> <span style="color:#06287e">spark_config_kubernetes</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">k8s://hostname:8443&#34;</span>))</code></pre></div><p>Please note that Spark on Kubernetes requires a proper container image, see <code>?spark_config_kubernetes</code> for details. In addition, Kubernetes support <a href="https://spark.apache.org/docs/2.3.0/running-on-kubernetes.html">was just added in Spark 2.3.0</a> and the Kubernetes scheduler is currently experimental in Spark.</p><p>We hope you enjoy all new features in sparklyr 0.9! You can read more about these features and others at <a href="https://spark.rstudio.com/">https://spark.rstudio.com/</a>, get help from the R community at <a href="https://community.rstudio.com/tags/sparklyr">https://community.rstudio.com/tags/sparklyr</a>, and report issues with sparklyr at <a href="https://github.com/rstudio/sparklyr">https://github.com/rstudio/sparklyr</a>.</p></description></item><item><title>RStudio Connect 1.6.8 - Emails, APIs, and Titles</title><link>https://www.rstudio.com/blog/rstudio-connect-1-6-8-emails-apis-and-titles/</link><pubDate>Thu, 20 Sep 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-1-6-8-emails-apis-and-titles/</guid><description><p>RStudio Connect 1.6.8 includes additions to custom emails, new user endpoints in the RStudio Connect Server API, support for content descriptions and title changes, and important security and authentication improvements.</p><p>&lt;image src=&rdquo;/blog-images/rsc-168-email.png&rdquo;, height=400px&gt;</p><h2 id="updates">Updates</h2><ul><li><strong>R Markdown Reports</strong> have access to environment variables containing <a href="http://docs.rstudio.com/connect/1.6.8/user/r-markdown.html#r-markdown-including-urls">metadata about the report on RStudio Connect</a>. This addition is especially important for <a href="http://docs.rstudio.com/connect/1.6.8/user/r-markdown.html#r-markdown-email-customization">custom emails</a>. In case you missed it, recent versions of RStudio Connect allow data scientists to distribute beautiful emails that can include plots, tables, and dynamically generated text. The new metadata can be used to craft an email that includes a link back to the original report or dashboard.</li></ul><p>&lt;image src=&rdquo;/blog-images/rsc-168-footer.png&rdquo;, width=350px&gt;</p><ul><li><strong>Users in the Connect Server API</strong> RStudio Connect v1.6.6 introduced the ability to list user information with the <a href="http://docs.rstudio.com/connect/1.6.8/api">Connect Server API</a>. Version 1.6.8 extends the Server API to include endpoints for creating users and updating users. These endpoints can be used with PAM, proxy, or built-in authentication providers. Future releases will include API support for LDAP/AD and OAuth authentication. As an example, if you are using proxy authentication, you can now programmatically create a user and give them access to content before they log in. The API request can specify all of a user&rsquo;s attributes or only some. Users will be asked to complete their profile on first login. See the new <a href="http://docs.rstudio.com/connect/1.6.8/admin/user-management.html#user-provisioning">admin documentation</a> on user provisioning for more information.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-bash" data-lang="bash"> curl -v -X POST https://connect.example.com/__api__/v1/users-H <span style="color:#4070a0">&#34;Content-Type: application/json&#34;</span>-H <span style="color:#4070a0">&#34;Authentication: Key ***API_KEY***&#34;</span>-d <span style="color:#4070a0">&#39;{</span><span style="color:#4070a0"> &#34;username&#34;: &#34;john_doe&#34;,</span><span style="color:#4070a0"> &#34;first_name&#34;: &#34;John&#34;,</span><span style="color:#4070a0"> &#34;last_name&#34;: &#34;Doe&#34;,</span><span style="color:#4070a0"> &#34;email&#34;: &#34;jdoe@example.com&#34;</span><span style="color:#4070a0"> }&#39;</span></code></pre></div><ul><li><strong>Content Titles and Descriptions</strong> The “Info” settings panel allows publishers and collaborators to edit the content title or - new this release - add a content description. The title and description are visible to viewers with access to the content and administrators. Future RStudio Connect releases will add support for content images, and incorporate all of this information into the content listing page - stay tuned!</li></ul><p>&lt;image src=&rdquo;/blog-images/rsc-168-info.png&rdquo;, width=350px&gt;</p><p><em>Note</em>: At this time, changing the content title in RStudio Connect does not update the title in the RStudio IDE publish dialog.</p><h2 id="security--authentication-changes">Security &amp; Authentication Changes</h2><ul><li><p><strong>TLS Versions</strong> RStudio Connect now supports the <code>HTTPS.MinimumTLS</code> <a href="http://docs.rstudio.com/connect/1.6.8/admin/appendix-configuration.html#appendix-configuration-https">configuration setting</a> which can be used to change the TLS version in use. Specific TLS ciphers can be prohibited using the <code>HTTPS.ProhibitedCiphers</code> configuration setting. Before you consider making your server more restrictive, ensure that all supported clients (both browsers <em>and</em> R) support the more restrictive settings.</p></li><li><p><strong>LDAP / Active Directory Groups</strong> For LDAP or AD installations, administrators should add the <a href="http://docs.rstudio.com/connect/1.6.8/admin/authentication.html#ldap-or-ad-configuration-settings"><code>LDAP.GroupUniqueIdAttribute</code></a> to identify which directory attribute uniquely identifies a group. Existing installations will continue working, but an upcoming release will require this setting on start-up.</p></li><li><p>The <a href="http://docs.rstudio.com/connect/1.6.8/admin/cli.html#cli-usermanager">CLI <code>usermanager</code> utility</a> has improved support for group management and offers better support for installations using LDAP/AD.</p></li></ul><h2 id="deprecations--breaking-changes">Deprecations &amp; Breaking Changes</h2><ul><li><p><strong>Breaking Change</strong> The configuration value <code>Server.SenderEmail</code> is validated at start-up and invalid email addresses will prevent RStudio Connect from starting.</p></li><li><p><strong>Breaking Change</strong> <code>Applications.EnvironmentBlacklist</code>, deprecated in 1.6.6, has been removed in favor of <code>Applications.ProhibitedEnvironment</code>.</p></li><li><p><strong>Breaking Change</strong> <code>LDAP.WhitelistedLoginGroups</code>, deprecated in v1.6.6, has been removed in favor of <code>LDAP.PermittedLoginGroups</code>.</p></li><li><p>The <code>xhr-streaming</code> SockJS protocol has been disabled for Microsoft Edge to fix a bug where Shiny applications became unresponsive. Shiny applications will fall back to a different protocol automatically and will work without any changes.</p></li></ul><p>Please review the <a href="http://docs.rstudio.com/connect/news">full release notes</a></p><blockquote><h4 id="upgrade-planning">Upgrade Planning</h4><p>If you use LDAP or Active Directory, please take note of the LDAP changes described above and in the release notes. Aside from the deprecations and breaking changes above, there are no other special considerations and upgrading should only take few seconds. If you are upgrading from an earlier version, be sure to consult the release notes for the intermediate releases, as well.</p></blockquote><p>If you haven&rsquo;t yet had a chance to download and try <a href="https://rstudio.com/products/connect/">RStudio Connect</a>, we encourage you to do so. RStudio Connect is the best way to share all the work that you do in R (Shiny apps, R Markdown documents, plots, dashboards, Plumber APIs, etc.) with collaborators, colleagues, or customers.</p><p>You can find more details or download a 45-day evaluation of the product at <a href="https://www.rstudio.com/products/connect/">https://www.rstudio.com/products/connect/</a>. Additional resources can be found below.</p><ul><li><a href="https://www.rstudio.com/products/connect/">RStudio Connect home page &amp; downloads</a></li><li><a href="http://docs.rstudio.com/connect/admin/">RStudio Connect Admin Guide</a></li><li><a href="https://www.rstudio.com/wp-content/uploads/2016/01/RSC-IT-Q-and-A.pdf">What IT needs to know about RStudio Connect</a></li><li><a href="http://docs.rstudio.com/connect/news/">Detailed news and changes between each version</a></li><li><a href="https://www.rstudio.com/pricing/#ConnectPricing">Pricing</a></li><li><a href="https://beta.rstudioconnect.com/connect/">An online preview of RStudio Connect</a></li></ul></description></item><item><title>Radix for R Markdown</title><link>https://www.rstudio.com/blog/radix-for-r-markdown/</link><pubDate>Wed, 19 Sep 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/radix-for-r-markdown/</guid><description><p>Today we&rsquo;re excited to announce <a href="https://rstudio.github.io/radix/">Radix</a>, a new R Markdown format optimized for scientific and technical communication. Features of Radix include:</p><ul><li><p>Reader-friendly typography that adapts well to mobile devices.</p></li><li><p>Flexible <a href="https://rstudio.github.io/radix/figures.html">figure layout</a> options (e.g. displaying figures at a larger width than the article text).</p></li><li><p>Tools for making articles <a href="https://rstudio.github.io/radix/citations.html">easily citeable</a>, as well as for generating <a href="https://rstudio.github.io/radix/citations.html#google-scholar">Google Scholar</a> compatible citation metadata.</p></li><li><p>The ability to incorporate JavaScript and D3-based <a href="https://rstudio.github.io/radix/interactivity.html">interactive visualizations</a>.</p></li><li><p>A variety of ways to <a href="https://rstudio.github.io/radix/publish_article.html">publish articles</a>, including support for publishing sets of articles as a <a href="https://rstudio.github.io/radix/website.html">Radix website</a>.</p></li><li><p>The ability to <a href="https://rstudio.github.io/radix/blog.html">create a blog</a> composed of a collection of Radix articles.</p></li></ul><p>Radix is based on the <a href="https://github.com/distillpub/template">Distill web framework</a>, which was originally created for use in the Distill Machine Learning Journal. Radix combines the technical authoring features of Distill with <a href="https://rmarkdown.rstudio.com/">R Markdown</a>.</p><p>Below we&rsquo;ll demonstrate some of the key features of Radix. To learn more about installing and using Radix, check out the <a href="https://rstudio.github.io/radix/">Radix for R Markdown</a> website.</p><h2 id="figure-layout">Figure layout</h2><p>Radix provides many flexible options for laying out figures. While the main text column in Radix articles is relatively narrow (optimized for comfortable reading), figures can occupy a larger region. For example:</p><figure><img src="https://www.rstudio.com/blog-images/2018-09-17-radix-wider-layouts.png" class="screenshot"/></figure><p>For figures you want to emphasize or that require lots of visual space, you can also create layouts that occupy the entire width of the screen:</p><figure><img src="https://www.rstudio.com/blog-images/2018-09-17-radix-fullscreen-layout.png" class="screenshot"/></figure><p>Of course, some figures and notes are only ancillary and are therefore better placed in the margin:</p><figure><img src="https://www.rstudio.com/blog-images/2018-09-17-radix-footnotes-and-asides.png" class="screenshot"/></figure><h2 id="citations-and-metadata">Citations and metadata</h2><p>Radix articles support including citations and a corresponding bibliography using standard R Markdown citation syntax.</p><p>In addition, when you provide a <code>citation_url</code> metadata field for your article, a citation appendix that makes it easy for others to cite your article is automatically generated:</p><figure><img src="https://rstudio.github.io/radix/images/citation.png" class="screenshot"/></figure><p>Radix also automatically includes standard <a href="http://ogp.me/">Open Graph</a> and <a href="https://developer.twitter.com/en/docs/tweets/optimize-with-cards/overview/abouts-cards">Twitter Card</a> metadata. This makes links to your article display rich metadata when shared in various places:</p><figure><img src="https://www.rstudio.com/blog-images/2018-09-17-radix-metadata.png" class="screenshot"/></figure><h2 id="creating-a-blog">Creating a blog</h2><p>You can publish a series of Radix articles as either a website or a blog. For example, the <a href="https://blogs.rstudio.com/tensorflow/">TensorFlow for R</a> blog is implemented using Radix:</p><figure><img src="https://www.rstudio.com/blog-images/2018-09-17-radix-blog.png" class="screenshot"/></figure><p>To learn more, see the article on <a href="https://rstudio.github.io/radix/blog.html">creating a blog with Radix</a>.</p><h2 id="getting-started">Getting started</h2><p>To create an <a href="https://rmarkdown.rstudio.com">R Markdown</a> document that uses the Radix format, first install the <strong>radix</strong> R package:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">devtools<span style="color:#666">::</span><span style="color:#06287e">install_github</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">rstudio/radix&#34;</span>)</code></pre></div><p>Using Radix requires Pandoc v2.0 or higher. If you are using RStudio then you should use RStudio v1.2.718 or higher (which comes bundled with Pandoc v2.0). You can download the preview release of RStudio v1.2 at <a href="https://www.rstudio.com/products/rstudio/download/preview/">https://www.rstudio.com/products/rstudio/download/preview/</a>.</p><p>Next, use the <strong>New R Markdown</strong> dialog within RStudio to create a new Radix article:</p><figure><img src="https://rstudio.github.io/radix/images/new_radix_article.png" class="screenshot"/></figure><p>This will give you a minimal new Radix document.</p><p>Then, check out the <a href="https://rstudio.github.io/radix/">Radix for R Markdown</a> website to learn more about what&rsquo;s possible. Happy authoring!</p><style type="text/css">.screenshot {border: 1px solid rgba(0, 0, 0, 0.2);}</style></description></item><item><title>Deadline extended for rstudio::conf(2019) abstract submissions</title><link>https://www.rstudio.com/blog/rstudio-conf-2019-submission-deadline-extended/</link><pubDate>Fri, 14 Sep 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-conf-2019-submission-deadline-extended/</guid><description><p><a href="https://www.rstudio.com/conference/">rstudio::conf</a>, the conference on all things R and RStudio, will take place January 17 and 18, 2019 in Austin, Texas, preceded by Training Days on January 15 and 16.</p><p>We&rsquo;ve received requests from a number of you for permission to submit talk/e-poster abstracts after the deadline (this Saturday, September 15). In response, we&rsquo;re extending the deadline by a week for everyone; <strong>the new submission deadline is September 22,</strong> a week from Saturday. We&rsquo;ll still notify you of our decision on October 1.</p><p>See our <a href="https://blog.rstudio.com/2018/08/20/rstudio-conf-2019-contributed-talks-eposters/">earlier post</a> for submission guidelines.</p><p><a href="https://goo.gl/forms/zMI5Jcy4FpU6X7Sb2" button type="button" style= "padding: 12px 20px; border: none; font-size: 18px; border-radius: 3px; cursor: pointer; background-color: #4c83b6; color: #fff; box-shadow: 0, 1px, 3px, 0px, rgba(0,0,0,0.10);">Apply now!</a></p></description></item><item><title>Getting started with deep learning in R</title><link>https://www.rstudio.com/blog/getting-started-with-deep-learning-in-r/</link><pubDate>Wed, 12 Sep 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/getting-started-with-deep-learning-in-r/</guid><description><p>There are good reasons to get into deep learning: Deep learning has been outperforming the respective &ldquo;classical&rdquo; techniques in areas like image recognition and natural language processing for a while now, and it has the potential to bring interesting insights even to the analysis of tabular data. For many R users interested in deep learning, the hurdle is not so much the mathematical prerequisites (as many have a background in statistics or empirical sciences), but rather how to get started in an efficient way.</p><p>This post will give an overview of some materials that should prove useful. In the case that you don&rsquo;t have that background in statistics or similar, we will also present a few helpful resources to catch up with &ldquo;the math&rdquo;.</p><h2 id="keras-tutorials">Keras tutorials</h2><p>The easiest way to get started is using the Keras API. It is a high-level, declarative (in feel) way of specifying a model, training and testing it, originally developed in <a href="http://keras.io">Python</a> by Francois Chollet and ported to R by JJ Allaire.</p><p>Check out the tutorials on the <a href="https://tensorflow.rstudio.com/keras/">Keras website</a>: They introduce basic tasks like classification and regression, as well as basic workflow elements like saving and restoring models, or assessing model performance.</p><ul><li><p><a href="https://tensorflow.rstudio.com/keras/articles/tutorial_basic_classification.html">Basic classification</a> gets you started doing image classification using the <em>Fashion MNIST</em> dataset.</p></li><li><p><a href="https://tensorflow.rstudio.com/keras/articles/tutorial_basic_text_classification.html">Text classification</a> shows how to do sentiment analysis on movie reviews, and includes the important topic of how to preprocess text for deep learning.</p></li><li><p><a href="https://tensorflow.rstudio.com/keras/articles/tutorial_basic_regression.html">Basic regression</a> demonstrates the task of predicting a continuous variable by example of the famous Boston housing dataset that ships with Keras.</p></li><li><p><a href="https://tensorflow.rstudio.com/keras/articles/tutorial_overfit_underfit.html">Overfitting and underfitting</a> explains how you can assess if your model is under- or over-fitting, and what remedies to take.</p></li><li><p>Last but not least, <a href="https://tensorflow.rstudio.com/keras/articles/tutorial_save_and_restore.html">Save and restore models</a> shows how to save checkpoints during and after training, so you don&rsquo;t lose the fruit of the network&rsquo;s labor.</p></li></ul><p>Once you&rsquo;ve seen the basics, the website also has more advanced information on implementing custom logic, monitoring and tuning, as well as using and adapting pre-trained models.</p><h2 id="videos-and-book">Videos and book</h2><p>If you want a bit more conceptual background, the <a href="https://bit.ly/2oPtXWv">Deep Learning with R in motion</a> video series provides a nice introduction to basic concepts of machine learning and deep learning, including things often taken for granted, such as derivatives and gradients.</p><figure><a href="https://bit.ly/2oPtXWv"><img src="https://blogs.rstudio.com/tensorflow/posts/2018-09-07-getting-started/images/dl_in_motion.png" style="border: 1px solid rgba(0, 0, 0, 0.2);"></a><figcaption><em>Example from Deep Learning with R in motion, video 2.7, From Derivatives to Gradients</em></figcaption></figure><p>The first 2 components of the video series (<a href="https://bit.ly/2oPtXWv">Getting Started</a> and the <a href="https://bit.ly/2MY6YHj">MNIST Case Study</a>) are free. The remainder of the videos introduce different neural network architectures by way of detailed case studies.</p><a href="https://www.amazon.com/Deep-Learning-R-Francois-Chollet/dp/161729554X"><img src="https://blogs.rstudio.com/tensorflow/posts/2018-09-07-getting-started/images/dlwR.png" style="width: 15%; border: 1px solid rgba(0, 0, 0, 0.2); float:right; margin-left: 20px; margin-right: 40px;"></a><p>The series is a companion to the <a href="https://www.amazon.com/Deep-Learning-R-Francois-Chollet/dp/161729554X">Deep Learning with R</a> book by Francois Chollet and JJ Allaire.</p><p>Like the videos, the book has excellent, high-level explanations of deep learning concepts. At the same time, it contains lots of ready-to-use code, presenting examples for all the major architectures and use cases (including fancy stuff like variational autoencoders and GANs).</p><br/><h2 id="inspiration">Inspiration</h2><p>If you&rsquo;re not pursuing a specific goal, but in general curious about what can be done with deep learning, a good place to follow is the <a href="https://blogs.rstudio.com/tensorflow/">TensorFlow for R Blog</a>. There, you&rsquo;ll find applications of deep learning to business as well as scientific tasks, as well as technical expositions and introductions to new features.</p><figure><a href="https://blogs.rstudio.com/tensorflow/"><img src="https://www.rstudio.com/blog-images/2018-09-12-tensorflow-blog.png" style="border: 1px solid rgba(0, 0, 0, 0.2);"/></a></figure><p>In addition, the <a href="https://blogs.rstudio.com/tensorflow/gallery.html">TensorFlow for R Gallery</a> highlights several case studies that have proven especially useful for getting started in various areas of application.</p><h2 id="reality">Reality</h2><p>Once the ideas are there, realization should follow, and for most of us the question will be: Where can I actually <em>train</em> that model? As soon as real-world-size images are involved, or other kinds of higher-dimensional data, you&rsquo;ll need a modern, high performance GPU so training on your laptop won&rsquo;t be an option any more.</p><p>There are a few different ways you can train in the cloud:</p><ul><li><p>RStudio provides <a href="https://tensorflow.rstudio.com/tools/cloud_server_gpu.html">Amazon EC2 AMIs for cloud GPU instances</a>. The AMI has both RStudio Server and the R TensorFlow package suite preinstalled.</p></li><li><p>You can also try out <a href="https://tensorflow.rstudio.com/tools/cloud_desktop_gpu.html">Paperspace cloud GPU desktops</a> (again with the RStudio and the R TensorFlow package suite preinstalled).</p></li><li><p>The <em>cloudml</em> package provides an <a href="https://tensorflow.rstudio.com/tools/cloudml/articles/getting_started.html">interface to the Google Cloud Machine Learning engine</a>, which makes it easy to submit batch GPU training jobs to CloudML.</p></li></ul><h2 id="more-background">More background</h2><p>If you don&rsquo;t have a very &ldquo;mathy&rdquo; background, you might feel that you&rsquo;d like to supplement the concepts-focused approach from <em>Deep Learning with R</em> with a bit more low-level basics (just as some people feel the need to know at least a bit of C or Assembler when learning a high-level language).</p><p>Personal recommendations for such cases would include Andrew Ng&rsquo;s <a href="https://www.coursera.org/specializations/deep-learning">deep learning specialization</a> on Coursera (videos are free to watch), and the book(s) and recorded lectures on linear algebra by <a href="https://ocw.mit.edu/courses/mathematics/18-06-linear-algebra-spring-2010/video-lectures/">Gilbert Strang</a>.</p><p>Of course, the ultimate reference on deep learning, as of today, is the <a href="https://www.deeplearningbook.org">Deep Learning</a> textbook by Ian Goodfellow, Yoshua Bengio and Aaron Courville. The book covers everything from background in linear algebra, probability theory and optimization via basic architectures such as CNNs or RNNs, on to unsupervised models on the frontier of the very latest research.</p><h2 id="getting-help">Getting help</h2><p>Last not least, should you encounter problems with the software (or with mapping your task to runnable code), a good idea is to create a GitHub issue in the respective repository, e.g., <a href="https://github.com/rstudio/keras/">rstudio/keras</a>.</p><p>Best of luck for your deep learning journey with R!</p></description></item><item><title>Shiny Server (Pro) 1.5.8</title><link>https://www.rstudio.com/blog/shiny-server-pro-1-5-8/</link><pubDate>Tue, 04 Sep 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-server-pro-1-5-8/</guid><description><p><a href="https://www.rstudio.com/products/shiny/shiny-server/">Shiny Server 1.5.8.921 and Shiny Server Pro 1.5.8.985 are now available.</a></p><p>This release includes support for listening on IPv6 addresses. It also fixes issues with servers that have home directories mounted over NFS with <code>root_squash</code>, and with networks that use double-bind LDAP with restrictive permissions on user accounts.</p><p>Finally, this release changes the default SSL/TLS configuration in Shiny Server Pro to remove support for the obsolete and insecure TLSv1 protocol.</p><h3 id="shiny-server-158921">Shiny Server 1.5.8.921</h3><ul><li><p>Upgrade to Node v8.11.3.</p></li><li><p>Added support for listening on IPv6 addresses.</p></li><li><p>X-Powered-By response header now reports &ldquo;Shiny Server&rdquo; instead of &ldquo;Express&rdquo;.</p></li><li><p>Resolve permissions issues when log directory is on an NFS mount with rootsquash. The <code>log_as_user</code> directive was intended to work for these situations,but would fail in common configurations. It should now work.</p></li><li><p><code>log_file_mode</code> no longer respects the process umask, and the default has beenchanged from <code>0660</code> to <code>0640</code>.</p></li><li><p>Exit code of shiny-server process was always 0, regardless of the reason theprocess exited. Now a non-zero exit code is used if the process was terminatedby a signal, or an unhandled error crashed the process, or loading of theshiny-server.conf config file failed during startup.</p></li></ul><h3 id="shiny-server-pro-158985">Shiny Server Pro 1.5.8.985</h3><p>The above changes, plus:</p><ul><li><p>For LDAP double-bind authentication, use the base_bind account to iterate theuser&rsquo;s groups (rather than the user&rsquo;s own LDAP account, which sometimes doesnot have permissions to see its own groups).</p></li><li><p>Added <code>auth_ignore_case</code> directive, which can be used to treat <code>required_user</code>and <code>required_group</code> directives as case-insensitive. Disabled by default, asit&rsquo;s only safe to use on systems that prevent the creation of users/groupswhose names vary from existing users/groups only by case.</p></li><li><p>For SSL/TLS configurations, remove support for TLSv1 by default (SSLv2 and v3were already not supported). If a stricter or looser policy is desired, thiscan be achieved by adding <code>ssl_min_version</code> as a child directive of <code>ssl</code>;valid values for <code>ssl_min_version</code> are <code>tlsv1</code>, <code>tlsv11</code>, and <code>tlsv12</code>.</p></li></ul></description></item><item><title>rstudio::conf(2019) contributed talks & e-posters</title><link>https://www.rstudio.com/blog/rstudio-conf-2019-contributed-talks-eposters/</link><pubDate>Mon, 20 Aug 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-conf-2019-contributed-talks-eposters/</guid><description><p><a href="https://www.rstudio.com/conference/">rstudio::conf</a>, the conference on all things R and RStudio, will take place January 17 and 18, 2019 in Austin, Texas, preceded by Training Days on January 15 and 16. We are pleased to announce that this year’s conference includes contributed <strong>talks</strong> and <strong>e-posters</strong>!</p><p>There are fifteen contributed talk slots which are 20 minutes long, and are scheduled alongside talks by RStudio employees and invited speakers. We expect most talks to be attended by 400+ people, and are looking for interesting topics and engaging speakers.</p><p>Twenty e-posters will be shown during the opening reception on Thursday evening: we’ll provide a big screen, power, internet, drinks and snacks; you’ll provide a laptop with an innovative display or demo. Posters are a great opportunity to showcase your work and engage one-on-one with other attendees. The opening reception is open to all conference attendees, and was very well attended last year.</p><p>We are particularly interested in submissions that have one or more of these qualities:</p><ul><li>Showcase the use of R and RStudio’s tools to solve real problems.</li><li>Expand the tidyverse to reach new domains and audiences.</li><li>Combine R with other world class tools, like python, tensorflow, and spark.</li><li>Communicate using R, whether it’s building on top of RMarkdown, Shiny, ggplot2, or something else altogether.</li><li>Discuss how to teach R effectively.</li></ul><p>If accepted, you’ll receive complementary registration for the conference. (If you have already registered, we’ll refund your registration.)</p><p>Applications close Sept 15, and you’ll be notified of our decision on Oct 1.</p><p><a href="https://goo.gl/forms/zMI5Jcy4FpU6X7Sb2" button type="button" style= "padding: 12px 20px; border: none; font-size: 18px; border-radius: 3px; cursor: pointer; background-color: #4c83b6; color: #fff; box-shadow: 0, 1px, 3px, 0px, rgba(0,0,0,0.10);">Apply now!</a></p></description></item><item><title>R/Medicine Conference</title><link>https://www.rstudio.com/blog/r-medicine-conference/</link><pubDate>Mon, 13 Aug 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/r-medicine-conference/</guid><description><p>We are less than one month away from the <a href="http://www.r-medicine.com/">R/Medicine conference</a>! Tickets are still available to connect with the R users who are advancing the way we think about human health.</p><p>The goal of the R/Medicine conference is to promote the use of the R programming environment and the R ecosystem in medical research and clinical practice. In addition to showcasing novel tools, algorithms and methods for analyzing medical and clinical data, we hope the conference will provide a forum for collaboration within the community.</p><p>The keynote speakers will be:</p><ul><li><a href="https://statweb.stanford.edu/~tibs/">Robert Tibshirani</a> (<a href="https://twitter.com/robtibshirani">@robtibshirani</a>), Professor of Biomedical Data Science, and Statistics at Stanford University</li><li><a href="https://ischool.illinois.edu/people/faculty/vcs">Victoria Stodden</a> (<a href="https://twitter.com/victoriastodden">@victoriastodden</a>), Associate Professor of Information Sciences at the University of Illinois at Urbana-Champaign</li><li><a href="https://www.linkedin.com/in/michael-lawrence-74a9b482">Michael Lawrence</a> (<a href="https://twitter.com/lawremi">@lawremi</a>), Core Member of R and Bioconductor and Computational Biologist at Genentech</li><li><a href="https://medicine.yale.edu/intmed/people/harlan_krumholz.profile">Harlan M Krumholz</a>(<a href="https://twitter.com/hmkyale">@hmkyale</a>), Professor of Medicine and Professor in the Institute for Social and Policy Studies, Yale University.</li></ul><p>Conference talks will address the use of R in medical applications from Phase I clinical trial design through the analysis of the efficacy of medical therapies in public use. Topics include clinical trial design, the analysis of clinical trial data, personalized medicine, the analysis of patient records, the analysis of genetic data, the visualization of medical data, and reproducible research.</p><p>Additionally <a href="https://stat.duke.edu/people/mine-cetinkaya-rundel">Mine Cetinkaya-Rundel</a> will conduct an introductory workshop on <a href="https://shiny.rstudio.com/">Shiny</a> and <a href="http://www.columbia.edu/~bg2382/">Ben Goodrich</a> will offer a workshop on <a href="http://mc-stan.org/">Stan</a>.</p><p>The conference will be held at <a href="https://www.google.com/maps/place/Omni+New+Haven+Hotel+at+Yale/@41.3057418,-72.9296893,17z/data=!3m1!4b1!4m7!3m6!1s0x89e7d84b45005aef:0x8cfc0fe0a0ceb073!5m1!1s2018-06-03!8m2!3d41.3057378!4d-72.9274953">The Omni Hotel</a> 155 Temple Street New Haven, CT, just off the Yale campus.</p><p>Tickets for the conference are $650 for industry attendees, $480 for academics and $250 for students. You can register <a href="http://www.cvent.com/events/r-medicine-2018/event-summary-a6a3cda221d54495abe9711a2c33ca60.aspx">here</a>.</p><p>Don’t miss the opportunity to attend this inaugural event.</p></description></item><item><title>rstudio::conf(2019) diversity scholarships</title><link>https://www.rstudio.com/blog/rstudio-conf-2019-diversity-scholarships/</link><pubDate>Fri, 10 Aug 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-conf-2019-diversity-scholarships/</guid><description><p>rstudio::conf(2019) continues our tradition of diversity scholarships, and this year we’re doubling the program (again!) to 38 recipients:</p><ul><li><p>32 domestic diversity scholarships available to anyone living in the US or Canada who is a member of a group that is under-represented at rstudio::conf.</p></li><li><p>6 international scholarships available to residents of Africa, South or Central America, or Mexico.</p></li></ul><p>Both groups will receive complementary conference and workshop registration, and funds for travel and accommodation (up to $1000 for domestic candidates and $3000 for international). At the conference, scholars will also have networking opportunities with past diversity scholarship recipients as well as with leaders in the field.</p><p>In the long run, we hope that the rstudio::conf participants reflect the full diversity of the world around us. We believe in building on-ramps so that people from diverse backgrounds can learn R, build their knowledge, and then contribute back to the community. We also recognise that there are many parts of the world that do not offer easy access to R events. This year we identified Africa and South America / Mexico as regions that, broadly speaking, have had fewer R events in the recent past. We will continue to re-evaluate the regions where our scholarships can have the greatest impact and will adjust this program as rstudio::conf grows.</p><p>Scholarship applications will be evaluated on two main criteria:</p><ul><li><p>How will attending the conference impact you? What important problems will you be able to tackle that you couldn’t before? You will learn the most if you already have some experience with R, so show us what you’ve achieved so far.</p></li><li><p>How will you share your knowledge with others? We can’t help everyone, so we’re particularly interested in helping those who will go back to their communities and spread the love. Show us how you’ve shared your skills and knowledge in the past and tell us what you plan to do in the future.</p></li></ul><p>The scholarships are competitive, so please don&rsquo;t waste words on generalities. Instead get right to specifics about you and your achievements as they correspond to these criteria.</p><p><a href="https://goo.gl/forms/Mx6RwC2Bt9skAMMW2" button type="button" style= "padding: 12px 20px; border: none; font-size: 18px; border-radius: 3px; cursor: pointer; background-color: #4c83b6; color: #fff; box-shadow: 0, 1px, 3px, 0px, rgba(0,0,0,0.10);">Apply now!</a></p></description></item><item><title>What they forgot to teach you about R</title><link>https://www.rstudio.com/blog/what-they-forgot-to-teach-you-about-r/</link><pubDate>Tue, 07 Aug 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/what-they-forgot-to-teach-you-about-r/</guid><description><p>Join Jenny Bryan and Jim Hester of RStudio for this two-day hands-on workshop designed for experienced R and RStudio users who want to (re)design their R lifestyle! If you’d missed this sold out course at rstudio::conf 2018 now is your chance.</p><p>Register here: <a href="https://www.rstudio.com/workshops/what-they-forgot-to-teach-you-about-r/">https://www.rstudio.com/workshops/what-they-forgot-to-teach-you-about-r/</a></p><p>In this workshop you’ll learn holistic workflows that address the most common sources of friction in data analysis. We’ll work on project-oriented workflows, version control for data science (Git/GitHub!), and how to plan for collaboration, communication, and iteration (incl. RMarkdown). In terms of your R skills, expect to come away with new knowledge of your R installation, how to maintain it, robust strategies for working with the file system, and ways to use the purrr package for repetitive tasks.</p><p>You should take this workshop if you’ve been using R for a while and you feel like writing R code is not what’s holding you back the most. You’ve realized that you have more pressing “meta” problems that no one seems to talk about: how to divide your work into projects and scripts, how to expose your work to others, and how to get more connected to the R development scene. The tidyverse is not an explicit focus of the course (other than the purrr segment) and you can certainly work through the content without it. But you should expect a great deal of <a href="https://www.tidyverse.org/">tidyverse</a> exposure.</p><p>This course is taught by Jenny Bryan and Jim Hester.</p><p>Jenny is a Software Engineer and Data Scientist at RStudio and Adjunct Professor of Statistics at the University of British Columbia. Jenny is widely hailed for making Github a catalyst rather than an impediment to R happiness.</p><p>Jim is a software engineer on the tidyverse team at RStudio, with a background in Bioinformatics and Genomics. He is the author and maintainer of a number of R packages including covr, devtools, glue, readr and more&hellip;</p><p><strong>When</strong> - 8 a.m.to 5 p.m. Thursday, October 4th and Friday, October 5th</p><p><strong>Where</strong> - 1900 5th Avenue, Seattle, WA, 98101 - <a href="http://www.westinseattle.com/">The Westin Seattle</a></p><p><strong>Who</strong> - Jenny Bryan &amp; Jim Hester</p><p>Register here: <a href="https://www.rstudio.com/workshops/what-they-forgot-to-teach-you-about-r/">https://www.rstudio.com/workshops/what-they-forgot-to-teach-you-about-r/</a></p><p>This workshop would be particularly effective for groups of 2 or more co-workers, who want to reach and practice some shared decisions about workflow. As such discount codes will be available for 2 or more attendees from any one organization. Email <a href="sendmail:%20training@rstudio.com">training@rstudio.com</a> if you have any questions about the workshop that you don’t find answered on the registration page.</p></description></item><item><title>rstudio::conf 2019 is open for registration!</title><link>https://www.rstudio.com/blog/rstudio-conf-2019-is-open-for-registration/</link><pubDate>Tue, 31 Jul 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-conf-2019-is-open-for-registration/</guid><description><p><a href="https://rstd.io/conf">rstudio::conf</a>, the conference for all things R and RStudio, will take place January 17 and 18, 2019 (Thursday and Friday) in Austin, Texas. It will be preceded by Training Days on January 15 and 16 (Tuesday and Wednesday). Early Bird registration is now open!</p><p><a href="http://rstd.io/conf"><img src="https://user-images.githubusercontent.com/163582/43474805-7a13ee82-94b9-11e8-8694-440cf874fcf6.png" alt="Conference banner"></a></p><p><strong>Conference: Thursday January 17 - Friday January 18, 2019</strong></p><p>Join host and RStudio Chief Scientist <strong>Hadley Wickham</strong> along with keynote speakers <strong>David Robinson</strong>, Chief Data Scientist at DataCamp; <strong>Felienne Herman</strong>, assistant Professor at Delft University of Technology, founder of Infotron, teacher of Lego Mindstorm, co-founder of the Joy of Programming conference and a host of Software Engineering Radio; and <strong>Joe Cheng</strong>, RStudio CTO and creator of Shiny and RPubs, to explore the state of the art and future of data science.</p><p>Learn from and interact with these outstanding invited speakers and R innovators:</p><style>table { margin-left: 0; }</style><table><thead><tr><th>Speaker</th><th>Role</th><th>Affiliation</th></tr></thead><tbody><tr><td>Angela Bassa</td><td>Director of Data Science</td><td>iRobot</td></tr><tr><td>Karl Broman</td><td>Professor Biostatistics</td><td>University of Wisconsin</td></tr><tr><td>Alice Daish</td><td>Data Scientist</td><td>The Lego Group</td></tr><tr><td>Miles McBain</td><td>Science Engineering Faculty</td><td>Queensland University of Technology</td></tr><tr><td>Amelia McNamara</td><td>Assistant Professor</td><td>University of St Thomas</td></tr><tr><td>Karthik Ram</td><td>Data Scientist</td><td>UC Berkeley</td></tr><tr><td>Mary Rudis</td><td>Math Faculty</td><td>American Mathematical Association of Two Year Colleges</td></tr><tr><td>Kara Woo</td><td>Research Scientist</td><td>Sage Bionetworks</td></tr></tbody></table><p>Find out what RStudio is working on from the people who make the materials and tools you use. You’ll hear from well known RStudio data scientists and engineers like:</p><table><thead><tr><th>Speaker</th><th>Role</th></tr></thead><tbody><tr><td>J.J. Allaire</td><td>CEO &amp; Software Engineer</td></tr><tr><td>Jeff Allen</td><td>Software Engineer</td></tr><tr><td>Mara Averick</td><td>Tidyverse Developer Advocate</td></tr><tr><td>Jenny Bryan</td><td>Software Engineer</td></tr><tr><td>Mine Çetinkaya-Rundel</td><td>Data Scientist and Professional Educator</td></tr><tr><td>Winston Chang</td><td>Software Engineer</td></tr><tr><td>Gabor Csardi</td><td>Software Engineer</td></tr><tr><td>Garrett Grolemund</td><td>Data Scientist and Professional Educator</td></tr><tr><td>Sigrid Keydana</td><td>TensorFlow Developer Advocate</td></tr><tr><td>Max Kuhn</td><td>Software Engineer</td></tr><tr><td>Wes McKinney</td><td>Director of Ursa Labs</td></tr><tr><td>Yihui Xie</td><td>Software Engineer</td></tr></tbody></table><p>The conference will feature more than 60 sessions, with three tracks designed to expand your understanding of what’s possible with R and RStudio. The full agenda will be published in October.</p><p><strong>A call for papers will be announced at the end of August on this blog and on social media. Please subscribe to updates on <a href="https://www.rstudio.com/conference/">https://www.rstudio.com/conference/</a> to make sure you see it. Selected speakers will be announced in October.</strong></p><p><strong>Optional Training Days:</strong> Tuesday January 15 - Wednesday January 16, 2019Preceding the conference, on Tuesday and Wednesday January 15-16, 2019, RStudio will offer two days of optional in-person training. This year, your workshop choices include:</p><table><thead><tr><th>3 Introductory Workshops</th><th>Instructor</th></tr></thead><tbody><tr><td>Data Science in the Tidyverse (2 days)</td><td>Amelia McNamara &amp; Hadley Wickham</td></tr><tr><td>Intro to Shiny and RMarkdown (2 days)</td><td>Danny Kaplan (Macalester College)</td></tr><tr><td>Intro to R and TensorFlow (1 day)</td><td>Andrie de Vries, Kevin Kuo &amp; Sigrid Keydana</td></tr></tbody></table><table><thead><tr><th>8 Intermediate / Advanced Workshops</th><th>Instructor</th></tr></thead><tbody><tr><td>Applied Machine Learning (2 days)</td><td>Max Kuhn &amp; Alex Hayes</td></tr><tr><td>Intermediate Shiny (2 days)</td><td>Aimee Gott (Mango Solutions) &amp; Winston Chang</td></tr><tr><td>Building Tidy Tools (2 days)</td><td>Charlotte Wickham (Oregon State University) &amp; Hadley Wickham</td></tr><tr><td>What They Forgot to Teach You About R (2 days)</td><td>Jenny Bryan</td></tr><tr><td>Big Data with R (2 days)</td><td>Edgar Ruiz &amp; James Blair</td></tr><tr><td>Advanced R and TensorFlow (1 day)</td><td>Andrie de Vries, Kevin Kuo &amp; Sigrid Keydana</td></tr><tr><td>Advanced R Markdown (2 days)</td><td>Yihui Xie</td></tr><tr><td>Shiny in Production (2 days)</td><td>Sean Lopp</td></tr></tbody></table><table><thead><tr><th>3 Workshops for Partners, Professionals and Administrators</th><th>Instructor</th></tr></thead><tbody><tr><td>Tidyverse Trainer Certification (2 days)</td><td>Garrett Grolemund</td></tr><tr><td>Shiny Trainer Certification (1 day)</td><td>Mine Çetinkaya-Rundel</td></tr><tr><td>RStudio Professional Administrator Certification (2 days)</td><td>Cole Arendt &amp; Nathan Stephens</td></tr></tbody></table><p><strong>Who should go?</strong></p><p><a href="https://rstd.io/conf">rstudio::conf</a> is for RStudio users, R administrators, and RStudio partners who want to learn how to write better Shiny applications, explore all the capabilities of R Markdown, work effectively with Spark or TensorFlow, build predictive models, understand the tidyverse of tools for data science, build tidy tools themselves, discover production-ready development &amp; deployment practices, earn certification as a trainer for Shiny or the Tidyverse, or become a certified administrator of RStudio professional products.</p><p><strong>Why do people go to rstudio::conf?</strong></p><p>Because there is simply no better way to learn about all things R &amp; RStudio.</p><blockquote><p>“#rstudioconf: the best conference of any kind I have ever been to 🤩 Thanks @rstudio!”</p></blockquote><blockquote><p>“Just the first hour of conference already made worth it all my trip from Brazil!”</p></blockquote><blockquote><p>“First day back from #rstudioconf and my Shiny app is already vastly improved!”</p></blockquote><blockquote><p>“#rstudioconf was illuminating, so much gold on the struggles and successes of scaling #rstats for enterprise. It’s fascinating to see the community wrestle with these issues in a rapidly changing ecosystem. Thanks to @rstudio and all the speakers. See ya in Austin!”</p></blockquote><blockquote><p>“MASSIVE thank you to everyone @rstudio for your hard work to make #rstudioconf the best event each year for #rstat users. the genuine passion all speakers bring to the stage, the no-nonsense non-sales-pitchy talks, and the steadfast commitment to making #R better for all.”</p></blockquote><script src="https://fast.wistia.com/embed/medias/rz2ehf04zi.jsonp" async></script><script src="https://fast.wistia.com/assets/external/E-v1.js" async></script><div class="wistia_responsive_padding" style="padding:56.25% 0 0 0;position:relative;"><div class="wistia_responsive_wrapper" style="height:100%;left:0;position:absolute;top:0;width:100%;"><div class="wistia_embed wistia_async_rz2ehf04zi videoFoam=true" style="height:100%;position:relative;width:100%"><div class="wistia_swatch" style="height:100%;left:0;opacity:0;overflow:hidden;position:absolute;top:0;transition:opacity 200ms;width:100%;"><img src="https://fast.wistia.com/embed/medias/rz2ehf04zi/swatch" style="filter:blur(5px);height:100%;object-fit:contain;width:100%;" alt="" onload="this.parentNode.style.opacity=1;" /></div></div></div></div><br><p>Also, it’s at the new and fantastic Fairmont Hotel in Austin, Texas “The Live Music Capital of the World”</p><p><strong>What should I do now?</strong></p><p>Be an early bird! Attendance is limited. All seats are are available on a first-come, first-serve basis. Early Bird registration discounts are available (Conference only) and a capped number of Academic discounts are also available for eligible students and faculty.</p><p>Stay tuned for information about diversity scholarships. We’ll announce the application process at the end of August, and we’re offering twice as many as last year!</p><p>If all tickets available for a particular workshop are sold out before you are able to purchase, we apologize in advance!</p><br><p><a href="https://rstd.io/conf" button type="button" style= "padding: 12px 20px; border: none; font-size: 18px; border-radius: 3px; cursor: pointer; background-color: #4c83b6; color: #fff; box-shadow: 0, 1px, 3px, 0px, rgba(0,0,0,0.10);">Register</a></p><br><p>Please go to <a href="https://rstd.io/conf">rstudio::conf</a> to purchase.</p><p>We hope to see you in Austin at rstudio::conf 2019!</p><p>For questions or issues registering, please email <a href="mailto:conf@rstudio.com">conf@rstudio.com</a>.</p></description></item><item><title>Announcing the 1st Bookdown Contest</title><link>https://www.rstudio.com/blog/first-bookdown-contest/</link><pubDate>Fri, 27 Jul 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/first-bookdown-contest/</guid><description><p>Since the release of <a href="https://github.com/rstudio/bookdown">the <strong>bookdown</strong> package</a> in 2016, there have been a large number of books written and published with <strong>bookdown</strong>. Currently there are about 200 books (including tutorials and notes) listed on <a href="https://bookdown.org">bookdown.org</a> alone! We have also heard about other applications of <strong>bookdown</strong> based on custom templates (e.g., dissertations).</p><p>As popular as <strong>bookdown</strong> is becoming, especially with teachers, researchers, and students, we know it can take a lot of time to tailor <strong>bookdown</strong> to meet the special typesetting requirements of your institution or publisher. As it is today, future graduate students will have to spend many hours reinventing a thesis template, instead of focusing on writing content in R Markdown! Fortunately, we are sure that there are already elegant and reusable <strong>bookdown</strong> applications, which would greatly benefit future users.</p><p>With that in mind, we are happy to announce the first contest to recognize outstanding <strong>bookdown</strong> applications!</p><p><img src="https://user-images.githubusercontent.com/163582/43284090-651365f8-90e0-11e8-8092-a9b10775fda0.png" alt="The first bookdown contest"></p><h2 id="criteria">Criteria</h2><p>There are no hard judging criteria for this contest, but in general, we&rsquo;d prefer these types of applications:</p><ul><li>Publicly and freely accessible (both source documents and the output). If the full source and output cannot be shared publicly, we expect at least a full demo that can be shared (the demo could contain only placeholder content).</li><li>Not tightly tied to a particular output format, which means you should use as fewer raw LaTeX commands or HTML tags as possible in the body of the book (using the <code>includes</code> options is totally fine, e.g., including custom LaTeX content in the preamble). An exception can be made for dissertations, since they are typically in the PDF format.</li><li>Has some minimal examples or clear instructions for other users to easily create similar applications.</li><li>Uses new output formats based on <strong>bookdown</strong>&lsquo;s built-in output formats (such as <code>bookdown::html_book</code> or <code>bookdown::pdf_document2</code>).</li><li>Has creative and elegant styling for HTML and/or PDF output based on either the default templates in <strong>bookdown</strong> or completely new custom templates.</li></ul><p>We&rsquo;d also like to see non-English applications, such as books written in CJK (Chinese, Japanese, Korean), right-to-left, or other languages, since there are additional challenges in typesetting with these languages.</p><p>Note that the applications do not have to be technical books or even books at all. They could be novels, diaries, collections of poems/essays, course notes, or data analysis reports.</p><h2 id="awards">Awards</h2><p>Honorable Mention Prizes (ten):</p><ul><li>One signed copy of &ldquo;<a href="https://www.crcpress.com/product/isbn/9781138700109">bookdown: Authoring Books and Technical Documents with R Markdown</a>&quot;.</li><li>One RStudio t-shirt.</li></ul><p>Runner Up Prizes (two): All awards above, plus</p><ul><li>All hex/RStudio stickers we can find.</li><li>Any number of <a href="https://www.rstudio.com/about/gear/">RStudio t-shirts and mugs</a> (within $200).</li></ul><p>Grand Prize (one): All awards above, with three more signed books related to R Markdown</p><ul><li><a href="https://www.crcpress.com/p/book/9781138359338">R Markdown: The Definitive Guide</a></li><li><a href="https://www.crcpress.com/p/book/9781498716963">Dynamic Documents with knitr, 2nd edition</a></li><li><a href="https://www.crcpress.com/p/book/9780815363729">blogdown: Creating Websites with R Markdown</a></li></ul><p>The names and work of all winners will be highlighted in a gallery on the <a href="https://bookdown.org">bookdown.org</a> website and we will announce them on RStudio’s social platforms, including <a href="https://community.rstudio.com">community.rstudio.com</a> (unless the winner prefers not to be mentioned).</p><p>Of course, the main reward is knowing that you’ve helped future writers!</p><h2 id="submission">Submission</h2><p>To participate this contest, please follow the link <a href="http://rstd.io/bookdown-contest">http://rstd.io/bookdown-contest</a> to create a new post in RStudio Community (you will be asked to sign up if you don&rsquo;t have an account). The post title should start with &ldquo;Bookdown contest submission:&quot;, followed by a short title to describe your application (e.g., &ldquo;a PhD thesis template for Iowa State&rdquo;). The post may describe features and highlights of the application, include screenshots and links to live examples and source repositories, and briefly explain key technical details (how the customization or extension was achieved).</p><p>There is no limit on the number of entries one participant can submit. Please submit as many as you wish!</p><p>The deadline for the submission is October 1st, 2018. You are welcome to either submit your existing <strong>bookdown</strong> applications (even like a PhD thesis you wrote two years ago), or create one in two months! We will announce winners and their submissions in this blog, RStudio Community, and also on Twitter before Oct 15th, 2018.</p><p>I (Yihui) will be the main judge this year. Winners of this year will be invited to serve as judges next year. I&rsquo;ll consider both the above criteria and the feedback/reaction of other users in the submission posts in RStudio Community (such as the number of likes that a post receives).</p><p>Looking forward to your submissions!</p></description></item><item><title>RStudio Connect 1.6.6 - Custom Emails</title><link>https://www.rstudio.com/blog/rstudio-connect-1-6-6-custom-emails/</link><pubDate>Thu, 26 Jul 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-1-6-6-custom-emails/</guid><description><p>We are excited to announce RStudio Connect 1.6.6! This release caps a series of improvements to RStudio Connect’s ability to deliver your work to others.</p><p><img src="https://www.rstudio.com/blog-images/rsc-166-email-demo.png" alt=""></p><h2 id="custom-email">Custom Email</h2><p>The most significant change in RStudio Connect 1.6.6 is the new ability for publishers to customize the emails sent to others when they update their data products. In RStudio Connect, it is already possible to schedule the execution of R Markdown documents and send emails to subscribers notifying them of new versions of content. <strong>In this release, publishers can customize whether or not an email is sent, add email attachments, specify the email subject line, and dynamically build beautiful email messages with plots and tables produced by your analysis.</strong></p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">````{r}email <span style="color:#666">&lt;-</span> blastula<span style="color:#666">::</span><span style="color:#06287e">compose_email</span>(body <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0"></span><span style="color:#4070a0"> Hello Team,</span><span style="color:#4070a0"> Great job! We closed {today()} at {final_sales}.</span><span style="color:#4070a0"> {add_ggplot(p, width = 6, height = 6)}</span><span style="color:#4070a0"> - Jim</span><span style="color:#4070a0"> &#34;</span>)<span style="color:#06287e">if </span>(sales <span style="color:#666">&gt;</span> <span style="color:#40a070">10000</span>) {rmarkdown<span style="color:#666">::</span>output_metadata<span style="color:#666">$</span><span style="color:#06287e">set</span>(rsc_email_subject <span style="color:#666">=</span> <span style="color:#06287e">glue</span>(<span style="color:#4070a0">&#39;</span><span style="color:#4070a0">Sales at {final_sales} for {today()}&#39;</span>),rsc_email_body_html <span style="color:#666">=</span> email<span style="color:#666">$</span>html_str,rsc_email_images <span style="color:#666">=</span> email<span style="color:#666">$</span>images,rsc_email_attachments <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#39;</span><span style="color:#4070a0">sales_summary.pptx&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">sales_data.xlsx&#39;</span>))} else {rmarkdown<span style="color:#666">::</span>output_metadata<span style="color:#666">$</span><span style="color:#06287e">set</span>(rsc_email_suppress_scheduled <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>)}</code></pre></div><pre><code>All customizations are done using code in the underlying R Markdown document. The embedded code provides complete control over the email, but does not impact the result of the rendered report. For example, a report about sales numbers could be set up to only email end users if a critical threshold is reached.Full examples are available in the RStudio Connect [user guide](http://docs.rstudio.com/connect/1.6.6/user/r-markdown.html#r-markdown-email-body).## Other Updates- **Historical Reports** RStudio Connect currently allows users to view previously rendered reports. In RStudio Connect 1.6.6, when users are viewing a report with a history, they can open and share a link directly to the historical versions, or send an email including the historic content.- **Instrumentation** RStudio Connect 1.6.6 will track usage events and record information such as who uses the content, what content was used, and when content was viewed. We don't provide access to this data yet, but in future releases, this information will be accessible to publishers to help answer questions like, “How many users viewed my application this month?”.- The `usermanager alter` command can now be used to manage whether a user is locked or unlocked. See the admin guide for details and other updates to the `usermanager` command.- **User Listing in the Connect Server API** The public Connect Server API now includes an endpoint to list user information. See the [user guide](http://docs.rstudio.com/connect/1.6.6/user/cookbook.html#use-offset-pagination) for details.## Security &amp; Authentication Changes- **Removing the “Anyone” Option** New [configuration options](http://docs.rstudio.com/connect/1.6.6/admin/content-management.html#limiting-allowed-viewership) can be used to limit how widely publishers are allowed to distribute their content.- **The People Tab** In certain scenarios, it is undesirable for RStudio Connect viewers to be able to see the profiles of other RStudio Connect users. The `Applications.UsersListingMinRole` setting can now be used to prevent certain roles from seeing other profiles on the People tab. Users limited in this way will still see other user profiles in the content settings panel, but only for content they can access.- **LDAP / Active Directory Changes** RStudio Connect no longer relies on the distinguished name (DN) of a user. Existing installations will continue working, but administrators should use the new `LDAP.UniqueIdAttribute` to tell RStudio Connect which [LDAP attribute identifies users](http://docs.rstudio.com/connect/1.6.6/admin/authentication.html#unique-id-attribute).- A new `HTTP.ForceSecure` option is available, which sets the `Secure` flag on RStudio Connect browser cookies. This setting adds support for the `Secure` flag when RStudio Connect is used behind an HTTPS-terminating proxy. See the existing `HTTPS.Permanent` setting if you plan to use RStudio Connect to terminate HTTPS.## Deprecations &amp; Breaking Changes- **Breaking Change** In RStudio Connect 1.6.6, the `--force` flag in the `usermanager alter` command has been changed to `--force-demoting`.- **Breaking Change** All URLs referring to users and groups now use generated IDs in place of IDs that may have contained identifying information. Existing bookmarks to specific user or group pages may need to be updated, and pending account confirmation emails will need to be resent.- `Applications.EnvironmentBlacklist` is deprecated in favor of `Applications.ProhibitedEnvironment`, and `LDAP.WhitelistedLoginGroups` is deprecated in favor of `LDAP.PermittedLoginGroups`. Both settings will be removed in the next release.Please review the full [release notes](http://docs.rstudio.com/connect/1.6.6/news/).&gt; #### Upgrade Planning&gt; If you use LDAP or Active Directory, please take note&gt; of the LDAP changes described above and in the [release notes](http://docs.rstudio.com/connect/1.6.6/news/). Aside from the&gt; deprecations above, there are no other special considerations, and upgrading&gt; should take less than 5 minutes. If you’re upgrading from a release older&gt; than v1.6.4, be sure to consider the “Upgrade Planning” notes from the&gt; intervening releases, as well.If you haven't yet had a chance to download and try [RStudio Connect](https://rstudio.com/products/connect/), we encourage you to do so. RStudio Connect is the best way to share all the work that you do in R (Shiny apps, R Markdown documents, plots, dashboards, Plumber APIs, etc.) with collaborators, colleagues, or customers.You can find more details or download a 45-day evaluation of the product at [https://www.rstudio.com/products/connect/](https://www.rstudio.com/products/connect/). Additional resources can be found below.- [RStudio Connect home page &amp; downloads](https://www.rstudio.com/products/connect/)- [RStudio Connect Admin Guide](http://docs.rstudio.com/connect/admin/)- [What IT needs to know about RStudio Connect](https://www.rstudio.com/wp-content/uploads/2016/01/RSC-IT-Q-and-A.pdf)- [Detailed news and changes between each version](http://docs.rstudio.com/connect/news/)- [Pricing](https://www.rstudio.com/pricing/#ConnectPricing)- [An online preview of RStudio Connect](https://beta.rstudioconnect.com/connect/)</code></pre></description></item><item><title>The Revamped bookdown.org Website</title><link>https://www.rstudio.com/blog/revamped-bookdown-org/</link><pubDate>Wed, 25 Jul 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/revamped-bookdown-org/</guid><description><p>Since we announced the <strong>bookdown</strong> package <a href="https://www.rstudio.com/blog/announcing-bookdown/">in 2016</a>, there have been a large number of books, reports, notes, and tutorials written with this package and published to <a href="https://bookdown.org">https://bookdown.org</a>. We were excited to see that! At the same time, however, maintaining the list of books on bookdown.org has become more and more difficult because I must update the list manually to filter out books that are only skeletons or not built with <strong>bookdown</strong> (such as slides). It was not only time-consuming for me, but also delayed the exhibition of many awesome books.</p><p><img src="https://bookdown.org/yihui/bookdown/images/logo.png" alt="The bookdown logo"></p><p>Today I&rsquo;m happy to introduce the revamped bookdown.org website and to let you know how you may contribute your books there or help us improve the website. The full source of the website is hosted in the <a href="https://github.com/rstudio/bookdown.org">rstudio/bookdown.org</a> repository on Github (special thanks to <a href="https://github.com/cderv">Christophe Dervieux</a> and <a href="https://github.com/tcgriffith">TC Zhang</a> for the great help).</p><h2 id="the-archive-page">The archive page</h2><p>We list all books published to bookdown.org, that have substantial content, on the <a href="https://bookdown.org/home/archive/">Archive</a> page. This page also contains a few books published elsewhere (e.g., <em>Fundamentals of Data Visualization</em> by Claus O. Wilke). The list is automatically generated by scraping the homepages of books. If you see any information that is not accurate about your own book on this page, you may need to correct the information in your book source documents (e.g., <code>index.Rmd</code>) and re-publish the book. Then we can scrape your book again to reflect the correct information. You can also contribute links to your books published elsewhere by submitting pull requests on Github. Please read the <a href="https://bookdown.org/home/about/">About</a> page for detailed instructions.</p><h2 id="the-homepage">The homepage</h2><p>On the homepage, we feature a small subset of books written in <strong>bookdown</strong>. These books are typically either published or nearly complete. If you see an interesting/useful book written in <strong>bookdown</strong>, you may suggest that we add it to the homepage, no matter if you are its author or not. Again, please see the <a href="https://bookdown.org/home/about/">About</a> page for instructions.</p><h2 id="the-tags-page">The tags page</h2><p>To make it a little easier for you to find the books that you are interested in, we created a list of tags to classify books on the <a href="https://bookdown.org/home/tags/">Tags</a> page. The current classification method is quite rudimentary, however. We only match the tags against the descriptions of books. In the future, we may support custom keywords or tags in the <strong>bookdown</strong> package, so authors can provide their own tags. You are welcome to submit pull requests to <a href="https://github.com/rstudio/bookdown.org/edit/master/R/tags.txt">improve the existing tags</a>.</p><h2 id="the-authors-page">The authors page</h2><p>We also list all books by authors on the <a href="https://bookdown.org/home/authors/">Authors</a> page. Note that if a book has multiple authors, they are listed together, and the book is not displayed on the individual author&rsquo;s cards.</p><h2 id="so-they-have-authored-195-books-where-is-yours">So they have authored 195 books. Where is yours?</h2><p>We are happy (<a href="https://twitter.com/_ColinFay/status/1012964820004548609">as happy as Colin Fay</a>) to see that it is totally practical to publish books with <strong>bookdown</strong> and enjoy the simplicity of R Markdown at the same time. For authors, if we missed your excellent book on bookdown.org, please <a href="https://github.com/rstudio/bookdown.org/edit/master/R/staging.txt">do not hesitate to add it yourself</a>. The best time to write a book was 20 years ago. The second best time is now. We are looking forward to your own book on bookdown.org, and we hope readers will enjoy all these free and open-source books on bookdown.org.</p></description></item><item><title>The Revamped bookdown.org Website</title><link>https://www.rstudio.com/blog/revamped-bookdown-org/</link><pubDate>Wed, 25 Jul 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/revamped-bookdown-org/</guid><description><p>Since we announced the <strong>bookdown</strong> package <a href="https://www.rstudio.com/2016/12/02/announcing-bookdown/">in 2016</a>, there have been a large number of books, reports, notes, and tutorials written with this package and published to <a href="https://bookdown.org">https://bookdown.org</a>. We were excited to see that! At the same time, however, maintaining the list of books on bookdown.org has become more and more difficult because I must update the list manually to filter out books that are only skeletons or not built with <strong>bookdown</strong> (such as slides). It was not only time-consuming for me, but also delayed the exhibition of many awesome books.</p><p><img src="https://bookdown.org/yihui/bookdown/images/logo.png" alt="The bookdown logo"></p><p>Today I&rsquo;m happy to introduce the revamped bookdown.org website and to let you know how you may contribute your books there or help us improve the website. The full source of the website is hosted in the <a href="https://github.com/rstudio/bookdown.org">rstudio/bookdown.org</a> repository on Github (special thanks to <a href="https://github.com/cderv">Christophe Dervieux</a> and <a href="https://github.com/tcgriffith">TC Zhang</a> for the great help).</p><h2 id="the-archive-page">The archive page</h2><p>We list all books published to bookdown.org, that have substantial content, on the <a href="https://bookdown.org/home/archive/">Archive</a> page. This page also contains a few books published elsewhere (e.g., <em>Fundamentals of Data Visualization</em> by Claus O. Wilke). The list is automatically generated by scraping the homepages of books. If you see any information that is not accurate about your own book on this page, you may need to correct the information in your book source documents (e.g., <code>index.Rmd</code>) and re-publish the book. Then we can scrape your book again to reflect the correct information. You can also contribute links to your books published elsewhere by submitting pull requests on Github. Please read the <a href="https://bookdown.org/home/about/">About</a> page for detailed instructions.</p><h2 id="the-homepage">The homepage</h2><p>On the homepage, we feature a small subset of books written in <strong>bookdown</strong>. These books are typically either published or nearly complete. If you see an interesting/useful book written in <strong>bookdown</strong>, you may suggest that we add it to the homepage, no matter if you are its author or not. Again, please see the <a href="https://bookdown.org/home/about/">About</a> page for instructions.</p><h2 id="the-tags-page">The tags page</h2><p>To make it a little easier for you to find the books that you are interested in, we created a list of tags to classify books on the <a href="https://bookdown.org/home/tags/">Tags</a> page. The current classification method is quite rudimentary, however. We only match the tags against the descriptions of books. In the future, we may support custom keywords or tags in the <strong>bookdown</strong> package, so authors can provide their own tags. You are welcome to submit pull requests to <a href="https://github.com/rstudio/bookdown.org/edit/master/R/tags.txt">improve the existing tags</a>.</p><h2 id="the-authors-page">The authors page</h2><p>We also list all books by authors on the <a href="https://bookdown.org/home/authors/">Authors</a> page. Note that if a book has multiple authors, they are listed together, and the book is not displayed on the individual author&rsquo;s cards.</p><h2 id="so-they-have-authored-195-books-where-is-yours">So they have authored 195 books. Where is yours?</h2><p>We are happy (<a href="https://twitter.com/_ColinFay/status/1012964820004548609">as happy as Colin Fay</a>) to see that it is totally practical to publish books with <strong>bookdown</strong> and enjoy the simplicity of R Markdown at the same time. For authors, if we missed your excellent book on bookdown.org, please <a href="https://github.com/rstudio/bookdown.org/edit/master/R/staging.txt">do not hesitate to add it yourself</a>. The best time to write a book was 20 years ago. The second best time is now. We are looking forward to your own book on bookdown.org, and we hope readers will enjoy all these free and open-source books on bookdown.org.</p></description></item><item><title>Announcing the R Markdown Book</title><link>https://www.rstudio.com/blog/announcing-the-r-markdown-book/</link><pubDate>Fri, 13 Jul 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-the-r-markdown-book/</guid><description><p>It is exciting for us to see the amazing growth of the R Markdown ecosystem over the four years since the creation of R Markdown in 2014. Now you can author many types of documents, and build a wide range of applications based on R Markdown. As an effort to unite and improve the documentation of the R Markdown base package (<strong>rmarkdown</strong>) and several other extensions (such as <strong>bookdown</strong>, <strong>blogdown</strong>, <strong>pkgdown</strong>, <strong>flexdashboard</strong>, <strong>tufte</strong>, <strong>xaringan</strong>, <strong>rticles</strong>, and <strong>learnr</strong>) in one place, we authored a book titled &ldquo;<em>R Markdown: The Definitive Guide</em>&quot;, which is to be published by Chapman &amp; Hall/CRC in about two weeks.</p><p><a href="https://bookdown.org/yihui/rmarkdown/"><img src="https://bookdown.org/yihui/rmarkdown/images/cover.png" alt="R Markdown: The Definitive Guide"></a></p><p>You <a href="https://www.crcpress.com/p/book/9781138359338">can pre-order a copy</a> now if you like. Our publisher is generous enough to allow us to provide a complete online version of this book at <a href="https://bookdown.org/yihui/rmarkdown/,">https://bookdown.org/yihui/rmarkdown/,</a> which you can always read for free. The full source of this book is also freely and publicly available in the Github repo <a href="https://github.com/rstudio/rmarkdown-book">https://github.com/rstudio/rmarkdown-book</a>.</p><p>Please feel free to let us know if you have any questions or suggestions regarding this book. You are always welcome to send Github pull requests to help us improve the book.^[When you are reading the online version of the book, you may click the Edit button on the toolbar to edit the source file of a page, and follow the guide on Github to create a pull request.] We hope you will find this book useful.</p></description></item><item><title>RStudio Connect v1.6.4.2 - Security Update</title><link>https://www.rstudio.com/blog/rstudio-connect-v1-6-4-2-security-update/</link><pubDate>Mon, 09 Jul 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-v1-6-4-2-security-update/</guid><description><p>A security vulnerability in a third-party library used by RStudio Connect was uncovered during a security audit last week. We have confirmed that this vulnerability has not been used against any of the RStudio Connect instances we host, and are unaware of it being exploited on any customer deployments. Under certain conditions, this vulnerability could compromise the session of a user that was tricked into visiting a specially crafted URL. The issue affects all versions of RStudio Connect up to and including 1.6.4.1, but none of our other products. We have prepared a hotfix: <a href="https://www.rstudio.com/products/connect/">v1.6.4.2</a>.</p><p>RStudio remains committed to providing the most secure product possible. We regularly perform internal security audits against RStudio Connect in order to ensure the product’s security.</p><p>As part of the responsible disclosure process, we will provide additional details about the vulnerability and how to ensure that you have not been affected, in the coming weeks once customers have had time to update their systems. For now, <strong>please update your RStudio Connect installations to version 1.6.4.2 as soon as possible</strong>.</p></description></item><item><title>Shiny 1.1.0: Scaling Shiny with async</title><link>https://www.rstudio.com/blog/shiny-1-1-0/</link><pubDate>Tue, 26 Jun 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-1-1-0/</guid><description><p>This is a significant release for <strong><a href="https://shiny.rstudio.com">Shiny</a></strong>, with a major new feature that was nearly a year in the making: support for asynchronous operations!</p><p>Without this capability, when Shiny performs long-running calculations or tasks on behalf of one user, it stalls progress for all other Shiny users that are connected to the same process. Therefore, Shiny apps that feature long-running calculations or tasks have generally been deployed using many R processes, each serving a small number of users; this works, but is not the most efficient approach. Such applications now have an important new tool in the toolbox to improve performance under load.</p><p>Shiny async is implemented via integration with the <strong><a href="https://github.com/HenrikBengtsson/future">future</a></strong> and <strong><a href="https://rstudio.github.io/promises/">promises</a></strong> packages. These two packages are used together:</p><ol><li><strong>Use <code>future</code> to perform long-running operations in a worker process that runs in the background</strong>, leaving Shiny processes free to serve other users in the meantime. This yields much better responsiveness under load, and much more predictable latency.</li><li><strong>Use <code>promises</code> to handle the result of each long-running background operation back in the Shiny process</strong>, where additional processing can occur, such as further data manipulation, or displaying to the user via a reactive output.</li></ol><p>If your app has a small number of severe performance bottlenecks, you can use this technique to get massively better responsiveness under load. For example, if the <code>httr::GET</code> call in this server function takes 30 seconds to complete:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">server <span style="color:#666">&lt;-</span> <span style="color:#06287e">function</span>(input, output, session) {r <span style="color:#666">&lt;-</span> <span style="color:#06287e">reactive</span>({httr<span style="color:#666">::</span><span style="color:#06287e">GET</span>(url) <span style="color:#666">%&gt;%</span>httr<span style="color:#666">::</span><span style="color:#06287e">content</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">parsed&#34;</span>)})output<span style="color:#666">$</span>plot <span style="color:#666">&lt;-</span> <span style="color:#06287e">renderPlot</span>({<span style="color:#06287e">r</span>() <span style="color:#666">%&gt;%</span><span style="color:#06287e">ggplot</span>(<span style="color:#06287e">aes</span>(speed, dist)) <span style="color:#666">+</span> <span style="color:#06287e">geom_point</span>()})}</code></pre></div><p>then the entire R process is stalled for those 30 seconds.</p><p>We can rewrite it asynchronously like this:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(promises)<span style="color:#06287e">library</span>(future)<span style="color:#06287e">plan</span>(multisession)server <span style="color:#666">&lt;-</span> <span style="color:#06287e">function</span>(input, output, session) {r <span style="color:#666">&lt;-</span> <span style="color:#06287e">reactive</span>({<span style="color:#06287e">future</span>(httr<span style="color:#666">::</span><span style="color:#06287e">GET</span>(url)) <span style="color:#666">%...&gt;%</span>httr<span style="color:#666">::</span><span style="color:#06287e">content</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">parsed&#34;</span>)})output<span style="color:#666">$</span>plot <span style="color:#666">&lt;-</span> <span style="color:#06287e">renderPlot</span>({<span style="color:#06287e">r</span>() <span style="color:#666">%...&gt;%</span> {<span style="color:#06287e">ggplot</span>(., <span style="color:#06287e">aes</span>(speed, dist)) <span style="color:#666">+</span> <span style="color:#06287e">geom_point</span>()}})}</code></pre></div><p>Even if the <code>httr::GET(url)</code> takes 30 seconds, the <code>r</code> reactive executes almost instantly, and returns control to the caller. The code inside <code>future(...)</code> is executed in a different R process that runs in the background, and whenever its result becomes available (i.e. in 30 seconds), the right-hand side of <code>%...&gt;%</code> will be executed with that result. (<code>%...&gt;%</code> is called a &ldquo;promise pipe&rdquo;; it works similarly to a magrittr pipe that knows how to wait for and &ldquo;unwrap&rdquo; promises.)</p><p>If the original (synchronous) code appeared in a Shiny app, then during that 30 seconds, the R process is stuck dealing with the download and can&rsquo;t respond to any requests being made by other users. But with the async version, the R process only needs to kick off the operation, and then is free to service other requests. This means other users will only have to wait milliseconds, not minutes, for the app to respond.</p><h3 id="case-study">Case study</h3><p>We&rsquo;ve created a <a href="https://rstudio.github.io/promises/articles/casestudy.html">detailed case study</a> that walks through the async conversion of a realistic example app. This app processes low-level logging data from RStudio&rsquo;s CRAN mirrors, to let us explore the heaviest downloaders for each day.</p><p><img src="https://rstudio.github.io/promises/articles/case-study-tab3.png" alt=""></p><p>To load test this example app, we launched 50 sessions of simulated load, with a 5 second delay between each launch, and directed this traffic to a single R process. We then rewrote the app to use futures and promises, and reran the load test with this async version. (The tools we used to perform the load testing are not yet publicly available, but you can refer to <a href="https://www.rstudio.com/resources/rstudioconf-2018/scaling-shiny/">Sean Lopp&rsquo;s talk at rstudio::conf 2018</a> for a preview.)</p><p>Under these conditions, the finished async version displays significantly lower (mean) response times than the original. In the table below, &ldquo;HTTP traffic&rdquo; refers to requests that are made during page load time, and &ldquo;reactive processing&rdquo; refers to the time between the browser sending a reactive input value and the server returning updated reactive outputs.</p><style>table td:first-child, table th:first-child {text-align:left !important;}table td, table th {text-align:right !important;}</style><table><thead><tr><th>Response type</th><th>Original</th><th>Async</th><th>Delta</th></tr></thead><tbody><tr><td>HTTP traffic</td><td>605 ms</td><td>139 ms</td><td>-77%</td></tr><tr><td>Reactive processing</td><td>10.7 sec</td><td>3.48 sec</td><td>-67%</td></tr></tbody></table><h3 id="learn-more">Learn more</h3><p>Visit the <a href="https://rstudio.github.io/promises/">promises</a> website to learn more, or watch my <a href="https://www.rstudio.com/resources/videos/scaling-shiny-apps-with-async-programming-june-2018/">recent webinar</a> on Shiny async.</p><p>See the <a href="https://shiny.rstudio.com/reference/shiny/1.1.0/upgrade.html">full changelog</a> for Shiny v1.1.0.</p><h2 id="related-packages">Related packages</h2><p>Over the last year, we created or enhanced several other packages to support async Shiny:</p><ul><li>The <strong><a href="https://rstudio.github.io/promises/">promises</a></strong> package (released 2018-04-13) mentioned above provides the actual API you&rsquo;ll use to do async programming in R. We implemented this as a separate package so that other parts of the R community, not just Shiny users, can take advantage of these techniques. The promises package was inspired by the basic ideas of <a href="https://developers.google.com/web/fundamentals/primers/promises">JavaScript promises</a>, but also have significantly improved syntax and extensibility to make them work well with R and Shiny. Currently, promises is most useful when used with the <a href="https://cran.r-project.org/package=future">future</a> package by <a href="https://github.com/HenrikBengtsson">Henrik Bengtsson</a>.</li><li><strong><a href="https://cran.r-project.org/package=later">later</a></strong> (released 2017-06-25) adds a low-level feature to R that is critical to async programming: the ability to schedule R code to be executed in the future, within the same R process. You can do all sorts of cool stuff on top of this, as some people are <a href="https://yihui.name/en/2017/10/later-recursion/">discovering</a>.</li><li><strong><a href="https://cran.r-project.org/package=httpuv">httpuv</a></strong> (1.4.0 released 2018-04-19) has long been the HTTP web server that Shiny, and most other web frameworks for R, sit on top of. Version 1.4.0 adds support for asynchronous handling of HTTP requests, and also adds a dedicated I/O-handling thread for greatly improved performance under load.</li></ul><p>In the coming weeks, you can also expect updates for async compatibility to <strong><a href="https://www.htmlwidgets.org">htmlwidgets</a></strong>, <strong><a href="https://plot.ly/r/">plotly</a></strong>, and <strong><a href="https://rstudio.github.io/DT/">DT</a></strong>. Most other HTML widgets will automatically become async compatible once htmlwidgets is updated.</p></description></item><item><title>RStudio Connect v1.6.4</title><link>https://www.rstudio.com/blog/rstudio-connect-v1-6-4/</link><pubDate>Tue, 19 Jun 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-v1-6-4/</guid><description><p>RStudio Connect version 1.6.4 is <a href="https://www.rstudio.com/products/connect/">now available</a>!</p><p>There are a few breaking changes and a handful of new features that are highlighted below.We encourage you to upgrade as soon as possible!</p><h2 id="breaking">Breaking</h2><p>Please take note of important breaking changes before upgrading.</p><h3 id="pandoc-2">Pandoc 2</h3><p>RStudio Connect includes Pandoc 1 and will now also include Pandoc 2. Admins donot need to install either.</p><p>If you have deployed content with rmarkdown version 1.9 or higher, then thatcontent will now use Pandoc 2 at runtime. This brings in several bug fixes andenables some new functionality, but does introduce some backwardsincompatibilities. To protect older versions of rmarkdown, Pandoc 1 will stillbe used for content deployed with any rmarkdown version prior to 1.9. Contentnot using the rmarkdown package will have Pandoc 2 available.</p><p>Pandoc is dynamically made available to content when it is executed, so contentusing the newer version of rmarkdown will see Pandoc 2 immediately uponupgrading RStudio Connect, whether or not you have updated the content recently.The types of backwards incompatibilities we expect are issues like minorwhite-space rendering differences.</p><h3 id="r-markdown-rendering">R Markdown Rendering</h3><p>The R Markdown rendering environment has been updated, which will break acertain class of R Markdown documents. No action is needed for the majority ofR Markdown documents. Publishers will need to rewrite R Markdown documents thatdepended on locally preserving and storing state in between renderings.</p><p>The update isolates renderings and protects against clashes caused by concurrentwrites, but also means that files written to the local directory during a renderwill not be present or available the next time that the report is rendered.</p><p>For example, a report that writes a CSV file to disk on day 1 at a locallocation, <code>write.csv(‘data.csv’)</code>, and then on day 2 reads the same CSV<code>read.csv(‘data.csv’)</code>, will no longer work. Publishers should refactor thistype of R Markdown document to write data to a database or a shared directorythat is not<a href="http://docs.rstudio.com/connect/1.6.4/admin/process-management.html#process-management-sandboxing">sandboxed</a>.For instance, to <code>/app-data/data.csv</code>.</p><h2 id="new-features">New Features</h2><h3 id="file-download">File Download</h3><p>When a user accesses a <code>Microsoft Word</code> file or some other file type that is notrendered in the browser, Connect previously downloaded the content immediately.We have added a download page that simplifies the presentation ofbrowser-unfriendly file types.</p><p><img src="images/rsc-164-big-download-button.png" width="600" alt="Big DownloadButton showing example with Microsoft Word Document"></p><h3 id="content-filtering">Content Filtering</h3><p>The RStudio Connect Dashboard now includes interactive labels for tag filters inthe content listing view. This simplifies keeping track of complex searches,especially when returning to the Dashboard with saved filter state.</p><p><img src="images/rsc-164-breadcrumbs.png" width="600" alt="Example showingbreadcrumbs of tag filters in Content listing view along with a search"></p><h3 id="log-download">Log Download</h3><p>The Connect UI truncates log files to show the latest output. However, whensomeone downloads log files, the downloaded file is no longer truncated. Thismakes it easier for a developer to inspect asset behavior with the full log fileavailable on Connect.</p><h3 id="user-management">User Management</h3><p>Connect now allows administrators to filter the users list by multiple accountstatuses. The last day that each user was active is now displayed along with theuser list.</p><p><img src="images/rsc-164-user-page.png" width="800" alt="Example of the user pageshowing multiple selection and last active date"></p><blockquote><h4 id="upgrade-planning">Upgrade Planning</h4><p>Besides breaking changes above, there are no special precautions to be aware ofwhen upgrading from v1.6.2 to v1.6.4. You can expect the installation and startupof v1.6.4 to be complete in under a minute.</p><p>If you’re upgrading from a release older than v1.6.2, be sure to consider the“Upgrade Planning” notes from the intervening releases, as well.</p></blockquote><p>If you haven&rsquo;t yet had a chance to download and try <a href="https://rstudio.com/products/connect/">RStudioConnect</a>, we encourage you to do so.RStudio Connect is the best way to share all the work that you do in R (Shinyapps, R Markdown documents, plots, dashboards, Plumber APIs, etc.) withcollaborators, colleagues, or customers.</p><p>You can find more details or download a 45-day evaluation of the product at<a href="https://www.rstudio.com/products/connect/">https://www.rstudio.com/products/connect/</a>.Additional resources can be found below.</p><ul><li><a href="https://www.rstudio.com/products/connect/">RStudio Connect home page &amp; downloads</a></li><li><a href="http://docs.rstudio.com/connect/admin/">RStudio Connect Admin Guide</a></li><li><a href="https://www.rstudio.com/wp-content/uploads/2016/01/RSC-IT-Q-and-A.pdf">What IT needs to know about RStudio Connect</a></li><li><a href="http://docs.rstudio.com/connect/news/">Detailed news and changes between each version</a></li><li><a href="https://www.rstudio.com/pricing/#ConnectPricing">Pricing</a></li><li><a href="https://beta.rstudioconnect.com/connect/">An online preview of RStudio Connect</a></li></ul></description></item><item><title>Applied Machine Learning Workshop</title><link>https://www.rstudio.com/blog/2018-04-11-applied-machine-learning-workshop/</link><pubDate>Tue, 15 May 2018 09:24:00 -0400</pubDate><guid>https://www.rstudio.com/blog/2018-04-11-applied-machine-learning-workshop/</guid><description><p>Join Max Kuhn of RStudio for his popular Applied Machine Learning Workshop in Washington D.C.! If you’d missed his sold out course at rstudio::conf 2018 now is your chance.</p><p>Register here: <a href="https://www.rstudio.com/workshops/applied-machine-learning/">https://www.rstudio.com/workshops/applied-machine-learning/</a></p><p>This two-day course will provide an overview of using R for supervised learning. The session will step through the process of building, visualizing, testing, and comparing models that are focused on prediction. The goal of the course is to provide a thorough workflow in R that can be used with many different regression or classification techniques. Case studies on real data will be used to illustrate the functionality and several different predictive models are illustrated.</p><p>The course focuses on both high-level approaches to modeling (e.g., the caret package) and newer modeling packages in the tidyverse: recipes, rsample, yardstick, and tidyposterior. Basic familiarity with R and the tidyverse is required.</p><p>This course is taught by Dr. Max Kuhn a Software Engineer at RStudio. He is the author or maintainer of several R packages for predictive modeling including caret, Cubist, C50 and others. He routinely teaches classes in predictive modeling at rstudio::conf, Predictive Analytics World, and UseR! and his publications include work on neuroscience biomarkers, drug discovery, molecular diagnostics and response surface methodology. He and Kjell Johnson wrote the <a href="https://www.amazon.com/Applied-Predictive-Modeling-Max-Kuhn/dp/1461468485/">award-winning book Applied Predictive Modeling</a> in 2013.</p><p><strong>When</strong> - 8 a.m.to 5 p.m. Wednesday, August 15th and Thursday, August 16th</p><p><strong>Where</strong> - 20F Conference Center, 20 F Street NW, Suite 1000, Washington D.C.</p><p><strong>Who</strong> - Dr. Max Kuhn</p><p>Register here: <a href="https://www.rstudio.com/workshops/applied-machine-learning/">https://www.rstudio.com/workshops/applied-machine-learning/</a></p><p>Discounts are available for 5 or more attendees from any organization. Email <a href="sendmail:%20training@rstudio.com">training@rstudio.com</a> if you have any questions about the workshop that you don’t find answered on the registration page.</p></description></item><item><title>sparklyr 0.8: Production pipelines and graphs</title><link>https://www.rstudio.com/blog/sparklyr-0-8/</link><pubDate>Mon, 14 May 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/sparklyr-0-8/</guid><description><p>We&rsquo;re pleased to announce that <a href="https://CRAN.R-project.org/package=sparklyr">sparklyr 0.8</a> is now available on CRAN! Sparklyr provides an R interface to Apache Spark. It supports dplyr syntax for working with Spark DataFrames and exposes the full range of machine learning algorithms available in Spark ML. You can also learn more about Apache Spark and sparklyr at <a href="http://spark.rstudio.com">spark.rstudio.com</a> and the <a href="https://www.rstudio.com/resources/webinars/introducing-an-r-interface-for-apache-spark/">sparklyr webinar series</a>. In this version, we added support for Spark 2.3, Livy 0.5, and various enhancements and bugfixes. For this post, we&rsquo;d like to highlight a new feature from Spark 2.3 and introduce the mleap and graphframes extensions.</p><h2 id="parallel-cross-validation">Parallel Cross-Validation</h2><p>Spark 2.3 supports parallelism in hyperparameter tuning. In other words, instead of training each model specification serially, you can now train them in parallel. This can be enabled by setting the <code>parallelism</code> parameter in <code>ml_cross_validator()</code> or <code>ml_train_split_validation()</code>. Here&rsquo;s an example:</p><pre><code class="language-{r," data-lang="{r,">library(sparklyr)sc &lt;- spark_connect(master = &quot;local&quot;, version = &quot;2.3.0&quot;)iris_tbl &lt;- sdf_copy_to(sc, iris)# Define the pipelinelabels &lt;- c(&quot;setosa&quot;, &quot;versicolor&quot;, &quot;virginica&quot;)pipeline &lt;- ml_pipeline(sc) %&gt;%ft_vector_assembler(c(&quot;Sepal_Width&quot;, &quot;Sepal_Length&quot;, &quot;Petal_Width&quot;, &quot;Petal_Length&quot;),&quot;features&quot;) %&gt;%ft_string_indexer_model(&quot;Species&quot;, &quot;label&quot;, labels = labels) %&gt;%ml_logistic_regression()# Specify hyperparameter gridgrid &lt;- list(logistic = list(elastic_net_param = c(0.25, 0.75),reg_param = c(1e-3, 1e-4)))# Create the cross validator objectcv &lt;- ml_cross_validator(sc, estimator = pipeline, estimator_param_maps = grid,evaluator = ml_multiclass_classification_evaluator(sc),num_folds = 3, parallelism = 4)# Train the modelscv_model &lt;- ml_fit(cv, iris_tbl)</code></pre><p>Once the models are trained, you can inspect the performance results by using the newly available helper function <code>ml_validation_metrics()</code>:</p><pre><code class="language-{r}" data-lang="{r}">ml_validation_metrics(cv_model)spark_disconnect(sc)</code></pre><h2 id="pipelines-in-production">Pipelines in Production</h2><p>Earlier this year, we <a href="https://blog.rstudio.com/2018/01/29/sparklyr-0-7/">announced support for ML Pipelines in sparklyr</a>, and discussed how one can persist models onto disk. While that workflow is appropriate for batch scoring of large datasets, we also wanted to enable real-time, low-latency scoring using pipelines developed with sparklyr. To enable this, we&rsquo;ve developed the <a href="https://CRAN.R-project.org/package=mleap">mleap</a> package, available on CRAN, which provides an interface to the <a href="https://github.com/combust/mleap">MLeap</a> open source project.</p><p>MLeap allows you to use your Spark pipelines in any Java-enabled device or service. This works by serializing Spark pipelines which can later be loaded into the Java Virtual Machine (JVM) for scoring without requiring a Spark cluster. This means that software engineers can take Spark pipelines exported with sparklyr and easily embed them in web, desktop or mobile applications.</p><p>To get started, simply grab the package from CRAN and install the necessary dependencies:</p><pre><code class="language-{r," data-lang="{r,">install.packages(&quot;mleap&quot;)library(mleap)install_maven()install_mleap()</code></pre><pre><code class="language-{r," data-lang="{r,">library(mleap)</code></pre><p>Then, build a pipeline as usual:</p><pre><code class="language-{r," data-lang="{r,">library(sparklyr)sc &lt;- spark_connect(master = &quot;local&quot;, version = &quot;2.2.0&quot;)mtcars_tbl &lt;- sdf_copy_to(sc, mtcars)# Create a pipeline and fit itpipeline &lt;- ml_pipeline(sc) %&gt;%ft_binarizer(&quot;hp&quot;, &quot;big_hp&quot;, threshold = 100) %&gt;%ft_vector_assembler(c(&quot;big_hp&quot;, &quot;wt&quot;, &quot;qsec&quot;), &quot;features&quot;) %&gt;%ml_gbt_regressor(label_col = &quot;mpg&quot;)pipeline_model &lt;- ml_fit(pipeline, mtcars_tbl)</code></pre><p>Once we have the pipeline model, we can export it via <code>ml_write_bundle()</code>:</p><pre><code class="language-{r," data-lang="{r,"># Export modelmodel_path &lt;- file.path(tempdir(), &quot;mtcars_model.zip&quot;)transformed_tbl &lt;- ml_transform(pipeline_model, mtcars_tbl)ml_write_bundle(pipeline_model, transformed_tbl, model_path)spark_disconnect(sc)</code></pre><p>At this point, we&rsquo;re ready to use <code>mtcars_model.zip</code> in other applications. Notice that the following code does not require Spark:</p><pre><code class="language-{r," data-lang="{r,"># Import modelmodel &lt;- mleap_load_bundle(model_path)# Create a data frame to be scorednewdata &lt;- tibble::tribble(~qsec, ~hp, ~wt,16.2, 101, 2.68,18.1, 99, 3.08)# Transform the data frametransformed_df &lt;- mleap_transform(model, newdata)dplyr::glimpse(transformed_df)</code></pre><p>Notice that MLeap requires Spark 2.0 to 2.3. You can find additional details in the <a href="https://spark.rstudio.com/guides/mleap/">production pipelines</a> guide.</p><h2 id="graph-analysis">Graph Analysis</h2><p>The other extension we&rsquo;d like to highlight is <a href="https://CRAN.R-project.org/package=graphframes">graphframes</a>, which provides an interface to the <a href="https://graphframes.github.io/">GraphFrames</a> Spark package. GraphFrames allows us to run graph algorithms at scale using a DataFrame-based API.</p><p>Let&rsquo;s see graphframes in action through a quick example, where we analyze the relationships among package on CRAN.</p><pre><code class="language-{r," data-lang="{r,">library(graphframes)library(dplyr)sc &lt;- spark_connect(master = &quot;local&quot;, version = &quot;2.1.0&quot;)# Grab list of CRAN packages and their dependenciesavailable_packages &lt;- available.packages(contrib.url(&quot;https://cloud.r-project.org/&quot;)) %&gt;%`[`(, c(&quot;Package&quot;, &quot;Depends&quot;, &quot;Imports&quot;)) %&gt;%as_tibble() %&gt;%transmute(package = Package,dependencies = paste(Depends, Imports, sep = &quot;,&quot;) %&gt;%gsub(&quot;\\n|\\s+&quot;, &quot;&quot;, .))# Copy data to Sparkpackages_tbl &lt;- sdf_copy_to(sc, available_packages, overwrite = TRUE)# Create a tidy table of dependencies, which define the edges of our graphedges_tbl &lt;- packages_tbl %&gt;%mutate(dependencies = dependencies %&gt;%regexp_replace(&quot;\\\\(([^)]+)\\\\)&quot;, &quot;&quot;)) %&gt;%ft_regex_tokenizer(&quot;dependencies&quot;, &quot;dependencies_vector&quot;,pattern = &quot;(\\s+)?,(\\s+)?&quot;, to_lower_case = FALSE) %&gt;%transmute(src = package,dst = explode(dependencies_vector)) %&gt;%filter(!dst %in% c(&quot;R&quot;, &quot;NA&quot;))</code></pre><p>Once we have an edges table, we can easily create a <code>GraphFrame</code> object by calling <code>gf_graphframe()</code> and running PageRank:</p><pre><code class="language-{r}" data-lang="{r}"># Create a GraphFrame objectg &lt;- gf_graphframe(edges = edges_tbl)# Run the PageRank algorithmpagerank &lt;- gf_pagerank(g, tol = 0.01)pagerank %&gt;%gf_vertices() %&gt;%arrange(desc(pagerank))</code></pre><p>We can also collect a sample of the graph locally for visualization:</p><pre><code class="language-{r}" data-lang="{r}">library(gh)library(visNetwork)list_repos &lt;- function(username) {gh(&quot;/users/:username/repos&quot;, username = username) %&gt;%vapply(&quot;[[&quot;, &quot;&quot;, &quot;name&quot;)}rlib_repos &lt;- list_repos(&quot;r-lib&quot;)tidyverse_repos &lt;- list_repos(&quot;tidyverse&quot;)base_packages &lt;- installed.packages() %&gt;%as_tibble() %&gt;%filter(Priority == &quot;base&quot;) %&gt;%pull(Package)top_packages &lt;- pagerank %&gt;%gf_vertices() %&gt;%arrange(desc(pagerank)) %&gt;%head(75) %&gt;%pull(id)edges_local &lt;- g %&gt;%gf_edges() %&gt;%filter(src %in% !!top_packages &amp;&amp; dst %in% !!top_packages) %&gt;%rename(from = src, to = dst) %&gt;%collect()vertices_local &lt;- g %&gt;%gf_vertices() %&gt;%filter(id %in% top_packages) %&gt;%mutate(group = case_when(id %in% !!rlib_repos ~ &quot;r-lib&quot;,id %in% !!tidyverse_repos ~ &quot;tidyverse&quot;,id %in% !!base_packages ~ &quot;base&quot;,TRUE ~ &quot;other&quot;),title = id) %&gt;%collect()visNetwork(vertices_local, edges_local, width = &quot;100%&quot;) %&gt;%visEdges(arrows = &quot;to&quot;)spark_disconnect(sc)</code></pre><p><img src="https://user-images.githubusercontent.com/163582/39633449-a677b02a-4f7d-11e8-82ab-27c1205430cf.png" style="display: none;" /></p><p>Notice that GraphFrames currently supports Spark 2.0 and 2.1. You can find additional details in the <a href="https://spark.rstudio.com/graphframes/">graph analysis</a> guide.</p></description></item><item><title>Enterprise Advocate</title><link>https://www.rstudio.com/blog/2018-05-11-enterprise-advocate/</link><pubDate>Fri, 11 May 2018 13:41:10 -0400</pubDate><guid>https://www.rstudio.com/blog/2018-05-11-enterprise-advocate/</guid><description><p>We are looking for our next <a href="https://hire.withgoogle.com/public/jobs/rstudiocom/view/P_AAAAAACAAADFZoly7Lojez?trackingTag=rStudioBlog">Enterprise Advocate</a> to join the RStudio team. See what <a href="https://www.linkedin.com/in/peter-knast-808828b/">Pete Knast</a>, Global Director of New Business, has to say about working at RStudio and the Enterprise Advocate role.</p><p><img src="https://www.rstudio.com/blog-images/uploads/_mg_2145.jpg" alt="null"></p><p><em>When did you join RStudio and what made you interested in working here?</em></p><p>I joined in early 2014. I was excited by RStudio since I love helping people and being an open source company RStudio seemed like a great way to reach a lot of people and get to assist with numerous interesting use cases.</p><p><em>What types of projects do you work on?</em></p><p>I get to work on the front lines corresponding directly with our customers. Since my focus is new business this means I am helping open source users take the next step in their use of R. Sometimes this means helping IT organizations understand how RStudio can integrate with corporate security/authentication protocols and sometimes it involves showing off various Shiny applications. The types of projects always vary which keeps me on my toes.</p><p><em>What do you enjoy about working at RStudio?</em></p><p>One would be that my colleagues are not only extremely smart but very genuine so there are always fun conversations in and outside of work. Another big reason would be the various use cases I get to play a part in. Since each customer has a different application area or industry I never get bored when I learn about how RStudio offerings are being applied.</p><p><em>What types of qualities do you look for when hiring an Enterprise Advocate?</em></p><p>Two words that come to mind are humble and smart. Also since RStudio is so popular but our company is small we have a high volume of customers to connect with so high energy is also a must. If you enjoy solving problems not only will you find the role a good fit but you will have the chance to help bring solutions to large corporations and assist in resolving issues that can even cure diseases.</p><p><em>What are the goals for someone new to this role?</em></p><p>To establish themselves in the r and data science community as a trusted consultant. When you wake up from a dream that involves R you have made it. :)</p><p>If you think you or someone you know might be a good fit for this role and want to know more, check it out <a href="https://hire.withgoogle.com/public/jobs/rstudiocom/view/P_AAAAAACAAADFZoly7Lojez?trackingTag=rStudioBlog">here</a>.</p></description></item><item><title>leaflet 2.0.0</title><link>https://www.rstudio.com/blog/leaflet-2-0-0/</link><pubDate>Thu, 10 May 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/leaflet-2-0-0/</guid><description><p><a href="http://rstudio.github.io/leaflet/">leaflet</a> 2.0.0 is now on CRAN!</p><p>The leaflet R package wraps the <a href="https://leafletjs.com/">Leaflet.js</a> JavaScript library, and this release of the R package marks a major upgrade from the outdated Leaflet.js 0.7.x to the current Leaflet.js 1.x (specifically, 1.3.1).</p><p>Leaflet.js 1.x includes some non-backward-compatible API changes versus 0.7.x. If you’re using only R code to create your Leaflet maps, these changes should not affect you. If you are using custom JavaScript, some changes may be required to your code. Please see the Leaflet.js <a href="http://leafletjs.com/reference-1.3.0.html">reference documentation</a> and <a href="https://github.com/Leaflet/Leaflet/blob/master/CHANGELOG.md">changelog</a>.</p><p>Big thanks to <a href="https://twitter.com/timelyportfolio">@timelyportfolio</a> and <a href="https://www.karambelkar.info/about/">Bhaskar Karambelkar</a> for their significant contributions to this release!</p><div id="leaflet.extras-and-leaflet.esri" class="section level2"><h2>leaflet.extras and leaflet.esri</h2><p>Two additional packages by Bhaskar, leaflet.extras and leaflet.esri, have been updated on CRAN to utilize the latest Leaflet.js library bindings. leaflet.extras extends the Leaflet R package using various Leaflet.js plugins, offering features like heatmaps, additional marker icons, and drawing tools. leaflet.esri provides access to ArcGIS services, based on the <a href="https://esri.github.io/esri-leaflet/">ESRI leaflet plugin</a>.</p><pre class="r"><code>library(leaflet)library(leaflet.extras)leaflet(quakes) %&gt;%addTiles() %&gt;%addHeatmap(lng = ~long, lat = ~lat, radius = 8)</code></pre><!-- content of http://rpubs.com/barret/leaflet-2-0-0-quakes of the above code --><iframe src="https://rstudio-pubs-static.s3.amazonaws.com/388042_d0a5b2ee3b2d44858e9cfc55edb109bf.html" width="100%" height="400px" frameborder="0"></iframe></div><div id="full-changelog" class="section level2"><h2>Full changelog</h2><div id="breaking-changes" class="section level3"><h3>Breaking Changes</h3><ul><li>Update to latest Leaflet.js 1.x (v1.3.1). Please see the Leaflet.js <a href="http://leafletjs.com/reference-1.3.0.html">reference documentation</a> and <a href="https://github.com/Leaflet/Leaflet/blob/master/CHANGELOG.md">change log</a>.</li><li>Previously, labels were implemented using the 3rd party extension <a href="https://github.com/Leaflet/Leaflet.labelExtension">Leaflet.label</a>. Leaflet.js 1.x now provides this functionality naively. There are some minor differences to note:</li><li>If you are using custom JavaScript to create labels, you’ll need to change references to <code>L.Label</code> to <code>L.Tooltip</code>.</li><li>Tooltips are now displayed with default Leaflet.js styling.</li><li>In custom javascript extensions, change all <code>*.bindLabel()</code> to <code>*.bindTooltip()</code>.</li><li>All Leaflet.js plugins updated to versions compatible with Leaflet.js 1.x.</li></ul></div><div id="known-issues" class="section level3"><h3>Known Issues</h3><ul><li>The default CSS z-index of the Leaflet map has changed; for some Shiny applications, the map now covers elements that are intended to be displayed on top of the map. This issue has been fixed on GitHub (<code>devtools::install_github(&quot;rstudio/leaflet&quot;)</code>). For now, you can work around this in the CRAN version by including this line in your application UI:</li></ul><pre class="r"><code>tags$style(&quot;.leaflet-map-pane { z-index: auto; }&quot;)</code></pre></div><div id="features" class="section level3"><h3>Features</h3><ul><li>Added more providers for <code>addProviderTiles()</code>: OpenStreetMap.CH, OpenInfraMap, OpenInfraMap.Power, OpenInfraMap.Telecom, OpenInfraMap.Petroleum, OpenInfraMap.Water, OpenPtMap, OpenRailwayMap, OpenFireMap, SafeCast.</li><li>Add <code>groupOptions</code> function. Currently the only option is letting you specify zoom levels at which a group should be visible.</li><li>Added support for drag events.</li><li>Added <code>method</code> argument to <code>addRasterImage()</code> to enable nearest neighbor interpolation when projecting categorical rasters.</li><li>Added an <code>'auto'</code> method for <code>addRasterImage()</code>. Projected factor results are coerced into factors.</li><li>Added <code>data</code> parameter to remaining <code>addXXX()</code> methods, including addLegend.</li><li>Added <code>preferCanvas</code> argument to <code>leafletOptions()</code>. ### Bug Fixes and Improvements</li><li>Relative protocols are used where possible when adding tiles. In RStudio 1.1.x on linux and windows, a known issue of ‘<a href="https://" class="uri">https://</a>’ routes fail to load, but works within browsers (rstudio/rstudio#2661).</li><li><code>L.multiPolyline</code> was absorbed into <code>L.polyline</code>, which also accepts an <a href="http://leafletjs.com/reference-1.3.0.html#polyline">array of polyline information</a>.</li><li>Fixed bug where icons were anchored to the top-center by default, not center-center.</li><li>Fixed bug where markers would not appear in self contained knitr files.</li><li><code>L.Label</code> is now <code>L.tooltip</code> in Leaflet.js. <code>labelOptions()</code> now translates old options <code>clickable</code> to <code>interactive</code> and <code>noHide</code> to <code>permanent</code>.</li><li>Fix a bug where the default <code>addTiles()</code> would not work with .html files served directly from the filesystem.</li><li>Fix bug with accessing columns in formulas when the data source is a Crosstalk SharedData object wrapping a spatial data frame or sf object.</li><li>Fix strange wrapping behavior for legend, especially common for Chrome when browser zoom level is not 100%.</li><li>Fix incorrect opacity on NA entry in legend.</li><li>Ensure type safety of <code>.indexOf(stamp)</code>.</li><li><code>validateCoords()</code> warns on invalid polygon data.</li></ul></div></div></description></item><item><title>RStudio Connect v1.6.2</title><link>https://www.rstudio.com/blog/rstudio-connect-v1-6-2/</link><pubDate>Wed, 09 May 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-v1-6-2/</guid><description><p>RStudio Connect version 1.6.2 is <a href="https://www.rstudio.com/products/connect/">now available</a>!</p><p>There are a handful of new features that are highlighted below. We encourageyou to upgrade as soon as possible!</p><h2 id="recommend-building-r-from-source">Recommend Building R from Source</h2><p>If you have installed R using <code>apt-get</code>, <code>yum</code>, or some other package manager,we recommend that you <a href="http://docs.rstudio.com/connect/1.6.2/admin/getting-started.html#r-source">install R fromsource</a>instead. This protects you from application breakages when the system versionof R is upgraded. We have updated our documentation to reflect these bestpractices concerning R administration for use with Connect.</p><p>Installing R from source allows installing multiple versions of R side by side,and allows content to persist as published without risk of breaking during anupgrade to the version of R. This also allows publishers to publish to aversion of R that more closely approximates their development environment.</p><h2 id="user-filtering">User Filtering</h2><p>For Connect implementations with many users, we have added features to the Userpage that allow administrators to filter users by various states. Last release,we added the ability for administrators to filter between users that arecounting against the Connect license and those that are inactive. This release,we also exposed the ability for administrators to filter inactive users intothose that were manually locked and those that are not counting against yourlicense due to inactivity.</p><img src="https://www.rstudio.com/blog/images/rsc-162-user-filter.png" width="300" alt="Admin user filter showing active, locked, and inactive users"><h2 id="other-changes">Other Changes</h2><p><strong>Connect Server API</strong></p><p>The Connect Server API was introduced in Connect 1.6.0. This release, we added<a href="http://docs.rstudio.com/connect/1.6.2/api/#get-audit-logs">an endpoint called<code>/audit_logs</code></a> whereaudit logs can be accessed and paged. There is an example of how to use the APIin the <a href="http://docs.rstudio.com/connect/1.6.2/user/cookbook.html#get-all-audit-logs">Connect Server APICookbook</a>.Stay tuned as we add endpoints to the Connect Server API to allow programmaticinteraction with the features of RStudio Connect.</p><p><strong>Content Filtering</strong></p><p>Content searching and content filtering state now persists when a user navigatesto a new page and then returns. To return to the default content screen, theuser can select “Reset all filters.”</p><p><strong>Email Customization</strong></p><p>In the last few releases, Connect has added features that allow publishers ofRMarkdown documents to customize the email output of scheduled RMarkdownreports. In this release, publishers gain the ability to optionally suppressattachments or suppress email output altogether.</p><p>These options can be set in the YAML header for a RMarkdown document, but areprobably more useful when <a href="http://docs.rstudio.com/connect/1.6.2/user/r-markdown.html#r-markdown-email-suppress-scheduled">set inline by editing<code>rmarkdown::output_metadata</code></a>.For instance, in financial market analysis, I might decide that variance is notsignificant enough to warrant an email update, and thereby ensure email updatesonly occur when there is critical information to deliver.</p><p><strong>New Icons</strong></p><p>Connect previously used <a href="https://en.gravatar.com/">Gravatar icons</a>. We havechanged this to standard monogram icons, which should fit better with offlineinstallations and other enterprise applications.</p><img src="images/rsc-162-icon.png" width="250" alt="Side-by-side comparison of Gravatar icon to monogram icon"><blockquote><h4 id="upgrade-planning">Upgrade Planning</h4><p>There are no special precautions to be aware of when upgrading from v1.6.0 tov1.6.2. You can expect the installation and startup of v1.6.2 to be complete inunder a minute.</p><p>If you’re upgrading from a release older than v1.6.0, be sure to consider the“Upgrade Planning” notes from the intervening releases, as well.</p></blockquote><p>If you haven&rsquo;t yet had a chance to download and try <a href="https://rstudio.com/products/connect/">RStudioConnect</a>, we encourage you to do so.RStudio Connect is the best way to share all the work that you do in R (Shinyapps, R Markdown documents, plots, dashboards, Plumber APIs, etc.) withcollaborators, colleagues, or customers.</p><p>You can find more details or download a 45-day evaluation of the product at<a href="https://www.rstudio.com/products/connect/">https://www.rstudio.com/products/connect/</a>.Additional resources can be found below.</p><ul><li><a href="https://www.rstudio.com/products/connect/">RStudio Connect home page &amp; downloads</a></li><li><a href="http://docs.rstudio.com/connect/admin/">RStudio Connect Admin Guide</a></li><li><a href="https://www.rstudio.com/wp-content/uploads/2016/01/RSC-IT-Q-and-A.pdf">What IT needs to know about RStudio Connect</a></li><li><a href="http://docs.rstudio.com/connect/news/">Detailed news and changes between each version</a></li><li><a href="https://www.rstudio.com/pricing/#ConnectPricing">Pricing</a></li><li><a href="https://beta.rstudioconnect.com/connect/">An online preview of RStudio Connect</a></li></ul></description></item><item><title>Arrow and beyond: Collaborating on next generation tools for open source data science</title><link>https://www.rstudio.com/blog/arrow-and-beyond/</link><pubDate>Thu, 19 Apr 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/arrow-and-beyond/</guid><description><p>Two years ago, <a href="https://github.com/wesm">Wes McKinney</a> and <a href="https://github.com/hadley">Hadley Wickham</a> got together to discuss some of the systems challenges facing the Python and R communities. Data science teams inevitably work with multiple languages and systems, so it&rsquo;s critical that data flow seamlessly and efficiently between these environments. Wes and Hadley wanted to explore opportunities to collaborate on tools for improving interoperability between Python, R, and external compute and storage systems. This discussion led to the creation of the <a href="https://github.com/wesm/feather">feather</a> file format, a very fast on-disk format for storing data frames that can be read and written to by multiple languages.</p><p>Feather was a successful project, and has made it easier for thousands of data scientists and data engineers to collaborate across language boundaries. In this post, we want to update you on how we think about cross-language collaboration, and share some exciting new plans.</p><h2 id="beyond-file-based-interoperability">Beyond file-based interoperability</h2><p>File-based interoperability is a great first step, but is fundamentally clunky: to communicate between R and Python running on the same computer, you have to save out from one and load into the other. What if there were some way to share data in memory without having to copy objects or round trip to disk?</p><p>You may have experienced a taste of this if you’ve tried the <a href="https://rstudio.github.io/reticulate/">reticulate</a> package. It makes it possible to use Python objects and functions from R. But reticulate is focused on solving only one part of the problem, for R and Python. It doesn’t help pass data from R to Julia, or Julia to Python, or Python to <a href="https://spark.apache.org/">Apache Spark</a>. What if there were some way to share data between multiple languages without having to write a translation layer between every pair of languages? That challenge is the inspiration for the <a href="https://arrow.apache.org/">Apache Arrow</a> project, which defines a standardized, language independent, columnar memory format for analytics and data science.</p><h2 id="a-new-data-science-runtime">A new data science runtime</h2><p>The Apache Arrow project has been making great progress, so we can now start to think about what could be built on top of that foundation. Modern hardware platforms provide huge opportunities for optimization (cache pipelining, CPU parallelism, GPUs, etc.), which should allow us to use a laptop to interactively analyze 100GB datasets. We should also be getting dramatically better performance when building models and visualizing data on smaller datasets.</p><p>We think that the time has come to build a modern data science runtime environment that takes advantage of the computational advances of the last 20 years, and can be used from many languages (in the same way that <a href="http://jupyter.org/">Project Jupyter</a> has built an interactive data science environment that supports many languages). We don&rsquo;t think that it makes sense to build this type of infrastructure for a single language, as there are too many difficult problems, and we need diverse viewpoints to solve them. Wes has been thinking and talking publicly about <a href="https://www.slideshare.net/wesm/shared-infrastructure-for-data-science">shared infrastructure for data science</a> for some time, and recently RStudio and Wes have been talking about what we could do to begin making this a reality.</p><p>These discussions have culminated in a plan to work closely together on building a new data science runtime powered by Apache Arrow. What might this new runtime look like? Here are some of the things currently envisioned:</p><ul><li><p>A core set of C++ shared libraries with bindings for each host language</p></li><li><p>Runtime in-memory format based on the Arrow columnar format, with auxiliary data structures that can be described by composing Arrow data structures</p></li><li><p>Reusable operator “kernel” containing functions utilizing Arrow format as input and output. This includes pandas-style array functions, as well as SQL-style relational operations (joins, aggregations, etc.)</p></li><li><p>Multithreaded graph dataflow-based execution engine for efficient evaluation of lazy data frame expressions created in the hostlanguage</p></li><li><p>Subgraph compilation using LLVM; optimization of common operator patterns</p></li><li><p>Support for user-defined operators and function kernels</p></li><li><p>Comprehensive interoperability with existing data representations (e.g., data frames in R, pandas / NumPy in Python)</p></li><li><p>New front-end interfaces for host languages (e.g., dplyr and other &ldquo;tidy&rdquo; front ends for R, evolution of pandas for Python)</p></li></ul><p>When you consider the scope and potential impact of the project, it&rsquo;s hopefully easy to see why language communities need to come together around making it happen rather than work in their own silos.</p><h2 id="ursa-labs">Ursa Labs</h2><p>Today, Wes has <a href="http://wesmckinney.com/blog/announcing-ursalabs/">announced Ursa Labs</a>, an independent open-source development lab that will serve as the focal point for the development of a new cross-language data science runtime powered by Apache Arrow. <a href="https://ursalabs.org">Ursa Labs</a> isn&rsquo;t a startup company and won&rsquo;t have its own employees. Instead, a variety of individuals and organizations will contribute to the effort.</p><p>RStudio will serve as a host organization for Ursa Labs, providing operational support and infrastructure (e.g., help with hiring, DevOps, QA, etc.) which will enable Wes and others to dedicate 100% of their time and energy to creating great open-source software.</p><p>Hadley will be a key technical advisor to Ursa, and collaborate with Wes on the design and implementation of the data science runtime. Hadley and his team will also build a dplyr back end, as well as other tidy interfaces to the new system.</p><p>It might sound strange to hear that Wes, who is so closely associated with Python, will be working with RStudio. It might also sound strange that RStudio will be investing in tools that are useful for R and Python users alike. Aren&rsquo;t these languages and tools out to succeed at each other&rsquo;s expense? That&rsquo;s not how we see it. Rather, we are inspired to work together by the common desire to make the languages our users love more successful. Languages are vocabularies for interacting with computation, and like human vocabularies, are rich and varied. We succeed as tool builders by understanding the users that embrace our languages, and by building tools perfectly suited to their needs. That&rsquo;s what Wes, Hadley, and RStudio have been doing for many years, and we think everyone will be better off if we do it together!</p><p>We are tremendously excited to see the fruits of this work, and to continue R&rsquo;s tradition of providing fluent and powerful interfaces to state-of-the-art computational environments. Check out the <a href="https://ursalabs.org">Ursa Labs</a> website for additional details and to find out how you can get involved with the project!</p><style type="text/css">.title {font-size: 1.0em;}</style></description></item><item><title>Summer Interns</title><link>https://www.rstudio.com/blog/2018-04-18-summer-interns/</link><pubDate>Wed, 18 Apr 2018 16:58:36 -0400</pubDate><guid>https://www.rstudio.com/blog/2018-04-18-summer-interns/</guid><description><p>We were thrilled by the response to our <a href="https://blog.rstudio.com/2018/02/12/summer-interns/">summer internship program</a>. After carefully reviewing over 250 applications, we have made our final selections. Here is a brief description of each intern and the projects they will be working on this summer.</p><h2 id="fanny-chow-bootstrapping-methods">Fanny Chow, bootstrapping methods</h2><p>Fanny is a master&rsquo;s student working with Max Kuhn this summer. Previously she studied Statistics and International Relations at UC Davis. She is particularly interested in privacy and interpretability in machine learning.</p><h2 id="alex-hayes-broom">Alex Hayes, broom</h2><p>Alex Hayes will graduate from Rice University this spring with a degree in statistics. He&rsquo;s particularly interested in improving modelling interfaces in R. Alex will be working with Dave Robinson on processing pull requests and reorganizing the broom package. <a href="http://www.alexpghayes.com/">Personal website</a></p><h2 id="dana-seidel-ggplot2">Dana Seidel, ggplot2</h2><p>Dana Seidel is a PhD candidate at UC Berkeley joining RStudio this summer to work on ggplot2 development with Hadley Wickham. Her graduate work is focused on understanding the movement and space use of large mammals, relying heavily on R&rsquo;s spatial and modelling libraries. In 2016 she worked as an intern on Google Maps&rsquo; GeoDataAnalytics team and in the last year she helped to teach an introductory data science course at UC Berkeley, training students in R, RStudio, and the tidyverse.</p><h2 id="timothy-mastny-shiny">Timothy Mastny, Shiny</h2><p>Tim is a graduate student at the University of Nebraska, transitioning into data science and software development from the aerospace industry. He will be joining Winston Chang and the Shiny team as an intern, working on open issues and UI improvements. In his spare time, he likes to dabble in open-source R packages and watch horror movies. <a href="https://timmastny.rbind.io/">Personal Website</a> <a href="https://github.com/tmastny">Github</a></p><h2 id="irene-steves-tidies-of-march">Irene Steves, Tidies of March</h2><p>Irene discovered R through the organismal biology/ecology world, and is excited to join RStudio as an intern for the summer of 2018. She will be working with Jenny Bryan to develop the Tidies of March, a series of coding challenges inspired by the Advent of Code. <a href="https://github.com/isteves">Github</a></p><p>Please welcome our Summer interns!</p></description></item><item><title>RStudio Connect 1.6.0 - A Year in the Making!</title><link>https://www.rstudio.com/blog/rstudio-connect-1-6-0-a-year-in-the-making/</link><pubDate>Thu, 12 Apr 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-1-6-0-a-year-in-the-making/</guid><description><p>We’re pleased to announce RStudio Connect 1.6.0. Connect 1.6.0 caps a year of significant updates and we encourage all customers to upgrade.</p><iframe src="https://rstudio.wistia.com/medias/ocoqxrklya?wvideo=ocoqxrklya" title="Wistia video player" allowtransparency="true" frameborder="0" scrolling="no" class="wistia_embed" name="wistia_embed" allowfullscreen mozallowfullscreen webkitallowfullscreen oallowfullscreen msallowfullscreen width="100%" height="400px"></iframe><h2 id="new-content-types">New Content Types</h2><p><strong>Plumber APIs</strong></p><p>RStudio Connect introduces support for deploying <a href="http://docs.rstudio.com/connect/1.6.0/user/index.html#plumber">Plumber APIs</a>, allowing any R function to be served as an HTTP RESTful service. API support makes it easy to integrate R into external systems. Connect handles scaling and securing published APIs.</p><p><strong>TensorFlow Serving</strong></p><p>The new TensorFlow R bindings enable R users to tap into the powerful TensorFlow library. In conjunction with tfdeploy, Connect helps R users <a href="http://docs.rstudio.com/connect/1.6.0/user/index.html#tensorflow-model-apis">deploy deep learning models</a> built in TensorFlow to be accessed and used by external systems.</p><h2 id="improvements-for-shiny-and-r-markdown">Improvements for Shiny and R Markdown</h2><p>Connect users continue to share analytics through reports, dashboards, and web applications. New features make these favorite tools even more powerful.</p><p><strong>Runtime Settings</strong></p><p>Application publishers have the ability to fine-tune performance settings for their content, including timeout and scaling settings. Administrators can enforce <a href="http://docs.rstudio.com/connect/1.6.0/admin/appendix-configuration.html#appendix-configuration-scheduler">global defaults and caps</a>.</p><p><strong>Environment Variables</strong></p><p>Trying to avoid saving your password or API key in your code? Connect supports <a href="http://docs.rstudio.com/connect/1.6.0/admin/security-auditing.html#application-environment-variables">encrypted secrets</a>, which can be managed through the web interface. Secrets are injected into the R process as environment variables.</p><p><strong>Report Histories</strong></p><p>The output from ad-hoc and scheduled R Markdown documents are saved and accessible, making it easy to compare prior runs.</p><h2 id="other-big-updates">Other Big Updates</h2><p><strong>Content Organization and Discovery</strong></p><p>To organize all this content, Connect added <a href="http://docs.rstudio.com/connect/1.6.0/admin/content-management.html#tags">tags</a>. Administrators create a tag schema that publishers can use to organize content into categories. Viewers are able to filter and search for content by tag. In addition to tags, users can search for content by title.</p><p><strong>Best-in-Class Security</strong></p><p>Connect has added additional security features, including support for <a href="http://docs.rstudio.com/connect/1.6.0/admin/appendix-configuration.html#appendix-configuration-authentication">CAPTCHA</a>, <a href="http://docs.rstudio.com/connect/1.6.0/admin/security-auditing.html#web-sudo-mode">“Web Sudo Mode”</a>, improved <a href="http://docs.rstudio.com/connect/1.6.0/admin/authentication.html#user-attribute-editability">User Management</a>, and support for <a href="http://docs.rstudio.com/connect/1.6.0/admin/process-management.html#process-management-pam-sessions">Kerberos and PAM Sessions</a>.</p><p><strong>Scaling</strong></p><p>Connect can be scaled horizontally by adding <a href="http://docs.rstudio.com/connect/1.6.0/admin/high-availability.html">execution servers</a>. This type of scaling ensures Connect can handle mission-critical content and guarantee high availability. We’ve used a Connect cluster to support a Shiny app with 10,000 simultaneous users!</p><p>We’ve also added new licensing mechanisms to support transient servers and <a href="https://www.rstudio.com/wp-content/uploads/2018/03/RStudio_Docker_3-9-2018.pdf">containerized deployments</a>.</p><h2 id="looking-ahead">Looking Ahead</h2><p>We’re excited to be working on two new features that are previewed in Connect 1.6.0.</p><p><strong>Email Customization</strong></p><p>In 1.5.14, we introduced the ability to customize the subject line for emails. 1.6.0 adds support for <a href="http://docs.rstudio.com/connect/1.6.0/user/r-markdown.html#r-markdown-email-attachments">specifying attachments</a>, like Excel files, generated by a report. Look out for more options in the future. Share analytics by pushing the most important results directly to your audience’s inbox.</p><p><strong>Connect Server API</strong></p><p>Connect’s web interface helps users manage their content interactively from a browser. However, we know there are times when you’d rather write code to automate certain tasks. To this end, we’re excited to announce the <a href="http://docs.rstudio.com/connect/1.6.0/api/">Connect Server API</a>. The Server API lets you “drive” RStudio Connect programmatically. This release includes a small piece of the API, an endpoint that lists the R versions available on the server. We’ll gradually roll out more of the API. Email us if you have specific functionality you would be excited about automating!</p><h2 id="1514-to-160">1.5.14 to 1.6.0?</h2><p>Normally we outline the incremental improvements from release to release. While this blog post has highlighted changes throughout the 1.5 series, there are specific updates in 1.6.0. For full details see the <a href="http://docs.rstudio.com/connect/1.6.0/news/">1.6.0 release notes</a>.</p><h2 id="deprecation-announcement">Deprecation Announcement</h2><p>Version 1.6.0 no longer supports Ubuntu 12.04 or Internet Explorer 10. For more information see <a href="https://www.rstudio.com/about/platform-deprecation-strategy/">RStudio’s deprecation policy</a>.</p><blockquote><h4 id="upgrade-planning">Upgrade Planning</h4><p>There are no special precautions to be aware of when upgrading from v1.5.14 to v1.6.0. You can expect the installation and startup of v1.6.0 to be complete in under a minute.</p><p>If you’re upgrading from a release older than v1.5.14, be sure to consider the “Upgrade Planning” notes from the intervening releases, as well.</p></blockquote><p>If you haven&rsquo;t yet had a chance to download and try <a href="https://rstudio.com/products/connect/">RStudio Connect</a>, we encourage you to do so. RStudio Connect is the best way to share all the work that you do in R (Shiny apps, R Markdown documents, plots, dashboards, Plumber APIs, etc.) with collaborators, colleagues, or customers.</p><p>You can find more details or download a 45-day evaluation of the product at <a href="https://www.rstudio.com/products/connect/">https://www.rstudio.com/products/connect/</a>. Additional resources can be found below.</p><ul><li><a href="https://www.rstudio.com/products/connect/">RStudio Connect home page &amp; downloads</a></li><li><a href="http://docs.rstudio.com/connect/admin/">RStudio Connect Admin Guide</a></li><li><a href="https://www.rstudio.com/wp-content/uploads/2016/01/RSC-IT-Q-and-A.pdf">What IT needs to know about RStudio Connect</a></li><li><a href="http://docs.rstudio.com/connect/news/">Detailed news and changes between each version</a></li><li><a href="https://www.rstudio.com/pricing/#ConnectPricing">Pricing</a></li><li><a href="https://beta.rstudioconnect.com/connect/">An online preview of RStudio Connect</a></li></ul></description></item><item><title>Shiny Server (Pro) 1.5.7</title><link>https://www.rstudio.com/blog/shiny-server-pro-1-5-7/</link><pubDate>Wed, 11 Apr 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-server-pro-1-5-7/</guid><description><p><a href="https://www.rstudio.com/products/shiny/shiny-server/">Shiny Server 1.5.7.907 and Shiny Server Pro 1.5.7.954 are now available.</a></p><p>Highlights for this release are a major-version Node upgrade, support for HTTP gzip/deflate compression and (optionally) <a href="https://en.wikipedia.org/wiki/Secure_cookies">secure cookies</a>, and numerous bug fixes. We&rsquo;ve also dropped support for some Linux distro versions that have reached their end of life.</p><h3 id="shiny-server-157907">Shiny Server 1.5.7.907</h3><ul><li><p>Upgrade to Node v8.10.0.</p></li><li><p>Dropped support for Ubuntu 12.04 and SLES 11.</p></li><li><p>Support gzip/deflate compression for HTTP responses. You can disable this ifnecessary with the directive &ldquo;http_allow_compression no;&rdquo; at the top levelof shiny-server.conf.</p></li><li><p>Don&rsquo;t color log output if stdout is not a terminal.</p></li></ul><h3 id="shiny-server-pro-157954">Shiny Server Pro 1.5.7.954</h3><p>The above changes, plus:</p><ul><li><p>Rename CSRF token cookie from XSRF-TOKEN to SSP-CSRF, so as not to conflictwith other Angular apps being served from the same host.</p></li><li><p>Fix bug where dashboard could show incorrect or even negative values from RAMusage.</p></li><li><p>Fix bugs retrieving LDAP/Active Directory groups when group_filter containsan extensible match operator (which is the default for auth_active_dir).</p></li><li><p>Fix bug where server could crash with &ldquo;render is not defined&rdquo;.</p></li><li><p>Add <code>secure_cookies always;</code> directive, which adds the HTTP cookie flag&ldquo;secure&rdquo; to our session cookies. Note that this should only be used if allauthenticated apps and the admin dashboard are ONLY accessible via https,either through Shiny Server Pro&rsquo;s built-in TLS support or via a proxy.</p></li></ul></description></item><item><title>Building tidy tools workshop</title><link>https://www.rstudio.com/blog/building-tidy-tools-workshop/</link><pubDate>Mon, 09 Apr 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/building-tidy-tools-workshop/</guid><description><p>Join RStudio Chief Data Scientist Hadley Wickham for his popular Building tidy tools workshop in San Francisco! If you&rsquo;d missed the sold out course at rstudio::conf 2018 now is your chance.</p><p>Register here: <a href="https://www.rstudio.com/workshops/extending-the-tidyverse/">https://www.rstudio.com/workshops/extending-the-tidyverse/</a></p><p>You should take this class if you have some experience programming in R and you want to learn how to tackle larger scale problems. You&rsquo;ll get the most if you&rsquo;re already familiar with the basics of functions (i.e. you&rsquo;ve written a few) and are comfortable with R’s basic data structures (vectors, matrices, arrays, lists, and data frames). There is probably a 30% overlap in the material with Hadley&rsquo;s previous &ldquo;R Masterclass&rdquo;. However, the material has been substantially reorganised, so if you&rsquo;ve taken the R Masterclass in the past, you&rsquo;ll still learn a lot in this class.</p><h3 id="what-will-i-learn">What will I learn?</h3><p>This course has three primary goals. You will:</p><ul><li>Learn efficient workflows for developing high-quality R functions, using the set of conventions codified by a package. You&rsquo;ll also learn workflows for unit testing, which helps ensure that your functions do exactly what you think they do.</li><li>Master the art of writing functions that do one thing well and can be fluently combined together to solve more complex problems. We&rsquo;ll cover common function writing pitfalls and how to avoid them.</li><li>Learn how to write collections of functions that work well together, and adhere to existing conventions so they&rsquo;re easy to pick up for newcomers. We&rsquo;ll discuss API design, functional programming tools, the basics of object design in S3, and the tidy eval system for NSE.</li></ul><p><strong>When</strong> - 9 a.m.to 5 p.m. Thursday May 17th and Friday the 18th</p><p><strong>Where</strong> - The Westin, 1 Old Bayshore Hwy, Millbrae, CA 94030</p><p><strong>Who</strong> - Hadley Wickham, Chief Scientist at RStudio</p><p>Build your skills and learn from the best at this rare in-person workshop - the only West coast workshop from Hadley in 2018.</p><p>Register here: <a href="https://www.rstudio.com/workshops/extending-the-tidyverse/">https://www.rstudio.com/workshops/extending-the-tidyverse/</a></p><p>As of today, there are just 30+ seats left. Discounts are available for 5 or more attendees from any organization. Email <a href="mailto:training@rstudio.com">training@rstudio.com</a> if you have any questions about the workshop that you don’t find answered on the registration page.</p></description></item><item><title>DT 0.4: Editing Tables, Smart Filtering, and More</title><link>https://www.rstudio.com/blog/dt-0-4/</link><pubDate>Thu, 29 Mar 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/dt-0-4/</guid><description><p>It has been <a href="https://www.rstudio.com/blog/dt-an-r-interface-to-the-datatables-library/">more than two years</a> since we announced the initial version of the <strong>DT</strong> package. Today we want to highlight a few significant changes and new features in the recent releases v0.3 and v0.4. The full changes can be found in the <a href="https://github.com/rstudio/DT/releases">release notes</a>.</p><h2 id="editable-tables">Editable tables</h2><p>Now you can make a table editable through the new argument <code>datatable(..., editable = TRUE)</code>. Then you will be able to edit a cell by double-clicking on it. This feature works in both client-side and server-side (Shiny) processing modes. See <a href="https://github.com/rstudio/DT/pull/480">here for examples</a>.</p><p><img src="https://user-images.githubusercontent.com/163582/38057156-1c0cce84-32a4-11e8-84ac-1c93ec60684e.png" alt="An editable tables"></p><h2 id="smart-filtering-in-the-server-side-processing-mode">Smart filtering in the server-side processing mode</h2><p>Searching in the server-side processing mode has enabled <a href="https://datatables.net/reference/option/search.smart">the &ldquo;smart&rdquo; mode</a> by default. Previously, this mode only works in the client-side processing mode. If you want to disable the smart filtering, you can set the initialization option in <code>datatable()</code> (e.g., <code>options = list(search = list(smart = FALSE))</code>). The smart filtering means spaces in the global search keyword for the table will have a special meaning: each returned record in the table should match <em>all</em> of the words separated by spaces. For example, a keyword &ldquo;1234 abc&rdquo; will match every record in the table that contain both &ldquo;1234&rdquo; and &ldquo;abc&rdquo; (in previous versions, this is just treated as a single keyword).</p><p><img src="https://user-images.githubusercontent.com/163582/38057956-bedfad96-32a6-11e8-815c-73abce2d74bc.png" alt="Smart filters in DataTables"></p><h2 id="shift--clicking-for-row-selection">Shift + Clicking for row selection</h2><p>After you have enabled <a href="https://rstudio.github.io/DT/shiny.html">row selection</a>, you can hold the <code>Shift</code> key and click to select multiple consecutive rows.</p><h2 id="dtoutput-and-renderdt-for-shiny-apps">DTOutput() and renderDT() for Shiny apps</h2><p>We have added functions <code>DTOutput()</code> and <code>renderDT()</code> as aliases of <code>dataTableOutput()</code> and <code>renderDataTable()</code>, respectively. This is because the latter two often collide with functions of the same names in <strong>shiny</strong>. You are recommended to use <code>DTOutput()</code> and <code>renderDT()</code> in Shiny apps, so that you don&rsquo;t have to worry about forgetting the <code>DT::</code> qualifier (e.g., <code>DT::renderDataTable</code>). Naming is hard, and this was perhaps my biggest mistake in the initial version of <strong>DT</strong>. I was too optimistic that <code>DT::renderDataTable()</code> could quickly and easily replace <code>shiny::renderDataTable()</code> so we could drop the latter. It turned out that the two were not completely compatible initially, and had got more and more differences later (<strong>DT</strong> has many more features).</p><p>Versions 0.3 and 0.4 also include several bug fixes, and we appreciate all the bug reports and pull requests from <strong>DT</strong> users. In particular, we want to thank Xianying Tan (<a href="https://github.com/shrektan">@shrektan</a>) for his many helpful pull requests to implement new features and fix bugs.</p><p>Again, the full documentation is at <a href="https://rstudio.github.io/DT/">https://rstudio.github.io/DT/</a>. Please use <a href="https://github.com/rstudio/DT/issues">Github issues</a> if you want to file bug reports or feature requests, and use <a href="https://stackoverflow.com/questions/tagged/dt">StackOverflow</a> or <a href="https://community.rstudio.com">RStudio Community</a> if you have questions.</p></description></item><item><title>reticulate: R interface to Python</title><link>https://www.rstudio.com/blog/reticulate-r-interface-to-python/</link><pubDate>Mon, 26 Mar 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/reticulate-r-interface-to-python/</guid><description><p>We are pleased to announce the <strong>reticulate</strong> package, a comprehensive set of tools for interoperability between Python and R. The package includes facilities for:</p><img src="https://rstudio.github.io/reticulate/images/reticulated_python.png" width=200 align=right style="margin-left: 15px;" alt="reticulated python"/><ul><li><p>Calling Python from R in a variety of ways including R Markdown, sourcing Python scripts, importing Python modules, and using Python interactively within an R session.</p></li><li><p>Translation between R and Python objects (for example, between R and Pandas data frames, or between R matrices and NumPy arrays).</p></li><li><p>Flexible binding to different versions of Python including virtual environments and Conda environments.</p></li></ul><p>Reticulate embeds a Python session within your R session, enabling seamless, high-performance interoperability. If you are an R developer that uses Python for some of your work or a member of data science team that uses both languages, reticulate can dramatically streamline your workflow!</p><p>You can install the <strong>reticulate</strong> pacakge from CRAN as follows:</p><pre><code>install.packages(&quot;reticulate&quot;)</code></pre><p>Read on to learn more about the features of reticulate, or see the <a href="https://rstudio.github.io/reticulate">reticulate website</a> for detailed documentation on using the package.</p><h2 id="python-in-r-markdown">Python in R Markdown</h2><p>The <strong>reticulate</strong> package includes a Python engine for <a href="http://rmarkdown.rstudio.com">R Markdown</a> with the following features:</p><ul><li><p>Run Python chunks in a single Python session embedded within your R session (shared variables/state between Python chunks)</p></li><li><p>Printing of Python output, including graphical output from <a href="https://matplotlib.org/">matplotlib</a>.</p></li><li><p>Access to objects created within Python chunks from R using the <code>py</code> object (e.g. <code>py$x</code> would access an <code>x</code> variable created within Python from R).</p></li><li><p>Access to objects created within R chunks from Python using the <code>r</code> object (e.g. <code>r.x</code> would access to <code>x</code> variable created within R from Python)</p></li></ul><div style="clear: both;"></div><p>Built in conversion for many Python object types is provided, including <a href="http://www.numpy.org/">NumPy</a> arrays and <a href="https://pandas.pydata.org/">Pandas</a> data frames. From example, you can use Pandas to read and manipulate data then easily plot the Pandas data frame using <a href="http://ggplot2.org/">ggplot2</a>:</p><img src="https://rstudio.github.io/reticulate/images/rmarkdown_engine_zoomed.png" class="screenshot"/><p>Note that the reticulate Python engine is enabled by default within R Markdown whenever reticulate is installed.</p><p>See the <a href="https://rstudio.github.io/reticulate/articles/r_markdown.html">R Markdown Python Engine</a> documentation for additional details.</p><h2 id="importing-python-modules">Importing Python modules</h2><p>You can use the <code>import()</code> function to import any Python module and call it from R. For example, this code imports the Python <code>os</code> module and calls the <code>listdir()</code> function:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(reticulate)os <span style="color:#666">&lt;-</span> <span style="color:#06287e">import</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">os&#34;</span>)os<span style="color:#666">$</span><span style="color:#06287e">listdir</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">.&#34;</span>)</code></pre></div><pre><code> [1] &quot;.git&quot; &quot;.gitignore&quot; &quot;.Rbuildignore&quot; &quot;.RData&quot;[5] &quot;.Rhistory&quot; &quot;.Rproj.user&quot; &quot;.travis.yml&quot; &quot;appveyor.yml&quot;[9] &quot;DESCRIPTION&quot; &quot;docs&quot; &quot;external&quot; &quot;index.html&quot;[13] &quot;index.Rmd&quot; &quot;inst&quot; &quot;issues&quot; &quot;LICENSE&quot;[17] &quot;man&quot; &quot;NAMESPACE&quot; &quot;NEWS.md&quot; &quot;pkgdown&quot;[21] &quot;R&quot; &quot;README.md&quot; &quot;reticulate.Rproj&quot; &quot;src&quot;[25] &quot;tests&quot; &quot;vignettes&quot;</code></pre><p>Functions and other data within Python modules and classes can be accessed via the <code>$</code> operator (analogous to the way you would interact with an R list, environment, or reference class).</p><p>Imported Python modules support code completion and inline help:</p><img src="https://rstudio.github.io/reticulate/images/reticulate_completion.png" class="screenshot"/><p>See <a href="https://rstudio.github.io/reticulate/articles/calling_python.html">Calling Python from R</a> for additional details on interacting with Python objects from within R.</p><h2 id="sourcing-python-scripts">Sourcing Python scripts</h2><p>You can source any Python script just as you would source an R script using the <code>source_python()</code> function. For example, if you had the following Python script <em>flights.py</em>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-python" data-lang="python"><span style="color:#007020;font-weight:bold">import</span> <span style="color:#0e84b5;font-weight:bold">pandas</span><span style="color:#007020;font-weight:bold">def</span> <span style="color:#06287e">read_flights</span>(<span style="color:#007020">file</span>):flights <span style="color:#666">=</span> pandas<span style="color:#666">.</span>read_csv(<span style="color:#007020">file</span>)flights <span style="color:#666">=</span> flights[flights[<span style="color:#4070a0"></span><span style="color:#4070a0">&#39;</span><span style="color:#4070a0">dest</span><span style="color:#4070a0">&#39;</span>] <span style="color:#666">==</span> <span style="color:#4070a0"></span><span style="color:#4070a0">&#34;</span><span style="color:#4070a0">ORD</span><span style="color:#4070a0">&#34;</span>]flights <span style="color:#666">=</span> flights[[<span style="color:#4070a0"></span><span style="color:#4070a0">&#39;</span><span style="color:#4070a0">carrier</span><span style="color:#4070a0">&#39;</span>, <span style="color:#4070a0"></span><span style="color:#4070a0">&#39;</span><span style="color:#4070a0">dep_delay</span><span style="color:#4070a0">&#39;</span>, <span style="color:#4070a0"></span><span style="color:#4070a0">&#39;</span><span style="color:#4070a0">arr_delay</span><span style="color:#4070a0">&#39;</span>]]flights <span style="color:#666">=</span> flights<span style="color:#666">.</span>dropna()<span style="color:#007020;font-weight:bold">return</span> flights</code></pre></div><p>Then you can source the script and call the <code>read_flights()</code> function as follows:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">source_python</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">flights.py&#34;</span>)flights <span style="color:#666">&lt;-</span> <span style="color:#06287e">read_flights</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">flights.csv&#34;</span>)<span style="color:#06287e">library</span>(ggplot2)<span style="color:#06287e">ggplot</span>(flights, <span style="color:#06287e">aes</span>(carrier, arr_delay)) <span style="color:#666">+</span> <span style="color:#06287e">geom_point</span>() <span style="color:#666">+</span> <span style="color:#06287e">geom_jitter</span>()</code></pre></div><p>See the <a href="https://rstudio.github.io/reticulate/reference/source_python.html"><code>source_python()</code></a> documentation for additional details on sourcing Python code.</p><h2 id="python-repl">Python REPL</h2><p>If you want to work with Python interactively you can call the <code>repl_python()</code> function, which provides a Python REPL embedded within your R session. Objects created within the Python REPL can be accessed from R using the <code>py</code> object exported from reticulate. For example:</p><img src="https://rstudio.github.io/reticulate/images/python_repl.png" class="screenshot"/><p>Enter <code>exit</code> within the Python REPL to return to the R prompt.</p><p>Note that Python code can also access objects from within the R session using the <code>r</code> object (e.g. <code>r.flights</code>). See the <a href="https://rstudio.github.io/reticulate/reference/repl_python.html"><code>repl_python()</code></a> documentation for additional details on using the embedded Python REPL.</p><h2 id="type-conversions">Type conversions</h2><p>When calling into Python, R data types are automatically converted to their equivalent Python types. When values are returned from Python to R they are converted back to R types. Types are converted as follows:</p><table><thead><tr><th>R</th><th>Python</th><th>Examples</th></tr></thead><tbody><tr><td>Single-element vector</td><td>Scalar</td><td><code>1</code>, <code>1L</code>, <code>TRUE</code>, <code>&quot;foo&quot;</code></td></tr><tr><td>Multi-element vector</td><td>List</td><td><code>c(1.0, 2.0, 3.0)</code>, <code>c(1L, 2L, 3L)</code></td></tr><tr><td>List of multiple types</td><td>Tuple</td><td><code>list(1L, TRUE, &quot;foo&quot;)</code></td></tr><tr><td>Named list</td><td>Dict</td><td><code>list(a = 1L, b = 2.0)</code>, <code>dict(x = x_data)</code></td></tr><tr><td>Matrix/Array</td><td>NumPy ndarray</td><td><code>matrix(c(1,2,3,4), nrow = 2, ncol = 2)</code></td></tr><tr><td>Data Frame</td><td>Pandas DataFrame</td><td><code> data.frame(x = c(1,2,3), y = c(&quot;a&quot;, &quot;b&quot;, &quot;c&quot;))</code></td></tr><tr><td>Function</td><td>Python function</td><td><code>function(x) x + 1</code></td></tr><tr><td>NULL, TRUE, FALSE</td><td>None, True, False</td><td><code>NULL</code>, <code>TRUE</code>, <code>FALSE</code></td></tr></tbody></table><p>If a Python object of a custom class is returned then an R reference to that object is returned. You can call methods and access properties of the object just as if it was an instance of an R reference class.</p><h2 id="learning-more">Learning more</h2><p>The <a href="https://rstudio.github.io/reticulate/">reticulate website</a> includes comprehensive documentation on using the package, including the following articles that cover various aspects of using reticulate:</p><ul><li><p><a href="https://rstudio.github.io/reticulate/articles/calling_python.html">Calling Python from R</a> — Describes the various ways to access Python objects from R as well as functions available for more advanced interactions and conversion behavior.</p></li><li><p><a href="https://rstudio.github.io/reticulate/articles/r_markdown.html">R Markdown Python Engine</a> — Provides details on using Python chunks within R Markdown documents, including how call Python code from R chunks and vice-versa.</p></li><li><p><a href="https://rstudio.github.io/reticulate/articles/versions.html">Python Version Configuration</a> — Describes facilities for determining which version of Python is used by reticulate within an R session.</p></li><li><p><a href="https://rstudio.github.io/reticulate/articles/python_packages.html">Installing Python Packages</a> — Documentation on installing Python packages from PyPI or Conda, and managing package installations using virtualenvs and Conda environments.</p></li><li><p><a href="https://rstudio.github.io/reticulate/articles/package.html">Using reticulate in an R Package</a> — Guidelines and best practices for using reticulate in an R package.</p></li><li><p><a href="https://rstudio.github.io/reticulate/articles/arrays.html">Arrays in R and Python</a> — Advanced discussion of the differences between arrays in R and Python and the implications for conversion and interoperability.</p></li></ul><h2 id="why-reticulate">Why reticulate?</h2><p>From the <a href="https://en.wikipedia.org/wiki/Reticulated_python">Wikipedia</a> article on the reticulated python:</p><blockquote><p>The reticulated python is a speicies of python found in Southeast Asia. They are the world&rsquo;s longest snakes and longest reptiles&hellip;The specific name, reticulatus, is Latin meaning &ldquo;net-like&rdquo;, or reticulated, and is a reference to the complex colour pattern.</p></blockquote><p>From the <a href="https://www.merriam-webster.com/dictionary/reticulate">Merriam-Webster</a> definition of reticulate:</p><blockquote><p>1: resembling a net or network; especially : having veins, fibers, or lines crossing a reticulate leaf. 2: being or involving evolutionary change dependent on genetic recombination involving diverse interbreeding populations.</p></blockquote><p>The package enables you to <em>reticulate</em> Python code into R, creating a new breed of project that weaves together the two languages.</p><p><em><strong>UPDATE:</strong></em> <em>Nov. 27, 2019</em><br><em>Learn more about <a href="https://rstudio.com/solutions/python-and-r/">how R and Python work together in RStudio</a>.</em></p><style type="text/css">.screenshot, .illustration {margin-bottom: 20px;border: 1px solid #ddd;box-shadow: 5px 5px 5px #eee;}</style></description></item><item><title>Platform Deprecation Strategy</title><link>https://www.rstudio.com/blog/platform-deprecation-strategy/</link><pubDate>Wed, 07 Mar 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/platform-deprecation-strategy/</guid><description><p>In an effort to streamline product development, maintenance, and support to ensure the best experience for our users, we have created a strategy for operating system and browser deprecation. This will allow us to focus our work on modern platforms, and to encourage best practices in R development.</p><p>This policy applies to all of our products and packages, and has been posted to the website <a href="https://www.rstudio.com/about/platform-deprecation-strategy/">here</a>. This strategy is included in our Support Agreement, the full text of which can be found <a href="https://www.rstudio.com/about/support-agreement/">here</a>.</p><p>The current support end dates have been chosen based on the OS or browser end-of-life dates, and are generally aligned with the dates on which the latest version of R can no longer be installed on them. Note that the first deprecation takes place on April 2, 2018, when Internet Explorer 10 and Ubuntu 12.04 will no longer be supported on new releases of our software.</p></description></item><item><title>RStudio Connect v1.5.14</title><link>https://www.rstudio.com/blog/rstudio-connect-v1-5-14/</link><pubDate>Fri, 02 Mar 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-v1-5-14/</guid><description><p>RStudio Connect v1.5.14 is now available! This release includes support for secure environment variables, customizing email subject lines, and beta support for serving TensorFlow models.</p><p>This release introduces beta support for SuSE and will be the last version of RStudio Connect to support Ubuntu 12.04 and Internet Explorer 10. Contact <a href="mailto:sales@rstudio.com">sales@rstudio.com</a> for more information on supported platforms.</p><h2 id="secure-environment-variables">Secure Environment Variables</h2><p><img src="rsc-1514-e.png" alt="Connect interface for specifying environment variables"></p><p>Environment variables for executable content can be configured in the RStudio Connect Dashboard. Values assigned to environment variables are available from within your R code using the <code>Sys.getenv</code> function. Once saved, environment variables are obscured in the Connect dashboard and are encrypted at rest.</p><p>These variables are a good way to specify values that should not be embedded in code, such as API keys, database credentials, or other kinds of sensitive data. Environment variables can also be used with the <a href="https://github.com/rstudio/config">config package</a> to manage configuration options across environments, including development, staging, and production.</p><h2 id="email-customization">Email Customization</h2><p><img src="rsc-1514-emails.png" alt="Inbox showing emails from Connect with custom subject lines"></p><p>RStudio Connect has always been able to send emails notifying users when new or scheduled versions of R Markdown reports are available, but now version 1.5.14 allows report authors to <a href="http://docs.rstudio.com/connect/1.5.14/user/r-markdown.html#email-customization">customize the email subject line</a>.</p><p>The subject line can be set using a new function in the rmarkdown package (version &gt;= 1.8). Reports can hard-code a custom subject line, or the subject line can be computed dynamically in the report. For example, in the image above, the subject line was generated with the code:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">rmarkdown<span style="color:#666">::</span>output_metdata<span style="color:#666">$</span><span style="color:#06287e">set</span>(“rsc_email_subject” <span style="color:#666">=</span><span style="color:#06287e">paste0</span>(params<span style="color:#666">$</span>ticker, ‘ is ‘, sign, ‘ today by <span style="color:#666">$</span>’ change)</code></pre></div><p>Convey the most important results of an analysis to stakeholders at a glance!</p><h2 id="tensorflow-models">TensorFlow Models</h2><p><img src="rsc-1514-tensorflow.png" alt="Connect hosting a sample TensorFlow model"></p><p>RStudio recently announced a set of packages giving R users powerful access to TensorFlow. In v1.5.14, models developed in TensorFlow can be deployed to Connect and served as RESTful APIs.</p><p>As an example, a data scientist can write and train a TensorFlow model for fraud detection using R. The data scientist can then export and deploy the model to RStudio Connect. Once deployed, non-R systems such as front-end web applications or back-end ETL services can call the model using standard HTTP requests to classify whether or not new records are fradulent.</p><p>Like other content on Connect, deployed TensorFlow models can be scaled, and access can be restricted through API keys. The <a href="https://TensorFlow.rstudio.com">RStudio TensorFlow site</a> has more details on model creation and deployment.</p><h2 id="other-improvements">Other Improvements</h2><ul><li>Beta Support for running RStudio Connect on SuSE Linux Enterprise Server versions 12 sp3+. Contact <a href="mailto:sales@rstudio.com">sales@rstudio.com</a> if you are interested in running Connect on SLES.</li><li><strong>BREAKING</strong>: <code>[Applications].ConnectionTimeout</code> and <code>[Applications].ReadTimeout</code> settings deprecated in version 1.5.12 have been removed from the configuration. They have been replaced with <code>[Scheduler].ConnectionTimeout</code> and <code>[Scheduler].ReadTimeout</code>, respectively.</li><li>Version 1.5.12 introduced a new license expiration warning shown to publishers and administrators. As of v1.5.14, this warning can be permanently disabled using the <code>[Licensing].ExpirationUIWarning</code> configuration flag.</li><li>The <code>-unstable</code> flag has been removed from the <code>migrate</code> utility&rsquo;s <code>rebuild-packrat</code> command. The admin guide contains more information on <a href="http://docs.rstudio.com/connect/1.5.14/admin/files-directories.html#server-migrations">server migration</a>.</li></ul><p>All changes in v1.5.14 and previous versions are available in the <a href="http://docs.rstudio.com/connect/1.5.14/news/">release news</a>.</p><h2 id="deprecation-announcement">Deprecation Announcement</h2><p>Version 1.5.14 is the last version of RStudio Connect that will support Ubuntu 12.04 and Internet Explorer 10.</p><blockquote><h4 id="upgrade-planning">Upgrade Planning</h4><p>There are no special precautions to be aware of when upgrading from v1.5.12 apart from the breaking changes listed above and in the release notes. You can expect the installation and startup of v1.5.14 to be complete in under a minute.</p><p>If you’re upgrading from a release older than v1.5.12, be sure to consider the “Upgrade Planning” notes from the intervening releases, as well.</p></blockquote><p>If you haven&rsquo;t yet had a chance to download and try <a href="https://rstudio.com/products/connect">RStudio Connect</a>, we encourage you to do so. RStudio Connect is the best way to share all the work that you do in R (Shiny apps, R Markdown documents, plots, dashboards, Plumber APIs, etc.) with collaborators, colleagues, or customers.</p><p>You can find more details or download a 45-day evaluation of the product at <a href="https://www.rstudio.com/products/connect/">https://www.rstudio.com/products/connect/</a>. Additional resources can be found below.</p><ul><li><a href="https://www.rstudio.com/products/connect/">RStudio Connect home page &amp; downloads</a></li><li><a href="http://docs.rstudio.com/connect/admin/">RStudio Connect Admin Guide</a></li><li><a href="https://www.rstudio.com/wp-content/uploads/2016/01/RSC-IT-Q-and-A.pdf">What IT needs to know about RStudio Connect</a></li><li><a href="http://docs.rstudio.com/connect/news/">Detailed news and changes between each version</a></li><li><a href="https://www.rstudio.com/pricing/#ConnectPricing">Pricing</a></li><li><a href="https://beta.rstudioconnect.com/connect/">An online preview of RStudio Connect</a></li></ul></description></item><item><title>Summer interns</title><link>https://www.rstudio.com/blog/summer-interns/</link><pubDate>Mon, 12 Feb 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/summer-interns/</guid><description><p>We are excited to announce the first formal summer internship program at RStudio. The goal of our internship program is to enable RStudio employees to collaborate with current students (broadly construed: if you think of yourself as a student you quality) to create impactful and useful applications that will help both RStudio users and the broader R community, and help ensure that the community of R developers is representative of the community of R users. You will have the opportunity to work with some of the most influential data scientists and R developers and work on widely used R packages.</p><p>To be qualified for the internship, you need some existing experience writing code in R and using git + GitHub. To demonstrate these skills, your application needs to include a link to a package, Shiny app, or data analysis repository on GitHub. It&rsquo;s ok if you create it specially for this application; we just want some evidence that you’re already familiar with the basic mechanics of collaborative development in R.</p><p>RStudio is a geographically distributed team which means you can be based anywhere in the USA (next year, we’ll try and support international interns too). That means, unless you are based in Boston or Seattle, you will be working 100% remotely, although we will pay for travel to one face-to-face meeting with your mentor. You will meet with your mentor for at least an hour a week, but otherwise you’ll be working on your own.</p><h2 id="projects">Projects</h2><p>We are recruiting interns for the following five projects:</p><ul><li><p><strong>Bootstrapping methods</strong>: Implement 1) classic bootstrap methods (confidence intervals and other methods) to work with <a href="https://topepo.github.io/rsample/">rsample</a>, <a href="https://topepo.github.io/yardstick/">yardstick</a>, and potentially <a href="http://infer.netlify.com/">infer</a> as well as 2) modern bootstrap methods for performance estimation (e.g. 632, 632+ estimates) for rsample.</p><p><strong>Skills needed</strong>: knowledge of bootstrapping methods (e.g. Ch 5 of Davidson and Hinkley) and tidyverse tools and packages. C++ would be advantageous but not required.</p><p><strong>Mentor</strong>: Max Kuhn</p></li><li><p><strong>broom</strong>: broom provides a bunch of methods to turn models into tidy data frames. It’s widely used but has lacked developer bandwidth to move it forward. Your job will be to resolve as many pull requests and issues as possible, while thinking about how to re-organise broom for long term maintainability.</p><p><strong>Skills needed</strong>: experience with one or more modelling packages in R; strong communication skills</p><p><strong>Mentor</strong>: David Robinson.</p></li><li><p><strong>ggplot2</strong>: ggplot2 is one of the biggest and most used packages in the tidyverse. In this internship you will learn enough about the internals that you can start contributing. You will learn the challenges of working with a large existing codebase, in an environment when any API change is likely to affect existing code.</p><p><strong>Skills needed</strong>: experience creating ggplot2 graphics for data analysis; previous package development experience.</p><p><strong>Mentor</strong>: Hadley Wickham</p></li><li><p><strong>Shiny</strong>: Shiny lets R programmers quickly create interactive web applications with R. The focus of this internship will be on addressing open issues and working on general user interface improvements. You will learn about how Shiny works, and gain experience working on a project that is at the interface of data analysis and web programming.</p><p><strong>Skills needed</strong>: experience with JavaScript and CSS; some experience creating your own Shiny apps.</p><p><strong>Mentor</strong>: Winston Chang</p></li><li><p><strong>The Tidies of March</strong>: Construct ~30 tidyverse data analysis exercises inspired by the <a href="https://adventofcode.com/">Advent of Code</a>. The main goal is to create an Advent of Code type of experience, but where the exercises cultivate and reward mastery of R, written in an idiomatic tidyverse style.</p><p><strong>Skills needed</strong>: documented experience using the tidyverse to analyze data and an appreciation of coding style/taste. Experience with the R ecosystem for making websites.</p><p><strong>Mentor</strong>: Jenny Bryan.</p></li></ul><h2 id="apply-now">Apply now!</h2><p>The internship pays $USD 6000, lasts 10 weeks, and will start around June 1. Applications close March 12.</p><p><a href="https://goo.gl/forms/cNCKg55JTehlH12h1">Apply now!</a></p><p>We value diverse viewpoints, and we encourage people with diverse backgrounds and experiences to apply.</p></description></item><item><title>TensorFlow for R</title><link>https://www.rstudio.com/blog/tensorflow-for-r/</link><pubDate>Tue, 06 Feb 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/tensorflow-for-r/</guid><description><p>Over the past year we’ve been hard at work on creating R interfaces to <a href="https://tensorflow.rstudio.com/">TensorFlow</a>, an open-source machine learning framework from Google. We are excited about TensorFlow for many reasons, not the least of which is its state-of-the-art infrastructure for deep learning applications.</p><p>In the 2 years since it was initially open-sourced by Google, TensorFlow has rapidly become the <a href="https://twitter.com/fchollet/status/871089784898310144?lang=en">framework of choice</a> for both machine learning practitioners and researchers. On Saturday, we formally announced our work on TensorFlow during J.J. Allaire’s keynote at <a href="https://www.rstudio.com/conference/">rstudio::conf</a>:</p><iframe width="711" height="400" src="https://www.youtube.com/embed/atiYXm7JZv0?rel=0" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe><p>In the keynote, J.J. describes not only the work we’ve done on TensorFlow but also discusses deep learning more broadly (what it is, how it works, and where it might be relevant to users of R in the years ahead).</p><div id="new-packages-and-tools" class="section level2"><h2>New packages and tools</h2><p>The R interface to TensorFlow consists of a suite of R packages that provide a variety of interfaces to TensorFlow for different tasks and levels of abstraction, including:</p><ul><li><p><a href="https://tensorflow.rstudio.com/keras/">keras</a>—A high-level interface for neural networks, with a focus on enabling fast experimentation.</p></li><li><p><a href="https://tensorflow.rstudio.com/tfestimators/">tfestimators</a>— Implementations of common model types such as regressors and classifiers.</p></li><li><p><a href="https://tensorflow.rstudio.com/tensorflow/">tensorflow</a>—Low-level interface to the TensorFlow computational graph.</p></li><li><p><a href="https://tensorflow.rstudio.com/tools/tfdatasets/">tfdatasets</a>—Scalable input pipelines for TensorFlow models.</p></li></ul><p>Besides the various R interfaces to TensorFlow, there are tools to help with training workflow, including real time feedback on training metrics within the RStudio IDE:</p><p><img src="https://www.rstudio.com/blog-images/2018-02-06-keras-training-metrics.gif" /></p><p>The <a href="https://tensorflow.rstudio.com/tools/tfruns/">tfruns package</a> provides tools to track, and manage TensorFlow training runs and experiments:</p><p><img src="https://www.rstudio.com/blog-images/2018-02-06-tfruns.png" style="border: solid 1px #cccccc;" /></p></div><div id="access-to-gpus" class="section level2"><h2>Access to GPUs</h2><p>Training convolutional or recurrent neural networks can be extremely computationally expensive, and benefits significantly from access to a recent high-end NVIDIA GPU. However, most users don’t have this sort of hardware available locally. To address this we have provided a number of ways to use GPUs in the cloud, including:</p><ul><li><p>The <a href="https://tensorflow.rstudio.com/tools/cloudml/">cloudml package</a>, an R interface to Google’s hosted machine learning engine.</p></li><li><p><a href="https://tensorflow.rstudio.com/tools/cloud_server_gpu.html#amazon-ec2">RStudio Server with Tensorflow-GPU for AWS</a> (an Amazon EC2 image preconfigured with NVIDIA CUDA drivers, TensorFlow, the TensorFlow for R interface, as well as RStudio Server).</p></li><li><p>Detailed instructions for setting up an Ubuntu 16.04 <a href="https://tensorflow.rstudio.com/tools/cloud_desktop_gpu.html">cloud desktop with a GPU</a> using the Paperspace service.</p></li></ul><p>There is also documentation on <a href="https://tensorflow.rstudio.com/tools/local_gpu.html">setting up a GPU</a> on your local workstation if you already have the required NVIDIA GPU hardware.</p></div><div id="learning-resources" class="section level2"><h2>Learning resources</h2><p>We’ve also made a significant investment in learning resources, all of these resources are available on the TensorFlow for R website at <a href="https://tensorflow.rstudio.com" class="uri">https://tensorflow.rstudio.com</a>.</p><p>Some of the learning resources include:</p><table><colgroup><col width="21%" /><col width="78%" /></colgroup><tbody><tr class="odd"><td><a href="https://www.amazon.com/Deep-Learning-R-Francois-Chollet/dp/161729554X"><img class="nav-image illustration" src="https://images.manning.com/720/960/resize/book/a/4e5e97f-4e8d-4d97-a715-f6c2b0eb95f5/Allaire-DLwithR-HI.png" width=250/></a></td><td><a href="https://www.amazon.com/Deep-Learning-R-Francois-Chollet/dp/161729554X">Deep Learning with R</a> <br/>Deep Learning with R is meant for statisticians, analysts, engineers, and students with a reasonable amount of R experience but no significant knowledge of machine learning and deep learning. You’ll learn from more than 30 code examples that include detailed commentary and practical recommendations. You don’t need previous experience with machine learning or deep learning: this book covers from scratch all the necessary basics. You don’t need an advanced mathematics background, either—high school level mathematics should suffice in order to follow along.</td></tr><tr class="even"><td><a href="https://github.com/rstudio/cheatsheets/raw/master/keras.pdf"><img class="nav-image illustration" src="https://tensorflow.rstudio.com/learn/images/resources-cheatsheet.png" width=250/></a></td><td><a href="https://github.com/rstudio/cheatsheets/raw/master/keras.pdf">Deep Learning with Keras Cheatsheet</a> <br/>A quick reference guide to the concepts and available functions in the R interface to Keras. Covers the various types of Keras layers, data preprocessing, training workflow, and pre-trained models.</td></tr><tr class="odd"><td><a href="https://tensorflow.rstudio.com/gallery/"><img class="nav-image illustration" src="https://tensorflow.rstudio.com/learn/images/keras-customer-churn.png" width=250/></a></td><td><a href="https://tensorflow.rstudio.com/gallery/">Gallery</a> <br/>In-depth examples of using TensorFlow with R, including detailed explanatory narrative as well as coverage of ancillary tasks like data preprocessing and visualization. A great resource for taking the next step after you’ve learned the basics.</td></tr><tr class="even"><td><a href="https://tensorflow.rstudio.com/learn/examples.html"><img class="nav-image illustration" src="https://tensorflow.rstudio.com/learn/images/resources-examples.png" width=250/></a></td><td><a href="https://tensorflow.rstudio.com/learn/examples.html">Examples</a> <br/> Introductory examples of using TensorFlow with R. These examples cover the basics of training models with the keras, tfestimators, and tensorflow packages.</td></tr></tbody></table></div><div id="whats-next" class="section level2"><h2>What’s next</h2><p>We’ll be continuing to build packages and tools that make using TensorFlow from R easy to learn, productive, and capable of addressing the most challenging problems in the field. We’ll also be making an ongoing effort to add to our gallery of in-depth examples. To stay up to date on our latest tools and additions to the gallery, you can subscribe to the <a href="https://tensorflow.rstudio.com/blog/">TensorFlow for R Blog</a>.</p><p>While TensorFlow and deep learning have done some impressive things in fields like image classification and speech recognition, its use within other domains like biomedical and time series analysis is more experimental and not yet proven to be of broad benefit. We’re excited to how the R community will push the frontiers of what’s possible, as well as find entirely new applications. If you are an R user who has been curious about TensorFlow and/or deep learning applications, now is a great time to dive in and learn more!</p><style type="text/css">.illustration {border: solid 1px #cccccc;}.nav-image {margin-top: 8px;}</style></div></description></item><item><title>sparklyr 0.7: Spark Pipelines and Machine Learning</title><link>https://www.rstudio.com/blog/sparklyr-0-7/</link><pubDate>Mon, 29 Jan 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/sparklyr-0-7/</guid><description><p>We are excited to share that <a href="https://cran.r-project.org/web/packages/sparklyr/index.html">sparklyr 0.7</a> is now available on CRAN! Sparklyr provides an R interface to Apache Spark. It supports dplyr syntax for working with Spark DataFrames and exposes the full range of machine learning algorithms available in Spark. You can also learn more about Apache Spark and sparklyr in <a href="http://spark.rstudio.com">spark.rstudio.com</a> and our new <a href="https://www.rstudio.com/resources/webinars/introducing-an-r-interface-for-apache-spark/">webinar series on Apache Spark</a>. Features in this release:</p><ul><li>Adds support for <strong>ML Pipelines</strong> which provide a uniform set of high-level APIs to help create, tune, and deploy machine learning pipelines at scale.</li><li>Enhances <strong>Machine Learning</strong> capabilities by supporting the full range of ML algorithms and feature transformers.</li><li>Improves <strong>Data Serialization</strong>, specifically by adding support for date columns.</li><li>Adds support for <a href="https://spark.rstudio.com/guides/connections/#cluster-mode">YARN cluster mode</a> connections.</li><li>Adds various other improvements as listed in the <a href="https://spark.rstudio.com/news/">NEWS</a> file.</li></ul><p>In this blog post, we highlight Pipelines, new ML functions, and enhanced support for data serialization. To follow along in the examples below, you can upgrade to the latest stable version from CRAN with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">sparklyr&#34;</span>)</code></pre></div><h2 id="ml-pipelines">ML Pipelines</h2><p>The <a href="https://spark.apache.org/docs/latest/ml-pipeline.html">ML Pipelines</a> API is a high-level interface for building ML workflows in Spark. Pipelines provide a uniform approach to compose feature transformers and ML routines, and are interoperable across the different Spark APIs (R/sparklyr, Scala, and Python.)</p><p>First, let&rsquo;s go over a quick overview of terminology. A <code>Pipeline</code> consists of a sequence of stages&mdash;<code>PipelineStage</code>s&mdash;that act on some data in order. A <code>PipelineStage</code> can be either a <code>Transformer</code> or an <code>Estimator</code>. A <code>Transformer</code> takes a data frame and returns a transformed data frame, whereas an <code>Estimator</code> take a data frame and returns a <code>Transformer</code>. You can think of an <code>Estimator</code> as an algorithm that can be fit to some data, e.g. the ordinary least squares (OLS) method, and a <code>Transformer</code> as the fitted model, e.g. the linear formula that results from OLS. A <code>Pipeline</code> is itself a <code>PipelineStage</code> and can be an element in another <code>Pipeline</code>. Lastly, a <code>Pipeline</code> is always an <code>Estimator</code>, and its fitted form is called <code>PipelineModel</code> which is a <code>Transformer</code>.</p><p>Let&rsquo;s look at some examples of creating pipelines. We establish a connection and copy some data to Spark:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(sparklyr)<span style="color:#06287e">library</span>(dplyr)<span style="color:#60a0b0;font-style:italic"># If needed, install Spark locally via `spark_install()`</span>sc <span style="color:#666">&lt;-</span> <span style="color:#06287e">spark_connect</span>(master <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">local&#34;</span>)iris_tbl <span style="color:#666">&lt;-</span> <span style="color:#06287e">copy_to</span>(sc, iris)<span style="color:#60a0b0;font-style:italic"># split the data into train and validation sets</span>iris_data <span style="color:#666">&lt;-</span> iris_tbl <span style="color:#666">%&gt;%</span><span style="color:#06287e">sdf_partition</span>(train <span style="color:#666">=</span> <span style="color:#40a070">2</span><span style="color:#666">/</span><span style="color:#40a070">3</span>, validation <span style="color:#666">=</span> <span style="color:#40a070">1</span><span style="color:#666">/</span><span style="color:#40a070">3</span>, seed <span style="color:#666">=</span> <span style="color:#40a070">123</span>)</code></pre></div><p>Then, we can create a new <code>Pipeline</code> with <code>ml_pipeline()</code> and add stages to it via the <code>%&gt;%</code> operator. Here we also define a transformer using dplyr transformations using the newly available <code>ft_dplyr_transformer()</code>.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">pipeline <span style="color:#666">&lt;-</span> <span style="color:#06287e">ml_pipeline</span>(sc) <span style="color:#666">%&gt;%</span><span style="color:#06287e">ft_dplyr_transformer</span>(iris_data<span style="color:#666">$</span>train <span style="color:#666">%&gt;%</span><span style="color:#06287e">mutate</span>(Sepal_Length <span style="color:#666">=</span> <span style="color:#06287e">log</span>(Sepal_Length),Sepal_Width <span style="color:#666">=</span> Sepal_Width ^ <span style="color:#40a070">2</span>)) <span style="color:#666">%&gt;%</span><span style="color:#06287e">ft_string_indexer</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Species&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">label&#34;</span>)pipeline</code></pre></div><pre><code>## Pipeline (Estimator) with 2 stages## &lt;pipeline_c75757b824f&gt;## Stages## |--1 SQLTransformer (Transformer)## | &lt;dplyr_transformer_c757fa84cca&gt;## | (Parameters -- Column Names)## |--2 StringIndexer (Estimator)## | &lt;string_indexer_c75307cbfec&gt;## | (Parameters -- Column Names)## | input_col: Species## | output_col: label## | (Parameters)## | handle_invalid: error</code></pre><p>Under the hood, <code>ft_dplyr_transformer()</code> extracts the SQL statements associated with the input and creates a Spark <code>SQLTransformer</code>, which can then be applied to new datasets with the appropriate columns.We now fit the <code>Pipeline</code> with <code>ml_fit()</code> then transform some data using the resulting <code>PipelineModel</code> with <code>ml_transform()</code>.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">pipeline_model <span style="color:#666">&lt;-</span> pipeline <span style="color:#666">%&gt;%</span><span style="color:#06287e">ml_fit</span>(iris_data<span style="color:#666">$</span>train)<span style="color:#60a0b0;font-style:italic"># pipeline_model is a transformer</span>pipeline_model <span style="color:#666">%&gt;%</span><span style="color:#06287e">ml_transform</span>(iris_data<span style="color:#666">$</span>validation) <span style="color:#666">%&gt;%</span><span style="color:#06287e">glimpse</span>()</code></pre></div><pre><code>## Observations: ??## Variables: 6## $ Petal_Length &lt;dbl&gt; 1.4, 1.3, 1.3, 1.0, 1.6, 1.9, 3.3, 4.5, 1.6, 1.5,...## $ Petal_Width &lt;dbl&gt; 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 1.0, 1.7, 0.2, 0.2,...## $ Species &lt;chr&gt; &quot;setosa&quot;, &quot;setosa&quot;, &quot;setosa&quot;, &quot;setosa&quot;, &quot;setosa&quot;,...## $ Sepal_Length &lt;dbl&gt; 1.482, 1.482, 1.482, 1.526, 1.548, 1.569, 1.589, ...## $ Sepal_Width &lt;dbl&gt; 8.41, 9.00, 10.24, 12.96, 10.24, 11.56, 5.76, 6.2...## $ label &lt;dbl&gt; 1, 1, 1, 1, 1, 1, 0, 2, 1, 1, 1, 0, 1, 1, 1, 1, 1...</code></pre><h3 id="a-predictive-modeling-pipeline">A predictive modeling pipeline</h3><p>Now, let&rsquo;s try to build a classification pipeline on the <code>iris</code> dataset.</p><p>Spark ML algorithms require that the label column be encoded as numeric and predictor columns be encoded as one vector column. We&rsquo;ll build on the pipeline we created in the previous section, where we have already included a <code>StringIndexer</code> stage to encode the label column.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic"># define stages</span><span style="color:#60a0b0;font-style:italic"># vector_assember will concatenate the predictor columns into one vector column</span>vector_assembler <span style="color:#666">&lt;-</span> <span style="color:#06287e">ft_vector_assembler</span>(sc,input_cols <span style="color:#666">=</span> <span style="color:#06287e">setdiff</span>(<span style="color:#06287e">colnames</span>(iris_data<span style="color:#666">$</span>train), <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Species&#34;</span>),output_col <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">features&#34;</span>)logistic_regression <span style="color:#666">&lt;-</span> <span style="color:#06287e">ml_logistic_regression</span>(sc)<span style="color:#60a0b0;font-style:italic"># obtain the labels from the fitted StringIndexerModel</span>labels <span style="color:#666">&lt;-</span> pipeline_model <span style="color:#666">%&gt;%</span><span style="color:#06287e">ml_stage</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">string_indexer&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">ml_labels</span>()<span style="color:#60a0b0;font-style:italic"># IndexToString will convert the predicted numeric values back to class labels</span>index_to_string <span style="color:#666">&lt;-</span> <span style="color:#06287e">ft_index_to_string</span>(sc, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">prediction&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">predicted_label&#34;</span>,labels <span style="color:#666">=</span> labels)<span style="color:#60a0b0;font-style:italic"># construct a pipeline with these stages</span>prediction_pipeline <span style="color:#666">&lt;-</span> <span style="color:#06287e">ml_pipeline</span>(pipeline, <span style="color:#60a0b0;font-style:italic"># pipeline from previous section</span>vector_assembler,logistic_regression,index_to_string)<span style="color:#60a0b0;font-style:italic"># fit to data and make some predictions</span>prediction_model <span style="color:#666">&lt;-</span> prediction_pipeline <span style="color:#666">%&gt;%</span><span style="color:#06287e">ml_fit</span>(iris_data<span style="color:#666">$</span>train)predictions <span style="color:#666">&lt;-</span> prediction_model <span style="color:#666">%&gt;%</span><span style="color:#06287e">ml_transform</span>(iris_data<span style="color:#666">$</span>validation)predictions <span style="color:#666">%&gt;%</span><span style="color:#06287e">select</span>(Species, label<span style="color:#666">:</span>predicted_label) <span style="color:#666">%&gt;%</span><span style="color:#06287e">glimpse</span>()</code></pre></div><pre><code>## Observations: ??## Variables: 7## $ Species &lt;chr&gt; &quot;setosa&quot;, &quot;setosa&quot;, &quot;setosa&quot;, &quot;setosa&quot;, &quot;setos...## $ label &lt;dbl&gt; 1, 1, 1, 1, 1, 1, 0, 2, 1, 1, 1, 0, 1, 1, 1, 1...## $ features &lt;list&gt; [&lt;1.482, 8.410, 1.400, 0.200&gt;, &lt;1.482, 9.000,...## $ rawPrediction &lt;list&gt; [&lt;-67.48, 2170.98, -2103.49&gt;, &lt;-124.4, 2365.8...## $ probability &lt;list&gt; [&lt;0, 1, 0&gt;, &lt;0, 1, 0&gt;, &lt;0, 1, 0&gt;, &lt;0, 1, 0&gt;, ...## $ prediction &lt;dbl&gt; 1, 1, 1, 1, 1, 1, 0, 2, 1, 1, 1, 0, 1, 1, 1, 1...## $ predicted_label &lt;chr&gt; &quot;setosa&quot;, &quot;setosa&quot;, &quot;setosa&quot;, &quot;setosa&quot;, &quot;setos...</code></pre><h3 id="model-persistence">Model persistence</h3><p>Another benefit of pipelines is reusability across programing languages and easy deployment to production. We can save a pipeline from R as follows:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ml_save</span>(prediction_model, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">path/to/prediction_model&#34;</span>)</code></pre></div><p>When you call <code>ml_save()</code> on a <code>Pipeline</code> or <code>PipelineModel</code> object, all of the information required to recreate it will be saved to disk. You can then load it in the future to, in the case of a <code>PipelineModel</code>, make predictions or, in the case of a <code>Pipeline</code>, retrain on new data.</p><h2 id="machine-learning">Machine learning</h2><p>Sparklyr 0.7 introduces more than 20 new feature transformation and machine learning functions to include the full set of <a href="https://spark.rstudio.com/reference/#section-spark-machine-learning">Spark ML</a> algorithms. We highlight just a couple here.</p><h3 id="bisecting-k-means">Bisecting K-means</h3><p>Bisecting k-means is a variant of k-means that can sometimes be much faster to train. Here we show how to use <code>ml_bisecting_kmeans()</code> with <code>iris</code> data.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(ggplot2)model <span style="color:#666">&lt;-</span> <span style="color:#06287e">ml_bisecting_kmeans</span>(iris_tbl, Species <span style="color:#666">~</span> Petal_Length <span style="color:#666">+</span> Petal_Width, k <span style="color:#666">=</span> <span style="color:#40a070">3</span>, seed <span style="color:#666">=</span> <span style="color:#40a070">123</span>)predictions <span style="color:#666">&lt;-</span> <span style="color:#06287e">ml_predict</span>(model, iris_tbl) <span style="color:#666">%&gt;%</span><span style="color:#06287e">collect</span>() <span style="color:#666">%&gt;%</span><span style="color:#06287e">mutate</span>(cluster <span style="color:#666">=</span> <span style="color:#06287e">as.factor</span>(prediction))<span style="color:#06287e">ggplot</span>(predictions, <span style="color:#06287e">aes</span>(x <span style="color:#666">=</span> Petal_Length,y <span style="color:#666">=</span> Petal_Width,color <span style="color:#666">=</span> predictions<span style="color:#666">$</span>cluster)) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>()</code></pre></div><p><img src="https://www.rstudio.com/post/2018-01-29-sparklyr-0-7_files/figure-html/unnamed-chunk-7-1.png" alt=""></p><h3 id="frequent-pattern-mining">Frequent pattern mining</h3><p><code>ml_fpgrowth()</code> enables <a href="https://en.wikipedia.org/wiki/Association_rule_learning">frequent pattern mining</a> at scale using the FP-Growth algorithm. See the <a href="https://spark.apache.org/docs/2.2.0/ml-frequent-pattern-mining.html">Spark ML documentation</a> for more details. Here we briefly showcase the sparklyr API.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic"># create an item purchase history dataset</span>items <span style="color:#666">&lt;-</span> <span style="color:#06287e">data.frame</span>(items <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">1,2,5&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">1,2,3,5&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">1,2&#34;</span>),stringsAsFactors <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">FALSE</span>)<span style="color:#60a0b0;font-style:italic"># parse into vector column</span>items_tbl <span style="color:#666">&lt;-</span> <span style="color:#06287e">copy_to</span>(sc, items) <span style="color:#666">%&gt;%</span><span style="color:#06287e">mutate</span>(items <span style="color:#666">=</span> <span style="color:#06287e">split</span>(items, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">,&#34;</span>))<span style="color:#60a0b0;font-style:italic"># fit the model</span>fp_model <span style="color:#666">&lt;-</span> items_tbl <span style="color:#666">%&gt;%</span><span style="color:#06287e">ml_fpgrowth</span>(min_support <span style="color:#666">=</span> <span style="color:#40a070">0.5</span>, min_confidence <span style="color:#666">=</span> <span style="color:#40a070">0.6</span>)<span style="color:#60a0b0;font-style:italic"># use the model to predict related items based on</span><span style="color:#60a0b0;font-style:italic"># learned association rules</span>fp_model <span style="color:#666">%&gt;%</span><span style="color:#06287e">ml_transform</span>(items_tbl) <span style="color:#666">%&gt;%</span><span style="color:#06287e">collect</span>() <span style="color:#666">%&gt;%</span><span style="color:#06287e">mutate_all</span>(<span style="color:#06287e">function</span>(x) <span style="color:#06287e">sapply</span>(x, paste0, collapse <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">,&#34;</span>))</code></pre></div><pre><code>## # A tibble: 3 x 2## items prediction## &lt;chr&gt; &lt;chr&gt;## 1 1,2,5 &quot;&quot;## 2 1,2,3,5 &quot;&quot;## 3 1,2 5</code></pre><h2 id="data-serialization">Data serialization</h2><p>Various improvements were made to better support serialization and collection of data frames. Most notably, dates are now supported:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">copy_to</span>(sc, nycflights13<span style="color:#666">::</span>flights) <span style="color:#666">%&gt;%</span><span style="color:#06287e">select</span>(carrier, flight, time_hour)</code></pre></div><pre><code>## # Source: lazy query [?? x 3]## # Database: spark_connection## carrier flight time_hour## &lt;chr&gt; &lt;int&gt; &lt;dttm&gt;## 1 UA 1545 2013-01-01 05:00:00## 2 UA 1714 2013-01-01 05:00:00## 3 AA 1141 2013-01-01 05:00:00## 4 B6 725 2013-01-01 05:00:00## 5 DL 461 2013-01-01 06:00:00## 6 UA 1696 2013-01-01 05:00:00## 7 B6 507 2013-01-01 06:00:00## 8 EV 5708 2013-01-01 06:00:00## 9 B6 79 2013-01-01 06:00:00## 10 AA 301 2013-01-01 06:00:00## # ... with more rows</code></pre><p>We can&rsquo;t wait to see what you&rsquo;ll build with the new features! As always, comments, issue reports, and contributions are welcome on the <a href="https://github.com/rstudio/sparklyr">sparklyr GitHub repo</a>.</p></description></item><item><title>RStudio Connect v1.5.12</title><link>https://www.rstudio.com/blog/rstudio-connect-v1-5-12/</link><pubDate>Fri, 12 Jan 2018 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-v1-5-12/</guid><description><p>We’re pleased to announce RStudio Connect 1.5.12. This release includes support for viewing historical content, per-application timeout settings, and important improvements and bug fixes.</p><p><img src="https://www.rstudio.com/blog-images/rsc-1512-historical.png" alt="An example report with historical versions selected"></p><h2 id="historical-content">Historical Content</h2><p>RStudio Connect now retains and displays historical content. By selecting the content’s history, viewers can easily navigate, compare, and email prior versions of content. Historical content is especially valuable for scheduled reports. Previously published documents, plots, and custom versions of parameterized reports are also saved. Administrators <a href="http://docs.rstudio.com/connect/1.5.12/admin/appendix-configuration.html#appendix-configuration-applications">can control</a> how much history is saved by specifying a maximum age and/or a maximum number of historical versions.</p><h2 id="timeout-settings">Timeout Settings</h2><p>Timeout settings can be customized for specific Shiny applications or Plumber APIs. These settings allow publishers to optimize timeouts for specific content. For example, a live-updating dashboard might be kept open without expecting user input, while a resource-intensive, interactive app might be more aggressively shut down when idle. Idle Timeout, Initial Timeout, Connection Timeout, and Read Timeout can all be customized.</p><p>Along with this improvement, be aware of a <strong>BREAKING CHANGE</strong>. The <code>Applications.ConnectionTimeout</code> and <code>Application.ReadTimeout</code> settings, which specify server default timeouts for all content, have been deprecated in favor of <code>Scheduler.ConnectionTimeout</code> and <code>Scheduler.ReadTimeout</code>.</p><h2 id="other-improvements">Other Improvements</h2><ul><li>A <em>new security option</em>, “<a href="http://docs.rstudio.com/connect/1.5.12/admin/security-auditing.html#web-sudo-mode">Web Sudo Mode</a>”, is enabled by default to require users to re-enter their password prior to performing sensitive actions like altering users, altering API keys, and linking RStudio to Connect.</li><li>The <em>usermanager</em> command line interface (CLI) can be used to update user first and last name, email, and username in addition to user role. User attributes that are managed by your external authentication provider will continue to be managed externally, but the CLI can be used by administrators to complete other fields in user profiles.</li><li>The Connect dashboard will show administrators and publishers a warning as license expiration nears.</li><li><strong>BREAKING CHANGE</strong> The <code>RequireExternalUsernames</code> option deprecated in 1.5.10 has been removed.</li><li><strong>Known Issue</strong> After installing RStudio Connect 1.5.12, previously deployed content may incorrectly display an error message. Refreshing the browser will fix the error.</li></ul><p>You can see the full release notes for RStudio Connect 1.5.12 <a href="http://docs.rstudio.com/connect/1.5.12/news/">here</a>.</p><blockquote><h4 id="upgrade-planning">Upgrade Planning</h4><p>There are no special precautions to be aware of when upgrading from v1.5.10 apart from the breaking changes and known issue listed above and in the release notes. You can expect the installation and startup of v1.5.12 to be complete in under a minute.</p><p>If you’re upgrading from a release older than v1.5.10, be sure to consider the “Upgrade Planning” notes from the intervening releases, as well.</p></blockquote><p>If you haven&rsquo;t yet had a chance to download and try <a href="https://rstudio.com/products/connect">RStudio Connect</a>, we encourage you to do so. RStudio Connect is the best way to share all the work that you do in R (Shiny apps, R Markdown documents, plots, dashboards, Plumber APIs, etc.) with collaborators, colleagues, or customers.</p><p>You can find more details or download a 45-day evaluation of the product at <a href="https://www.rstudio.com/products/connect/">https://www.rstudio.com/products/connect/</a>. Additional resources can be found below.</p><ul><li><a href="https://www.rstudio.com/products/connect/">RStudio Connect home page &amp; downloads</a></li><li><a href="http://docs.rstudio.com/connect/admin/">RStudio Connect Admin Guide</a></li><li><a href="https://www.rstudio.com/wp-content/uploads/2016/01/RSC-IT-Q-and-A.pdf">What IT needs to know about RStudio Connect</a></li><li><a href="http://docs.rstudio.com/connect/news/">Detailed news and changes between each version</a></li><li><a href="https://www.rstudio.com/pricing/#ConnectPricing">Pricing</a></li><li><a href="https://beta.rstudioconnect.com/connect/">An online preview of RStudio Connect</a></li></ul></description></item><item><title>Shiny Server (Pro) 1.5.6</title><link>https://www.rstudio.com/blog/shiny-server-pro-1-5-6/</link><pubDate>Mon, 11 Dec 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-server-pro-1-5-6/</guid><description><p><a href="https://www.rstudio.com/products/shiny/shiny-server/">Shiny Server 1.5.6.875 and Shiny Server Pro 1.5.6.902 are now available.</a></p><p>This release of Shiny Server Pro includes floating license support and Shiny Server contains a small enhancement to the way errors are displayed. We recommend upgrading at your earliest convenience.</p><h3 id="shiny-server-156875">Shiny Server 1.5.6.875</h3><ul><li>Use HTTPS for Google Fonts on error page, which resolves insecure content errors on some browser when run behind SSL.(PR <a href="https://github.com/rstudio/shiny-server/pull/322">#322</a>)</li></ul><h3 id="shiny-server-pro-156902">Shiny Server Pro 1.5.6.902</h3><p>This release adds <strong>floating license</strong> support through the <code>license_type</code> configuration directive.Full documentation can be found at <a href="http://docs.rstudio.com/shiny-server/#floating-licenses">http://docs.rstudio.com/shiny-server/#floating-licenses</a>.</p><p>Floating licensing allows you to run fully licensed copies of Shiny Server Pro easily in ephemeral instances, such as Docker containers, virtual machines, and EC2 instances. Instances don’t have to be individually licensed, and you don’t have to manually activate and deactivate a license in each instance. Instead, a lightweight license server distributes temporary licenses (&ldquo;leases&rdquo;) to each instance, and the instance uses the license only while it’s running.</p><p>This model is perfect for environments in which Shiny Server Pro instances are frequently created and destroyed on demand, and only requires that you purchase a license for the maximum number of concurrent instances you want to run.</p></description></item><item><title>Birds of a Feather sessions at rstudio::conf 2018 and the rstudio::conf app!</title><link>https://www.rstudio.com/blog/birds-of-a-feather-sessions-at-rstudio-conf-2018-and-the-rstudio-conf-app/</link><pubDate>Fri, 01 Dec 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/birds-of-a-feather-sessions-at-rstudio-conf-2018-and-the-rstudio-conf-app/</guid><description><p>RStudio appreciates the hundreds of smart, passionate, data science enthusiasts who have already registered for <a href="https://www.rstudio.com/conference/">rstudio::conf 2018</a>. We’re looking forward to a fantastic conference, immersing in all things R &amp; RStudio.</p><p>If you haven’t registered yet, please do! Some workshops are now full. We are also over <strong>90% of our registration target</strong> - with more than 2 months to go. It’s safe to say we will sell out. The sooner you are able to register, the better. It’s going to be an amazing time!</p><p><strong><a href="https://www.rstudio.com/conference/rstudioconf-tickets/">REGISTER TODAY</a></strong></p><p>For those who have registered, we’d like to help you connect with others doing similar work.</p><p>Attendees include many kinds of professionals in physical, natural, social and data sciences, statistics, education, engineering, research, BI, IT data infrastructure, finance, marketing, customer support, operations, human resources&hellip;and many more. They are sole proprietors and work for the world’s largest companies. They live in developing and developed countries. They use R and RStudio to explore data, develop polished reports, publish interactive visualizations, or create production code central to the success of their company. They share a common bond - a commitment to R, enthusiasm for RStudio products, and a desire to become better data scientists.</p><p>To foster relationships among people doing similar work we’ve made time and arranged spaces for 9 total Birds of a Feather (BoF) sessions.They will be held during breakfast and lunch, so you can grab a meal and head to your preferred BoF room!</p><p>Some topics seem obvious to us. For example, we will definitely set aside rooms for Life Sciences, Financial Services, Education, and Training &amp; Consulting Partner BoFs. As we see it, a BoF is just a short unconference session within a conference, organized or left un-organized (mostly) by participants! Topics may be narrow or broad. Some may have agendas and others may be purely for networking. At a minimum, each room will have a friendly RStudio proctor, chairs, a screen to present, and flipcharts for those who are inspired to create discussion sub-groups, share material broadly, or collaborate.</p><p><strong>What Birds of a Feather sessions would you like to attend?</strong></p><p>In addition to the 4 BoFs we will set aside rooms for, we’re going to use the new <a href="https://community.rstudio.com/">community.rstudio.com</a> as a place to discuss which 5 additional BoFs should be allocated time and space. If you are registered for rstudio::conf or planning to register, head on over and look for the rstudio::conf category and the &ldquo;your ideas for birds of a feather sessions&rdquo; topic to start proposing and upvoting!</p><p>Once the BoF session topics are decided, we’ll load them into our mobile app for the conference. This, along with community.rstudio.com, will allow for Pre-BoF discussions so you can hit the ground running in San Diego!</p><p><a href="https://get.eventedge.com/rstudioconf/">Download the Conference App Now</a></p><p>rstudio::conf 2018 is the conference for all things R &amp; RStudio. Training Days are on January 31 and February 1. The conference is February 2 and 3.</p><p>Interested in having your company sponsor rstudio::conf 2018? It’s a unique opportunity to share your support for R users. Contact <a href="mailto:anne@rstudio.com">anne@rstudio.com</a> for more information.</p></description></item><item><title>RStudio Connect v1.5.10</title><link>https://www.rstudio.com/blog/rstudio-connect-v1-5-10/</link><pubDate>Fri, 01 Dec 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-v1-5-10/</guid><description><p>We’re pleased to announce version 1.5.10 of RStudio Connect and the general availability of RStudio Connect Execution Servers. Execution Servers enable horizontal scaling and high availability for all the content you develop in R. The 1.5.10 release also includes important security improvements and bug fixes.</p><p><strong>RStudio Connect Execution Servers</strong></p><p><img src="https://www.rstudio.com/blog-images/rsc-ha.jpg" alt="HA Architecture"></p><p>Support for high availability and horizontal scaling is now generally available through RStudio Connect Execution Servers. Execution Servers enable RStudio Connect to run across a multi-node cluster.</p><p>Today, Execution Servers act as identically configured Connect instances. Requests for Shiny applications and Plumber APIs are split across nodes by a load balancer. Scheduled R Markdown execution is distributed across the cluster through an internal job scheduler that distributes work evenly across nodes. Over time, more of Connect’s work will be handled by the internal scheduler, giving admins control over what nodes accomplish certain tasks.</p><p>The <a href="https://docs.rstudio.com/connect/admin">admin guide</a> includes configuration instructions. Contact [sales](mailto: <a href="mailto:sales@rstudio.com">sales@rstudio.com</a>) for licensing information.</p><p><strong>Other Improvements</strong></p><ul><li><p>For configurations using SQLite, the <strong>SQLite database is automatically backed up</strong> while Connect is running. By default, three backups are retained and a new backup is taken every 24 hours. To disable, setup <code>[Sqlite].Backup</code> to false in the server configuration file.</p></li><li><p>RStudio Connect has always isolated user code from the file system. For example, application A can not access data uploaded with application B. In 1.5.10, <strong>R processes can now read from the <code>/tmp</code> and <code>/var/tmp</code> directories</strong>. This change enables shared files to be stored in <code>/tmp</code> and <code>/var/tmp</code> and helps facilitate Kerberos configurations. R processes still have isolated temporary directories provided at runtime and accessible with the <code>tempdir</code> function and <code>TMPDIR</code> environment variable. See <a href="http://docs.rstudio.com/connect/admin">section 12</a> of the admin guide for more details on process sandboxing.</p></li><li><p>Improvements have been made in RStudio Connect and the <code>rsconnect</code> package to <strong>support deployments using proxied authentication</strong>. See the admin guide for details on setting up the proxy. Anonymous viewers and requests authenticated with API keys are also now supported with proxied auth.</p></li><li><p>Scheduled reports are now re-run if execution is interrupted by a server restart. In a cluster, reports are automatically re-run if a node goes down, assuring high availability for scheduled renderings.</p></li><li><p><code>AdminEditableUsernames</code> is disabled by default for compatibility with the <code>RequireExternalUsernames</code> flag introduced in 1.5.8. These changes increase security by preventing changes to data supplied by authentication providers.</p></li><li><p>User session expiration is better enforced. All user browser sessions will need to login after the 1.5.10 upgrade.</p></li><li><p>Runtime environments for <a href="https://rmarkdown.rstudio.com/authoring_shiny.html">Shiny R Markdown Documents</a> have changed to support <code>rmarkdown</code> versions 1.7+.</p></li></ul><p>You can see the full release notes for RStudio Connect 1.5.10 <a href="http://docs.rstudio.com/connect/news">here</a>.</p><blockquote><p><strong>Upgrade Planning</strong> There are no special precautions to be aware of when upgrading from 1.5.8 to 1.5.10. Installation and startup should take less than a minute.</p></blockquote><p>If you haven’t yet had a chance to download and try <a href="https://www.rstudio.com/products/connect/">RStudio Connect</a> we encourage you to do so. RStudio Connect is the best way to share all the work that you do in R (Shiny apps, R Markdown documents, plots, dashboards, Plumber APIs, etc.) with collaborators, colleagues, or customers.</p><p>You can find more details or download a 45 day evaluation of the product at <a href="https://www.rstudio.com/products/connect/">https://www.rstudio.com/products/connect/</a>. Additional resources can be found below.</p><ul><li><a href="http://docs.rstudio.com/connect/admin">RStudio Connect Admin Guide</a></li><li><a href="http://docs.rstudio.com/connect/news">Detailed Release Notes</a></li><li><a href="https://rstudio.com/pricing">Pricing</a></li><li><a href="https://beta.rstudioconnect.com">Online preview of RStudio Connect</a></li></ul></description></item><item><title>pool package on CRAN</title><link>https://www.rstudio.com/blog/pool-0-1-3/</link><pubDate>Fri, 17 Nov 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/pool-0-1-3/</guid><description><p>The <a href="https://github.com/rstudio/pool">pool package</a> makes it easier for Shiny developers to connect to databases. Up until now, there wasn&rsquo;t a clearly good way to do this. As a Shiny app author, if you connect to a database globally (outside of the server function), your connection won&rsquo;t be robust because all sessions would share that connection (which could leave most users hanging when one of them is using it, or even all of them if the connection breaks). But if you try to connect each time that you need to make a query (e.g. for every reactive you have), your app becomes a lot slower, as it can take in the order of seconds to establish a new connection. The <code>pool</code> package solves this problem by taking care of when to connect and disconnect, allowing you to write performant code that automatically reconnects to the database only when needed.</p><p>So, if you are a Shiny app author who needs to connect and interact with databases inside your apps, keep reading because this package was created to make your life easier.</p><h2 id="what-the-pool-package-does">What the <code>pool</code> package does</h2><p>The <code>pool</code> package adds a new level of abstraction when connecting to a database: instead of directly fetching a connection from the database, you will create an object (called a &ldquo;pool&rdquo;) with a reference to that database. The pool holds a number of connections to the database. Some of these may be currently in-use and some of these may be idle, waiting for a new query or statement to request them. Each time you make a query, you are querying the pool, rather than the database. Under the hood, the pool will either give you an idle connection that it previously fetched from the database or, if it has no free connections, fetch one and give it to you. You never have to create or close connections directly: the pool knows when it should grow, shrink or keep steady. You only need to close the pool when you’re done.</p><p>Since <code>pool</code> integrates with both <code>DBI</code> and <code>dplyr</code>, there are very few things that will be new to you, if you&rsquo;re already using either of those packages. Essentially, you shouldn&rsquo;t feel the difference, with the exception of creating and closing a &ldquo;Pool&rdquo; object (as opposed to connecting and disconnecting a &ldquo;DBIConnection&rdquo; object). See <a href="https://github.com/rstudio/pool#usage">this copy-pasteable app</a> that uses <code>pool</code> and <code>dplyr</code> to query a MariaDB database (hosted on AWS) inside a Shiny app.</p><p>Very briefly, here&rsquo;s how you&rsquo;d connect to a database, write a table into it using <code>DBI</code>, query it using <code>dplyr</code>, and finally disconnect (you must have <code>DBI</code>, <code>dplyr</code> and <code>pool</code> installed and loaded in order to be able to run this code):</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">conn <span style="color:#666">&lt;-</span> <span style="color:#06287e">dbConnect</span>(RSQLite<span style="color:#666">::</span><span style="color:#06287e">SQLite</span>(), dbname <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">:memory:&#34;</span>)<span style="color:#06287e">dbWriteTable</span>(conn, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">quakes&#34;</span>, quakes)<span style="color:#06287e">tbl</span>(conn, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">quakes&#34;</span>) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">select</span>(<span style="color:#666">-</span>stations) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">filter</span>(mag <span style="color:#666">&gt;=</span> <span style="color:#40a070">6</span>)<span style="color:#60a0b0;font-style:italic">## # Source: lazy query [?? x 4]</span><span style="color:#60a0b0;font-style:italic">## # Database: sqlite 3.19.3 [:memory:]</span><span style="color:#60a0b0;font-style:italic">## lat long depth mag</span><span style="color:#60a0b0;font-style:italic">## &lt;dbl&gt; &lt;dbl&gt; &lt;int&gt; &lt;dbl&gt;</span><span style="color:#60a0b0;font-style:italic">## 1 -20.70 169.92 139 6.1</span><span style="color:#60a0b0;font-style:italic">## 2 -13.64 165.96 50 6.0</span><span style="color:#60a0b0;font-style:italic">## 3 -15.56 167.62 127 6.4</span><span style="color:#60a0b0;font-style:italic">## 4 -12.23 167.02 242 6.0</span><span style="color:#60a0b0;font-style:italic">## 5 -21.59 170.56 165 6.0</span><span style="color:#06287e">dbDisconnect</span>(conn)</code></pre></div><p>And here&rsquo;s how you&rsquo;d do the same using <code>pool</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">pool <span style="color:#666">&lt;-</span> <span style="color:#06287e">dbPool</span>(RSQLite<span style="color:#666">::</span><span style="color:#06287e">SQLite</span>(), dbname <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">:memory:&#34;</span>)<span style="color:#06287e">dbWriteTable</span>(pool, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">quakes&#34;</span>, quakes)<span style="color:#06287e">tbl</span>(pool, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">quakes&#34;</span>) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">select</span>(<span style="color:#666">-</span>stations) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">filter</span>(mag <span style="color:#666">&gt;=</span> <span style="color:#40a070">6</span>)<span style="color:#60a0b0;font-style:italic">## # Source: lazy query [?? x 4]</span><span style="color:#60a0b0;font-style:italic">## # Database: sqlite 3.19.3 [:memory:]</span><span style="color:#60a0b0;font-style:italic">## lat long depth mag</span><span style="color:#60a0b0;font-style:italic">## &lt;dbl&gt; &lt;dbl&gt; &lt;int&gt; &lt;dbl&gt;</span><span style="color:#60a0b0;font-style:italic">## 1 -20.70 169.92 139 6.1</span><span style="color:#60a0b0;font-style:italic">## 2 -13.64 165.96 50 6.0</span><span style="color:#60a0b0;font-style:italic">## 3 -15.56 167.62 127 6.4</span><span style="color:#60a0b0;font-style:italic">## 4 -12.23 167.02 242 6.0</span><span style="color:#60a0b0;font-style:italic">## 5 -21.59 170.56 165 6.0</span><span style="color:#06287e">poolClose</span>(pool)</code></pre></div><h2 id="what-problem-pool-was-created-to-solve">What problem <code>pool</code> was created to solve</h2><p>As mentioned before, the goal of the <code>pool</code> package is to abstract away the logic of connection management and the performance cost of fetching a new connection from a remote database. These concerns are especially prominent in interactive contexts, like Shiny apps. (So, while this package is of most practical value to Shiny developers, there is no harm if it is used in other contexts.)</p><p>The rest of this post elaborates some more on the specific problems of connection management inside of Shiny, and how <code>pool</code> addresses them.</p><h3 id="the-connection-management-spectrum">The connection management spectrum</h3><p>When you’re connecting to a database, it&rsquo;s important to manage your connections: when to open them (taking into account that this is a potentially long process for remote databases), how to keep track of them, and when to close them. This is always true, but it becomes especially relevant for Shiny apps, where not following best practices can lead to many slowdowns (from inadvertently opening too many connections) and/or many leaked connections (i.e. forgetting to close connections once you no longer need them). Over time, leaked connections could accumulate and substantially slow down your app, as well as overwhelming the database itself.</p><p>Oversimplifying a bit, we can think of connection management in Shiny as a spectrum ranging from the extreme of just having one connection per app (potentially serving several sessions of the app) to the extreme of opening (and closing) one connection for each query you make. Neither of these approaches is great: the former is fast, but not robust, and the reverse is true for the latter.</p><p>In particular, opening only one connection per app makes it fast (because, in the whole app, you only fetch one connection) and your code is kept as simple as possible. However:</p><ul><li>it cannot handle simultaneous requests (e.g. two sessions open, both querying the database at the same time);</li><li>if the connection breaks at some point (maybe the database server crashed), you won’t get a new connection (you have to exit the app and re-run it);</li><li>finally, if you are not quite at this extreme, and you use more than one connection per app (but fewer than one connection per query), it can be difficult to keep track of all your connections, since you’ll be opening and closing them in potentially very different places.</li></ul><p>While the other extreme of opening (and closing) one connection for each query you make resolves all of these points, it is terribly slow (each time we need to access the database, we first have to fetch a connection), and you need a lot more (boilerplate) code to connect and disconnect the connection within each reactive/function.</p><p>If you&rsquo;d like to see actual code that illustrates these two approaches, check <a href="https://github.com/rstudio/pool#context-and-motivation">this section of the <code>pool</code> README</a>.</p><h3 id="the-best-of-both-worlds">The best of both worlds</h3><p>The <code>pool</code> package was created so that you don&rsquo;t have to worry about this at all. Since <code>pool</code> abstracts away the logic of connection management, for the vast majority of cases, you never have to deal with connections directly. Since the pool “knows” when it should have more connections and how to manage them, you have all the advantages of the second approach (one connection per query), without the disadvantages. You are still using one connection per query, but that connection is always fetched and returned to the pool, rather than getting it from the database directly. This is a whole lot faster and more efficient. Finally, the code is kept just as simple as the code in the first approach (only one connection for the entire app), since you don&rsquo;t have to continuously call <code>dbConnect</code> and <code>dbDisconnect</code>.</p><h2 id="feedback">Feedback</h2><p>This package has quietly been around for a year and it&rsquo;s now finally on CRAN, following lots of the changes in the database world (both in <code>DBI</code> and <code>dplyr</code>). All <code>pool</code>-related feedback is welcome. Issues (bugs and features requests) can be posted to the <a href="https://github.com/rstudio/pool/issues">github tracker</a>. Requests for help with code or other questions can be posted to <a href="https://community.rstudio.com/c/shiny">community.rstudio.com/c/shiny</a>, which I check regularly (they can, of course, also be posted to <a href="https://stackoverflow.com/">Stack Overflow</a>, but I&rsquo;m extremely likely to miss it).</p></description></item><item><title>rstudio::conf(2018) program now available!</title><link>https://www.rstudio.com/blog/rstudio-conf-2018-program/</link><pubDate>Mon, 06 Nov 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-conf-2018-program/</guid><description><p>rstudio::conf 2018, the conference on all things R and RStudio, is only a few months away. Now is the time to claim your spot or grab one of the few remaining seats at Training Days!</p><p><a href="https://www.rstudio.com/conference/"><strong>REGISTER NOW</strong></a></p><p>Whether you’re already registered or still working on it, we’re delighted today to announce the <a href="https://beta.rstudioconnect.com/content/3105/">full conference schedule</a>, so that you can plan your days in San Diego. rstudio::conf 2017 takes place January 31-Feb 3 at the Manchester Grand Hyatt, California.</p><p>This year we have over 60 talks:</p><ul><li><p>Keynotes by <a href="http://dicook.org">Dianne Cook</a> “To the Tidyverse and Beyond: Challenges for the Future in Data Science” and <a href="https://github.com/jjallaire">JJ Allaire</a> “Machine Learning with R and TensorFlow”</p></li><li><p><a href="https://blog.rstudio.com/2017/07/12/join-us-at-rstudioconf-2018/">14 invited talks</a> from outstanding speakers, innovators, and data scientists.</p></li><li><p>18 contributed talks from the R community on topics like “Branding and automating your work with R Markdown“, “Reinforcement learning in Minecraft with CNTK-R”, and “Training an army of new data scientists”.</p></li><li><p>And 28 talks by RStudio employees on the latest developments in the<a href="https://tidyverse.org">tidyverse</a>,<a href="https://spark.rstudio.com">spark</a>profiling,<a href="https://shiny.rstudio.com/">Shiny</a>,<a href="https://rmarkdown.rstudio.com/">R Markdown</a>,<a href="http://db.rstudio.com/">databases</a>,<a href="https://www.rstudio.com/products/connect/">RStudio Connect</a>,and more!</p></li></ul><p>We also have <a href="https://www.rstudio.com/conference/#training">11 two day workshops</a> (for both beginners and experts!) if you want to go deep into a topic. We look forward to seeing you there!</p><p><a href="https://www.rstudio.com/conference/"><strong>REGISTER NOW</strong></a></p></description></item><item><title>R-Admins Community</title><link>https://www.rstudio.com/blog/r-admins-community/</link><pubDate>Fri, 03 Nov 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/r-admins-community/</guid><description><p>Today we&rsquo;re pleased to announce a new category of community.rstudio.com dedicated to R administrators: <a href="https://community.rstudio.com/c/r-admin">https://community.rstudio.com/c/r-admin</a>.</p><p>There are already multiple places where you can get help with R, Shiny, the RStudio IDE, and the tidyverse. There are, however, far fewer resources for R admins: people who work with R in production, in large organizations, and in complex environments. We hope this new category will serve as a useful and friendly place to connect with fellow R admins to discuss the issues they deal with. We expect this category to include:</p><ul><li><p>Discussions about best practices and ideas</p></li><li><p>General questions to fellow admins about RStudio Pro products, designed to ease friction in R administrator workflows</p></li><li><p>An exchange of ideas on domain-specific use cases and configurations</p></li></ul><p>If you’re an existing RStudio customer, this forum is a complement to RStudio’s direct support:</p><ul><li><p>Folks from RStudio will participate, but only lightly moderate topics and discussions.</p></li><li><p>RStudio commercial license holders should still feel free to report Proproduct problems to <a href="mailto:support@rstudio.com">support@rstudio.com</a>.</p></li><li><p>If you think a topic needs RStudio support’s attention, please suggest thatthe poster contact RStudio support directly. You can also tag @support in a reply.</p></li></ul></description></item><item><title>RStudio Connect v1.5.8</title><link>https://www.rstudio.com/blog/rstudio-connect-v1-5-8/</link><pubDate>Tue, 24 Oct 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-v1-5-8/</guid><description><p>We&rsquo;re pleased to announce RStudio Connect: version 1.5.8. This release enables reconnects for Shiny applications, more consistent and trustworthy editing of user information, and various LDAP improvements.</p><p><img src="https://www.rstudio.com/blog-images/rsc-154-plumber.png" alt="The auto-generated &ldquo;swagger&rdquo; interface for a web API written using Plumber."></p><p>The major changes this release include:</p><ul><li>Enabled <strong>support for Shiny reconnects</strong>. Users of Shiny applications are less likely to be interrupted during brief network hiccups. The <code>Client.ReconnectTimeout</code> property specifies how long that session is maintained when there is connectivity trouble. The default setting is <code>15s</code>. See <a href="https://shiny.rstudio.com/articles/reconnecting.html">https://shiny.rstudio.com/articles/reconnecting.html</a> to learn more about reconnecting to Shiny applications. Disable this feature by giving the <code>Client.ReconnectTimeout</code> property a value of <code>0</code>.</li><li>Greater <strong>consistency around editing user information</strong>. Authentication providers that expect user information to come in externally (like LDAP and OAuth) will by default forbid users from editing their information and will automatically refresh user profile information when the user logs in. Other providers now more consistently allow information that was specified when the user created their account to be edited by the user later.</li><li>The <code>browseURL</code> R function is disabled when executing deployed content. Use techniques like the Shiny <code>shiny::tags$a</code> function to expose links to application visitors.</li><li>Support more flexibility when searching for LDAP users and groups with the <code>[LDAP].UserFilterBase</code> and <code>[LDAP].GroupFilterBase</code> settings.</li><li>LDAP configuration&rsquo;s <code>BindDN</code> password can now be stored in an external file using the new <code>BindPasswordFile</code> field. Also made improvements to LDAP group membership lookups.</li><li>Previously, usernames could not be edited when using the LDAP authentication provider by default or if the <code>Authentication.RequireExternalUsernames</code> flag was set to <code>true</code>. Now, user email, first name, and last name are also not editable for this configuration.</li><li>Connect administrators now receive an email as license expiration nears. Email is sent when the license is sixty days from expiring. Disable this behavior through the <code>Licensing.Expiration</code> setting.</li><li>Resolved a bug in the version of the <code>rebuild-packrat</code> command-line tool that was released in v1.5.6. Previously, the migration utility would render static content inaccessible. This release fixes this behavior and adds support for running this CLI tool while the RStudio Connect server is online. However, due to the discovery of new defects, the utility is disabled by default and is not recommended for production use until further notice. Those wishing to attempt to use the utility anyway should do so on a staging server that can be safely lost, and all content should be thoroughly tested after it has completed. <a href="http://docs.rstudio.com/connect/1.5.8/admin/cli.html#migration-cli">http://docs.rstudio.com/connect/1.5.8/admin/cli.html#migration-cli</a></li><li>Fixed an issue with account confirmations and password resets for servers using non-UTC time zones.</li><li>LDAP now updates user email, first name, and last name every time a user logs in.</li><li>Fix an issue when performing the <code>LOGIN</code> SMTP authentication mechanism.</li><li>BREAKING: Changed the default value for <code>PAM.AuthenticatedSessionService</code> to <code>su</code>. Previously, on some distributions of Linux, setting <code>PAM.ForwardPassword</code> to <code>true</code> could present PAM errors to users when running applications as the current user if the <code>AuthenticatedSessionService</code> was not configured. System administrators who had previously edited the <code>rstudio-connect</code> PAM service for use in <code>ForwardPassword</code> mode should update the <code>PAM.AuthenticatedSessionService</code> configuration option. See:http://docs.rstudio.com/connect/1.5.8/admin/process-management.html#pam-credential-caching-kerberos</li><li>BREAKING: The format of the RStudio Connect package file names have changed. Debian package file names have the form <code>rstudio-connect_1.2.3-7_amd64.deb</code>. RPM package file names have the form <code>rstudio-connect-1.2.3-7.x86_64.rpm</code>. In addition, the RPM meta-data will have a &ldquo;version&rdquo; of <code>1.2.3</code> and a &ldquo;release&rdquo; of <code>7</code> for this file name. Previously, the RPM would have had a &ldquo;version&rdquo; of <code>1.2.3-7</code>.</li></ul><p>You can see the full release notes for RStudio Connect 1.5.8 <a href="http://docs.rstudio.com/connect/1.5.8/news/">here</a>.</p><blockquote><h4 id="upgrade-planning">Upgrade Planning</h4><p>There are no special precautions to be aware of when upgrading from v1.5.6. You can expect the installation and startup of v1.5.8 to be complete in under a minute.</p><p>If you’re upgrading from a release older than v1.5.6, be sure to consider the “Upgrade Planning” notes from those other releases, as well.</p></blockquote><p>If you haven&rsquo;t yet had a chance to download and try <a href="https://rstudio.com/products/connect">RStudio Connect</a> we encourage you to do so. RStudio Connect is the best way to share all the work that you do in R (Shiny apps, R Markdown documents, plots, dashboards, Plumber APIs, etc.) with collaborators, colleagues, or customers.</p><p>You can find more details or download a 45 day evaluation of the product at <a href="https://www.rstudio.com/products/connect/">https://www.rstudio.com/products/connect/</a>. Additional resources can be found below.</p><ul><li><a href="https://www.rstudio.com/products/connect/">RStudio Connect home page &amp; downloads</a></li><li><a href="http://docs.rstudio.com/connect/admin/">RStudio Connect Admin Guide</a></li><li><a href="https://www.rstudio.com/wp-content/uploads/2016/01/RSC-IT-Q-and-A.pdf">What IT needs to know about RStudio Connect</a></li><li><a href="http://docs.rstudio.com/connect/news/">Detailed news and changes between each version</a></li><li><a href="https://www.rstudio.com/pricing/#ConnectPricing">Pricing</a></li><li><a href="https://beta.rstudioconnect.com/connect/">An online preview of RStudio Connect</a></li></ul></description></item><item><title>Announcing RStudio Professional Drivers</title><link>https://www.rstudio.com/blog/announcing-rstudio-professional-drivers/</link><pubDate>Mon, 16 Oct 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-rstudio-professional-drivers/</guid><description><p>Today we are excited to announce the availability of <a href="https://www.rstudio.com/products/drivers/">RStudio Professional Drivers</a>. There are, of course, many ways to connect to <a href="http://db.rstudio.com">Databases using R</a>. RStudio Professional Drivers are specifically intended for use with our professional products, including <a href="https://www.rstudio.com/products/rstudio-server-pro/">RStudio Server Pro</a>, <a href="https://www.rstudio.com/products/shiny-server-pro/">Shiny Server Pro</a>, and <a href="https://www.rstudio.com/products/connect/">RStudio Connect</a>. These data connectors combined with enhancements to <a href="http://dplyr.tidyverse.org/">dplyr</a>, the <a href="https://github.com/rstats-db/odbc">odbc</a> package, and the <a href="https://blog.rstudio.com/2017/08/16/rstudio-preview-connections/">RStudio IDE</a> provide a comprehensive suite of tools for accessing and analyzing data with your enterprise systems.</p><h2 id="connect-to-popular-data-sources">Connect to popular data sources</h2><p><a href="https://www.rstudio.com/products/drivers/">RStudio Professional Drivers</a> help you connect to some of the most popular databases. Available for download today are ODBC drivers for Microsoft SQL Server, Oracle, PostgreSQL, Apache Hive, Apache Impala, and Salesforce. We will add several more drivers over the coming months. Don’t see your database listed? Please contact our <a href="https://rstudio.youcanbook.me/">sales team</a> to let us know what you would like us to add.</p><img src="https://www.rstudio.com/blog-images/2017-09-08-driver-logos.png" alt="RStudio Professional Driver Logos" style="width: 70%"/><h2 id="professional-advantages">Professional advantages</h2><p><a href="https://www.rstudio.com/products/drivers/">RStudio Professional Drivers</a> are intended for customers who need standards-based, supported data connectors that are easy to install and work with our professional products. They provide the following advantages:</p><ul><li><strong>Professional</strong>. We deliver professional ODBC drivers that are supported along with our pro products. Use these drivers when you run R and Shiny with your production systems.</li><li><strong>Coverage</strong>. We connect you to some of the most popular databases available today, and we are committed to increasing the number of data connectors we support in the future.</li><li><strong>Consistency</strong>. Use the same data connectors everywhere you use RStudio professional products. Develop and publish your content with the same set of data connectors systemwide.</li><li><strong>Convenience</strong>. Our drivers are easy to install and designed to work with our products. This means no more headaches trying to configure third party drivers and packages with your system.</li></ul><h2 id="using-rstudio-server-pro">Using RStudio Server Pro</h2><p>If you are an existing customer, you can use <a href="https://www.rstudio.com/products/drivers/">RStudio Professional Drivers</a> with your current version of <a href="https://www.rstudio.com/products/rstudio-server-pro/">RStudio Server Pro</a>. The latest version of <a href="https://www.rstudio.com/blog/2017-10-09-rstudio-v1-1-release/">RStudio Server Pro (v1.1)</a> has integrated support for using these drivers with <a href="https://www.rstudio.com/blog/rstudio-preview-connections/">Data Connections</a>. When you install the drivers onto your server, <a href="https://www.rstudio.com/products/rstudio-server-pro/">RStudio Server Pro</a> will automatically discover your drivers and populate the Connections wizard. Once you establish a connection you can browse your data source catalog and schema in the Connections tab.</p><img src="https://www.rstudio.com/blog-images/2017-08-16-connection_create.png" alt="RStudio Connection wizard" style="width: 70%"/><h2 id="using-rstudio-connect-and-shiny-server-pro">Using RStudio Connect and Shiny Server Pro</h2><p>Many Shiny applications and R Markdown documents are designed with database backends. With <a href="https://www.rstudio.com/products/drivers/">RStudio Professional Drivers</a> you can develop and publish your content using the same data connectors systemwide. These drivers also ensure that your production ready applications are backed by professional software and support. Use these drivers when you run Shiny in production with <a href="https://www.rstudio.com/products/shiny-server-pro/">Shiny Server Pro</a> or <a href="https://www.rstudio.com/products/connect/">RStudio Connect</a>.</p><h2 id="using-our-open-source-products">Using our open source products</h2><p><a href="https://www.rstudio.com/products/drivers/">RStudio Professional Drivers</a> are intended for customers who need supported data connectors that are easy to install and work with our professional products. But alternative options exist for various data sources. The <a href="https://blog.rstudio.com/2017/08/16/rstudio-preview-connections/">RStudio IDE</a>, the <a href="https://github.com/rstats-db/odbc">odbc</a> package, and <a href="http://dplyr.tidyverse.org/">dplyr</a> will still work when you bring your own ODBC driver. To learn more about best practices when using data connectors, see our new website, <a href="http://db.rstudio.com">Databases using R</a>.</p><p><em>If you are a current customer or if you are evaluating RStudio professional products, you can download <a href="https://www.rstudio.com/products/drivers/">RStudio Professional Drivers</a> today for no additional charge. If you are interested to learn more about how RStudio professional products can help you and your organization, please contact our <a href="https://rstudio.youcanbook.me/">sales team</a> for more information or email us at <a href="mailto:info@rstudio.com">info@rstudio.com</a>.</em></p></description></item><item><title>RStudio v1.1 Released</title><link>https://www.rstudio.com/blog/2017-10-09-rstudio-v1-1-release/</link><pubDate>Mon, 09 Oct 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/2017-10-09-rstudio-v1-1-release/</guid><description><p><img src="https://www.rstudio.com/blog-images/2017-10-09-rstudio-v1-1.png" alt="RStudio v1.1"></p><p>We&rsquo;re excited to announce the general availability of RStudio 1.1. Highlights include:</p><ul><li>A <a href="https://www.rstudio.com/blog/rstudio-preview-connections/">connections tab</a> which makes it easy to connect to, explore, and view data in a variety of databases.</li><li>A <a href="https://www.rstudio.com/blog/rstudio-v1-1-preview-terminal/">terminal tab</a> which provides fluid shell integration with the IDE, xterm emulation, and even support for full-screen terminal applications.</li><li>An <a href="https://www.rstudio.com/blog/rstudio-v1-1-preview-object-explorer/">object explorer</a> which can navigate deeply nested R data structures and objects.</li><li>A new, modern <a href="https://www.rstudio.com/blog/rstudio-dark-theme/">dark theme</a> and Retina-quality icons throughout.</li><li>Dozens of other <a href="https://www.rstudio.com/blog/2017-09-13-rstudio-v1-1-little-things/">small improvements</a> and bugfixes.</li></ul><p><a href="https://www.rstudio.com/products/rstudio-server-pro/">RStudio Server Pro 1.1</a> is also now available. Some of the <a href="https://www.rstudio.com/blog/rstudio-rsp-1.1-features/">new Pro features</a> include:</p><ul><li>Support for floating licenses, which make it easy to run RStudio Server Pro in Docker containers, virtual machines, and cloud computing instances.</li><li>Improved session management, which allows analysts to label, multi-select, force quit and otherwise self-manage their R sessions.</li><li>Tools for administrators, including the ability to send users notifications and automatically clean up unused sessions, freeing disk space and resources.</li></ul><p>You can <a href="https://www.rstudio.com/products/rstudio/download/">download RStudio v1.1</a> today, and let us know what you think on the <a href="https://community.rstudio.com/c/rstudio-ide">RStudio IDE community forum</a>.</p></description></item><item><title>RStudio v1.1 Released</title><link>https://www.rstudio.com/blog/2017-10-09-rstudio-v1-1-release/</link><pubDate>Mon, 09 Oct 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/2017-10-09-rstudio-v1-1-release/</guid><description><p>We&rsquo;re excited to announce the general availability of RStudio 1.1. Highlights include:</p><ul><li>A <a href="https://www.rstudio.com/blog/rstudio-preview-connections/">connections tab</a> which makes it easy to connect to, explore, and view data in a variety of databases.</li><li>A <a href="https://www.rstudio.com/blog/rstudio-v1-1-preview-terminal/">terminal tab</a> which provides fluid shell integration with the IDE, xterm emulation, and even support for full-screen terminal applications.</li><li>An <a href="https://www.rstudio.com/blog/rstudio-v1-1-preview-object-explorer/">object explorer</a> which can navigate deeply nested R data structures and objects.</li><li>A new, modern <a href="https://www.rstudio.com/blog/rstudio-dark-theme/">dark theme</a> and Retina-quality icons throughout.</li><li>Dozens of other <a href="https://www.rstudio.com/blog/2017-09-13-rstudio-v1-1-little-things/">small improvements</a> and bugfixes.</li></ul><p><a href="https://www.rstudio.com/products/rstudio-server-pro/">RStudio Server Pro 1.1</a> is also now available. Some of the <a href="https://www.rstudio.com/blog/rstudio-rsp-1.1-features/">new Pro features</a> include:</p><ul><li>Support for floating licenses, which make it easy to run RStudio Server Pro in Docker containers, virtual machines, and cloud computing instances.</li><li>Improved session management, which allows analysts to label, multi-select, force quit and otherwise self-manage their R sessions.</li><li>Tools for administrators, including the ability to send users notifications and automatically clean up unused sessions, freeing disk space and resources.</li></ul><p>You can <a href="https://www.rstudio.com/products/rstudio/download/">download RStudio v1.1</a> today, and let us know what you think on the <a href="https://community.rstudio.com/c/rstudio-ide">RStudio IDE community forum</a>.</p></description></item><item><title>community.rstudio.com</title><link>https://www.rstudio.com/blog/rstudio-community/</link><pubDate>Thu, 14 Sep 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-community/</guid><description><p>We’re excited to announce <a href="http://community.rstudio.com">community.rstudio.com</a>, a new site for discussions about RStudio, the tidyverse, and friends. To begin with, we’re focussing on three main areas:</p><ul><li><a href="http://discourse.rstudio.org/c/tidyverse">The Tidyverse</a></li><li><a href="http://discourse.rstudio.org/c/shiny">Shiny</a></li><li><a href="http://discourse.rstudio.org/c/rstudio-ide">RStudio IDE</a></li></ul><p>In the near future, we expect to launch a category for RStudio admins. This will be a place to coordinate knowledge about best practices for installing, configuring, and managing RStudio products, and for running R in production. Stay tuned for more details!</p><h2 id="what-is-communtyrstudiocom-for">What is communty.rstudio.com for?</h2><p>It’s easiest to say what community.rstudio.com is not: it’s not a replacement for Stack Overflow, GitHub, or our premium support services:</p><ul><li><p>If you have a precisely and clearly defined question (and accompanying<a href="https://www.tidyverse.org/help/#reprex">reprex</a>), you should still ask it on <a href="https://stackoverflow.com/questions/tagged/r">Stack Overflow</a>.</p></li><li><p>If you have discovered a bug in an R package, you should still file anissue on <a href="http://github.com/tidyverse">GitHub</a>.</p></li><li><p>If you’re a customer with a commercial license for a Professional products,you should continue to report Pro product issues as described in our<a href="https://www.rstudio.com/about/support-agreement/">support agreement</a>.</p></li></ul><p>The goal of community.rstudio.com is to provide a <a href="http://community.rstudio.com/guidelines">friendly space</a> for discussions that don’t quite fit into the above categories. It’s a great place to send your friends who are intimidated by Stack Overflow, or to go to if you’re not sure whether or not you&rsquo;ve found a bug. We expect that it will gradually supplant our existing google groups like shiny-discuss and ggplot2.</p><p>RStudio employees will frequent the discussions, but we won’t have time to answer every question. We will ensure that discussions remain <a href="http://community.rstudio.com/guidelines">friendly and professional</a>, but our goal is to foster an environment where the RStudio community can help one another.</p><h2 id="getting-the-ball-rolling">Getting the ball rolling</h2><p>The RStudio forums are a community site, and a community site is nothing without a community! In order to generate some initial momentum, we’re going to be running four limited-time promotions:</p><ul><li><p>Joe Cheng (developer of Shiny), Hadley Wickham (me), and Garrett Grolemund(RStudio master instructor) will do <a href="http://community.rstudio.com/t/office-hours-with-hadley-joe-garrett/46">weekly office hours</a>.We’ll each be online for at least one hour per week, and will spend thattime answering your questions.</p></li><li><p>If you post in the first month, you’ll get a custom “founding member” badgefor your <a href="http://community.rstudio.com">community.rstudio.com</a> profile.</p></li><li><p>Each week for the next month we&rsquo;ll recognize a few of the most helpfulparticipants, and send them a sticker pack and t-shirt. Post once during theweek to be eligible.</p></li><li><p>Each month for the next six months we’ll select one person who we feel hasbeen particularly helpful to the community, and send them a signed copy ofR4DS and an RStudio t-shirt of their choice.</p></li></ul><p>Depending on how things go, we might keep doing the prize packs for longer, but you should participate soon in order to maximise your chances of a sweet prize!</p></description></item><item><title>RStudio v1.1 - The Little Things</title><link>https://www.rstudio.com/blog/2017-09-13-rstudio-v1-1-little-things/</link><pubDate>Wed, 13 Sep 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/2017-09-13-rstudio-v1-1-little-things/</guid><description><p><em>Today, we&rsquo;re concluding our blog series on new features in RStudio 1.1. If you&rsquo;d like to try these features out for yourself, you can download a <a href="https://www.rstudio.com/products/rstudio/download/preview/">preview release of RStudio 1.1</a>.</em></p><h2 id="details-matter">Details matter</h2><p>Throughout this blog series, we&rsquo;ve focused on some of the big features we added in RStudio 1.1. It&rsquo;s not just the big things that matter, though; it&rsquo;s sometimes the little ones that make the most difference in your day-to-day work. Towards that end, we spent a chunk of time during RStudio 1.1&rsquo;s development implementing small but significant improvements to the core IDE features you use every day. Many of these were based on requests and ideas from the R community &ndash; we&rsquo;re very thankful for everyone&rsquo;s input and perspective!</p><h3 id="create-git-branches">Create Git branches</h3><p>We&rsquo;ve significantly improved Git branch management. Now you can add a new branch right from the IDE, and set it up to track a remote branch at the same time.</p><p><img src="https://www.rstudio.com/blog-images/2017-09-13-new-branch.png" alt="New branch"></p><p>You can even type to search your branch names, which is helpful if you have a lot of them!</p><p><img src="https://www.rstudio.com/blog-images/2017-09-13-search-branches.png" alt="Search branches"></p><h3 id="ctrlr-command-search">Ctrl+R command search</h3><p>Regular RStudio users will know that <em>Ctrl+Up</em> (or <em>Cmd+Up</em> on MacOS) lets you recall a previous R command by typing only a few letters from the beginning of the command. In RStudio 1.1, we&rsquo;ve added <em>Ctrl+R</em>, which&ndash;just like in your favorite shell&ndash;performs an incremental history search. Now you can recall a previous R command based on text anywhere in the command, not just at the beginning.</p><p><img src="https://www.rstudio.com/blog-images/2017-09-13-ctrl-r.png" alt="Reverse history search"></p><h3 id="change-knit-directory">Change knit directory</h3><p>In RStudio 1.0, knitting R Markdown documents (and executing notebook chunks) was always done in the context of the document&rsquo;s directory. This has some advantages, but many people prefer to use more contextual paths. In RStudio 1.1, you can now choose to evaluate chunks in the current working directory, or in the directory of the document&rsquo;s project.</p><p><img src="https://www.rstudio.com/blog-images/2017-09-13-knit-directory.png" alt="Choose knit directory"></p><p>This setting can be different for each R Markdown document, and it affects both the execution of notebook chunks and the behavior of the Knit button, so you&rsquo;ll get consistent results no matter how your R chunks are evaluated.</p><h3 id="run-r-code-between-blank-lines">Run R code between blank lines</h3><p>In older versions of RStudio, <em>Ctrl+Enter</em> sent a single line of R code to the console. In 1.0, we added the ability to automatically send an entire R statement to the console, even if it was spread over multiple lines. And in 1.1, we&rsquo;ve made it possible to execute consecutive R lines, delimited by blank lines (like &ldquo;paragraphs&rdquo; in <a href="https://ess.r-project.org/">ESS</a>). You can opt into this behavior in Options -&gt; Code -&gt; Editing -&gt; Execution:</p><p><img src="https://www.rstudio.com/blog-images/2017-09-13-consecutive-lines.png" alt="Multiple consecutive R lines"></p><p>This behavior is helpful if you use blank lines to delimit sections of your R script, and usually want to execute each section as a unit. For instance, in the example below you would press Ctrl+Enter once to execute the first section, and again to execute the second.</p><pre><code class="language-{r}" data-lang="{r}"># Let's run these two commands to build some data.categories &lt;- c(&quot;first&quot;, &quot;second&quot;)data &lt;- data.frame(group = factor(categories),measure = c(40, 60))# Then, this last one to view it.ggplot(data = data, aes(group, measure), geom_bar(stat = &quot;identity&quot;))</code></pre><p>It&rsquo;s also possible to mix and match execution styles, as we&rsquo;ve added new commands that specifically use each style (regardless of the default behavior of Ctrl+Enter). Set Ctrl+Enter to your most-used style and bind keyboard shortcuts to the others you use; you&rsquo;ll rarely find yourself reaching for the mouse to execute R code!</p><h3 id="data-viewer-improvements">Data viewer improvements</h3><p>We have removed the 100-column limit in the data viewer. We&rsquo;ve also made it possible to resize the columns, so you can see more of your data when it contains long text.</p><p><img src="https://www.rstudio.com/blog-images/2017-09-13-resize-columns.png" alt="Data viewer columns"></p><h3 id="execute-code-from-help">Execute code from Help</h3><p>Ever wanted to execute the example code from a function&rsquo;s Help? Now you can just highlight it right in the Help pane and press Ctrl+Enter (Cmd+Enter on macOS) to send it to the console.</p><p><img src="https://www.rstudio.com/blog-images/2017-09-13-execute-help.png" alt="Help code execution"></p><h3 id="project-templates">Project templates</h3><p>Any R package can now supply RStudio with a template for new projects based on the package. For instance, when you install the <a href="https://bookdown.org/yihui/bookdown/">bookdown</a> package, you&rsquo;ll start seeing an option for new <strong>bookdown</strong> projects in RStudio, which will get you started with the skeleton of a book right away.</p><p><img src="https://www.rstudio.com/blog-images/2017-09-13-project-templates.png" alt="R project templates"></p><p>This project template isn&rsquo;t hard-coded into RStudio &ndash; it&rsquo;s provided by the R package, so any package can provide a project template. If you&rsquo;re a package author, see our <a href="https://rstudio.github.io/rstudio-extensions/rstudio_project_templates.html">RStudio Project Templates</a> documentation for information on how to add a project template to your package.</p><h3 id="ligature-support">Ligature support</h3><p>In the last few years we&rsquo;ve seen an uptick in fonts designed specifically for code. <a href="https://www.hanselman.com/blog/MonospacedProgrammingFontsWithLigatures.aspx">Many of these use &ldquo;ligatures&rdquo;</a>, which combine two or more individual characters into a single typographical glyph (see <a href="https://github.com/tonsky/FiraCode">Fira Code</a> for examples). RStudio Server and RStudio on MacOS have always supported these natively; in RStudio 1.1 we added support for them on Windows and Linux, too.</p><p><img src="https://www.rstudio.com/blog-images/2017-09-13-ligatures.png" alt="Ligatures in Fira Code on Windows"></p><h3 id="files-pane-ergonomics">Files pane ergonomics</h3><p>The <em>Copy</em> command in the Files pane makes it easy to copy a file from one place to another, but sometimes you want the file to have a different name in the new location. We&rsquo;ve added a new command called <em>Copy To&hellip;</em> which does this: just like <code>cp</code>, you can now specify a destination filename.</p><p><img src="https://www.rstudio.com/blog-images/2017-09-13-files-pane.png" alt="Copy To"></p><p>We&rsquo;ve also significantly improved <em>Rename</em>: when you use the Files pane to rename a file that you have open, it&rsquo;s no longer necessary to close and re-open the file; the editor buffer adapts immediately to the new name.</p><h3 id="ansi-colors-in-the-r-console">ANSI colors in the R console</h3><p>R packages like <a href="https://cran.r-project.org/web/packages/crayon/index.html">crayon</a> make it possible for R&rsquo;s output to include colors and highlighting, using standard ANSI escape sequences. We&rsquo;ve added support for these to RStudio&rsquo;s R console; packages that use <em>crayon</em>, like <a href="https://cran.r-project.org/web/packages/diffobj/index.html">diffobj</a>, can now emit colored, styled text right inside the R console.</p><p><img src="https://www.rstudio.com/blog-images/2017-09-13-crayon.png" alt="ANSI colors in R"></p><h2 id="conclusion">Conclusion</h2><p>This wraps up our blog series on the RStudio 1.1 preview. If you just can&rsquo;t get enough, there&rsquo;s lots more: see the <a href="https://www.rstudio.com/products/rstudio/download/preview-release-notes/">RStudio 1.1 Preview Release Notes</a> for a full list of the changes in RStudio 1.1.</p><p>If you&rsquo;d like to learn more, we have two upcoming webinars in which we&rsquo;ll be taking a deeper dive:</p><ul><li><strong>New Features of the IDE</strong> on December 6th, 2017: we&rsquo;ll show you what&rsquo;s new in RStudio 1.1 and walk you through how to apply the new features to your workflow.</li><li><strong>Terminal Updates</strong> on December 20th, 2017: we&rsquo;ll get into the details of the new RStudio 1.1 Terminal.</li></ul><p>We hope you&rsquo;ve enjoyed learning about the new features and capabilities ahead of the official release of RStudio 1.1, and we look forward to putting an official release in your hands in the coming months.</p></description></item><item><title>Shiny Server (Pro) 1.5.4</title><link>https://www.rstudio.com/blog/shiny-server-pro-1-5-4/</link><pubDate>Tue, 12 Sep 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-server-pro-1-5-4/</guid><description><p><a href="https://www.rstudio.com/products/shiny/shiny-server/">Shiny Server 1.5.4.869 and Shiny Server Pro 1.5.4.872 are now available.</a></p><p>Both the new Shiny Server and Shiny Server Pro releases include bug fixes and enhancements, and we recommend upgrading at your earliest convenience.</p><h3 id="shiny-server-154869">Shiny Server 1.5.4.869</h3><ul><li><a href="http://docs.rstudio.com/shiny-server/#clickjacking-protection">Clickjacking protection</a> can now be enabled using the <code>frame_options</code> directive. This directive was already available in Shiny Server Pro, but is now available in Shiny Server as well.</li><li>A bug that caused <code>&quot;Error: Can't set headers after they are sent.&quot;</code> to appear in error logs was fixed.</li><li>Several bugs in <code>license-manager</code> were fixed.</li></ul><h3 id="shiny-server-pro-154872">Shiny Server Pro 1.5.4.872</h3><ul><li>The &ldquo;utilization scheduler&rdquo; component, which manages the way requests are routed to R processes, has been significantly improved. It has been made more robust, and applications now respond more efficiently to increased load.</li><li><code>auth_pam</code>: The performance of multiple simultaneous logins was improved.</li><li><code>auth_ldap</code>: A bug was fixed that caused LDAP to return no groups when a username contained a backslash.</li></ul></description></item><item><title>Announcing blogdown: Create Websites with R Markdown</title><link>https://www.rstudio.com/blog/announcing-blogdown/</link><pubDate>Mon, 11 Sep 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-blogdown/</guid><description><p>Today I&rsquo;m excited to announce a new R package, <strong>blogdown</strong>, to help you create general-purpose (static) websites with R Markdown. The first version of <strong>blogdown</strong> is available on CRAN now, and you can install it with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">blogdown&#34;</span>)</code></pre></div><p>The source package is hosted on Github in the repository <a href="https://github.com/rstudio/blogdown">rstudio/blogdown</a>. Since <strong>blogdown</strong> is a new package, you may install and test the development version using <code>devtools::install_github(&quot;rstudio/blogdown&quot;)</code> if you run into problems with the CRAN version.</p><h2 id="introduction">Introduction</h2><p>In a nutshell, <strong>blogdown</strong> is an effort to integrate R Markdown with static website generators, so that you can generate web pages dynamically. For example, you can use R code chunks (or <a href="https://rmarkdown.rstudio.com/authoring_knitr_engines.html">other languages</a> that <strong>knitr</strong> supports) to generate tables and graphics automatically on any web page. Before <strong>blogdown</strong>, you can easily do this using:</p><ul><li>the <strong>rmarkdown</strong> package to create single output files from R Markdown documents;</li><li>and the <a href="https://github.com/rstudio/bookdown"><strong>bookdown</strong></a> package to compile multiple R Markdown documents to a book;</li></ul><p>But the structure of a website can be far more complicated than a collection of independent HTML pages or a book. With <strong>blogdown</strong>, the directory structure of your R Markdown files can be arbitrary. You can easily create a project website, or a blog. Each page can have its own metadata (such as categories and tags), and you can generate pages of lists of content (such as a list of blog posts or examples).</p><p>Besides the advantage in website structures, another highlight of <strong>blogdown</strong> is that it inherited <strong>bookdown</strong>&lsquo;s Markdown extensions (based on Pandoc&rsquo;s Markdown), which means you can easily write technical content on your website, including everything that Pandoc supports (e.g., headings, lists, footnotes, tables, figures, citations, LaTeX math, and quotes, etc) and <strong>bookdown</strong>&lsquo;s extensions (e.g., figure and table captions, cross-references, theorems, proofs, numbered equations, and HTML widgets, etc).</p><p>There are several popular static site generators, and the main one we support in <strong>blogdown</strong> is <a href="https://gohugo.io">Hugo</a>. Hugo is easy to install (no dependencies), lightning fast (one millisecond per page), and very flexible. We have also provided (limited) support for <a href="https://jekyllrb.com">Jekyll</a> and <a href="https://hexo.io">Hexo</a> (see documentation). The Markdown support in these generators is often poor in terms of functionalities (you cannot easily beat Pandoc&rsquo;s Markdown), and sometimes it is painful that they use different flavors of Markdown. With <strong>blogdown</strong>, you can use richer Markdown syntax if you want.</p><h2 id="get-started">Get Started</h2><p>It is extremely easy to get started with a new website. After you have installed the <strong>blogdown</strong> package, it only takes one step to create a new website&mdash;just call the function <code>new_site()</code> under an empty directory (or an empty RStudio project):</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">blogdown<span style="color:#666">::</span><span style="color:#06287e">new_site</span>()</code></pre></div><p>It will download and install Hugo if it has not been installed, download a default Hugo theme, add some sample posts, build the site, and launch it in your web browser (or RStudio Viewer) so that you can immediately preview the website. Note you only need to use this function once for every new site. For an existing website, you may call <code>blogdown::serve_site()</code> or the RStudio addin &ldquo;Serve Site&rdquo; to preview the site; it will watch changes in your source files continuously and rebuild your site automatically.</p><p>To write new posts, you may use the RStudio addin &ldquo;New Post&rdquo;:</p><p><img src="https://bookdown.org/yihui/blogdown/images/new-post.png" alt="New Post"></p><p>If you are not satisfied with the default theme, you can try to create another new site with <a href="https://bookdown.org/yihui/blogdown/other-themes.html">a different theme</a> till you find a theme that you like.</p><h2 id="documentation">Documentation</h2><p>The comprehensive documentation of this package is a book written in <strong>bookdown</strong>, which is freely available at <a href="https://bookdown.org/yihui/blogdown/">https://bookdown.org/yihui/blogdown/</a> and to be published by Chapman &amp; Hall later this year. The book may seem to be short (about 150 pages), but it contains many external resources, such as examples that we have spent a lot of time on creating. It may take you quite a while to fully digest this book, but perhaps it is not necessary. For example, you do not have to read Chapter 2 to understand how Hugo works if you can find a theme that you like and don&rsquo;t want to customize too much (hint: this is unlikely&mdash;you will surely be bored by the appearance of your website someday). Anyway, you are expected to read at least Chapter 1 of this book.</p><h2 id="migration">Migration</h2><p>If you don&rsquo;t have a website right now, consider yourself to be lucky. It is much easier to start a new website than converting an existing website. The latter is not impossible, and we have shown in <a href="https://bookdown.org/yihui/blogdown/migration.html">Chapter 4</a> how to convert WordPress and Jekyll websites to Hugo. To give you an idea about how long it takes to convert a website:</p><ul><li><p>I spent a whole week on converting <a href="https://yihui.name">my personal website</a> from Jekyll to Hugo. The complication was that I had a Chinese blog, an English blog, two project websites (<a href="https://yihui.name/knitr/"><strong>knitr</strong></a> and <a href="https://yihui.name/animation/"><strong>animation</strong></a>), and several single-page project websites (such as <a href="https://yihui.name/formatR/"><strong>formatR</strong></a>). Finally I managed to put all of them in <a href="https://github.com/rbind/yihui">one repository</a>.</p></li><li><p><a href="https://robjhyndman.com">Rob Hyndman</a> spent several days on converting his WordPress website to <strong>blogdown</strong> (<a href="https://support.rbind.io/2017/05/15/converting-robjhyndman-to-blogdown/">read more here</a>), when the <strong>blogdown</strong> documentation was far from being complete (Chapter 4 did not exist).</p></li><li><p>It took me four hours to convert <a href="https://simplystatistics.org/">Simply Statistics blog</a> from Jekyll to <strong>blogdown</strong>. It had about 1000 posts at that time.</p></li><li><p>I have converted three WordPress sites by myself: <a href="https://rviews.rstudio.com/">the RViews blog</a> took me a few days, <a href="https://blog.rstudio.com/">the RStudio blog</a> took me one day, and <a href="http://kbroman.org/">Karl Broman</a>&lsquo;s blog took me one hour.</p></li></ul><p>I have provided the scripts that I used in Chapter 4, and hopefully you can reuse them to convert your own websites to save you some time.</p><h2 id="conclusion">Conclusion</h2><p>I believe <strong>blogdown</strong> can introduce a highly streamlined experience to create and maintain a website. At least I feel addicted to blogging again after three years. I have been a firm believer in writing, but I hated the fact that I had to log in an online system to write something (e.g., WordPress), or manually type out all the YAML metadata in a new post (which is why I created the RStudio addin &ldquo;New Post&rdquo;). Now I just need to open my RStudio project, use the &ldquo;Serve Site&rdquo; addin, then create a new post using the addin &ldquo;New Post&rdquo; or revise existing posts, and I can live preview the site immediately in RStudio Viewer when I save the post. Deployment of the website is as simple as pushing to Github, and <a href="https://www.netlify.com">Netlify</a> will do the rest of work for me.</p><p><img src="https://slides.yihui.name/gif/cat-flow.gif" alt="cats flow"></p><p>There are many advantages of static websites as mentioned in <a href="https://bookdown.org/yihui/blogdown/static-sites.html">Chapter 2</a> of the book. You whole website is just contained in a folder that you can preview locally (even offline!) or publish to any web server. Your posts are plain-text files that you can create or edit at any time, which means finally you have got something more meaningful to do <a href="https://twitter.com/imtaras/status/906392194012999680">on your next flight</a> than having to stare at Sudoku puzzles to kill time (you may teach your neighbors R Markdown and <strong>blogdown</strong> if they feel jealousy looking at your screen).</p><h2 id="acknowledgements">Acknowledgements</h2><p>Although <strong>blogdown</strong> is still relatively new, I have received a lot of useful feedback during the development of the package and the book. There have been about 200 <a href="https://github.com/rstudio/blogdown/issues">Github issues</a> (including <a href="https://github.com/rstudio/blogdown/pulls">pull requests</a>) and a few dozen questions on <a href="http://stackoverflow.com/questions/tagged/blogdown">StackOverflow</a>. Some Github issues were really inspiring (e.g., <a href="https://github.com/rstudio/blogdown/issues/40">#40</a> and <a href="https://github.com/rstudio/blogdown/issues/97">#97</a>), and I was very glad that they filed the feature requests. I also want to thank <a href="https://github.com/rstudio/blogdown/graphs/contributors">those who</a> submitted Github pull requests to improve the book. To be honest, this book was quite painful to write, because there are too many technologies potentially related to a website (e.g., JavaScript, domain names, DNS, and continuous deployment). However, I have gained a lot of motivation and inspiration from several early brave users who created their websites and wrote their own <strong>blogdown</strong> tutorials (even before the official documentation existed). That is also <a href="https://bookdown.org/yihui/blogdown/author.html">how I found</a> the co-authors of this book, <a href="https://twitter.com/ProQuesAsker">Amber</a> and <a href="https://apreshill.rbind.io">Alison</a>.</p><p>I&rsquo;m particularly grateful to the feedback from beginners (so please don&rsquo;t be shy to ask &ldquo;dumb&rdquo; questions). It is very helpful to see what can be confusing to beginners, so that we can try better explanations or implementations, which can usually benefit users of all levels.</p><p>If you are looking for inspirations from other people&rsquo;s <strong>blogdown</strong>-based websites, you may thumb through the Github organization <a href="https://github.com/rbind">https://github.com/rbind</a>. You are also welcome to move your website over there to share with or inspire more people.</p><p>As usual, please feel free to ask questions on <a href="https://stackoverflow.com/questions/tagged/blogdown">StackOverflow</a> (with at least tags <code>r</code> and <code>blogdown</code>), and file bug reports and feature requests <a href="https://github.com/rstudio/blogdown">on Github</a>. I recommend you to spend some time on reading the <strong>blogdown</strong> book, but I understand it may not be easy to digest, so it is fine to ask questions before you finish reading the book. I&rsquo;ll be happy to point you to the relevant sections if your questions have been answered in the book. I hope you can enjoy this package, and have fun with your website!</p></description></item><item><title>RStudio 1.1 Preview - New Features in RStudio Server Pro</title><link>https://www.rstudio.com/blog/rstudio-rsp-1.1-features/</link><pubDate>Thu, 07 Sep 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-rsp-1.1-features/</guid><description><p><em>Today, we&rsquo;re continuing our blog series on new features in RStudio 1.1. If you’d like to try these features out for yourself, you can download a <a href="https://www.rstudio.com/products/rstudio/download/preview/">preview release of RStudio Server Pro 1.1</a>.</em></p><h1 id="rstudio-server-pro">RStudio Server Pro</h1><p>Today we are going to be talking about some of the great new features we have added to RStudio Server Pro v1.1, which make users’ and administrators’ workflows more efficient. Let’s begin!</p><h2 id="features-for-users">Features for Users</h2><h3 id="homepage-improvements">Homepage Improvements</h3><p>One of the key features of RStudio Server Pro is the ability for users to run multiple concurrent R sessions. In order to help you keep track of your active sessions, RStudio Server Pro comes with a <strong>home page</strong> that allows you to see the status of your sessions, such as if they are idle, executing or suspended. In RStudio Server Pro v1.1, we’ve made a few improvements that will help you manage your sessions.</p><h4 id="managing-multiple-sessions">Managing Multiple Sessions</h4><p>The homepage now allows you to select multiple active sessions and kill them, allowing you to cleanup many sessions with one single action. Simply select the desired sessions by clicking their checkboxes, and click on either the Quit or Force Quit buttons. The Quit button will attempt to gracefully kill your sessions by giving them the opportunity to properly shut down. The Force Quit button immediately kills your sessions by sending them SIGKILL and cleans up all child processes as well. Note that in both cases, any unsaved data may be lost.</p><p><img src="https://www.rstudio.com/blog-images/2017-09-13-home-page-select.png" alt="Selecting Sessions"></p><h4 id="accessing-the-home-page">Accessing the Home Page</h4><p>We’ve made accessing the home page easier, allowing you to get there without having to first get into a running session. To do so, simply type in the address of your company’s RStudio in your browser, and add /home, like so:</p><p><img src="https://www.rstudio.com/blog-images/2017-09-13-home-page-url.png" alt="Home Page URL"></p><h4 id="session-labels">Session Labels</h4><p>In the past, it could be difficult to keep track of your sessions and know which was which when looking at them on the home page. Now you can label your sessions so that they show up clearly on the home page. To do that, simply click the <strong>Label Current Session</strong> button from the Sessions dropdown in the top right corner from within an active session and give the session a name.</p><p><img src="https://www.rstudio.com/blog-images/2017-09-13-label-current-session.png" alt="Labeling the Current Session"></p><p>The label will be displayed on the home page making it easy for you to properly drop back in to the correct session.</p><p><img src="https://www.rstudio.com/blog-images/2017-09-13-home-page-labels.png" alt="Sessions on Home Page with Labels"></p><h2 id="features-for-admins">Features for Admins</h2><h3 id="floating-licensing">Floating Licensing</h3><p>Floating licensing allows you to run fully licensed copies of RStudio Server Pro easily in ephemeral instances, such as Docker containers, virtual machines, and EC2 instances. Instances don&rsquo;t have to be individually licensed, and you don&rsquo;t have to manually activate and deactivate a license in each instance. Instead, a lightweight license server distributes temporary licenses (&ldquo;leases&rdquo;) to each instance, and the instance uses the license only while it&rsquo;s running.</p><p>This model is perfect for environments in which RStudio Server Pro instances are frequently created and destroyed on demand, and only requires that you purchase a license for the maximum number of concurrent instances you want to run.</p><p>Technical details on the floating licensing system are available in the <a href="http://docs.rstudio.com/ide/server-pro/1.1.345/license-management.html#floating-licensing">RStudio Server Pro Administration Guide</a>.</p><h3 id="session-notifications">Session Notifications</h3><p>Administrators can now broadcast notifications to users in real-time with our new notification system. The system is very flexible, so we will just cover the basics here. For more in-depth information, see the <a href="http://docs.rstudio.com/ide/server-pro/1.1.345/r-sessions.html#notifications">RStudio Server Pro Administration Guide</a>, or take a look at the documentation in the <strong>/etc/rstudio/notifications.conf</strong> file.</p><p>To broadcast a notification to all of your users, add lines like the following to your <strong>/etc/rstudio/notifications.conf</strong> file:</p><pre><code class="language-{ini}" data-lang="{ini}">StartTime: 2017-11-06EndTime: 2017–11-11Message: Remember that on November 10th after business hours we will be performing a database migration. Please make sure all of your work is saved before going home.</code></pre><p>This will create a notification that starts broadcasting to your users on the 6th of November, and stops on midnight of the 11th. Each user will only see the message until they have acknowledged it, usually once in the first session they create during the notification window, or to any sessions that are currently active. The notification will look something like:</p><p><img src="https://www.rstudio.com/blog-images/2017-09-13-admin-notification.png" alt="Session Notification"></p><h3 id="automatically-deleting-unused-sessions">Automatically Deleting Unused Sessions</h3><p>As an administrator, you have the ability to automatically suspend sessions to disk after a certain period of inactivity by specifying the <strong>session-timeout-minutes</strong> option in <strong>/etc/rstudio/rsession.conf</strong>. RStudio Server Pro v1.1 adds the ability to also kill and delete these sessions entirely after a certain amount of hours, freeing up valuable system resources. Simply add the following line to <strong>/etc/rstudio/rsession.conf</strong>.</p><pre><code class="language-{ini}" data-lang="{ini}">session-timeout-kill-hours=96</code></pre><p>This setting will kill and delete any inactive sessions that have not been used for the specified hours. You should set a long timeout period to ensure that only sessions users have forgotten about or no longer need are deleted, as the session’s data is lost forever. Again, for more information, see the RStudio Server Pro Administration Guide.</p><p>Thank you for using RStudio Server Pro. We hope these new features improve both user and administrator workflows!</p></description></item><item><title>RStudio Connect v1.5.6 - Now Supporting Kerberos!</title><link>https://www.rstudio.com/blog/rstudio-connect-v1-5-6-kerberos/</link><pubDate>Wed, 06 Sep 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-v1-5-6-kerberos/</guid><description><p>We&rsquo;re pleased to announce support for Kerberos in <a href="https://www.rstudio.com/products/connect/">RStudio Connect: version 1.5.6</a>. Organizations that use Kerberos can now run Shiny applications and Shiny R Markdown documents in tailored processes that have access only to the appropriate resources inside the organization.</p><p><img src="https://www.rstudio.com/blog-images/2017-09-06-rsc-admin-list.png" alt="RStudio Connect administrator globally listing all content on the server"></p><p>The notable changes this release include:</p><ul><li><strong>Full support for Kerberos</strong> across Shiny applications and Shiny R Markdown documents by running R in an authenticated PAM session that uses the cached credentials of the current user. See <a href="http://docs.rstudio.com/connect/1.5.6/admin/process-management.html#pam-credential-caching-kerberos">the documentation</a> for more details.</li><li>Content <strong>deployment no longer requires explicit publishing</strong>. New content is available immediately after it is deployed and visible only to the owner. Enable the <code>[Applications].ExplicitPublishing</code> setting to revert this behavior.</li><li>Heterogeneous <strong>server migrations</strong> are now supported, allowing administrators to upgrade their distribution or change to a different (supported) Linux distribution. See <a href="http://docs.rstudio.com/connect/1.5.6/admin/files-directories.html#server-migrations">the migration documentation</a> for more details.</li><li>Admins are now able to toggle the content filtering settings to <strong>enumerate all content on the server</strong> so that they can manage settings, regardless of whether or not they have visibility into that content. The permissions here are unchanged; the admin will only be able to view the settings for the content, not the content itself. To view the content, the admin would need to add themselves as a viewer or collaborator of the content, which is an audited action.</li><li><strong>Shiny error sanitization</strong> is enabled by default. Disable the <code>[Applications].ShinyErrorSanitization</code> setting to revert this behavior. See <a href="https://shiny.rstudio.com/articles/sanitize-errors.html">the Shiny documentation</a> for more information about Shiny error sanitization.</li><li>Improved <strong>LDAP group lookup performance</strong> on large LDAP servers that don&rsquo;t support <code>memberof</code>. Additionally, improved LDAP logging and error handling.</li><li><strong>Security enhancements</strong> around proxy authentication, redirects, and brute-force password attacks. This release adds support for a challenge-response (<strong>CAPTCHA</strong>) to help mitigate brute-force attacks on users&rsquo; passwords. Set <code>[Authentication].ChallengeResponseEnabled</code> to true to enable this feature.</li><li><strong>Customize the subject prefix</strong> for all outgoing emails using the <code>[Server].EmailSubjectPrefix</code> setting. The default is still <code>[RStudio Connect]</code>.</li><li>BREAKING: Running content as the current user is now disabled for content other than Shiny Applications or Shiny R Markdown Documents. Reports will execute as the application <code>RunAs</code>, falling back to the system <code>Applications.RunAs</code> if none is specified.</li></ul><p>You can see the full release notes for RStudio Connect 1.5.6 <a href="http://docs.rstudio.com/connect/1.5.6/news/">here</a>.</p><blockquote><h4 id="upgrade-planning">Upgrade Planning</h4><p>There are no special precautions to be aware of when upgrading from v1.5.4. You can expect the installation and startup of v1.5.6 to be complete in under a minute.</p><p>If you’re upgrading from a release older than v1.5.4, be sure to consider the “Upgrade Planning” notes from those other releases, as well.</p></blockquote><p>If you haven&rsquo;t yet had a chance to download and try <a href="https://www.rstudio.com/products/connect/">RStudio Connect</a> we encourage you to do so. RStudio Connect is the best way to share all the work that you do in R (Shiny apps, R Markdown documents, plots, dashboards, Plumber APIs, etc.) with collaborators, colleagues, or customers.</p><p>You can find more details or download a 45 day evaluation of the product at <a href="https://www.rstudio.com/products/connect/">https://www.rstudio.com/products/connect/</a>. Additional resources can be found below.</p><ul><li><a href="https://www.rstudio.com/products/connect/">RStudio Connect home page &amp; downloads</a></li><li><a href="http://docs.rstudio.com/connect/admin/">RStudio Connect Admin Guide</a></li><li><a href="https://www.rstudio.com/wp-content/uploads/2016/01/RSC-IT-Q-and-A.pdf">What IT needs to know about RStudio Connect</a></li><li><a href="http://docs.rstudio.com/connect/news/">Detailed news and changes between each version</a></li><li><a href="https://www.rstudio.com/pricing/#ConnectPricing">Pricing</a></li><li><a href="https://beta.rstudioconnect.com/connect/">An online preview of RStudio Connect</a></li></ul></description></item><item><title>Keras for R</title><link>https://www.rstudio.com/blog/keras-for-r/</link><pubDate>Tue, 05 Sep 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/keras-for-r/</guid><description><p>We are excited to announce that the <a href="https://keras.rstudio.com">keras package</a> is now available on CRAN. The package provides an R interface to <a href="https://keras.io">Keras</a>, a high-level neural networks API developed with a focus on enabling fast experimentation. Keras has the following key features:</p><ul><li><p>Allows the same code to run on CPU or on GPU, seamlessly.</p></li><li><p>User-friendly API which makes it easy to quickly prototype deep learning models.</p></li><li><p>Built-in support for convolutional networks (for computer vision), recurrent networks (for sequence processing), and any combination of both.</p></li><li><p>Supports arbitrary network architectures: multi-input or multi-output models, layer sharing, model sharing, etc. This means that Keras is appropriate for building essentially any deep learning model, from a memory network to a neural Turing machine.</p></li><li><p>Is capable of running on top of multiple back-ends including <a href="https://github.com/tensorflow/tensorflow">TensorFlow</a>, <a href="https://github.com/Microsoft/cntk">CNTK</a>, or <a href="https://github.com/Theano/Theano">Theano</a>.</p></li></ul><p>If you are already familiar with Keras and want to jump right in, check out <a href="https://keras.rstudio.com">https://keras.rstudio.com</a> which has everything you need to get started including over 20 complete examples to learn from.</p><p>To learn a bit more about Keras and why we&rsquo;re so excited to announce the Keras interface for R, read on!</p><h2 id="keras-and-deep-learning">Keras and Deep Learning</h2><p>Interest in deep learning has been accelerating rapidly over the past few years, and several deep learning frameworks have emerged over the same time frame. Of all the available frameworks, Keras has stood out for its productivity, flexibility and user-friendly API. At the same time, TensorFlow has emerged as a next-generation machine learning platform that is both extremely flexible and well-suited to production deployment.</p><p>Not surprisingly, Keras and TensorFlow have of late been pulling away from other deep learning frameworks:</p><blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">Google web search interest around deep learning frameworks over time. If you remember Q4 2015 and Q1-2 2016 as confusing, you weren&#39;t alone. <a href="https://t.co/1f1VQVGr8n">pic.twitter.com/1f1VQVGr8n</a></p>&mdash; François Chollet (@fchollet) <a href="https://twitter.com/fchollet/status/871089784898310144">June 3, 2017</a></blockquote><script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script><p>The good news about Keras and TensorFlow is that you don&rsquo;t need to choose between them! The default backend for Keras is TensorFlow and Keras can be <a href="https://blog.keras.io/keras-as-a-simplified-interface-to-tensorflow-tutorial.html">integrated seamlessly</a> with TensorFlow workflows. There is also a pure-TensorFlow implementation of Keras with <a href="https://www.youtube.com/watch?v=UeheTiBJ0Io&amp;t=7s&amp;index=8&amp;list=PLOU2XLYxmsIKGc_NBoIhTn2Qhraji53cv">deeper integration</a> on the roadmap for later this year.</p><p>Keras and TensorFlow are the state of the art in deep learning tools and with the keras package you can now access both with a fluent R interface.</p><h2 id="getting-started">Getting Started</h2><h3 id="installation">Installation</h3><p>To begin, install the keras R package from CRAN as follows:</p><pre><code class="language-{r," data-lang="{r,">install.packages(&quot;keras&quot;)</code></pre><p>The Keras R interface uses the <a href="https://www.tensorflow.org/">TensorFlow</a> backend engine by default. To install both the core Keras library as well as the TensorFlow backend use the <code>install_keras()</code> function:</p><pre><code class="language-{r," data-lang="{r,">library(keras)install_keras()</code></pre><p>This will provide you with default CPU-based installations of Keras and TensorFlow. If you want a more customized installation, e.g. if you want to take advantage of NVIDIA GPUs, see the documentation for <a href="https://keras.rstudio.com/reference/install_keras.html"><code>install_keras()</code></a>.</p><h3 id="mnist-example">MNIST Example</h3><p>We can learn the basics of Keras by walking through a simple example: recognizing handwritten digits from the <a href="https://en.wikipedia.org/wiki/MNIST_database">MNIST</a> dataset. MNIST consists of 28 x 28 grayscale images of handwritten digits like these:</p><img style="width: 50%;" src="https://www.tensorflow.org/images/MNIST.png"><p>The dataset also includes labels for each image, telling us which digit it is. For example, the labels for the above images are 5, 0, 4, and 1.</p><h4 id="preparing-the-data">Preparing the Data</h4><p>The MNIST dataset is included with Keras and can be accessed using the <code>dataset_mnist()</code> function. Here we load the dataset then create variables for our test and training data:</p><pre><code class="language-{r}" data-lang="{r}">library(keras)mnist &lt;- dataset_mnist()x_train &lt;- mnist$train$xy_train &lt;- mnist$train$yx_test &lt;- mnist$test$xy_test &lt;- mnist$test$y</code></pre><p>The <code>x</code> data is a 3-d array <code>(images,width,height)</code> of grayscale values. To prepare the data for training we convert the 3-d arrays into matrices by reshaping width and height into a single dimension (28x28 images are flattened into length 784 vectors). Then, we convert the grayscale values from integers ranging between 0 to 255 into floating point values ranging between 0 and 1:</p><pre><code class="language-{r}" data-lang="{r}"># reshapedim(x_train) &lt;- c(nrow(x_train), 784)dim(x_test) &lt;- c(nrow(x_test), 784)# rescalex_train &lt;- x_train / 255x_test &lt;- x_test / 255</code></pre><p>The <code>y</code> data is an integer vector with values ranging from 0 to 9. To prepare this data for training we <a href="https://www.quora.com/What-is-one-hot-encoding-and-when-is-it-used-in-data-science">one-hot encode</a> the vectors into binary class matrices using the Keras <code>to_categorical()</code> function:</p><pre><code class="language-{r}" data-lang="{r}">y_train &lt;- to_categorical(y_train, 10)y_test &lt;- to_categorical(y_test, 10)</code></pre><h4 id="defining-the-model">Defining the Model</h4><p>The core data structure of Keras is a model, a way to organize layers. The simplest type of model is the <a href="https://keras.rstudio.com/articles/sequential_model.html">sequential model</a>, a linear stack of layers.</p><p>We begin by creating a sequential model and then adding layers using the pipe (<code>%&gt;%</code>) operator:</p><pre><code class="language-{r}" data-lang="{r}">model &lt;- keras_model_sequential()model %&gt;%layer_dense(units = 256, activation = &quot;relu&quot;, input_shape = c(784)) %&gt;%layer_dropout(rate = 0.4) %&gt;%layer_dense(units = 128, activation = &quot;relu&quot;) %&gt;%layer_dropout(rate = 0.3) %&gt;%layer_dense(units = 10, activation = &quot;softmax&quot;)</code></pre><p>The <code>input_shape</code> argument to the first layer specifies the shape of the input data (a length 784 numeric vector representing a grayscale image). The final layer outputs a length 10 numeric vector (probabilities for each digit) using a <a href="https://en.wikipedia.org/wiki/Softmax_function">softmax activation function</a>.</p><p>Use the <code>summary()</code> function to print the details of the model:</p><pre><code class="language-{r}" data-lang="{r}">summary(model)</code></pre><pre style="box-shadow: none;"><code>Model________________________________________________________________________________Layer (type) Output Shape Param #================================================================================dense_1 (Dense) (None, 256) 200960________________________________________________________________________________dropout_1 (Dropout) (None, 256) 0________________________________________________________________________________dense_2 (Dense) (None, 128) 32896________________________________________________________________________________dropout_2 (Dropout) (None, 128) 0________________________________________________________________________________dense_3 (Dense) (None, 10) 1290================================================================================Total params: 235,146Trainable params: 235,146Non-trainable params: 0________________________________________________________________________________</code></pre><p>Next, compile the model with appropriate loss function, optimizer, and metrics:</p><pre><code class="language-{r}" data-lang="{r}">model %&gt;% compile(loss = &quot;categorical_crossentropy&quot;,optimizer = optimizer_rmsprop(),metrics = c(&quot;accuracy&quot;))</code></pre><h4 id="training-and-evaluation">Training and Evaluation</h4><p>Use the <code>fit()</code> function to train the model for 30 epochs using batches of 128 images:</p><pre><code class="language-{r," data-lang="{r,">history &lt;- model %&gt;% fit(x_train, y_train,epochs = 30, batch_size = 128,validation_split = 0.2)</code></pre><p>The <code>history</code> object returned by <code>fit()</code> includes loss and accuracy metrics which we can plot:</p><pre><code class="language-{r}" data-lang="{r}">plot(history)</code></pre><p><img src="https://keras.rstudio.com/images/training_history_ggplot2.png" alt=""></p><p>Evaluate the model&rsquo;s performance on the test data:</p><pre><code class="language-{r," data-lang="{r,">model %&gt;% evaluate(x_test, y_test,verbose = 0)</code></pre><pre><code>$loss[1] 0.1149$acc[1] 0.9807</code></pre><p>Generate predictions on new data:</p><pre><code class="language-{r," data-lang="{r,">model %&gt;% predict_classes(x_test)</code></pre><pre><code> [1] 7 2 1 0 4 1 4 9 5 9 0 6 9 0 1 5 9 7 3 4 9 6 6 5 4 0 7 4 0 1 3 1 3 4 7 2 7 1 2[40] 1 1 7 4 2 3 5 1 2 4 4 6 3 5 5 6 0 4 1 9 5 7 8 9 3 7 4 6 4 3 0 7 0 2 9 1 7 3 2[79] 9 7 7 6 2 7 8 4 7 3 6 1 3 6 9 3 1 4 1 7 6 9[ reached getOption(&quot;max.print&quot;) -- omitted 9900 entries ]</code></pre><p>Keras provides a vocabulary for building deep learning models that is simple, elegant, and intuitive. Building a question answering system, an image classification model, a neural Turing machine, or any other model is just as straightforward.</p><p>The <a href="https://keras.rstudio.com/articles/sequential_model.html">Guide to the Sequential Model</a> article describes the basics of Keras sequential models in more depth.</p><h2 id="examples">Examples</h2><p>Over 20 complete examples are available (special thanks to <a href="https://github.com/dfalbel">@dfalbel</a> for his work on these!). The examples cover image classification, text generation with stacked LSTMs, question-answering with memory networks, transfer learning, variational encoding, and more.</p><table><thead><tr><th>Example</th><th>Description</th></tr></thead><tbody><tr><td><a href="https://keras.rstudio.com/articles/examples/addition_rnn.html">addition_rnn</a></td><td>Implementation of sequence to sequence learning for performing addition of two numbers (as strings).</td></tr><tr><td><a href="https://keras.rstudio.com/articles/examples/babi_memnn.html">babi_memnn</a></td><td>Trains a memory network on the bAbI dataset for reading comprehension.</td></tr><tr><td><a href="https://keras.rstudio.com/articles/examples/babi_rnn.html">babi_rnn</a></td><td>Trains a two-branch recurrent network on the bAbI dataset for reading comprehension.</td></tr><tr><td><a href="https://keras.rstudio.com/articles/examples/cifar10_cnn.html">cifar10_cnn</a></td><td>Trains a simple deep CNN on the CIFAR10 small images dataset.</td></tr><tr><td><a href="https://keras.rstudio.com/articles/examples/conv_lstm.html">conv_lstm</a></td><td>Demonstrates the use of a convolutional LSTM network.</td></tr><tr><td><a href="https://keras.rstudio.com/articles/examples/deep_dream.html">deep_dream</a></td><td>Deep Dreams in Keras.</td></tr><tr><td><a href="https://keras.rstudio.com/articles/examples/imdb_bidirectional_lstm.html">imdb_bidirectional_lstm</a></td><td>Trains a Bidirectional LSTM on the IMDB sentiment classification task.</td></tr><tr><td><a href="https://keras.rstudio.com/articles/examples/imdb_cnn.html">imdb_cnn</a></td><td>Demonstrates the use of Convolution1D for text classification.</td></tr><tr><td><a href="https://keras.rstudio.com/articles/examples/imdb_cnn_lstm.html">imdb_cnn_lstm</a></td><td>Trains a convolutional stack followed by a recurrent stack network on the IMDB sentiment classification task.</td></tr><tr><td><a href="https://keras.rstudio.com/articles/examples/imdb_fasttext.html">imdb_fasttext</a></td><td>Trains a FastText model on the IMDB sentiment classification task.</td></tr><tr><td><a href="https://keras.rstudio.com/articles/examples/imdb_lstm.html">imdb_lstm</a></td><td>Trains a LSTM on the IMDB sentiment classification task.</td></tr><tr><td><a href="https://keras.rstudio.com/articles/examples/lstm_text_generation.html">lstm_text_generation</a></td><td>Generates text from Nietzsche&rsquo;s writings.</td></tr><tr><td><a href="https://keras.rstudio.com/articles/examples/mnist_acgan.html">mnist_acgan</a></td><td>Implementation of AC-GAN (Auxiliary Classifier GAN ) on the MNIST dataset</td></tr><tr><td><a href="https://keras.rstudio.com/articles/examples/mnist_antirectifier.html">mnist_antirectifier</a></td><td>Demonstrates how to write custom layers for Keras</td></tr><tr><td><a href="https://keras.rstudio.com/articles/examples/mnist_cnn.html">mnist_cnn</a></td><td>Trains a simple convnet on the MNIST dataset.</td></tr><tr><td><a href="https://keras.rstudio.com/articles/examples/mnist_irnn.html">mnist_irnn</a></td><td>Reproduction of the IRNN experiment with pixel-by-pixel sequential MNIST in &ldquo;A Simple Way to Initialize Recurrent Networks of Rectified Linear Units&rdquo; by Le et al.</td></tr><tr><td><a href="https://keras.rstudio.com/articles/examples/mnist_mlp.html">mnist_mlp</a></td><td>Trains a simple deep multi-layer perceptron on the MNIST dataset.</td></tr><tr><td><a href="https://keras.rstudio.com/articles/examples/mnist_hierarchical_rnn.html">mnist_hierarchical_rnn</a></td><td>Trains a Hierarchical RNN (HRNN) to classify MNIST digits.</td></tr><tr><td><a href="https://keras.rstudio.com/articles/examples/mnist_transfer_cnn.html">mnist_transfer_cnn</a></td><td>Transfer learning toy example.</td></tr><tr><td><a href="https://keras.rstudio.com/articles/examples/neural_style_transfer.html">neural_style_transfer</a></td><td>Neural style transfer (generating an image with the same “content” as a base image, but with the “style” of a different picture).</td></tr><tr><td><a href="https://keras.rstudio.com/articles/examples/reuters_mlp.html">reuters_mlp</a></td><td>Trains and evaluates a simple MLP on the Reuters newswire topic classification task.</td></tr><tr><td><a href="https://keras.rstudio.com/articles/examples/stateful_lstm.html">stateful_lstm</a></td><td>Demonstrates how to use stateful RNNs to model long sequences efficiently.</td></tr><tr><td><a href="https://keras.rstudio.com/articles/examples/variational_autoencoder.html">variational_autoencoder</a></td><td>Demonstrates how to build a variational autoencoder.</td></tr><tr><td><a href="https://keras.rstudio.com/articles/examples/variational_autoencoder_deconv.html">variational_autoencoder_deconv</a></td><td>Demonstrates how to build a variational autoencoder with Keras using deconvolution layers.</td></tr></tbody></table><h2 id="learning-more">Learning More</h2><p>After you&rsquo;ve become familiar with the basics, these articles are a good next step:</p><ul><li><p><a href="https://keras.rstudio.com/articles/sequential_model.html">Guide to the Sequential Model</a>. The sequential model is a linear stack of layers and is the API most users should start with.</p></li><li><p><a href="https://keras.rstudio.com/articles/functional_api.html">Guide to the Functional API</a>. The Keras functional API is the way to go for defining complex models, such as multi-output models, directed acyclic graphs, or models with shared layers.</p></li><li><p><a href="https://keras.rstudio.com/articles/training_visualization.html">Training Visualization</a>. There are a wide variety of tools available for visualizing training. These include plotting of training metrics, real time display of metrics within the RStudio IDE, and integration with the TensorBoard visualization tool included with TensorFlow.</p></li><li><p><a href="https://keras.rstudio.com/articles/applications.html">Using Pre-Trained Models</a>. Keras includes a number of deep learning models (Xception, VGG16, VGG19, ResNet50, InceptionVV3, and MobileNet) that are made available alongside pre-trained weights. These models can be used for prediction, feature extraction, and fine-tuning.</p></li><li><p><a href="https://keras.rstudio.com/articles/faq.html">Frequently Asked Questions</a>. Covers many additional topics including streaming training data, saving models, training on GPUs, and more.</p></li></ul><p>Keras provides a productive, highly flexible framework for developing deep learning models. We can&rsquo;t wait to see what the R community will do with these tools!</p><style type="text/css">main {hyphens: inherit;}</style></description></item><item><title>RStudio 1.1 Preview - I Only Work in Black</title><link>https://www.rstudio.com/blog/rstudio-dark-theme/</link><pubDate>Wed, 30 Aug 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-dark-theme/</guid><description><p><em>Today, we&rsquo;re continuing our blog series on new features in RStudio 1.1. If you&rsquo;d like to try these features out for yourself, you can <a href="https://www.rstudio.com/products/rstudio/download/preview/">download a preview release of RStudio 1.1</a>.</em></p><h2 id="i-only-work-in-black">I Only Work in Black</h2><p>For those of us that like to work in black or very very dark grey, the dark theme can be enabled from the &lsquo;Global Options&rsquo; menu, selecting the &lsquo;Appearance&rsquo; tab and choosing an &lsquo;Editor theme&rsquo; that is dark.</p><p>Icons are now high-DPI, a &lsquo;Modern&rsquo; and &lsquo;Sky&rsquo; theme were also added, read more about them under <a href="https://support.rstudio.com/hc/en-us/articles/115011846747-Using-RStudio-Themes">Using RStudio Themes</a>.</p><img src="https://www.rstudio.com/blog-images/2017-08-30-rstudio-dark-theme.png" style="box-shadow: 0 15px 15px rgba(0, 0, 0, 0.16)" width = "80%"/><p>All panels support themes: Code editor, Console, <a href="https://blog.rstudio.com/2017/08/11/rstudio-v1-1-preview-terminal/">Terminal</a>, Environment, History, Files, <a href="https://blog.rstudio.com/2017/08/16/rstudio-preview-connections/">Connections</a>, Packages, Help, Build and VCS. Other features like <a href="https://blog.rstudio.com/2016/10/05/r-notebooks/">Notebooks</a>, Debugging, Profiling , Menus and the <a href="https://blog.rstudio.com/2017/08/22/rstudio-v1-1-preview-object-explorer/">Object Explorer</a> support this theme as well.</p><img src="https://www.rstudio.com/blog-images/2017-08-30-rstudio-dark-theme-panes.png" style="box-shadow: 0 15px 15px rgba(0, 0, 0, 0.16)" width = "80%"/><p>However, the Plots and Viewer panes render with the default colors of your content and therefore, require additional packages to switch to dark themes. For instance, <a href="https://rstudio.github.io/shinythemes/">shinythemes</a> provides the <code>darkly</code> theme for Shiny and <a href="https://cran.r-project.org/web/packages/ggthemes/vignettes/ggthemes.html">ggthemes</a> support for <code>light = FALSE</code> under <code>ggplot</code>. If you are a package author, consider using <code>rstudioapi::getThemeInfo()</code> when generating output to these panes.</p><img src="https://www.rstudio.com/blog-images/2017-08-30-rstudio-dark-theme-plots.png" style="box-shadow: 0 15px 15px rgba(0, 0, 0, 0.16)" width = "80%"/><p>Enjoy!</p></description></item><item><title>Shiny Dev Center gets a shiny new update</title><link>https://www.rstudio.com/blog/shiny-dev-center-gets-a-shiny-new-update/</link><pubDate>Tue, 29 Aug 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-dev-center-gets-a-shiny-new-update/</guid><description><p>I am excited to announce the redesign and reorganization of <a href="https://shiny.rstudio.com/">shiny.rstudio.com</a>, also known as the Shiny Dev Center. The Shiny Dev Center is the place to go to <a href="https://shiny.rstudio.com/tutorial/">learn</a> about <a href="https://shiny.rstudio.com/articles/">all things Shiny</a> and to <a href="http://shiny.rstudio-staging.com/reference/shiny/">keep up to date</a> with it as it evolves.</p><p><img src="https://www.rstudio.com/blog-images/2017-08-29-shiny-dev-center-home.png" alt="Shiny Dev Center"></p><p>The goal of this refresh is to provide a clear learning path for those who are just starting off with developing Shiny apps as well as to make advanced Shiny topics easily accessible to those building large and complex apps. The <a href="https://shiny.rstudio.com/articles/">articles overview</a> that we designed to help navigate the wealth of information on the Shiny Dev Center aims to achieve this goal.</p><p><img src="https://www.rstudio.com/blog-images/2017-08-29-shiny-articles-overview.png" alt="Articles overview"></p><p>Other highlights of the refresh include:</p><ul><li>A brand new look!</li><li>New articles</li><li>Updated articles with modern Shiny code examples</li><li>Explicit linking, where relevant, to other RStudio resources like webinars, support docs, etc.</li><li>A prominent link to our ever growing <a href="https://www.rstudio.com/products/shiny/shiny-user-showcase/">Shiny User Showcase</a></li><li>A <a href="https://shiny.rstudio.com/contribute/">guide</a> for contributing to Shiny (inspired by the <a href="http://www.tidyverse.org/contribute/">Tidyverse contribute guide</a>)</li></ul><p>Stay tuned for more updates to the Shiny Dev Center in the near future!</p></description></item><item><title>Newer to R? rstudio::conf 2018 is for you! Early bird pricing ends August 31.</title><link>https://www.rstudio.com/blog/rstudio-conf-2018-early-bird-pricing/</link><pubDate>Fri, 25 Aug 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-conf-2018-early-bird-pricing/</guid><description><p>Immersion is among the most effective ways to learn any language. Immersing where new and advanced users come together to improve their use of the R language is a rare opportunity. <a href="https://www.rstudio.com/conference/">rstudio::conf 2018 </a> is that time and place!</p><p><strong><a href="https://www.rstudio.com/conference/rstudioconf-tickets/">REGISTER TODAY</a></strong></p><p><em>Be an Early Bird! <strong>Discounts</strong> for early conference registration <strong>expire August 31</strong>.Immerse as a team! Ask us about group discounts for 5 or more from the same organization.</em></p><p>Rstudio::conf 2018 is a two day conference with optional two day workshops. One of the conference tracks will focus on topics for newer R users. Newer R users will learn about the best ways to use R, to avoid common pitfalls and accelerate proficiency. Several workshops are also designed specifically for those newer to R.</p><h2 id="intro-to-r--rstudio">Intro to R &amp; RStudio</h2><ul><li><p>Are you new to R &amp; RStudio and do you learn best in person? You will learn the basics of R and data science, and practice using the RStudio IDE (integrated development environment) and R Notebooks. We will have a team of TAs on hand to show you the ropes and help you out when you get stuck.</p><p>This course is taught by well-known R educator and friend of RStudio, Amelia McNamara, a Smith College Visiting Assistant Professor of Statistical and Data Sciences &amp; Mass Mutual Faculty Fellow.</p></li></ul><h2 id="data-science-in-the-tidyverse">Data Science in the Tidyverse</h2><ul><li><p>Are you ready to begin applying the book, R for Data Science? Learn how to achieve your data analysis goals the “tidy” way. You will visualize, transform, and model data in R and work with date-times, character strings, and untidy data formats. Along the way, you will learn and use many packages from the tidyverse including ggplot2, dplyr, tidyr, readr, purrr, tibble, stringr, lubridate, and forcats.</p><p>This course is taught by friend of RStudio, Charlotte Wickham, a professor and award winning teacher and data analyst at Oregon State University.</p></li></ul><h2 id="intro-to-shiny--r-markdown">Intro to Shiny &amp; R Markdown</h2><ul><li><p>Do you want to share your data analysis with others in effective ways? For people who know their way around the RStudio IDE and R at least a little, this workshop will help you become proficient in Shiny application development and using R Markdown to communicate insights from data analysis to others.</p><p>This course is taught by Mine Çetinkaya-Rundel, Duke professor and RStudio professional educator. Mine is well known for her open education efforts, and her popular data science MOOCs.</p></li></ul><p>Whether you are new to the R language or as advanced as many of our speakers and educators, <a href="https://www.rstudio.com/conference/">rstudio::conf 2018 </a> is the place and time to focus on all things R &amp; RStudio.</p><p>We hope to see you in San Diego!</p></description></item><item><title>RStudio v1.1 Preview - Object Explorer</title><link>https://www.rstudio.com/blog/rstudio-v1-1-preview-object-explorer/</link><pubDate>Tue, 22 Aug 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-v1-1-preview-object-explorer/</guid><description><p><em>Today, we&rsquo;re continuing our blog series on new features in RStudio 1.1. If you&rsquo;d like to try these features out for yourself, you can <a href="https://www.rstudio.com/products/rstudio/download/preview/">download a preview release of RStudio 1.1</a>.</em></p><h2 id="object-explorer">Object Explorer</h2><p>You might already be familiar with the <strong>Data Viewer</strong> in RStudio, which allows for the inspection of data frames and other tabular R objects available in your R environment. With RStudio v1.1, it will be possible to inspect hierarchical (list-like) R objects as well, using the <strong>Object Explorer</strong>.</p><h3 id="exploring-an-object">Exploring an Object</h3><p>The same workflows you&rsquo;re familiar with for opening the data viewer apply when opening the object explorer. Let&rsquo;s start by inspecting some data returned by the GitHub API &ndash; we&rsquo;ll inspect the latest commit made on the <a href="https://github.com/tidyverse/dplyr">dplyr</a> repository. First, let&rsquo;s start by downloading and reading this data into R:</p><pre><code class="language-{r}" data-lang="{r}"># read from the commits API endpointconn &lt;- url(&quot;https://api.github.com/repos/hadley/dplyr/commits&quot;)content &lt;- readLines(conn, warn = FALSE)close(conn)# convert from JSON to R list objectdata &lt;- jsonlite::fromJSON(content, simplifyDataFrame = FALSE)# extract the most recent commitlatest &lt;- data[[1]]</code></pre><p>Within the environment pane, explorable objects will be shown with a magnifying glass, and clicking on this icon will open the associated item in the object explorer. (Alternatively, such objects can also be opened by directly calling the <code>View()</code> function on the object of interest.)</p><p><img src="https://www.rstudio.com/blog-images/2017-08-22-explorer-environment-pane.png" alt="Object Explorer"></p><p>After clicking on this icon, the object explorer will open, and we can begin exploring the latest <code>dplyr</code> commit.</p><p><img src="https://www.rstudio.com/blog-images/2017-08-22-explorer-open.png" alt="Object Explorer"></p><h3 id="interacting-with-the-object-explorer">Interacting with the Object Explorer</h3><p>The object explorer displays information within a tree with three (resizable!) columns:</p><ul><li><strong>Name</strong>: Either the name of the element (when present); or the index of the element in its parent container;</li><li><strong>Type</strong>: The underlying R type (or class) of a particular element, alongside its length;</li><li><strong>Value</strong>: A brief description of the value for a particular element.</li></ul><p>Expandable nodes (e.g. sub-lists) can be expanded by clicking the blue arrow to the left of the expandable field. In the following image, the <code>commit</code> and <code>tree</code> sub-nodes are opened:</p><p><img src="https://www.rstudio.com/blog-images/2017-08-22-explorer-expanded.png" alt="Object Explorer with Expanded Nodes"></p><p>You might also notice the text at the bottom left of the explorer, indicating the R code that can be used to access this particular object. If you mouse over a particular row in the object explorer, you&rsquo;ll see an icon drawn on the right side of that row &ndash; this icon can be clicked to send that code to the R console.</p><p><img src="https://www.rstudio.com/blog-images/2017-08-22-explorer-send-to-console.png" alt="Object Explorer with Send to Console Icon"></p><h2 id="filtering-with-the-explorer">Filtering with the Explorer</h2><p>All kinds of R objects can be inspected within the object explorer &ndash; environments, S4 objects, R6 objects, R functions, and other base R objects. For example, we can explore the <code>readr</code> namespace, and learn a bit about the functions contained within. We&rsquo;ll use the object explorer to explore the <code>read_csv()</code> function definition.</p><pre><code class="language-{r}" data-lang="{r}">readr &lt;- asNamespace(&quot;readr&quot;)View(readr)</code></pre><p><img src="https://www.rstudio.com/blog-images/2017-08-22-explorer-readr.png" alt="Object Explorer with readr Namespace"></p><p>There are quite a few top-level objects in the <code>readr</code> namespace (189 in total). Rather than scrolling to find the <code>read_csv()</code> in the explorer, we can use the search box at the top-right of the explorer to quickly filter down to entries containing <code>read_csv</code> in their name:</p><p><img src="https://www.rstudio.com/blog-images/2017-08-22-explorer-readr-read-csv.png" alt="Object Explorer readr Namespace Filtered"></p><p>Notice how the object explorer displays the <em>formals</em>, <em>body</em> and <em>environment</em> for an R function definition. This allows you to explore the &lsquo;guts&rsquo; of an R function &ndash; for example, the expression tree associated with a function&rsquo;s body, and the default parameter values associated with the function arguments. We can expand the <em>formals</em> entry to view the function arguments accepted by <code>read_csv()</code>:</p><p><img src="https://www.rstudio.com/blog-images/2017-08-22-explorer-readr-read-csv-expanded.png" alt="Object Explorer readr Namespace Expanded"></p><h3 id="xml2-integration">xml2 Integration</h3><p>The object explorer also comes with special handling for XML and HTML documents produced by the <code>xml2</code> package:</p><pre><code class="language-{r}" data-lang="{r}">library(xml2)text &lt;- &quot;&lt;parent&gt;&lt;child id='a'&gt;child 1&lt;/child&gt;&lt;child id='b'&gt;child 2&lt;/child&gt;&lt;/parent&gt;&quot;xml &lt;- xml2::read_xml(text)View(xml)</code></pre><p><img src="https://www.rstudio.com/blog-images/2017-08-22-explorer-xml.png" alt="Object Explorer with XML Document"></p><p>Similarly, the generated code uses the <code>xml2</code> package APIs to access nodes from within the XML document.</p><hr><p>We hope you find the <strong>Object Explorer</strong> to be a useful tool in your workflows. If you&rsquo;re interested in giving it a test drive, please <a href="https://www.rstudio.com/products/rstudio/download/preview/">download the RStudio 1.1 preview</a>.</p><p>If you have any questions or feedback, please get in touch with us at the <a href="http://support.rstudio.com/hc/en-us">support forums</a>.</p></description></item><item><title>RStudio Server Pro is ready for BigQuery on the Google Cloud Platform</title><link>https://www.rstudio.com/blog/google-cloud-platform/</link><pubDate>Fri, 18 Aug 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/google-cloud-platform/</guid><description><p>RStudio is excited to announce the availability of RStudio Server Pro on the Google Cloud Platform.</p><p>RStudio Server Pro GCP is identical to <a href="https://www.rstudio.com/products/rstudio-server-pro/">RStudio Server Pro</a>, but with additional convenience for data scientists, including pre-installation of multiple versions of R, common systems libraries, and the <a href="https://www.rstudio.com/blog/bigrquery-0-4-0/">BigQuery package for R</a>.</p><p>RStudio Server Pro GCP adapts to your unique circumstances. It allows you to choose different GCP computing instances for RStudio Server Pro no matter how large, whenever a project requires it (hourly pricing).</p><p>If the enhanced security, support for multiple R versions and multiple sessions, and commercially licensed and supported features of RStudio Server Pro appeal to you, please give RStudio Server Pro for GCP a try. Below are some useful links to get you started:</p><ul><li><a href="https://support.rstudio.com/hc/en-us/articles/115010424448-FAQ-for-RStudio-Server-Pro-GCP">Read the FAQ</a></li><li><a href="https://support.rstudio.com/hc/en-us/articles/115010260627-RStudio-Server-Pro-for-Google-Cloud-Platform">Try RStudio Server Pro GCP</a></li><li><a href="https://console.cloud.google.com/launcher/details/rstudio-launcher-public/rstudio-server-pro-for-gcp?q=rstudio">Launch RStudio Server Pro GCP</a></li></ul></description></item><item><title>RStudio 1.1 Preview - Data Connections</title><link>https://www.rstudio.com/blog/rstudio-preview-connections/</link><pubDate>Wed, 16 Aug 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-preview-connections/</guid><description><p><em>Today, we&rsquo;re continuing our blog series on new features in RStudio 1.1. If you&rsquo;d like to try these features out for yourself, you can <a href="https://www.rstudio.com/products/rstudio/download/preview/">download a preview release of RStudio 1.1</a>.</em></p><h2 id="data-connections">Data Connections</h2><p>Connecting to data sources in R isn&rsquo;t always straightforward; even when you&rsquo;re able to establish a connection, navigating within it and understanding the shape of the data inside can be difficult. We built the Connections tab to make it easy to establish connections to data sources and access the data they contain.</p><h3 id="connecting-to-data">Connecting to Data</h3><h4 id="new-connection-wizard">New Connection Wizard</h4><p>RStudio 1.1 includes a new Connection wizard which makes it easy to connect to any data source on your system.</p><img src="https://www.rstudio.com/blog-images/2017-08-16-connection_create.png" alt="RStudio Connection wizard" style="width: 70%"/><p>Clicking on a connection type shows a form which you can fill to create the appropriate R code for creating a new connection. It&rsquo;s now just as easy to connect to a data source as it is to import a CSV file.</p><img src="https://www.rstudio.com/blog-images/2017-08-16-connection_dialog.png" alt="RStudio Connection wizard" style="width: 70%"/><p>RStudio can show you data sources from a variety of places, including:</p><ul><li>ODBC drivers installed on your machine, using the <a href="https://github.com/rstats-db/odbc/">odbc</a> package</li><li>Spark connections, using the <a href="https://spark.rstudio.com/">sparklyr</a> package</li><li>Connections supplied by an administrator, on RStudio Server</li><li>Data sources defined by any installed R packages; see <a href="https://rstudio.github.io/rstudio-extensions/rstudio-connections.html">RStudio Connections</a> for more</li></ul><h4 id="saved-connections">Saved Connections</h4><p>Once you have established a connection, RStudio saves it for future reference (even if you didn&rsquo;t create it using the wizard!). This makes it easy to connect to the same data again later, without having to find the connection command in your R console history or dig up the R script you used to establish the connection.</p><p><img src="https://www.rstudio.com/blog-images/2017-08-16-connection_list.png" alt="Connection List"></p><p>Your connection history is available in all your projects. Never look up a server name again!</p><h2 id="exploring-connections">Exploring Connections</h2><p>When you&rsquo;re connected to a data source, RStudio will show you all of the objects available. You can explore this list, and preview the data, using the tools in the Connections pane.</p><p><img src="https://www.rstudio.com/blog-images/2017-08-16-connection_explorer.png" alt="Connection Explorer"></p><h2 id="professional-odbc-drivers">Professional ODBC Drivers</h2><p>Finally, RStudio will be providing professional ODBC drivers for the most important enterprise databases to our Pro customers. We developed the Connections tab to work seamlessly with these drivers; they will give your users easy, consistent access to the data in your company&rsquo;s systems. More news about these ODBC drivers is coming soon!</p><h2 id="further-reading">Further Reading</h2><p>The Connections tab is just one part of a larger effort at RStudio to advance the state of the art in data connectivity with R. Here&rsquo;s some more reading on both the big picture and the technical details:</p><ul><li><a href="http://db.rstudio.com/">Using Databases with R</a> describes best practices for working with data from databases using R and RStudio.</li><li><a href="https://support.rstudio.com/hc/en-us/articles/115010915687-Using-RStudio-Connections">Using RStudio Connections</a> contains more detailed information on the Connections tab.</li><li><a href="https://rstudio.github.io/rstudio-extensions/rstudio-connections.html">Extending RStudio Connections</a> describes how to add new connection types to RStudio and make it possible to browse your connection&rsquo;s data in the pane.</li></ul><p>We hope you <a href="https://www.rstudio.com/products/rstudio/download/preview/">download the RStudio 1.1 preview</a> and <a href="http://support.rstudio.com/hc/en-us">let us know what you think</a>!</p></description></item><item><title>rstudio::conf(2018): Contributed talks, e-posters, and diversity scholarships</title><link>https://www.rstudio.com/blog/contributed-talks-diversity-scholarships/</link><pubDate>Tue, 15 Aug 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/contributed-talks-diversity-scholarships/</guid><description><p><a href="https://www.rstudio.com/conference/">rstudio::conf</a>, the conference on all things R and RStudio, will take place February 2 and 3, 2018 in San Diego, California, preceded by Training Days on January 31 and February 1. We are pleased to announce that this year&rsquo;s conference includes contributed talks and e-posters, and diversity scholarships. <!-- more --> More information below!</p><h2 id="contributed-talks-and-e-posters">Contributed talks and e-posters</h2><p>rstudio::conf() is accepting proposals for contributed <strong>talks</strong> and <strong>e-posters</strong> for the first time! Contributed talks are 20 minutes long, and will be scheduled alongside talks by RStudio employees and invited speakers. E-posters will be shown during the opening reception on Thursday evening: we’ll provide a big screen, power, and internet; you’ll provide a laptop with an innovative display or demo.</p><p>We are particularly interested in talks and e-posters that:</p><ul><li><p>Showcase the use of R and RStudio’s tools to solve real problems.</p></li><li><p>Expand the tidyverse to reach new domains and audiences.</p></li><li><p>Communicate using R, whether it’s building on top of RMarkdown, Shiny, ggplot2, or something else altogether.</p></li><li><p>Discuss how to teach R effectively.</p></li></ul><p>To present you’ll also need to register for the conference.</p><p><strong><a href="https://rstudio.typeform.com/to/SUl5Qe">Apply now!</a></strong></p><p>Applications close Sept 15, and you’ll be notified of our decision by Oct 1.</p><h2 id="diversity-scholarships">Diversity scholarships</h2><p>We’re also continuing our tradition of <strong>diversity scholarships</strong>, and this year we’re doubling the program to twenty recipients. We will support underrepresented minorities in the R community by covering their registration (including workshops), travel, and accommodation.</p><p><strong><a href="https://rstudio.typeform.com/to/ZavzRM">Apply now!</a></strong></p><p>Applications close Sept 15, and you’ll be notified of our decision by Oct 1.</p></description></item><item><title>Shiny 1.0.4</title><link>https://www.rstudio.com/blog/shiny-1-0-4/</link><pubDate>Tue, 15 Aug 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-1-0-4/</guid><description><script src="https://www.rstudio.com/rmarkdown-libs/header-attrs/header-attrs.js"></script><p>Shiny 1.0.4 is now available on CRAN. To install it, run:</p><pre class="r"><code>install.packages(&quot;shiny&quot;)</code></pre><p>For most Shiny users, the most exciting news is that file inputs now support dragging and dropping:</p><p><img src="https://www.rstudio.com/blog-images/2017-08-15-shiny-1-0-4-drag-drop.gif" style="box-shadow: 0 0 8px #666;" /></p><p>It is now possible to add and remove tabs from a <code>tabPanel</code>, with the new functions <code>insertTab()</code>, <code>appendTab()</code>, <code>prependTab()</code>, and <code>removeTab()</code>. It is also possible to hide and show tabs with <code>hideTab()</code> and <code>showTab()</code>.</p><p>Shiny also has a new a function, <code>onStop()</code>, which registers a callback function that will execute when the application exits. (Note that this is different from the existing <code>onSessionEnded()</code>, which registers a callback that executes when a user’s session ends. An application can serve multiple sessions.) This can be useful for cleaning up resources when an application exits, such as database connections.</p><p>This release of Shiny also has many minor new features and bug fixes. For a the full set of changes, see the <a href="https://shiny.rstudio.com/reference/shiny/1.0.4/upgrade.html">changelog</a>.</p></description></item><item><title>Shiny 1.0.4</title><link>https://www.rstudio.com/blog/shiny-1-0-4/</link><pubDate>Tue, 15 Aug 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-1-0-4/</guid><description><p>Shiny 1.0.4 is now available on CRAN. To install it, run:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-R" data-lang="R"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">shiny&#34;</span>)</code></pre></div><p>For most Shiny users, the most exciting news is that file inputs now support dragging and dropping:</p><p><img src="2017-08-15-shiny-1-0-4-drag-drop.gif" alt="GIF f someone dragging and dropping a CSV in Shiny&rsquo;s upload field"></p><p>It is now possible to add and remove tabs from a <code>tabPanel</code>, with the new functions <code>insertTab()</code>, <code>appendTab()</code>, <code>prependTab()</code>, and <code>removeTab()</code>. It is also possible to hide and show tabs with <code>hideTab()</code> and <code>showTab()</code>.</p><p>Shiny also has a new a function, <code>onStop()</code>, which registers a callback function that will execute when the application exits. (Note that this is different from the existing <code>onSessionEnded()</code>, which registers a callback that executes when a user&rsquo;s session ends. An application can serve multiple sessions.) This can be useful for cleaning up resources when an application exits, such as database connections.</p><p>This release of Shiny also has many minor new features and bug fixes. For a the full set of changes, see the <a href="https://shiny.rstudio.com/reference/shiny/1.0.4/upgrade.html">changelog</a>.</p></description></item><item><title>RStudio v1.1 Preview: Terminal</title><link>https://www.rstudio.com/blog/rstudio-v1-1-preview-terminal/</link><pubDate>Fri, 11 Aug 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-v1-1-preview-terminal/</guid><description><p>Today we&rsquo;re excited to announce availability of our first Preview Release for RStudio 1.1, a major new release which includes the following new features:</p><ul><li>A <strong>Connections</strong> tab which makes it easy to connect to, explore, and view data in a variety of databases.</li><li>A <a href="https://support.rstudio.com/hc/en-us/articles/115010737148-Using-the-RStudio-Terminal">Terminal tab</a> which provides fluid shell integration with the IDE, xterm emulation, and even support for full-screen terminal applications.</li><li>An <strong>Object Explorer</strong> which can navigate deeply nested R data structures and objects.</li><li>A new, modern <strong>dark theme</strong> and Retina-quality icons throughout.</li><li>Improvements to the <a href="https://github.com/rstudio/rstudioapi">RStudio API</a> which add power and flexibility to RStudio add-ins and packages.</li><li>RStudio Server Pro support for floating licensing, notifications, self-service session management, a library of professional ODBC drivers, and more.</li><li>Dozens of other small improvements and bugfixes.</li></ul><p>You can try out these new features in the <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio Preview Release</a>.</p><h2 id="terminal">Terminal</h2><p>Over the next few weeks we&rsquo;ll be blogging about each of these new features. We start today with an overview of the integrated support for full-featured system terminals via the <strong>Terminal tab</strong>.</p><p>The Terminal tab provides access to the system shell within the RStudio IDE. Potential uses include advanced source-control operations, execution of long-running jobs, remote logins, and interactive full-screen terminal applications (e.g. text editors, terminal multiplexers).</p><h3 id="opening-terminals">Opening Terminals</h3><p>The Terminal tab is next to the Console tab. Switch to the Terminal tab to automatically create a new terminal, ready to accept commands. If the tab isn&rsquo;t visible, show it via <strong>Shift+Alt+T</strong> or the <strong>Tools -&gt; Terminal -&gt; Move Focus to Terminal</strong> menu. Here&rsquo;s a terminal with the output of some simple commands:</p><p><img src="https://www.rstudio.com/blog-images/2017-08-07-1_1_term-simple.png" alt="Basic terminal example"></p><p>Support for xterm enables use of full-screen programs:</p><p><img src="https://www.rstudio.com/blog-images/2017-08-07-1_1_term_vim.png" alt="Terminal running vim"></p><p>Additional terminal sessions can be started using the <strong>New Terminal</strong> command on the terminal drop-down menu, or via <strong>Shift+Alt+R</strong>.</p><p><img src="https://www.rstudio.com/blog-images/2017-08-07-1_1_term_dropdown.png" alt="Terminal dropdown menu"></p><p>Each terminal session is independent, with its own system shell and buffer. Switch between them using the arrows next to the drop-down menu or by clicking on the terminal&rsquo;s name in that menu.</p><p><img src="https://www.rstudio.com/blog-images/2017-08-07-1_1_term_multiple.png" alt="Multiple terminal example"></p><p>Programs running in a terminal do not block the rest of the RStudio user-interface, so you can continue working in RStudio even when the terminal is busy. On Mac, Linux, or Server, a busy terminal will have <strong>(busy)</strong> next to its name, and the close [x] changes to a <strong>stop</strong> button:</p><p><img src="https://www.rstudio.com/blog-images/2017-08-07-1_1_term_busy.png" alt="Busy terminal example"></p><p>If there is a busy terminal (Mac, Linux, or Server) trying to exit RStudio (or any other operation that will stop the current R session) will give a warning. Proceeding will kill the running programs.</p><p><img src="https://www.rstudio.com/blog-images/2017-08-07-1_1_term_warn.png" alt="Busy terminal example"></p><h3 id="run-in-terminal">Run in Terminal</h3><p>When editing a shell script in RStudio, the <strong>Run Selected Line(s)</strong> command (<strong>Cmd+Enter</strong> on Mac / <strong>Ctrl+Enter on others</strong>) executes the current line, or selection, in the current terminal. This can be used to step through a shell script line-by-line and observe the results in the terminal.</p><p>Here&rsquo;s an example where Cmd+Enter was hit three times, with focus on the editor and the cursor starting on the first line.</p><p><img src="https://www.rstudio.com/blog-images/2017-08-07-1_1_term_run.png" alt="Run a shell script line-by-line"></p><p>In other text file types, including R source files, the new <strong>Send to Terminal</strong> command (<strong>Cmd+Alt+Enter</strong> on Mac, <strong>Ctrl+Alt+Enter</strong> on others) may be invoked to send the current selection to the current terminal. This can be handy for other languages with a command-line interpreter. Below, Python was started in the current terminal, then <strong>Cmd+Alt+Enter</strong> was used to step through each line of the Python source file.</p><p><img src="https://www.rstudio.com/blog-images/2017-08-07-1_1_term_python.png" alt="Python example"></p><h3 id="closing-terminals">Closing Terminals</h3><p>To close a terminal session, use the Close Terminal command on the Terminal dropdown menu, click the [x] on the far-right of the Terminal pane toolbar, or exit from within the shell itself.</p><p>If the Terminal tab is not useful to your workflows, simply click the [x] on the tab itself to close it, and it will not reappear in future RStudio sessions. To restore the tab, start a terminal via the <strong>Tools/Terminal/New Terminal</strong> menu command.</p><h3 id="terminal-options">Terminal Options</h3><p>Various aspects of the terminal can be configured with the new Terminal Options pane. Invoke with Tools/Global Options&hellip; and click on the Terminal icon.</p><p><img src="https://www.rstudio.com/blog-images/2017-08-07-1_1_term_options.png" alt="Terminal Options"></p><h3 id="windows-specific-shell-options">Windows-Specific Shell Options</h3><p>On the RStudio IDE for Microsoft Windows, you can select between Git-Bash, Command Prompt, Windows PowerShell, or the Windows Subsystem for Linux. The choices available depend on what is installed on the system.</p><p><img src="https://www.rstudio.com/blog-images/2017-08-07-1_1_term_winshell.png" alt="Windows shell options"></p><p>We look forward to seeing how people use the <a href="https://support.rstudio.com/hc/en-us/articles/115010737148-Using-the-RStudio-Terminal">Terminal tab</a> in RStudio 1.1. If you want to give it a test drive, please download the <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio Preview Release</a>.</p><p>We hope you try out the preview and <a href="https://support.rstudio.com/">let us know</a> how we can make it better.</p></description></item><item><title>Building tidy tools workshop</title><link>https://www.rstudio.com/blog/upcoming-workshops/</link><pubDate>Thu, 10 Aug 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/upcoming-workshops/</guid><description><p>Have you embraced the tidyverse? Do you now want to expand it to meet your needs? Then this is a NEW two-day hands on workshop designed for you! The goal of this workshop is to take you from someone who uses tidyverse functions to someone who can extend the tidyverse by:</p><ul><li>Writing expressive code using advanced functional programming techniques</li><li>Designs consistent APIs using analogies to existing tools</li><li>Uses the S3 object system to make user friendly values</li><li>Can bundle functions with documentation and tests into a package to share with others.</li></ul><p>The class is taught by Hadley Wickham, Chief Scientist at RStudio, a member of the R Foundation, and Adjunct Professor at Stanford University and the University of Auckland. He builds tools (both computational and cognitive) to make data science easier, faster, and more fun. Much of the material for the course is drawn from two of his existing books, <a href="http://adv-r.hadley.nz/">Advanced R</a> and <a href="http://r-pkgs.had.co.nz">R Packages</a>, but the course also includes a lot of new material that will eventually become a book called &ldquo;Tidy tools&rdquo;.</p><p>Register here: <a href="https://www.rstudio.com/workshops/extending-the-tidyverse/">https://www.rstudio.com/workshops/extending-the-tidyverse/</a></p><p>As of today, there are just 30+ seats left. Discounts are still available for academics (students or faculty) and for 5 or more attendees from any organization. Email <a href="mailto:training@rstudio.com">training@rstudio.com</a> if you have any questions about the workshop that you don&rsquo;t find answered on the registration page.</p></description></item><item><title>RStudio Connect v1.5.4 - Now Supporting Plumber!</title><link>https://www.rstudio.com/blog/rstudio-connect-v1-5-4-plumber/</link><pubDate>Thu, 03 Aug 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-v1-5-4-plumber/</guid><description><p>We&rsquo;re thrilled to announce support for hosting <a href="https://www.rplumber.io/">Plumber APIs</a> in <a href="https://www.rstudio.com/products/connect/">RStudio Connect: version 1.5.4</a>. Plumber is an R package that allows you to define web APIs by adding special annotations to your existing R code &ndash; allowing you to make your R functions accessible to other systems.</p><p>Below you can see the auto-generated &ldquo;swagger&rdquo; interface for a web API written using Plumber.</p><p><img src="https://www.rstudio.com/blog-images/rsc-154-plumber.png" alt="The auto-generated &ldquo;swagger&rdquo; interface for a web API written using Plumber."></p><h3 id="develop-web-apis-using-plumber">Develop Web APIs using Plumber</h3><p>The open-source <a href="https://rplumber.io/">Plumber R package</a> enables you to create web APIs by merely adding special comments to your existing functions. These APIs can then be leveraged from other systems in your organization. For instance, you could query some functions written in R from a Java or Python application. Or you could develop a client for your API in JavaScript and allow users to interact with your R functions from a web browser.</p><p>Like Shiny applications, RStudio Connect supports one-step publishing, access controls, logging, and scaling for Plumber APIs. Visit <a href="http://docs.rstudio.com/connect/1.5.4/user/publishing.html#publishing-plumber-apis">the documentation</a> for guidance on publishing APIs to RStudio Connect.</p><p>Users may now create and manage personal API keys that will allow them to programmatically access APIs that require authentication; see <a href="http://docs.rstudio.com/connect/1.5.4/user/api-keys.html">the user guide</a> for more details.</p><p>Other notable changes this release:</p><ul><li><strong>Content search</strong> - On the content listing page, you can now search across all deployed content by title.</li><li>Official support for using <strong>PostgreSQL</strong> databases instead of SQLite. When configured appropriately, PostgreSQL can offer better performance. You can find documentation on configuration and the built-in database migration tool <a href="http://docs.rstudio.com/connect/1.5.4/admin/database-provider.html#changing-database-provider">here</a>.</li><li><strong>No customization of external usernames</strong> - When a username is obtained from an external authentication provider like LDAP, RStudio Connect will no longer offer the user an opportunity to customize the associated internal username. Previously this situation could occur if the username obtained from the external provider included a special character that RStudio Connect didn’t allow in usernames. Now, whatever username is provided from the external provider will be used without complaint. See <a href="http://docs.rstudio.com/connect/1.5.4/admin/authentication.html#auth-username-requirements">the admin guide</a> for more details.</li><li><strong>Upgraded our licensing software</strong> - 1.5.4 includes new licensing software that will minimize user issues and report errors more clearly when they do occur. This release also includes experimental support for floating licenses which can be used to support transient servers that might be running in Docker or another virtualized environment. Please contact <a href="mailto:support@rstudio.com">support@rstudio.com</a> if you&rsquo;re interested in helping test this feature.</li><li>Added a <strong>health check endpoint</strong> to make monitoring easier. See <a href="http://docs.rstudio.com/connect/1.5.4/admin/server-management.html#health-check">the admin guide</a> for more details.</li><li>Added support for <strong>Shiny reconnects</strong>. This enables users to <a href="https://shiny.rstudio.com/articles/reconnecting.html">reconnect to existing Shiny sessions</a> after brief network interruptions. This feature is not yet enabled by default but you can turn it on by setting <code>[Client].ReconnectTimeout</code> to something like <code>15s</code>.</li><li>The <code>[Authentication].Inactivity</code> setting can now be used to <strong>log users out after a period of inactivity</strong>. By default this feature is disabled, meaning users will remain logged in until their session expires, as controlled by the <code>[Authentication].Lifetime</code> setting. Additionally, we now do a better job of detecting when the user is logged out and immediately send them to the login page.</li><li>Support <strong>external R packages</strong>. This allows you to install an R package in the global system library and have deployed content use that package rather than trying to rebuild the package itself. This can be used as a workaround for packages that can&rsquo;t be installed correctly using Packrat, but should be viewed as a last resort, since this practice decreases the reproducibility and isolation of your content. See <a href="http://docs.rstudio.com/connect/1.5.4/admin/package-management.html#external-package-installation">the admin guide</a> for more details.</li><li>If they exist, inject <code>http_proxy</code> and <code>https_proxy</code> environment variables into all child R processes. More documentation available <a href="http://docs.rstudio.com/connect/1.5.4/admin/package-management.html#proxy-configuration">here</a>.</li><li>RStudio Connect now <strong>presents a warning when it is not using HTTPS</strong>. This is to remind users and administrators that it is insecure the send sensitive information like usernames and passwords over a non-secured connection. See <a href="http://docs.rstudio.com/connect/1.5.4/admin/appendix-configuration.html#appendix-configuration-https">the admin guide</a> for more information on how to configure your server to use HTTPS. Alternatively, if you’re handling SSL termination outside of Connect and want to disable this warning, you can set <code>[Http].NoWarning = true</code>.</li><li>RStudio Connect no longer leaves any R processes running when you stop the service. When the <code>rstudio-connect</code> service is restarted or stopped, all running R jobs are immediately interrupted.</li><li><strong>LDAP group queries are now cached</strong> for approximately ten seconds. This can significantly improve the load time of Shiny applications and other resources when using an LDAP server that contains many users or groups. Additionally, LDAP user searching has been improved to better handle certain configurations.</li></ul><p>You can see the full release notes for RStudio Connect 1.5.4 <a href="http://docs.rstudio.com/connect/1.5.4/news/">here</a>.</p><blockquote><h4 id="upgrade-planning">Upgrade Planning</h4><p>You can expect the installation and startup of v1.5.4 to be completed in under a minute. Previously authenticated users will need to login again when they visit the server again.</p><p>If your server is not using Connect’s HTTPS capabilities, your users will see a warning about using an insecure configuration. If you’re doing SSL termination outside of Connect, you should configure <code>[Http].NoWarning=true</code> to remove this warning.</p><p>If you’re upgrading from a release older than 1.5.0, be sure to consider the “Upgrade Planning” notes from those other releases, as well.</p></blockquote><p>If you haven&rsquo;t yet had a chance to download and try <a href="https://rstudio.com/products/connect">RStudio Connect</a> we encourage you to do so. RStudio Connect is the best way to share all the work that you do in R (Shiny apps, R Markdown documents, plots, dashboards, Plumber APIs, etc.) with collaborators, colleagues, or customers.</p><p>You can find more details or download a 45 day evaluation of the product at <a href="https://www.rstudio.com/products/connect/">https://www.rstudio.com/products/connect/</a>. Additional resources can be found below.</p><ul><li><a href="https://www.rstudio.com/products/connect/">RStudio Connect home page &amp; downloads</a></li><li><a href="http://docs.rstudio.com/connect/admin/">RStudio Connect Admin Guide</a></li><li><a href="https://www.rstudio.com/wp-content/uploads/2016/01/RSC-IT-Q-and-A.pdf">What IT needs to know about RStudio Connect</a></li><li><a href="http://docs.rstudio.com/connect/news/">Detailed news and changes between each version</a></li><li><a href="https://www.rstudio.com/pricing/#ConnectPricing">Pricing</a></li><li><a href="https://beta.rstudioconnect.com/connect/">An online preview of RStudio Connect</a></li></ul></description></item><item><title>sparklyr 0.6: Distributed R and external sources</title><link>https://www.rstudio.com/blog/sparklyr-0-6/</link><pubDate>Mon, 31 Jul 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/sparklyr-0-6/</guid><description><p>We&rsquo;re excited to announce a new release of the <a href="http://github.com/rstudio/sparklyr/">sparklyr</a> package, available in <a href="https://cran.r-project.org/package=sparklyr">CRAN</a> today! <code>sparklyr 0.6</code> introduces new features to:</p><ul><li><strong>Distribute R</strong> computations using <code>spark_apply()</code> to execute arbitrary R code across your Spark cluster. You can now use all of your favorite R packages and functions in a distributed context.</li><li>Connect to <strong>External Data Sources</strong> using <code>spark_read_source()</code>, <code>spark_write_source()</code>, <code>spark_read_jdbc()</code> and <code>spark_write_jdbc()</code>.</li><li><strong>Use the Latest Frameworks</strong> including <a href="https://blog.rstudio.com/2017/06/13/dplyr-0-7-0/">dplyr 0.7</a>, <a href="https://cran.r-project.org/package=DBI">DBI 0.7</a>, <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio 1.1</a> and <a href="https://databricks.com/blog/2017/07/11/introducing-apache-spark-2-2.html">Spark 2.2</a>.</li></ul><p>and several improvements across:</p><ul><li><strong>Spark Connections</strong> add a new <a href="https://databricks.com/product/databricks">Databricks</a> connection that enables <a href="https://databricks.com/blog/2017/05/25/using-sparklyr-databricks.html">using sparklyr in Databricks</a> through <code>mode=&quot;databricks&quot;</code>, add support for <a href="https://spark.apache.org/docs/latest/running-on-yarn.html">Yarn Cluster</a> through <code>master=&quot;yarn-cluster&quot;</code> and connection speed was also improved.</li><li><strong>Dataframes</strong> add support for <code>sdf_pivot()</code>, <code>sdf_broadcast()</code>, <code>cbind()</code>, <code>rbind()</code>, <code>sdf_separate_column()</code>, <code>sdf_bind_cols()</code>, <code>sdf_bind_rows()</code>, <code>sdf_repartition()</code> and <code>sdf_num_partitions()</code>.</li><li><strong>Machine Learning</strong> adds support for multinomial regression in <code>ml_logistic_regression()</code>, <code>weights.column</code> for GLM, <code>ml_model_data()</code> and a new <code>ft_count_vectorizer()</code> function for <code>ml_lda()</code>.</li><li>Many other improvements, from initial support for <strong>broom</strong> over <code>ml_linear_regression()</code> and <code>ml_generalized_linear_regression()</code>, <strong>dplyr</strong> support for <code>%like%</code>, <code>%rlike%</code> and <code>%regexp%</code>, sparklyr <strong>extensions</strong> now support <code>download_scalac()</code> to help you install the required Scala compilers while developing extensions, Hive <strong>database</strong> management got simplified with <code>tbl_change_db()</code> and <code>src_databases()</code> to query and switch between Hive databases. RStudio started a joint effort with <a href="https://www.microsoft.com">Microsoft</a> to support a <strong>cross-platform Spark installer</strong> under <a href="https://github.com/rstudio/spark-install">github.com/rstudio/spark-install</a>.</li></ul><p>Additional changes and improvements can be found in the <a href="https://github.com/rstudio/sparklyr/blob/master/NEWS.md">sparklyr NEWS</a> file.</p><p>Updated documentation and examples are available at <a href="http://spark.rstudio.com">spark.rstudio.com</a>. For questions or feedback, please feel free to open a <a href="https://github.com/rstudio/sparklyr/issues">sparklyr github issue</a> or a <a href="http://stackoverflow.com/questions/tagged/sparklyr">sparklyr stackoverflow question</a>.</p><h2 id="distributed-r">Distributed R##</h2><p><code>sparklyr 0.6</code> provides support for executing distributed R code through <code>spark_apply()</code>. For instance, after connecting and copying some data:</p><pre><code class="language-{r" data-lang="{r">library(sparklyr)sc &lt;- spark_connect(master = &quot;local&quot;)iris_tbl &lt;- sdf_copy_to(sc, iris)</code></pre><p>We can apply an arbitrary R function, say <code>jitter()</code>, to each column over each row as follows:</p><pre><code class="language-{r}" data-lang="{r}">iris_tbl %&gt;% spark_apply(function(e) sapply(e[,1:4], jitter))</code></pre><pre><code># Source: spark&lt;?&gt; [?? x 4]Sepal_Length Sepal_Width Petal_Length Petal_Width&lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;1 5.10 3.49 1.39 0.2082 4.89 2.99 1.40 0.2063 4.69 3.21 1.31 0.2114 4.61 3.10 1.48 0.1815 5.01 3.62 1.39 0.1906 5.39 3.88 1.71 0.3987 4.60 3.41 1.39 0.3188 4.99 3.41 1.48 0.1949 4.38 2.89 1.42 0.18610 4.88 3.10 1.51 0.106# … with more rows</code></pre><p>One can also group by columns to apply an operation over each group of rows, say, to perform linear regression over each group as follows:</p><pre><code class="language-{r}" data-lang="{r}">spark_apply(iris_tbl,function(e) broom::tidy(lm(Petal_Width ~ Petal_Length, e)),names = c(&quot;term&quot;, &quot;estimate&quot;, &quot;std.error&quot;, &quot;statistic&quot;, &quot;p.value&quot;),group_by = &quot;Species&quot;)</code></pre><pre><code># Source: spark&lt;?&gt; [?? x 6]Species term estimate std.error statistic p.value&lt;chr&gt; &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;1 versicolor (Intercept) -0.0843 0.161 -0.525 6.02e- 12 versicolor Petal_Length 0.331 0.0375 8.83 1.27e-113 virginica (Intercept) 1.14 0.379 2.99 4.34e- 34 virginica Petal_Length 0.160 0.0680 2.36 2.25e- 25 setosa (Intercept) -0.0482 0.122 -0.396 6.94e- 16 setosa Petal_Length 0.201 0.0826 2.44 1.86e- 2</code></pre><p>Packages can be used since they are automatically distributed to the worker nodes; however, using <code>spark_apply()</code> requires R to be installed over each worker node. Please refer to <a href="https://spark.rstudio.com/articles/guides-distributed-r.html">Distributing R Computations</a> for additional information and examples.</p><h2 id="external-data-sources-">External Data Sources</h2><p><code>sparklyr 0.6</code> adds support for connecting Spark to databases. This feature is useful if your Spark environment is separate from your data environment, or if you use Spark to access multiple data sources. You can use <code>spark_read_source()</code>, <code>spark_write_source</code> with any data connector available in <a href="https://spark-packages.org">Spark Packages</a>. Alternatively, you can use <code>spark_read_jdbc()</code> and <code>spark_write_jdbc()</code> and a JDBC driver with almost any data source.</p><p>For example, you can connect to Cassandra using <code>spark_read_source()</code>. Notice that the Cassandra connector version needs to match the Spark version as defined in their <a href="https://github.com/datastax/spark-cassandra-connector#version-compatibility">version compatibility</a> section.</p><pre><code class="language-{r" data-lang="{r">config &lt;- spark_config()config[[&quot;sparklyr.defaultPackages&quot;]] &lt;- c(&quot;datastax:spark-cassandra-connector:2.0.0-RC1-s_2.11&quot;)sc &lt;- spark_connect(master = &quot;local&quot;, config = config)spark_read_source(sc, &quot;emp&quot;,&quot;org.apache.spark.sql.cassandra&quot;,list(keyspace = &quot;dev&quot;, table = &quot;emp&quot;))</code></pre><p>To connect to MySQL, one can <a href="http://dev.mysql.com/downloads/connector/j/">download the MySQL connector</a> and use <code>spark_read_jdbc()</code> as follows:</p><pre><code class="language-{r" data-lang="{r">config &lt;- spark_config()config$`sparklyr.shell.driver-class-path` &lt;-&quot;~/Downloads/mysql-connector-java-5.1.41/mysql-connector-java-5.1.41-bin.jar&quot;sc &lt;- spark_connect(master = &quot;local&quot;, config = config)spark_read_jdbc(sc, &quot;person_jdbc&quot;, options = list(url = &quot;jdbc:mysql://localhost:3306/sparklyr&quot;,user = &quot;root&quot;, password = &quot;&lt;password&gt;&quot;,dbtable = &quot;person&quot;))</code></pre><p>Notice that the Cassandra connector version needs to match the Spark version as defined in their <a href="https://github.com/datastax/spark-cassandra-connector#version-compatibility">version compatibility</a> section. See also <a href="https://github.com/AkhilNairAmey/crassy">crassy</a>, an <code>sparklyr</code> extension being developed to read data from Cassandra with ease.</p><h2 id="dataframe-functions-">Dataframe Functions</h2><p><code>sparklyr 0.6</code> includes many improvements for working with DataFrames. Here are some additional highlights.</p><pre><code class="language-{r}" data-lang="{r}">x_tbl &lt;- sdf_copy_to(sc, data.frame(a = c(1,2,3), b = c(2,3,4)))y_tbl &lt;- sdf_copy_to(sc, data.frame(b = c(3,4,5), c = c(&quot;A&quot;,&quot;B&quot;,&quot;C&quot;)))</code></pre><h3 id="pivoting-dataframes">Pivoting DataFrames</h3><p>It is now possible to pivot (i.e. cross tabulate) one or more columns using <code>sdf_pivot()</code>.</p><pre><code class="language-{r}" data-lang="{r}">sdf_pivot(y_tbl, b ~ c, list(b = &quot;count&quot;))</code></pre><pre><code># Source: spark&lt;?&gt; [?? x 4]b A B C&lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;1 4 NaN 1 NaN2 3 1 NaN NaN3 5 NaN NaN 1</code></pre><h3 id="binding-rows-and-columns">Binding Rows and Columns</h3><p>Binding DataFrames by rows and columns is supported through <code>sdf_bind_rows()</code> and <code>sdf_bind_cols()</code>:</p><pre><code class="language-{r}" data-lang="{r}">sdf_bind_rows(x_tbl, y_tbl)</code></pre><pre><code># Source: spark&lt;?&gt; [?? x 3]a b c&lt;dbl&gt; &lt;dbl&gt; &lt;chr&gt;1 1 2 NA2 2 3 NA3 3 4 NA4 NaN 3 A5 NaN 4 B6 NaN 5 C</code></pre><pre><code class="language-{r}" data-lang="{r}">sdf_bind_cols(x_tbl, y_tbl)</code></pre><pre><code># Source: spark&lt;?&gt; [?? x 4]a b.x b.y c&lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;chr&gt;1 1 2 3 A2 3 4 5 C3 2 3 4 B</code></pre><h3 id="separating-columns">Separating Columns</h3><p>Separate lists into columns with ease. This is especially useful when working with model predictions that are returned as lists instead of scalars. In this example, each element in the probability column contains two items. We can use <code>sdf_separate_column()</code> to isolate the item that corresponds to the probability that <code>vs</code> equals one.</p><pre><code class="language-{r" data-lang="{r">cars &lt;- copy_to(sc, mtcars)ml_logistic_regression(cars, vs ~ mpg) %&gt;%ml_predict(cars) %&gt;%sdf_separate_column(&quot;probability&quot;, list(&quot;P[vs=1]&quot; = 2)) %&gt;%dplyr::select(`P[vs=1]`)</code></pre><pre><code># Source: spark&lt;?&gt; [?? x 1]`P[vs=1]`&lt;dbl&gt;1 0.5512 0.5513 0.7274 0.5935 0.3136 0.2617 0.06438 0.8419 0.72710 0.361# … with more rows</code></pre><h2 id="machine-learning-">Machine Learning</h2><h3 id="multinomial-regression">Multinomial Regression</h3><p><code>sparklyr 0.6</code> adds support for multinomial regression for Spark 2.1.0 or higher:</p><pre><code class="language-{r" data-lang="{r">iris_tbl %&gt;%ml_logistic_regression(Species ~ Sepal_Length + Sepal_Width)</code></pre><pre><code>Formula: Species ~ Sepal_Length + Sepal_WidthCoefficients:(Intercept) Sepal_Length Sepal_Widthversicolor -201.6 73.19 -59.84virginica -214.6 75.10 -59.43setosa 416.2 -148.29 119.27</code></pre><h3 id="improved-text-mining-with-lda">Improved Text Mining with LDA</h3><p><code>ft_tokenizer()</code> was introduced in <code>sparklyr 0.5</code> but <code>sparklyr 0.6</code> introduces <code>ft_count_vectorizer()</code> to simplify LDA:</p><pre><code class="language-{r" data-lang="{r">library(janeaustenr)lines_tbl &lt;- sdf_copy_to(sc,austen_books()[c(1,3),])lines_tbl %&gt;%ft_tokenizer(&quot;text&quot;, &quot;tokens&quot;) %&gt;%ft_count_vectorizer(&quot;tokens&quot;, &quot;features&quot;) %&gt;%ml_lda(features_col = &quot;features&quot;, k = 4)</code></pre><p>The vocabulary can be printed with:</p><pre><code class="language-{r}" data-lang="{r}">ml_lda(lines_tbl, ~text, k = 4)$vocabulary</code></pre><pre><code>[1] &quot;jane&quot; &quot;sense&quot; &quot;austen&quot; &quot;sensibility&quot;</code></pre><p>That&rsquo;s all for now, disconnecting:</p><pre><code class="language-{r}" data-lang="{r}">spark_disconnect(sc)</code></pre></description></item><item><title>haven 1.1.0</title><link>https://www.rstudio.com/blog/haven-1-1-0/</link><pubDate>Thu, 13 Jul 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/haven-1-1-0/</guid><description><p>I&rsquo;m pleased to announce the release of haven 1.1.0. Haven is designed to faciliate the transfer of data between R and SAS, SPSS, and Stata. It makes it easy to read SAS, SPSS, and Stata file formats in to R data frames, and makes it easy to save your R data frames in to SPSS and Stata if you need to collaborate with others using closed source statistical software.</p><p>Install the latest version by running:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-R" data-lang="R"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">haven&#34;</span>)</code></pre></div><p>haven 1.1.0 is a small release that fixes a bunch of smaller issues. See the <a href="http://haven.tidyverse.org/news/index.html#haven-1-1-0">release notes</a> for full details. The most important bug was ensuring that <code>read_sav()</code> once again correctly returns system defined missings as NA (rather than NaN). Other highlights include:</p><ul><li><p>Preliminary support for reading and writing SAS transport files with<code>read_xpt()</code> and <code>write_xpt()</code>.</p></li><li><p>An experimental <code>cols_only</code> argument in <code>read_sas()</code> that allows you toread only selected columns.</p></li><li><p>An update to the bundled ReadStat code, fixing many smaller reading andwriting bugs.</p></li><li><p>Better checks in <code>write_sav()</code> and <code>write_dta()</code> making it more likely thatyou&rsquo;ll get a clear error in R instead of producing an invalid file.</p></li></ul><p>A big thanks goes to community members <a href="https://github.com/ecortens">Evan Cortens</a> and <a href="https://github.com/pkq">Patrick Kennedy</a> who helped make this release possible with their code contributions.</p></description></item><item><title>Registration open for rstudio::conf 2018!</title><link>https://www.rstudio.com/blog/join-us-at-rstudioconf-2018/</link><pubDate>Wed, 12 Jul 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/join-us-at-rstudioconf-2018/</guid><description><p>RStudio is very excited to announce that <strong>rstudio::conf 2018 is open for registration!</strong></p><p><a href="https://www.rstudio.com/conference/"><strong>rstudio::conf</strong></a>, the conference on all things R and RStudio, will take place February 2 and 3, 2018 in San Diego, California, preceded by Training Days on January 31 and February 1.</p><p>This year’s conference will feature keynotes from <strong>Di Cook</strong>, Monash University Professor and Iowa State University Emeritus Faculty; and <strong>J.J. Allaire</strong>, RStudio Founder, CEO &amp; Principal Developer, along with talks from Shiny creator <strong>Joe Cheng</strong> and (no-introduction-necessary) <strong>Hadley Wickham</strong>.</p><p>We are especially happy to announce that the following outstanding R innovators have already accepted invitations to speak:</p><table><thead><tr><th>Speaker</th><th>Role</th><th>Affiliation</th></tr></thead><tbody><tr><td>Mara Averick</td><td>Research Analyst, Web Developer, Data Nerd</td><td>TCB Analytics</td></tr><tr><td>Nick Carchedi</td><td>Director of Course Development</td><td>Datacamp</td></tr><tr><td>Tanya Cashorali</td><td>Data Entrepreneur</td><td>TCB Analytics</td></tr><tr><td>Eric Colson</td><td>Chief Algorithms Officer</td><td>Stitch Fix</td></tr><tr><td>Sandra Griffith</td><td>Senior Methodologist</td><td>Flatiron Health</td></tr><tr><td>Aaron Horowitz</td><td>Analytics Team Leader</td><td>McKinsey</td></tr><tr><td>JD Long</td><td>Economist</td><td>Cocktail Party Host</td></tr><tr><td>Elaine McVey</td><td>Data Science Lead</td><td>TransLoc</td></tr><tr><td>Logan Meltabarger</td><td>Information Science &amp; Analytics</td><td>Slalom Consulting</td></tr><tr><td>Ian Lyttle</td><td>Senior Staff Engineer</td><td>Schneider Electric</td></tr><tr><td>Edzer Pebesma</td><td>Institute for Geoinformatics</td><td>University of Muenster</td></tr><tr><td>Thomas Peterson</td><td>Analytics Programmer</td><td>SKAT</td></tr><tr><td>Olga Pierce</td><td>Deputy Data Editor, Reporter</td><td>ProPublica</td></tr><tr><td>David Robinson</td><td>Data Scientist</td><td>Stack Overflow</td></tr></tbody></table><p>Attendees will also hear from a rare assembly of popular RStudio data scientists and developers like Yihui Xie, Winston Chang, Garrett Grolemund, Jenny Bryan, Max Kuhn, Kevin Ushey, Gabor Csardi, Amanda Gadrow, Jeff Allen, Jonathan McPherson, Javier Luraschi, Lionel Henry, Jim Hester, Bárbara Borges Ribeiro, Nathan Stephens, Sean Lopp, Edgar Ruiz, Phil Bowsher, Jonathan Regenstein, Mine Çetinkaya-Rundel and Joseph Rickert.</p><p>Additional speakers will be added soon.</p><p>The conference will feature more than 60 sessions, with three tracks specifically designed for people newer to R, advanced R users, and those looking for solutions to an interesting problem or industry application.</p><p>Preceding the conference, on January 31 and February 1, RStudio will offer two days of optional in-person training. This year, workshop choices include:</p><table><thead><tr><th>3 workshops for those newer to R</th><th>Instructor</th></tr></thead><tbody><tr><td>Intro to R and RStudio (2 days)</td><td>RStudio Instructor TBD</td></tr><tr><td>Data Science in the Tidyverse (2 days)</td><td>Charlotte Wickham</td></tr><tr><td>Intro to Shiny and RMarkdown (2 days)</td><td>Mine Çetinkaya-Rundel</td></tr></tbody></table><table><thead><tr><th>5 workshops for those familiar to R</th><th>Instructor</th></tr></thead><tbody><tr><td>Applied machine learning (2 days)</td><td>Max Kuhn</td></tr><tr><td>Intermediate Shiny (2 days)</td><td>Joe Cheng</td></tr><tr><td>Extending the tidyverse (2 days)</td><td>Hadley Wickham</td></tr><tr><td>What they forgot to teach you about R (2 days)</td><td>Jenny Bryan</td></tr><tr><td>Big Data with R (2 days)</td><td>Edgar Ruiz</td></tr></tbody></table><table><thead><tr><th>2 workshops for RStudio Partners and Administrators</th><th>Instructor</th></tr></thead><tbody><tr><td>Tidyverse Trainer Certification (2 days)</td><td>Garrett Grolemund</td></tr><tr><td>RStudio Connect Administrator Certification (1 day)</td><td>Sean Lopp</td></tr></tbody></table><h3 id="who-should-go">Who should go?</h3><p>rstudio::conf is for RStudio users, R administrators, and RStudio partners who want to learn how to write better Shiny applications, explore all the new capabilities of R Markdown, apply R to big data and work effectively with Spark, understand the tidyverse of tools for data science with R, discover best practices and tips for coding with RStudio, and investigate enterprise scale development &amp; deployment practices and tools.</p><h3 id="why-do-people-go-to-rstudioconf">Why do people go to rstudio::conf?</h3><p>Because there is simply no better way to immerse in all things R &amp; RStudio.</p><p>rstudio::conf 2017 | <strong>Average Satisfaction Rating 8.7 out of 10</strong></p><ul><li>“Overall, one of the best conferences I&rsquo;ve attended on any subject.”</li><li>“I had a blast and learned a lot. Glad I came.”</li><li>“There were real takeaways I could immediately apply to my work. It was a very effective way to be exposed to all of the new tools and ideas in the R community. I would have to read hundreds of blog posts and tweets.”</li><li>“It was just a really fun and collegial conference.”</li></ul><p>Also, it’s in San Diego.</p><h3 id="what-should-i-do-now">What should I do now?</h3><p>Be an early bird! Attendance is limited. All seats are are available on a first-come, first-serve basis. Early Bird registration discounts are available (Conference only) and a capped number of Academic discounts are also available for eligible students and faculty. If all tickets available for a particular workshop are sold out before you are able to purchase, we apologize in advance!</p><p>Please go to <a href="https://www.rstudio.com/conference/"><strong>www.rstudio.com/conference</strong></a> to purchase.</p><p>We hope to see you in San Diego at <strong>rstudio::conf 2018</strong>!</p><p>For questions or issues registering, please email <a href="mailto:conf@rstudio.com">conf@rstudio.com</a>. If you would like to sponsor rstudio::conf 2018 please email <a href="mailto:anne@rstudio.com">anne@rstudio.com</a>.</p><p><img src="https://manchester.grand.hyatt.com/content/dam/PropertyWebsites/grandhyatt/sanrs/Media/All/Manchester-Grand-Hyatt-San-Diego-P166-Exterior-1280x427.jpg" alt="Manchester Grand Hyatt, San Diego"></p></description></item><item><title>Introducing learnr</title><link>https://www.rstudio.com/blog/introducing-learnr/</link><pubDate>Tue, 11 Jul 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/introducing-learnr/</guid><description><p>We&rsquo;re pleased to introduce the <a href="https://rstudio.github.io/learnr/">learnr</a> package, now available on CRAN. The learnr package makes it easy to turn any R Markdown document into an interactive tutorial. Tutorials consist of content along with interactive components for checking and reinforcing understanding. Tutorials can include any or all of the following:</p><ul><li><p>Narrative, figures, illustrations, and equations.</p></li><li><p>Code exercises (R code chunks that users can edit and execute directly).</p></li><li><p>Multiple choice quizzes.</p></li><li><p>Videos (supported services include YouTube and Vimeo).</p></li><li><p>Interactive Shiny components.</p></li></ul><p><a href="https://tutorials.shinyapps.io/04-Programming-Basics/"><img src="https://rstudioblog.files.wordpress.com/2017/06/learnr-blog-1.png" alt="learnr-blog-1"></a></p><p>Each learnr tutorial is a Shiny interactive document, which means that tutorials can be deployed all of the same ways that Shiny applications can, including locally on an end-user’s machine, on a Shiny or RStudio Connect Server, or on a hosting service like <a href="http://shinyapps.io">shinyapps.io</a>.</p><h3 id="getting-started">Getting Started</h3><p>To create a learnr tutorial, install the learnr package with</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">learnr&#34;</span>)</code></pre></div><p>Then select the Interactive Tutorial template from the <strong>New R Markdown</strong> dialog in the RStudio IDE (v1.0.136 or later).</p><p><img src="https://rstudioblog.files.wordpress.com/2017/06/learnr-blog-2.png" alt="learnr-blog-2"></p><h3 id="exercises">Exercises</h3><p>Exercises are interactive R code chunks that allow readers to directly execute R code and see it&rsquo;s results:</p><p>To add an exercise, add <code>exercise = TRUE</code> to the chunk options of an R Markdown code chunk. R Markdown will preload the chunk with the code that you supply.</p><pre><code>```{r ex1, exercise = TRUE}head(mtcars, n = 5)```</code></pre><p>becomes</p><p><img src="https://rstudioblog.files.wordpress.com/2017/06/learnr-blog-3.png" alt="learnr-blog-3"></p><p>Exercises can include hints or solutions as well as custom checking code to provide feedback on user answers. The <a href="https://rstudio.github.io/learnr/exercises.html">learnr Exercises page</a> includes a more in depth discussion of exercises and their various available options and behaviors.</p><h3 id="questions">Questions</h3><p>You can include one or more <a href="https://rstudio.github.io/learnr/questions.html">multiple-choice quiz questions</a> within a tutorial to help verify that readers understand the concepts presented. Questions can have a single or multiple correct answers.</p><p>Include a question by calling the <code>question()</code> function within an R code chunk:</p><pre><code>```{r letter-a, echo=FALSE}question(&quot;What number is the letter A in the English alphabet?&quot;,answer(&quot;8&quot;),answer(&quot;14&quot;),answer(&quot;1&quot;, correct = TRUE),answer(&quot;23&quot;))```</code></pre><p><img src="https://rstudioblog.files.wordpress.com/2017/06/learnr-blog-4.png" alt="learnr-blog-4"></p><h3 id="videos">Videos</h3><p>You can include videos published on either YouTube or Vimeo within a tutorial using the standard markdown image syntax. Note that any valid YouTube or Vimeo URL will work. For example, the following are all valid examples of video embedding:</p><pre><code></code></pre><h3 id="code-checking">Code checking</h3><p>learnr works with external code checking packages to let you evaluate student answers and provide targeted, automated feedback, like the message below.</p><p><img src="https://rstudioblog.files.wordpress.com/2017/06/learnr-blog-5.png" alt="learnr-blog-5"></p><p>You can use any package that provides a learnr compatible <a href="https://rstudio.github.io/learnr/exercises.html#exercise_checking">checker function</a> to do code checking (the <a href="https://github.com/dtkaplan/checkr">checkr</a> package provides a working prototype of a compatible code checker).</p><h3 id="navigation-and-progress-tracking">Navigation and progress tracking</h3><p>Each learnr tutorial includes a Table of Contents that tracks student progress. learnr remembers which sections of a tutorial a student completes, and returns a student to where they left off when they reopen a tutorial.</p><p><img src="https://rstudioblog.files.wordpress.com/2017/06/learnr-blog-6.png" alt="learnr-blog-6"></p><h3 id="progressive-reveal">Progressive Reveal</h3><p>learnr optionally reveals content one sub-section at a time. You can use this feature to let students set their own pace, or to hide information that would spoil an exercise or question that appears just before it.</p><p>To use progressive reveal, set the <code>progressive</code> field to true in the yaml header.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-yaml" data-lang="yaml">---<span style="color:#bbb"></span><span style="color:#bbb"></span><span style="color:#007020;font-weight:bold">title</span>:<span style="color:#bbb"> </span><span style="color:#4070a0">&#34;Programming basics&#34;</span><span style="color:#bbb"></span><span style="color:#bbb"></span><span style="color:#007020;font-weight:bold">output</span>:<span style="color:#bbb"></span><span style="color:#bbb"></span><span style="color:#bbb"> </span><span style="color:#007020;font-weight:bold">learnr::tutorial</span>:<span style="color:#bbb"></span><span style="color:#bbb"></span><span style="color:#bbb"> </span><span style="color:#007020;font-weight:bold">progressive</span>:<span style="color:#bbb"> </span><span style="color:#007020;font-weight:bold">true</span><span style="color:#bbb"></span><span style="color:#bbb"> </span><span style="color:#007020;font-weight:bold">allow_skip</span>:<span style="color:#bbb"> </span><span style="color:#007020;font-weight:bold">true</span><span style="color:#bbb"></span><span style="color:#bbb"></span><span style="color:#007020;font-weight:bold">runtime</span>:<span style="color:#bbb"> </span>shiny_prerendered<span style="color:#bbb"></span><span style="color:#bbb"></span>---<span style="color:#bbb"></span></code></pre></div><p>Visit <a href="https://rstudio.github.io/learnr/">rstudio.github.io/learnr/</a> to learn more about creating interactive tutorials with learnr.</p></description></item><item><title>dbplyr 1.1.0</title><link>https://www.rstudio.com/blog/dbplyr-1-1-0/</link><pubDate>Tue, 27 Jun 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/dbplyr-1-1-0/</guid><description><p>I&rsquo;m pleased to announce the release of the <a href="http://github.com/hadley/dbplyr/">dbplyr</a> package, which now contains all dplyr code related to connecting to databases. This shouldn&rsquo;t affect you-as-a-user much, but it makes dplyr simpler, and makes it easier to release improvements just for database related code.</p><p>You can install the latest version of dbplyr with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">dbplyr&#34;</span>)</code></pre></div><h2 id="dbi-and-dplyr-alignment">DBI and dplyr alignment</h2><p>The biggest change in this release is that dplyr/dbplyr works much more directly with DBI database connections. This makes it much easier to switch between low-level queries written in SQL, and high-level data manipulation functions written with dplyr verbs.</p><p>To connect to a database, first use <code>DBI::dbConnect()</code> to create a database connection. For example, the following code connects to a temporary, in-memory, SQLite database, then uses DBI to copy over some data.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">con <span style="color:#666">&lt;-</span> DBI<span style="color:#666">::</span><span style="color:#06287e">dbConnect</span>(RSQLite<span style="color:#666">::</span><span style="color:#06287e">SQLite</span>(), <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">:memory:&#34;</span>)DBI<span style="color:#666">::</span><span style="color:#06287e">dbWriteTable</span>(con, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">iris&#34;</span>, iris)<span style="color:#60a0b0;font-style:italic">#&gt; [1] TRUE</span>DBI<span style="color:#666">::</span><span style="color:#06287e">dbWriteTable</span>(con, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">mtcars&#34;</span>, mtcars)<span style="color:#60a0b0;font-style:italic">#&gt; [1] TRUE</span></code></pre></div><p>With this connection in hand, you can execute hand-written SQL queries:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">DBI<span style="color:#666">::</span><span style="color:#06287e">dbGetQuery</span>(con, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">SELECT count() FROM iris&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; count()</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 150</span></code></pre></div><p>Or you can let dplyr generate the SQL for you:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">iris2 <span style="color:#666">&lt;-</span> <span style="color:#06287e">tbl</span>(con, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">iris&#34;</span>)species_mean <span style="color:#666">&lt;-</span> iris2 <span style="color:#666">%&gt;%</span><span style="color:#06287e">group_by</span>(Species) <span style="color:#666">%&gt;%</span><span style="color:#06287e">summarise_all</span>(mean)species_mean <span style="color:#666">%&gt;%</span> <span style="color:#06287e">show_query</span>()<span style="color:#60a0b0;font-style:italic">#&gt; &lt;SQL&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; SELECT `Species`, AVG(`Sepal.Length`) AS `Sepal.Length`, AVG(`Sepal.Width`) AS `Sepal.Width`, AVG(`Petal.Length`) AS `Petal.Length`, AVG(`Petal.Width`) AS `Petal.Width`</span><span style="color:#60a0b0;font-style:italic">#&gt; FROM `iris`</span><span style="color:#60a0b0;font-style:italic">#&gt; GROUP BY `Species`</span>species_mean<span style="color:#60a0b0;font-style:italic">#&gt; # Source: lazy query [?? x 5]</span><span style="color:#60a0b0;font-style:italic">#&gt; # Database: sqlite 3.11.1 [:memory:]</span><span style="color:#60a0b0;font-style:italic">#&gt; Species Sepal.Length Sepal.Width Petal.Length Petal.Width</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 setosa 5.006 3.428 1.462 0.246</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 versicolor 5.936 2.770 4.260 1.326</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 virginica 6.588 2.974 5.552 2.026</span></code></pre></div><p>This alignment is made possible thanks to the hard work of <a href="https://github.com/krlmlr">Kirill Muller</a> who has been working to make <a href="https://www.r-consortium.org/blog/2017/05/15/improving-dbi-a-retrospect">DBI backends</a> more consistent, comprehensive, and easier to use. This work has been funded by the <a href="https://www.r-consortium.org">R Consortium</a> and will <a href="https://www.r-consortium.org/blog/2017/04/03/q1-2017-isc-grants">continue this year</a> with improvements to backends for the two major open source databases MySQL/MariaDB and PostgreSQL.</p><p>(You can continue to the old style <code>src_mysql()</code>, <code>src_postgres()</code>, and <code>src_sqlite()</code> functions, which still live in dplyr, but I recommend that you switch to the new style for new code)</p><h2 id="sql-translation">SQL translation</h2><p>We&rsquo;ve also worked to improve the translation of R code to SQL. Thanks to <a href="https://github.com/hhoeflin">@hhoeflin</a>, dbplyr now has a basic SQL optimiser that considerably reduces the number of subqueries needed in many expressions. For example, the following code used to generate three subqueries, but now generates idiomatic SQL:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">con <span style="color:#666">%&gt;%</span><span style="color:#06287e">tbl</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">mtcars&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">filter</span>(cyl <span style="color:#666">&gt;</span> <span style="color:#40a070">2</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">select</span>(mpg<span style="color:#666">:</span>hp) <span style="color:#666">%&gt;%</span><span style="color:#06287e">head</span>(<span style="color:#40a070">10</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">show_query</span>()<span style="color:#60a0b0;font-style:italic">#&gt; &lt;SQL&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; SELECT `mpg` AS `mpg`, `cyl` AS `cyl`, `disp` AS `disp`, `hp` AS `hp`</span><span style="color:#60a0b0;font-style:italic">#&gt; FROM `mtcars`</span><span style="color:#60a0b0;font-style:italic">#&gt; WHERE (`cyl` &gt; 2.0)</span><span style="color:#60a0b0;font-style:italic">#&gt; LIMIT 10</span></code></pre></div><p>At a lower-level, dplyr now:</p><ul><li>Can translate <code>case_when()</code>:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(dbplyr)<span style="color:#06287e">translate_sql</span>(<span style="color:#06287e">case_when</span>(x <span style="color:#666">&gt;</span> <span style="color:#40a070">1</span> <span style="color:#666">~</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">big&#34;</span>, y <span style="color:#666">&lt;</span> <span style="color:#40a070">2</span> <span style="color:#666">~</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">small&#34;</span>), con <span style="color:#666">=</span> con)<span style="color:#60a0b0;font-style:italic">#&gt; &lt;SQL&gt; CASE</span><span style="color:#60a0b0;font-style:italic">#&gt; WHEN (`x` &gt; 1.0) THEN (&#39;big&#39;)</span><span style="color:#60a0b0;font-style:italic">#&gt; WHEN (`y` &lt; 2.0) THEN (&#39;small&#39;)</span><span style="color:#60a0b0;font-style:italic">#&gt; END</span></code></pre></div><ul><li>Has better support for type coercions:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">translate_sql</span>(<span style="color:#06287e">as.character</span>(cyl), con <span style="color:#666">=</span> con)<span style="color:#60a0b0;font-style:italic">#&gt; &lt;SQL&gt; CAST(`cyl` AS TEXT)</span><span style="color:#06287e">translate_sql</span>(<span style="color:#06287e">as.integer</span>(cyl), con <span style="color:#666">=</span> con)<span style="color:#60a0b0;font-style:italic">#&gt; &lt;SQL&gt; CAST(`cyl` AS INTEGER)</span><span style="color:#06287e">translate_sql</span>(<span style="color:#06287e">as.double</span>(cyl), con <span style="color:#666">=</span> con)<span style="color:#60a0b0;font-style:italic">#&gt; &lt;SQL&gt; CAST(`cyl` AS NUMERIC)</span></code></pre></div><ul><li>Can more reliably translate <code>%IN%</code>:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">translate_sql</span>(x <span style="color:#666">%in%</span> <span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">5</span>, con <span style="color:#666">=</span> con)<span style="color:#60a0b0;font-style:italic">#&gt; &lt;SQL&gt; `x` IN (1, 2, 3, 4, 5)</span><span style="color:#06287e">translate_sql</span>(x <span style="color:#666">%in%</span> <span style="color:#40a070">1L</span>, con <span style="color:#666">=</span> con)<span style="color:#60a0b0;font-style:italic">#&gt; &lt;SQL&gt; `x` IN (1)</span><span style="color:#06287e">translate_sql</span>(x <span style="color:#666">%in%</span> <span style="color:#06287e">c</span>(<span style="color:#40a070">1L</span>), con <span style="color:#666">=</span> con)<span style="color:#60a0b0;font-style:italic">#&gt; &lt;SQL&gt; `x` IN (1)</span></code></pre></div><p>You can now use <code>in_schema()</code> to refer to tables in schema: <code>in_schema(&quot;my_schema_name&quot;, &quot;my_table_name&quot;)</code>. You can use the result of this function anywhere you could previously use a table name.</p><p>We&rsquo;ve also included better translations for Oracle, MS SQL Server, Hive and Impala. We&rsquo;re working to add support for more databases over time, but adding support on your own is surprisingly easy. Submit an issue to <a href="https://github.com/tidyverse/dplyr/issues">dplyr</a> and we&rsquo;ll help you get started.</p><p>These are just the highlights: you can see the full set of improvements and bug fixes in the <a href="https://github.com/tidyverse/dbplyr/releases/tag/v1.0.0">release notes</a></p><h2 id="contributors">Contributors</h2><p>As with all R packages, this is truly a community effort. A big thanks goes to all those who contributed code or documentation to this release: <a href="https://github.com/austenhead">Austen Head</a>, <a href="https://github.com/edgararuiz">Edgar Ruiz</a>, <a href="https://github.com/gergness">Greg Freedman Ellis</a>, <a href="https://github.com/hannesmuehleisen">Hannes Mühleisen</a>, <a href="https://github.com/ianmcook">Ian Cook</a>, <a href="https://github.com/karldw">Karl Dunkle Werner</a>, <a href="https://github.com/mdsumner">Michael Sumner</a>, <a href="https://github.com/mine-cetinkaya-rundel">Mine Cetinkaya-Rundel</a>, <a href="https://github.com/shabbybanks">@shabbybanks</a> and <a href="https://github.com/zeehio">Sergio Oller</a></p><h2 id="vision">Vision</h2><p>Since you&rsquo;ve read this far, I also wanted to touch on RStudio&rsquo;s vision for databases. Many analysts have most of their data in databases, and making it as easy as possible to get data out of the database and into R makes a huge difference. Thanks to the community, R already has strong tools for talking to the popular open source databases. But support for connecting to enterprise databases and solving enterprise challenges has lagged somewhat. At RStudio we are actively working to solve these problems.</p><p>As well as dbplyr and DBI, we are working on many other pain points in the database ecosystem. You&rsquo;ll hear much more about these packages in the future, but I wanted to touch on the highlights so you can see where we are heading. These pieces are not yet as integrated as they should be, but they are valuable by themselves, and we will continue to work to make a seamless database experience, that is as good as (or better than!) any other environment.</p><ul><li><p>The <a href="https://github.com/rstats-db/odbc">odbc</a> package provides a DBI compliant backend for any database with an ODBC driver. Compared to the existing RODBC package, odbc is faster (~3x for reading, ~2x for writing), translates date/time data types, and is under active development.RStudio is also planning on providing best-of-breed ODBC drivers for the most important enterprise databases to our Pro customers. If you&rsquo;ve felt the pain of connecting to your enterprise database and would like to learn more, please schedule a meeting with our <a href="https://rstudio.youcanbook.me/">sales team</a>.</p></li><li><p>You should never record database credentials in your R scripts, so we are working on safer ways to store them that don&rsquo;t add a lot of extra hassle. One piece of the puzzle is the <a href="https://github.com/gaborcsardi/keyring">keyring</a> package, which allows you to securely store information in your system keychain, and only decrypt it when needed.Another piece of the puzzle is the <a href="https://github.com/rstudio/config">config</a> package, which makes it easy to parameterise your database connection credentials so that you can connect to your testing database when experimenting locally, and your production database when you <a href="https://www.rstudio.com/products/connect/">deploy your code</a>.</p></li><li><p>Connecting to databases from Shiny can be challenging because you don&rsquo;t want a fresh connection every for every user action (because that&rsquo;s slow), and you don&rsquo;t want one connection per app (because that&rsquo;s unreliable). The <a href="https://github.com/rstudio/pool">pool</a> package allows you to manage a shared pool of connections for your app, giving you both speed and reliability.</p></li><li><p>We&rsquo;re also working to make sure all of these pieces are easily used from the IDE and inside R Markdown. One neat feature that you might not have heard about is support for <a href="https://rmarkdown.rstudio.com/authoring_knitr_engines.html#sql">SQL chunks</a> in R Markdown.</p></li></ul><p>If any of these pieces sound interesting, please stay tuned to the blog for more upcoming announcements. Please also check out out new database website: <a href="https://db.rstudio.com">https://db.rstudio.com</a>. Over time, this website will expand to document all database best practices, so you can find everything you need in one place.</p></description></item><item><title>bigrquery 0.4.0</title><link>https://www.rstudio.com/blog/bigrquery-0-4-0/</link><pubDate>Mon, 26 Jun 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/bigrquery-0-4-0/</guid><description><p>I&rsquo;m pleased to announce that bigrquery 0.4.0 is now on CRAN. bigrquery makes it possible to talk to <a href="https://cloud.google.com/bigquery/">Google&rsquo;s BigQuery</a> cloud database. It provides both <a href="https://cloud.google.com/bigquery/">DBI</a> and <a href="http://dplyr.tidyverse.org/">dplyr</a> backends so you can interact with BigQuery using either low-level SQL or high-level dplyr verbs.</p><p>Install the latest version of bigrquery with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">bigrquery&#34;</span>)</code></pre></div><h2 id="basic-usage">Basic usage</h2><p>Connect to a bigquery database using DBI:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(dplyr)con <span style="color:#666">&lt;-</span> DBI<span style="color:#666">::</span><span style="color:#06287e">dbConnect</span>(<span style="color:#06287e">dbi_driver</span>(),project <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">publicdata&#34;</span>,dataset <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">samples&#34;</span>,billing <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">887175176791&#34;</span>)DBI<span style="color:#666">::</span><span style="color:#06287e">dbListTables</span>(con)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;github_nested&#34; &#34;github_timeline&#34; &#34;gsod&#34; &#34;natality&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt; [5] &#34;shakespeare&#34; &#34;trigrams&#34; &#34;wikipedia&#34;</span></code></pre></div><p>(You&rsquo;ll be prompted to authenticate interactively, or you can use a service token with <code>set_service_token()</code>.)</p><p>Then you can either submit your own SQL queries or use dplyr to write them for you:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">shakespeare <span style="color:#666">&lt;-</span> con <span style="color:#666">%&gt;%</span> <span style="color:#06287e">tbl</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">shakespeare&#34;</span>)shakespeare <span style="color:#666">%&gt;%</span><span style="color:#06287e">group_by</span>(word) <span style="color:#666">%&gt;%</span><span style="color:#06287e">summarise</span>(n <span style="color:#666">=</span> <span style="color:#06287e">sum</span>(word_count))<span style="color:#60a0b0;font-style:italic">#&gt; 0 bytes processed</span><span style="color:#60a0b0;font-style:italic">#&gt; # Source: lazy query [?? x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt; # Database: BigQueryConnection</span><span style="color:#60a0b0;font-style:italic">#&gt; word n</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;chr&gt; &lt;int&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 profession 20</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 augury 2</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 undertakings 3</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 surmise 8</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 religion 14</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 advanced 16</span><span style="color:#60a0b0;font-style:italic">#&gt; 7 Wormwood 1</span><span style="color:#60a0b0;font-style:italic">#&gt; 8 parchment 8</span><span style="color:#60a0b0;font-style:italic">#&gt; 9 villany 49</span><span style="color:#60a0b0;font-style:italic">#&gt; 10 digs 3</span><span style="color:#60a0b0;font-style:italic">#&gt; # ... with more rows</span></code></pre></div><h2 id="new-features">New features</h2><ul><li><p>dplyr support has been updated to require dplyr 0.7.0 and use dbplyr. This means that you can now more naturally work directly with DBI connections. dplyr now translates to modern BigQuery SQL which supports a broader set of translations. Along the way I also fixed a vareity of SQL generation bugs.</p></li><li><p>New functions <code>insert_extract_job()</code> makes it possible to extract data and save in google storage, and <code>insert_table()</code> allows you to insert empty tables into a dataset.</p></li><li><p>All POST requests (inserts, updates, copies and <code>query_exec</code>) now take <code>...</code>. This allows you to add arbitrary additional data to the request body making it possible to use parts of the BigQuery API that are otherwise not exposed. <code>snake_case</code> argument names are automatically converted to <code>camelCase</code> so you can stick consistently to snake case in your R code.</p></li><li><p>Full support for DATE, TIME, and DATETIME types (#128).</p></li></ul><p>There were a variety of bug fixes and other minor improvements: see the <a href="https://github.com/rstats-db/bigrquery/releases/tag/v0.4.0">release notes</a> for full details.</p><h2 id="contributors">Contributors</h2><p>bigrquery a community effort: a big thanks go to <a href="https://github.com/backlin">Christofer Bäcklin</a>, <a href="https://github.com/jarodmeng">Jarod G.R. Meng</a> and <a href="https://github.com/realAkhmed">Akhmed Umyarov</a> for their pull requests. Thank you all for your contributions!</p></description></item><item><title>dplyr 0.7.0</title><link>https://www.rstudio.com/blog/dplyr-0-7-0/</link><pubDate>Tue, 13 Jun 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/dplyr-0-7-0/</guid><description><p>I&rsquo;m pleased to announce that dplyr 0.7.0 is now on CRAN! (This was dplyr 0.6.0 previously; more on that below.) dplyr provides a &ldquo;grammar&rdquo; of data transformation, making it easy and elegant to solve the most common data manipulation challenges. dplyr supports multiple backends: as well as in-memory data frames, you can also use it with remote SQL databases. If you haven&rsquo;t heard of dplyr before, the best place to start is the <a href="http://r4ds.had.co.nz/transform.html">Data transformation</a> chapter in <a href="http://r4ds.had.co.nz">R for Data Science</a>.</p><p>You can install the latest version of dplyr with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">dplyr&#34;</span>)</code></pre></div><h2 id="features">Features</h2><p>dplyr 0.7.0 is a major release including over 100 improvements and bug fixes, as described in the <a href="https://github.com/tidyverse/dplyr/releases/tag/v0.7.0">release notes</a>. In this blog post, I want to discuss one big change and a handful of smaller updates. This version of dplyr also saw a major revamp of database connections. That&rsquo;s a big topic, so it&rsquo;ll get its own blog post next week.</p><h3 id="tidy-evaluation">Tidy evaluation</h3><p>The biggest change is a new system for programming with dplyr, called <strong>tidy evaluation</strong>, or tidy eval for short. Tidy eval is a system for capturing expressions and later evaluating them in the correct context. It is is important because it allows you to interpolate values in contexts where dplyr usually works with expressions:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">my_var <span style="color:#666">&lt;-</span> <span style="color:#06287e">quo</span>(homeworld)starwars <span style="color:#666">%&gt;%</span><span style="color:#06287e">group_by</span>(<span style="color:#666">!</span><span style="color:#666">!</span>my_var) <span style="color:#666">%&gt;%</span><span style="color:#06287e">summarise_at</span>(<span style="color:#06287e">vars</span>(height<span style="color:#666">:</span>mass), mean, na.rm <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>)<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 49 x 3</span><span style="color:#60a0b0;font-style:italic">#&gt; homeworld height mass</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 Alderaan 176.3333 64.0</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 Aleen Minor 79.0000 15.0</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 Bespin 175.0000 79.0</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 Bestine IV 180.0000 110.0</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 Cato Neimoidia 191.0000 90.0</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 Cerea 198.0000 82.0</span><span style="color:#60a0b0;font-style:italic">#&gt; 7 Champala 196.0000 NaN</span><span style="color:#60a0b0;font-style:italic">#&gt; 8 Chandrila 150.0000 NaN</span><span style="color:#60a0b0;font-style:italic">#&gt; 9 Concord Dawn 183.0000 79.0</span><span style="color:#60a0b0;font-style:italic">#&gt; 10 Corellia 175.0000 78.5</span><span style="color:#60a0b0;font-style:italic">#&gt; # ... with 39 more rows</span></code></pre></div><p>This makes it possible to write your functions that work like dplyr functions, reducing the amount of copy-and-paste in your code:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">starwars_mean <span style="color:#666">&lt;-</span> <span style="color:#06287e">function</span>(my_var) {my_var <span style="color:#666">&lt;-</span> <span style="color:#06287e">enquo</span>(my_var)starwars <span style="color:#666">%&gt;%</span><span style="color:#06287e">group_by</span>(<span style="color:#666">!</span><span style="color:#666">!</span>my_var) <span style="color:#666">%&gt;%</span><span style="color:#06287e">summarise_at</span>(<span style="color:#06287e">vars</span>(height<span style="color:#666">:</span>mass), mean, na.rm <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>)}<span style="color:#06287e">starwars_mean</span>(homeworld)</code></pre></div><p>You can also use the new <code>.data</code> pronoun to refer to variables with strings:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">my_var <span style="color:#666">&lt;-</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">homeworld&#34;</span>starwars <span style="color:#666">%&gt;%</span><span style="color:#06287e">group_by</span>(.data[[my_var]]) <span style="color:#666">%&gt;%</span><span style="color:#06287e">summarise_at</span>(<span style="color:#06287e">vars</span>(height<span style="color:#666">:</span>mass), mean, na.rm <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>)</code></pre></div><p>This is useful when you&rsquo;re writing packages that use dplyr code because it avoids an annoying note from <code>R CMD check</code>.</p><p>To learn more about how tidy eval helps solve data analysis challenge, please read the new <a href="http://dplyr.tidyverse.org/articles/programming.html">programming with dplyr</a> vignette. Tidy evaluation is implemented in the <a href="http://rlang.tidyverse.org">rlang</a> package, which also provides a vignette on the <a href="http://rlang.tidyverse.org/articles/tidy-evaluation.html">theoretical underpinnings</a>. Tidy eval is a rich system and takes a while to get your head around it, but we are confident that learning tidy eval will pay off, especially as it roles out to other packages in the tidyverse (tidyr and ggplot2 are next on the todo list).</p><p>The introduction of tidy evaluation means that the standard evaluation (underscored) version of each main verb (<code>filter_()</code>, <code>select_()</code> etc) is no longer needed, and so these functions have been deprecated (but remain around for backward compatibility).</p><h3 id="character-encoding">Character encoding</h3><p>We have done a lot of work to ensure that dplyr works with encodings other than Latin1 on Windows. This is most likely to affect you if you work with data that contains Chinese, Japanese, or Korean (CJK) characters. dplyr should now just work with such data. Please let us know if you have problems!</p><h3 id="new-datasets">New datasets</h3><p>dplyr has some new datasets that will help write more interesting examples:</p><ul><li><code>starwars</code>, shown above, contains information about characters from the Star Wars movies, sourced from the <a href="https://swapi.co">Star Wars API</a>. It contains a number of list-columns.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">starwars<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 87 x 13</span><span style="color:#60a0b0;font-style:italic">#&gt; name height mass hair_color skin_color eye_color</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 Luke Skywalker 172 77 blond fair blue</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 C-3PO 167 75 gold yellow</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 R2-D2 96 32 white, blue red</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 Darth Vader 202 136 none white yellow</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 Leia Organa 150 49 brown light brown</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 Owen Lars 178 120 brown, grey light blue</span><span style="color:#60a0b0;font-style:italic">#&gt; 7 Beru Whitesun lars 165 75 brown light blue</span><span style="color:#60a0b0;font-style:italic">#&gt; 8 R5-D4 97 32 white, red red</span><span style="color:#60a0b0;font-style:italic">#&gt; 9 Biggs Darklighter 183 84 black light brown</span><span style="color:#60a0b0;font-style:italic">#&gt; 10 Obi-Wan Kenobi 182 77 auburn, white fair blue-gray</span><span style="color:#60a0b0;font-style:italic">#&gt; # ... with 77 more rows, and 7 more variables: birth_year ,</span><span style="color:#60a0b0;font-style:italic">#&gt; # gender , homeworld , species , films ,</span><span style="color:#60a0b0;font-style:italic">#&gt; # vehicles , starships</span></code></pre></div><ul><li><code>storms</code> has the trajectories of ~200 tropical storms. It contains a strong grouping structure.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">storms<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 10,010 x 13</span><span style="color:#60a0b0;font-style:italic">#&gt; name year month day hour lat long status category</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 Amy 1975 6 27 0 27.5 -79.0 tropical depression -1</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 Amy 1975 6 27 6 28.5 -79.0 tropical depression -1</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 Amy 1975 6 27 12 29.5 -79.0 tropical depression -1</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 Amy 1975 6 27 18 30.5 -79.0 tropical depression -1</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 Amy 1975 6 28 0 31.5 -78.8 tropical depression -1</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 Amy 1975 6 28 6 32.4 -78.7 tropical depression -1</span><span style="color:#60a0b0;font-style:italic">#&gt; 7 Amy 1975 6 28 12 33.3 -78.0 tropical depression -1</span><span style="color:#60a0b0;font-style:italic">#&gt; 8 Amy 1975 6 28 18 34.0 -77.0 tropical depression -1</span><span style="color:#60a0b0;font-style:italic">#&gt; 9 Amy 1975 6 29 0 34.4 -75.8 tropical storm 0</span><span style="color:#60a0b0;font-style:italic">#&gt; 10 Amy 1975 6 29 6 34.0 -74.8 tropical storm 0</span><span style="color:#60a0b0;font-style:italic">#&gt; # ... with 10,000 more rows, and 4 more variables: wind ,</span><span style="color:#60a0b0;font-style:italic">#&gt; # pressure , ts_diameter , hu_diameter</span></code></pre></div><ul><li><code>band_members</code>, <code>band_instruments</code> and <code>band_instruments2</code> has a tiny amount of data about bands. It&rsquo;s designed to be very simple so you can illustrate how joins work without getting distracted by the details of the data.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">band_members<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 3 x 2</span><span style="color:#60a0b0;font-style:italic">#&gt; name band</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 Mick Stones</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 John Beatles</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 Paul Beatles</span>band_instruments<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 3 x 2</span><span style="color:#60a0b0;font-style:italic">#&gt; name plays</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 John guitar</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 Paul bass</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 Keith guitar</span></code></pre></div><h3 id="new-and-improved-verbs">New and improved verbs</h3><ul><li>The <code>pull()</code> generic allows you to extract a single column either by name or position. It&rsquo;s similar to <code>select()</code> but returns a vector, rather than a smaller tibble.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">mtcars <span style="color:#666">%&gt;%</span> <span style="color:#06287e">pull</span>(<span style="color:#40a070">-1</span>) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">str</span>()<span style="color:#60a0b0;font-style:italic">#&gt; num [1:32] 4 4 1 1 2 1 4 2 2 4 ...</span>mtcars <span style="color:#666">%&gt;%</span> <span style="color:#06287e">pull</span>(cyl) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">str</span>()<span style="color:#60a0b0;font-style:italic">#&gt; num [1:32] 6 6 4 6 8 6 8 4 4 6 ...</span></code></pre></div><p>Thanks to <a href="https://github.com/paulponcet">Paul Poncet</a> for the idea!</p><ul><li><p><code>arrange()</code> for grouped data frames gains a <code>.by_group</code> argument so you can choose to sort by groups if you want to (defaults to <code>FALSE</code>).</p></li><li><p>All single table verbs now have scoped variants suffixed with <code>_if()</code>, <code>_at()</code> and <code>_all()</code>. Use these if you want to do something to every variable (<code>_all</code>), variables selected by their names (<code>_at</code>), or variables that satisfy some predicate (<code>_if</code>).</p></li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">iris <span style="color:#666">%&gt;%</span> <span style="color:#06287e">summarise_if</span>(is.numeric, mean)starwars <span style="color:#666">%&gt;%</span> <span style="color:#06287e">select_if</span>(<span style="color:#06287e">Negate</span>(is.list))storms <span style="color:#666">%&gt;%</span> <span style="color:#06287e">group_by_at</span>(<span style="color:#06287e">vars</span>(month<span style="color:#666">:</span>hour))</code></pre></div><h3 id="other-important-changes">Other important changes</h3><ul><li>Local join functions can now control how missing values are matched. The default value is <code>na_matches = &quot;na&quot;</code>, which treats two missing values as equal. To prevent missing values from matching, use <code>na_matches = &quot;never&quot;</code>.</li></ul><p>You can change the default behaviour by calling <code>pkgconfig::set_config(&quot;dplyr::na_matches&quot;, &quot;never&quot;)</code>.</p><ul><li><code>bind_rows()</code> and <code>combine()</code> are more strict when coercing. Logical values are no longer coerced to integer and numeric. Date, POSIXct and other integer or double-based classes are no longer coerced to integer or double to avoid dropping important metadata. We plan to continue improving this interface in the future.</li></ul><h2 id="breaking-changes">Breaking changes</h2><p>From time-to-time I discover that I made a mistake in an older version of dplyr and developed what is now a clearly suboptimal API. If the problem isn&rsquo;t too big, I try to just leave it - the cost of making small improvements is not worth it when compared to the cost of breaking existing code. However, there are bigger improvements where I believe the short-term pain of breaking code is worth the long-term payoff of a better API.</p><p>Regardless, it&rsquo;s still frustrating when an update to dplyr breaks your code. To minimise this pain, I plan to do two things going forward:</p><ul><li><p>Adopt an odd-even release cycle so that API breaking changes only occur in odd numbered releases. Even numbered releases will only contain bug fixes and new features. This is why I&rsquo;ve skipped dplyr 0.6.0 and gone directly to dplyr 0.7.0.</p></li><li><p>Invest time in developing better tools isolating packages across projects so that you can choose when to upgrade a package on a project-by-project basis, and if something goes wrong, easily roll back to a version that worked. Look for news about this later in the year.</p></li></ul><h2 id="contributors">Contributors</h2><p>dplyr is truly a community effort. Apart from the dplyr team (myself, <a href="https://github.com/krlmlr">Kirill Müller</a>, and <a href="https://github.com/lionel-">Lionel Henry</a>), this release wouldn&rsquo;t have been possible without patches from <a href="https://github.com/cderv">Christophe Dervieux</a>, <a href="https://github.com/daattali">Dean Attali</a>, <a href="https://github.com/ianmcook">Ian Cook</a>, <a href="https://github.com/ijlyttle">Ian Lyttle</a>, <a href="https://github.com/JakeRuss">Jake Russ</a>, <a href="https://github.com/jayhesselberth">Jay Hesselberth</a>, <a href="https://github.com/jennybc">Jennifer (Jenny) Bryan</a>, <a href="https://github.com/lindbrook">@lindbrook</a>, <a href="https://github.com/maurolepore">Mauro Lepore</a>, <a href="https://github.com/npjc">Nicolas Coutin</a>, <a href="https://github.com/strengejacke">Daniel</a>, <a href="https://github.com/tonyfischetti">Tony Fischetti</a>, <a href="https://github.com/yutannihilation">Hiroaki Yutani</a> and <a href="https://github.com/zeehio">Sergio Oller</a>. Thank you all for your contributions!</p></description></item><item><title>RStudio Server Pro is now available on AWS Marketplace</title><link>https://www.rstudio.com/blog/rstudio-server-pro-is-now-available-on-aws-marketplace/</link><pubDate>Wed, 31 May 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-server-pro-is-now-available-on-aws-marketplace/</guid><description><p>RStudio is excited to announce the availability of its flagship enterprise-ready integrated development environment for R in AWS Marketplace.</p><p><a href="https://aws.amazon.com/marketplace/pp/B06W2G9PRY/?ref=_ptnr_devblg_"><img src="https://rstudioblog.files.wordpress.com/2017/05/rsp-aws.png" alt="RSP AWS"></a></p><p><a href="https://aws.amazon.com/marketplace/pp/B06W2G9PRY/?ref=_ptnr_devblg_"><strong>RStudio Server Pro AWS</strong> </a>is identical to <a href="https://www.rstudio.com/products/rstudio-server-pro/">RStudio Server Pro</a>, but with turnkey convenience. It comes pre-configured with multiple versions of R, common systems libraries, and the most popular R packages.</p><p>RStudio Server Pro AWS helps you adapt to your unique circumstances. It allows you to choose different AWS computing instances no matter how large, whenever a project requires it (flat hourly pricing). Or you can set up a persistent instance of RStudio Server Pro ready to be used anytime you need it (annual pricing), avoiding the sometimes complicated processes for procuring on-premises software.</p><p>If the enhanced security, elegant support for multiple R versions and multiple sessions, and commercially licensed and supported features of RStudio Server Pro appeal to you and your enterprise, consider RStudio Server Pro for AWS. It&rsquo;s ready to go!</p><p><a href="https://support.rstudio.com/hc/en-us/articles/115007144848-FAQ-for-RStudio-Server-Pro-AWS">Read the FAQ</a> <a href="https://aws.amazon.com/marketplace/pp/B06W2G9PRY/?ref=_ptnr_devblg_">Try RStudio Server Pro AWS</a></p></description></item><item><title>RStudio Connect 1.5.0 - Introducing Tags!</title><link>https://www.rstudio.com/blog/rstudio-connect-1-5-0-introducing-tags/</link><pubDate>Tue, 23 May 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-1-5-0-introducing-tags/</guid><description><p>We&rsquo;re excited to announce a powerful new ability to organize content in <a href="https://www.rstudio.com/products/connect/">RStudio Connect: version 1.5.0</a>. Tags allow publishers to arrange what they&rsquo;ve published and enable users to find and discover the content most relevant to them. The release also includes a newly designed (and customizable!) landing page and multiple important security enhancements.</p><p><img src="https://rstudioblog.files.wordpress.com/2017/05/screen-shot-2017-05-23-at-11-58-21-am.png" alt="New landing page in RStudio Connect v1.5.0"></p><h3 id="tagging-content-with-a-custom-tag-schema">Tagging Content with a Custom Tag Schema</h3><p>Tags can be used to manage and organize servers that have hundreds or even thousands of pieces of content published to them. Administrators can define a custom tag schema tailored to their organization. Publishers can then organize their content using tags, allowing all users to find the content they want by navigating through the tag schema.</p><p>See more details in the video below:</p><p><a href="http://fantastic.wistia.com/medias/m1onaqyodg?embedType=async&amp;videoFoam=true&amp;videoWidth=640">http://fantastic.wistia.com/medias/m1onaqyodg?embedType=async&amp;videoFoam=true&amp;videoWidth=640</a></p><h3 id="new-landing-page">New Landing Page</h3><p>The default landing page has been given a fresh look. Even better, administrators can now customize the landing page that logged out users will see when they visit the server. More details <a href="http://docs.rstudio.com/connect/1.5.0/admin/custom-landing.html">here</a>.</p><h3 id="security-enhancements">Security Enhancements</h3><p>This release includes multiple important security enhancements, so we recommend deploying this update as soon as possible. Specifically, this release adds protection for cross-site request forgery (CSRF) attacks and fixes two bugs around account management that could have been used to grant an account more permissions than it should have. These bugs were identified internally and we are not aware of any instances of these issues being exploited against a customer&rsquo;s server.</p><p>Other notable changes this release:</p><ul><li><p><code>[Authentication].Lifetime</code> can be used to define the duration of a user&rsquo;s session (the lifetime of their cookie) when they log in via web browser. It still defaults to 30 days.</p></li><li><p>Servers configured to use password authentication can now choose to disable user self-registration using the <code>[Password].SelfRegistration</code> setting. By default, this feature is still enabled.</p></li><li><p>Added experimental support for using PostgreSQL instead of SQLite as Connect&rsquo;s database. If you&rsquo;re interested in helping to test this feature, please contact <a href="mailto:support@rstudio.com">support@rstudio.com</a>.</p></li><li><p>Allow user and group names to contain periods.</p></li><li><p>Added support for the <a href="https://github.com/rstudio/config">config</a> package. More details <a href="http://docs.rstudio.com/connect/1.5.0/admin/process-management.html#using-the-config-package">here</a>.</p></li><li><p>Formally documented the configuration settings that support being reloaded via a <code>HUP</code> signal. Settings now mention &ldquo;Reloadable: true&rdquo; in the documentation if they support reloading.</p></li><li><p>Renamed the &ldquo;Performance&rdquo; tab for Shiny applications to &ldquo;Runtime.&rdquo;</p></li><li><p>Further improve database performance in high-traffic environments.</p></li></ul><blockquote><h4 id="upgrade-planning">Upgrade Planning</h4><p>You can expect the installation and startup of v1.5.4 to be completed in under a minute. Previously authenticated users will need to login again when they visit the server again.</p><p>If you’re upgrading from a release older than 1.4.6, be sure to consider the “Upgrade Planning” notes from those other releases, as well.</p></blockquote><p>If you haven&rsquo;t yet had a chance to download and try <a href="https://rstudio.com/products/connect">RStudio Connect</a> we encourage you to do so. RStudio Connect is the best way to share all the work that you do in R (Shiny apps, R Markdown documents, plots, dashboards, etc.) with collaborators, colleagues, or customers.</p><p>You can find more details or download a 45 day evaluation of the product at <a href="https://www.rstudio.com/products/connect/">https://www.rstudio.com/products/connect/</a>. Additional resources can be found below.</p><ul><li><p><a href="https://www.rstudio.com/products/connect/">RStudio Connect home page &amp; downloads</a></p></li><li><p><a href="http://docs.rstudio.com/connect/admin/">RStudio Connect Admin Guide</a></p></li><li><p><a href="https://www.rstudio.com/wp-content/uploads/2016/01/RSC-IT-Q-and-A.pdf">What IT needs to know about RStudio Connect</a></p></li><li><p><a href="http://docs.rstudio.com/connect/news/">Detailed news and changes between each version</a></p></li><li><p><a href="https://www.rstudio.com/pricing/#ConnectPricing">Pricing</a></p></li><li><p><a href="https://beta.rstudioconnect.com/connect/">An online preview of RStudio Connect</a></p></li></ul></description></item><item><title>shinydashboard 0.6.0</title><link>https://www.rstudio.com/blog/shinydashboard-0-6-0/</link><pubDate>Thu, 18 May 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shinydashboard-0-6-0/</guid><description><p>Shinydashboard 0.6.0 is now on CRAN! This release of shinydashboard was aimed at both fixing bugs and also bringing the package up to speed with users&rsquo; requests and Shiny itself (especially fully bringing <a href="https://shiny.rstudio.com/articles/bookmarking-state.html">bookmarkable state</a> to shinydashboard&rsquo;s sidebar). In addition to bug fixes and new features, we also added a <a href="https://rstudio.github.io/shinydashboard/behavior.html">new &ldquo;Behavior&rdquo; section</a> to the <a href="https://rstudio.github.io/shinydashboard/">shinydashboard website</a> to explain this release&rsquo;s two biggest new features, and also to provide users with more material about shinydashboard-specific behavior.</p><h2 id="sidebar">Sidebar</h2><p>This release introduces two new sidebar inputs. One of these inputs reports whether the sidebar is collapsed or expanded, and the other input reports which (if any) menu item in the side bar is expanded. In the screenshot below, the Charts tab is expanded.</p><p><img src="https://rstudioblog.files.wordpress.com/2017/05/sidebar-expanded.png" alt=""></p><p>These inputs are unusual since they&rsquo;re automatically available without you needing to declare them, and they have a fixed name. The first input is accessible via <code>input$sidebarCollapsed</code> and can have only two values: <code>TRUE</code>, which indicates that the sidebar is collapsed, and <code>FALSE</code>, which indicates that it is expanded (default).</p><p>The second input is accessible via <code>input$sidebarItemExpanded</code>. If no <code>menuItem()</code> in the sidebar is currently expanded, the value of this input is <code>NULL</code>. Otherwise, <code>input$sidebarItemExpanded</code> holds the value of the <code>expandedName</code> of whichever <code>menuItem()</code> is currently expanded (<code>expandedName</code> is a new argument to <code>menuItem()</code>; if none is provided, shinydashboard creates a sensible default).</p><h2 id="full-changes">Full changes</h2><p>As usual, you can view the full changelog for shinydashboard in the <a href="https://github.com/rstudio/shinydashboard/blob/v0.6.0/NEWS.md">NEWS</a> file.</p></description></item><item><title>Come see RStudio at an event near you next week!</title><link>https://www.rstudio.com/blog/come-see-rstudio-at-an-event-near-you-next-week/</link><pubDate>Fri, 12 May 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/come-see-rstudio-at-an-event-near-you-next-week/</guid><description><p>We love to engage with R and RStudio users online in webinars and communities because it is so efficient for everyone. But sometimes it&rsquo;s great to meet in person, too!</p><p>Next week RStudio will be in Miami, Baltimore and Chicago. We wanted to let you know in case you&rsquo;ll be there at the same time and want to &ldquo;Connect&rdquo; (yes, we said it :)) with us.</p><p>At each of these events we&rsquo;ll have the latest books signed by RStudio authors, t-shirts to win, demonstrations of RStudio Connect and RStudio Server Pro and, of course, stickers and cheatsheets. Share with us what you&rsquo;re doing with RStudio and get your product and company questions answered!</p><p><strong>Apache Big Data - Miami</strong></p><p>If big data is your thing, you use R, and you&rsquo;re headed to <a href="http://events.linuxfoundation.org/events/apache-big-data-north-america">Apache Big Data</a> in Miami May 15th through the 18th, you can find out in person how easy and practical it is to analyze big data with R and Spark.</p><p>While you&rsquo;re at the conference be sure to look us up at booth number 104 during the Expo Hall hours.</p><p><strong>PharmaSUG - Baltimore</strong></p><p>If you&rsquo;re in the Pharma industry, you use R, and you&rsquo;re headed to <a href="https://pharmasug.org/us/index.html">PharmaSUG</a> in Baltimore May 14th through the 17th, we hope you&rsquo;ll look us up. PharmaSUG is a not-to-be-missed event for programmers, statisticians, data managers, and others in the pharmaceutical, healthcare, and related industries.</p><p>Phil Bowsher from RStudio will be presenting <em>An Introduction to Shiny, R Markdown, and HTML Widgets for R with Applications in Drug Development</em> at 8am on Sunday, May 14th.</p><p>We will be in booth number 204 during the Expo Hall hours.</p><p><strong>R/Finance - Chicago</strong></p><p>Every year, new and interesting ways R is used in the financial industry surface at <a href="http://www.rinfinance.com/">R/Finance</a>. If you&rsquo;re going to Chicago May 19th and the 20th, we hope you&rsquo;ll come talk to us. You can&rsquo;t miss us at R/Finance!</p><p>Jonathan Regenstein from RStudio will be presenting <em>Reproducible Finance: A Global ETF Map and Shiny App</em> at 2pm on Saturday, May 20th.</p><p>Otherwise, if those aren&rsquo;t places you&rsquo;ll be next week, look for us in London, San Francisco, Brussels, or one of the many other <a href="https://www.rstudio.com/about/news-events/">events coming soon</a>!</p></description></item><item><title>readxl 1.0.0</title><link>https://www.rstudio.com/blog/readxl-1-0-0/</link><pubDate>Wed, 19 Apr 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/readxl-1-0-0/</guid><description><p>I&rsquo;m pleased to announce that <a href="http://readxl.tidyverse.org">readxl</a> 1.0.0 is available on CRAN. <a href="http://readxl.tidyverse.org">readxl</a> makes it easy to bring tabular data out of Excel and into R, for modern <code>.xlsx</code> files and the legacy <code>.xls</code> format. <a href="http://readxl.tidyverse.org">readxl</a> does not have any tricky external dependencies, such as Java or Perl, and is easy to install and use on Mac, Windows, and Linux.</p><p>You can install it with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">readxl&#34;</span>)</code></pre></div><p>As well as fixing many bugs, this release:</p><ul><li><p>Allows you to target specific cells for reading, in a variety of ways</p></li><li><p>Adds two new column types: <code>&quot;logical&quot;</code> and <code>&quot;list&quot;</code>, for data of disparate type</p></li><li><p>Is more resilient to the wondrous diversity in spreadsheets, e.g., those written by 3rd party tools</p></li></ul><p>You can see a full list of changes in the <a href="http://readxl.tidyverse.org/news/index.html">release notes</a>. This is the first release maintained by Jenny Bryan.</p><h2 id="specifying-the-data-rectangle">Specifying the data rectangle</h2><p>In an ideal world, data would live in a neat rectangle in the upper left corner of a spreadsheet. But spreadsheets often serve multiple purposes for users with different priorities. It is common to encounter several rows of notes above or below the data, for example. The new <code>range</code> argument provides a flexible interface for describing the data rectangle, including Excel-style ranges and row- or column-only ranges.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(readxl)<span style="color:#06287e">read_excel</span>(<span style="color:#06287e">readxl_example</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">deaths.xlsx&#34;</span>),range <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">arts!A5:F15&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 10 × 6</span><span style="color:#60a0b0;font-style:italic">#&gt; Name Profession Age `Has kids` `Date of birth`</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 David Bowie musician 69 TRUE 1947-01-08</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 Carrie Fisher actor 60 TRUE 1956-10-21</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 Chuck Berry musician 90 TRUE 1926-10-18</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 Bill Paxton actor 61 TRUE 1955-05-17</span><span style="color:#60a0b0;font-style:italic">#&gt; # ... with 6 more rows, and 1 more variables: `Date of death`</span><span style="color:#06287e">read_excel</span>(<span style="color:#06287e">readxl_example</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">deaths.xlsx&#34;</span>),sheet <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">other&#34;</span>,range <span style="color:#666">=</span> <span style="color:#06287e">cell_rows</span>(<span style="color:#40a070">5</span><span style="color:#666">:</span><span style="color:#40a070">15</span>))<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 10 × 6</span><span style="color:#60a0b0;font-style:italic">#&gt; Name Profession Age `Has kids` `Date of birth`</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 Vera Rubin scientist 88 TRUE 1928-07-23</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 Mohamed Ali athlete 74 TRUE 1942-01-17</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 Morley Safer journalist 84 TRUE 1931-11-08</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 Fidel Castro politician 90 TRUE 1926-08-13</span><span style="color:#60a0b0;font-style:italic">#&gt; # ... with 6 more rows, and 1 more variables: `Date of death`</span></code></pre></div><p>There is also a new argument <code>n_max</code> that limits the number of data rows read from the sheet. It is an example of <a href="http://readxl.tidyverse.org">readxl</a>&lsquo;s evolution towards a <a href="http://readr.tidyverse.org">readr</a>-like interface. The <a href="http://readxl.tidyverse.org/articles/sheet-geometry.html">Sheet Geometry vignette</a> goes over all the options.</p><h2 id="column-typing">Column typing</h2><p>The new ability to target cells for reading means that <a href="http://readxl.tidyverse.org">readxl</a>&lsquo;s automatic column typing will &ldquo;just work&rdquo; for most sheets, most of the time. Above, the <code>Has kids</code> column is automatically detected as <code>logical</code>, which is a new column type for <a href="http://readxl.tidyverse.org">readxl</a>.</p><p>You can still specify column type explicitly via <code>col_types</code>, which gets a couple new features. If you provide exactly one type, it is recycled to the necessary length. The new type <code>&quot;guess&quot;</code> can be mixed with explicit types to specify some types, while leaving others to be guessed.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">read_excel</span>(<span style="color:#06287e">readxl_example</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">deaths.xlsx&#34;</span>),range <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">arts!A5:C15&#34;</span>,col_types <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">guess&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">skip&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">numeric&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 10 × 2</span><span style="color:#60a0b0;font-style:italic">#&gt; Name Age</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 David Bowie 69</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 Carrie Fisher 60</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 Chuck Berry 90</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 Bill Paxton 61</span><span style="color:#60a0b0;font-style:italic">#&gt; # ... with 6 more rows</span></code></pre></div><p>The new argument <code>guess_max</code> limits the rows used for type guessing. Leading and trailing whitespace is trimmed when the new <code>trim_ws</code> argument is <code>TRUE</code>, which is the default. Finally, thanks to <a href="https://github.com/jmarshallnz">Jonathan Marshall</a>, multiple <code>na</code> values are accepted. The <a href="http://readxl.tidyverse.org/articles/cell-and-column-types.html">Cell and Column Types vignette</a> has more detail.</p><h3 id="list-columns"><code>&quot;list&quot;</code> columns</h3><p>Thanks to <a href="https://github.com/gergness">Greg Freedman Ellis</a> we now have a <code>&quot;list&quot;</code> column type. This is useful if you want to bring truly disparate data into R without the coercion required by atomic vector types.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">(df <span style="color:#666">&lt;-</span> <span style="color:#06287e">read_excel</span>(<span style="color:#06287e">readxl_example</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">clippy.xlsx&#34;</span>),col_types <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">text&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">list&#34;</span>)))<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 4 × 2</span><span style="color:#60a0b0;font-style:italic">#&gt; name value</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;chr&gt; &lt;list&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 Name &lt;chr [1]&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 Species &lt;chr [1]&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 Approx date of death &lt;dttm [1]&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 Weight in grams &lt;dbl [1]&gt;</span>tibble<span style="color:#666">::</span><span style="color:#06287e">deframe</span>(df)<span style="color:#60a0b0;font-style:italic">#&gt; $Name</span><span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;Clippy&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; $Species</span><span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;paperclip&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; $`Approx date of death`</span><span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;2007-01-01 UTC&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; $`Weight in grams`</span><span style="color:#60a0b0;font-style:italic">#&gt; [1] 0.9</span></code></pre></div><h2 id="everything-else">Everything else</h2><p>To learn more, read the <a href="http://readxl.tidyverse.org/articles/index.html">vignettes and articles</a> or <a href="http://readxl.tidyverse.org/news/index.html">release notes</a>. Highlights include:</p><ul><li><p>General rationalization of sheet geometry, including detection and treatment of empty rows and columns.</p></li><li><p>Improved behavior and messaging around coercion and mismatched cell and column types.</p></li><li><p>Improved handling of datetimes with respect to 3rd party software, rounding, and the <a href="https://support.microsoft.com/en-us/help/214326/excel-incorrectly-assumes-that-the-year-1900-is-a-leap-year">Lotus 1-2-3 leap year bug</a>.</p></li><li><p><code>read_xls()</code> and <code>read_xlsx()</code> are now exposed, so that files without an <code>.xls</code> or <code>.xlsx</code> extension can be read. Thanks <a href="https://github.com/jirkalewandowski">Jirka Lewandowski</a>!</p></li><li><p><a href="http://readxl.tidyverse.org/articles/articles/readxl-workflows.html">readxl Workflows</a> showcases patterns that reduce tedium and increase reproducibility when raw data arrives in a spreadsheet.</p></li></ul></description></item><item><title>dplyr 0.6.0 coming soon!</title><link>https://www.rstudio.com/blog/dplyr-0-6-0-coming-soon/</link><pubDate>Thu, 13 Apr 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/dplyr-0-6-0-coming-soon/</guid><description><p>I&rsquo;m planning to submit dplyr 0.6.0 to CRAN on May 11 (in four weeks time). In preparation, I&rsquo;d like to announce that the release candidate, dplyr 0.5.0.9002 is now available. I would really appreciate it if you&rsquo;d try it out and report any problems. This will ensure that the official release has as few bugs as possible.</p><h2 id="installation">Installation</h2><p>Install the pre-release version with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic"># install.packages(&#34;devtools&#34;)</span>devtools<span style="color:#666">::</span><span style="color:#06287e">install_github</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">tidyverse/dplyr&#34;</span>)</code></pre></div><p>If you discover any problems, please file a minimal <a href="http://github.com/jennybc/reprex#readme">reprex</a> on <a href="https://github.com/tidyverse/dplyr/issues">GitHub</a>. You can roll back to the released version with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">dplyr&#34;</span>)</code></pre></div><h2 id="features">Features</h2><p>dplyr 0.6.0 is a major release including over 100 bug fixes and improvements. There are three big changes that I want to touch on here:</p><ul><li><p>Databases</p></li><li><p>Improved encoding support (particularly for CJK on windows)</p></li><li><p>Tidyeval, a new framework for programming with dplyr</p></li></ul><p>You can see a complete list of changes in the draft <a href="https://github.com/tidyverse/dplyr/releases/tag/v0.6.0-rc">release notes</a>.</p><h3 id="databases">Databases</h3><p>Almost all database related code has been moved out of dplyr and into a new package, <a href="http://github.com/hadley/dbplyr/">dbplyr</a>. This makes dplyr simpler, and will make it easier to release fixes for bugs that only affect databases.</p><p>To install the development version of dbplyr so you can try it out, run:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">devtools<span style="color:#666">::</span><span style="color:#06287e">install_github</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">hadley/dbplyr&#34;</span>)</code></pre></div><p>There&rsquo;s one major change, as well as a whole heap of bug fixes and minor improvements. It is now no longer necessary to create a remote &ldquo;src&rdquo;. Instead you can work directly with the database connection returned by DBI, reflecting the robustness of the DBI ecosystem. Thanks largely to the work of <a href="https://github.com/krlmlr">Kirill Muller</a> (funded by the <a href="https://www.r-consortium.org">R Consortium</a>) DBI backends are now much more consistent, comprehensive, and easier to use. That means that there&rsquo;s no longer a need for a layer between you and DBI.</p><p>You can continue to use <code>src_mysql()</code>, <code>src_postgres()</code>, and <code>src_sqlite()</code> (which still live in dplyr), but I recommend a new style that makes the connection to DBI more clear:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">con <span style="color:#666">&lt;-</span> DBI<span style="color:#666">::</span><span style="color:#06287e">dbConnect</span>(RSQLite<span style="color:#666">::</span><span style="color:#06287e">SQLite</span>(), <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">:memory:&#34;</span>)DBI<span style="color:#666">::</span><span style="color:#06287e">dbWriteTable</span>(con, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">iris&#34;</span>, iris)<span style="color:#60a0b0;font-style:italic">#&gt; [1] TRUE</span>iris2 <span style="color:#666">&lt;-</span> <span style="color:#06287e">tbl</span>(con, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">iris&#34;</span>)iris2<span style="color:#60a0b0;font-style:italic">#&gt; Source: table&lt;iris&gt; [?? x 5]</span><span style="color:#60a0b0;font-style:italic">#&gt; Database: sqlite 3.11.1 [:memory:]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; Sepal.Length Sepal.Width Petal.Length Petal.Width Species</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;chr&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 5.1 3.5 1.4 0.2 setosa</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 4.9 3.0 1.4 0.2 setosa</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 4.7 3.2 1.3 0.2 setosa</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 4.6 3.1 1.5 0.2 setosa</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 5.0 3.6 1.4 0.2 setosa</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 5.4 3.9 1.7 0.4 setosa</span><span style="color:#60a0b0;font-style:italic">#&gt; 7 4.6 3.4 1.4 0.3 setosa</span><span style="color:#60a0b0;font-style:italic">#&gt; 8 5.0 3.4 1.5 0.2 setosa</span><span style="color:#60a0b0;font-style:italic">#&gt; 9 4.4 2.9 1.4 0.2 setosa</span><span style="color:#60a0b0;font-style:italic">#&gt; 10 4.9 3.1 1.5 0.1 setosa</span><span style="color:#60a0b0;font-style:italic">#&gt; # ... with more rows</span></code></pre></div><p>This is particularly useful if you want to perform non-SELECT queries as you can do whatever you want with <code>DBI::dbGetQuery()</code> and <code>DBI::dbExecute()</code>.</p><p>If you&rsquo;ve implemented a database backend for dplyr, please read the <a href="https://github.com/hadley/dbplyr/blob/master/NEWS.md#backends">backend news</a> to see what&rsquo;s changed from your perspective (not much). If you want to ensure your package works with both the current and previous version of dplyr, see <code>wrap_dbplyr_obj()</code> for helpers.</p><h3 id="character-encoding">Character encoding</h3><p>We have done a lot of work to ensure that dplyr works with encodings other that Latin1 on Windows. This is most likely to affect you if you work with data that contains Chinese, Japanese, or Korean (CJK) characters. dplyr should now just work with such data.</p><h3 id="tidyeval">Tidyeval</h3><p>dplyr has a new approach to non-standard evaluation (NSE) called tidyeval. Tidyeval is described in detail in a new <a href="http://dplyr.tidyverse.org/articles/programming.html">vignette about programming with dplyr</a> but, in brief, it gives you the ability to interpolate values in contexts where dplyr usually works with expressions:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">my_var <span style="color:#666">&lt;-</span> <span style="color:#06287e">quo</span>(homeworld)starwars <span style="color:#666">%&gt;%</span><span style="color:#06287e">group_by</span>(<span style="color:#666">!</span><span style="color:#666">!</span>my_var) <span style="color:#666">%&gt;%</span><span style="color:#06287e">summarise_at</span>(<span style="color:#06287e">vars</span>(height<span style="color:#666">:</span>mass), mean, na.rm <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>)<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 49 × 3</span><span style="color:#60a0b0;font-style:italic">#&gt; homeworld height mass</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 Alderaan 176.3333 64.0</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 Aleen Minor 79.0000 15.0</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 Bespin 175.0000 79.0</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 Bestine IV 180.0000 110.0</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 Cato Neimoidia 191.0000 90.0</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 Cerea 198.0000 82.0</span><span style="color:#60a0b0;font-style:italic">#&gt; 7 Champala 196.0000 NaN</span><span style="color:#60a0b0;font-style:italic">#&gt; 8 Chandrila 150.0000 NaN</span><span style="color:#60a0b0;font-style:italic">#&gt; 9 Concord Dawn 183.0000 79.0</span><span style="color:#60a0b0;font-style:italic">#&gt; 10 Corellia 175.0000 78.5</span><span style="color:#60a0b0;font-style:italic">#&gt; # ... with 39 more rows</span></code></pre></div><p>This will make it much easier to eliminate copy-and-pasted dplyr code by extracting repeated code into a function.</p><p>This also means that the underscored version of each main verb (<code>filter_()</code>, <code>select_()</code> etc). is no longer needed, and so these functions have been deprecated (but remain around for backward compatibility).</p></description></item><item><title>tidyverse updates</title><link>https://www.rstudio.com/blog/tidyverse-updates/</link><pubDate>Wed, 12 Apr 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/tidyverse-updates/</guid><description><p>Over the couple of months there have been a bunch of smaller releases to packages in the <a href="http://tidyverse.org">tidyverse</a>. This includes:</p><ul><li><p><a href="http://forcats.tidyverse.org">forcats</a> 0.2.0, for working with factors.</p></li><li><p><a href="http://readr.tidyverse.org">readr</a> 1.1.0, for reading flat-files from disk.</p></li><li><p><a href="http://stringr.tidyverse.org">stringr</a> 1.2.0, for manipulating strings.</p></li><li><p><a href="http://tibble.tidyverse.org">tibble</a> 1.3.0, a modern re-imagining of the data frame.</p></li></ul><p>This blog post summarises the most important new features, and points to the full release notes where you can learn more.</p><p>(If you&rsquo;ve never heard of the tidyverse before, it&rsquo;s an set of packages that are designed to work together to help you do data science. The best place to learn all about it is <a href="http://r4ds.had.co.nz">R for Data Science</a>.)</p><h2 id="forcats-020">forcats 0.2.0</h2><p>forcats has three new functions:</p><ul><li><p><code>as_factor()</code> is a generic version of <code>as.factor()</code>, which creates factors from character vectors ordered by appearance, rather than alphabetically. This ensures means that <code>as_factor(x)</code> will always return the same result, regardless of the current locale.</p></li><li><p><code>fct_other()</code> makes it easier to convert selected levels to &ldquo;other&rdquo;:</p></li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">x <span style="color:#666">&lt;-</span> <span style="color:#06287e">factor</span>(<span style="color:#06287e">rep</span>(<span style="color:#007020;font-weight:bold">LETTERS</span>[1<span style="color:#666">:</span><span style="color:#40a070">6</span>], times <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#40a070">10</span>, <span style="color:#40a070">5</span>, <span style="color:#40a070">1</span>, <span style="color:#40a070">1</span>, <span style="color:#40a070">1</span>, <span style="color:#40a070">1</span>)))x <span style="color:#666">%&gt;%</span><span style="color:#06287e">fct_other</span>(keep <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">A&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">B&#34;</span>)) <span style="color:#666">%&gt;%</span><span style="color:#06287e">fct_count</span>()<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 3 × 2</span><span style="color:#60a0b0;font-style:italic">#&gt; f n</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 A 10</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 B 5</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 Other 4</span>x <span style="color:#666">%&gt;%</span><span style="color:#06287e">fct_other</span>(drop <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">A&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">B&#34;</span>)) <span style="color:#666">%&gt;%</span><span style="color:#06287e">fct_count</span>()<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 5 × 2</span><span style="color:#60a0b0;font-style:italic">#&gt; f n</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 C 1</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 D 1</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 E 1</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 F 1</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 Other 15</span></code></pre></div><ul><li><code>fct_relabel()</code> allows programmatic relabeling of levels:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">x <span style="color:#666">&lt;-</span> <span style="color:#06287e">factor</span>(<span style="color:#007020;font-weight:bold">letters</span>[1<span style="color:#666">:</span><span style="color:#40a070">3</span>])x<span style="color:#60a0b0;font-style:italic">#&gt; [1] a b c</span><span style="color:#60a0b0;font-style:italic">#&gt; Levels: a b c</span>x <span style="color:#666">%&gt;%</span> <span style="color:#06287e">fct_relabel</span>(<span style="color:#06287e">function</span>(x) <span style="color:#06287e">paste0</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">-&#34;</span>, x, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">-&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; [1] -a- -b- -c-</span><span style="color:#60a0b0;font-style:italic">#&gt; Levels: -a- -b- -c-</span></code></pre></div><p>See the full list of other changes in the <a href="https://github.com/tidyverse/forcats/releases/tag/v0.2.0">release notes</a>.</p><h2 id="stringr-120">stringr 1.2.0</h2><p>This release includes a change to the API: <code>str_match_all()</code> now returns NA if an optional group doesn&rsquo;t match (previously it returned &ldquo;&quot;). This is more consistent with <code>str_match()</code> and other match failures.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">x <span style="color:#666">&lt;-</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a=1,b=2&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">c=3&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">d=&#34;</span>)x <span style="color:#666">%&gt;%</span> <span style="color:#06287e">str_match</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">(.)=(\\d)?&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [,1] [,2] [,3]</span><span style="color:#60a0b0;font-style:italic">#&gt; [1,] &#34;a=1&#34; &#34;a&#34; &#34;1&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt; [2,] &#34;c=3&#34; &#34;c&#34; &#34;3&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt; [3,] &#34;d=&#34; &#34;d&#34; NA</span>x <span style="color:#666">%&gt;%</span> <span style="color:#06287e">str_match_all</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">(.)=(\\d)?,?&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [[1]]</span><span style="color:#60a0b0;font-style:italic">#&gt; [,1] [,2] [,3]</span><span style="color:#60a0b0;font-style:italic">#&gt; [1,] &#34;a=1,&#34; &#34;a&#34; &#34;1&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt; [2,] &#34;b=2&#34; &#34;b&#34; &#34;2&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; [[2]]</span><span style="color:#60a0b0;font-style:italic">#&gt; [,1] [,2] [,3]</span><span style="color:#60a0b0;font-style:italic">#&gt; [1,] &#34;c=3&#34; &#34;c&#34; &#34;3&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; [[3]]</span><span style="color:#60a0b0;font-style:italic">#&gt; [,1] [,2] [,3]</span><span style="color:#60a0b0;font-style:italic">#&gt; [1,] &#34;d=&#34; &#34;d&#34; NA</span></code></pre></div><p>There are three new features:</p><ul><li>In <code>str_replace()</code>, <code>replacement</code> can now be a function. The function is once for each match and its return value will be used as the replacement.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">redact <span style="color:#666">&lt;-</span> <span style="color:#06287e">function</span>(x) {<span style="color:#06287e">str_dup</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">-&#34;</span>, <span style="color:#06287e">str_length</span>(x))}x <span style="color:#666">&lt;-</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">It cost $500&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">We spent $1,200 on stickers&#34;</span>)x <span style="color:#666">%&gt;%</span> <span style="color:#06287e">str_replace_all</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">\\$[0-9,]+&#34;</span>, redact)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;It cost ----&#34; &#34;We spent ------ on stickers&#34;</span></code></pre></div><ul><li>New <code>str_which()</code> mimics <code>grep()</code>:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">fruit <span style="color:#666">&lt;-</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">apple&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">banana&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">pear&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">pinapple&#34;</span>)<span style="color:#60a0b0;font-style:italic"># Matching positions</span><span style="color:#06287e">str_which</span>(fruit, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">p&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] 1 3 4</span><span style="color:#60a0b0;font-style:italic"># Matching values</span><span style="color:#06287e">str_subset</span>(fruit, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">p&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;apple&#34; &#34;pear&#34; &#34;pinapple&#34;</span></code></pre></div><ul><li>A new vignette (<a href="http://stringr.tidyverse.org/articles/regular-expressions.html"><code>vignette(&quot;regular-expressions&quot;)</code></a>) describes the details of the regular expressions supported by stringr. The main vignette (<a href="http://stringr.tidyverse.org/articles/stringr.html"><code>vignette(&quot;stringr&quot;)</code></a>) has been updated to give a high-level overview of the package.</li></ul><p>See the full list of other changes in the <a href="https://github.com/tidyverse/stringr/releases/tag/v1.2.0">release notes</a>.</p><h2 id="readr-110">readr 1.1.0</h2><p>readr gains two new features:</p><ul><li>All <code>write_*()</code> functions now support connections. This means that that you can write directly to compressed formats such as <code>.gz</code>, <code>bz2</code> or <code>.xz</code> (and readr will automatically do so if you use one of those suffixes).</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">write_csv</span>(iris, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">iris.csv.bz2&#34;</span>)</code></pre></div><ul><li><code>parse_factor(levels = NULL)</code> and <code>col_factor(levels = NULL)</code> will produce a factor column based on the levels in the data, mimicing factor parsing in base R (with the exception that levels are created in the order seen).</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">iris2 <span style="color:#666">&lt;-</span> <span style="color:#06287e">read_csv</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">iris.csv.bz2&#34;</span>, col_types <span style="color:#666">=</span> <span style="color:#06287e">cols</span>(Species <span style="color:#666">=</span> <span style="color:#06287e">col_factor</span>(levels <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">NULL</span>)))</code></pre></div><p>See the full list of other changes in the <a href="https://github.com/tidyverse/readr/releases/tag/v1.1.0">release notes</a>.</p><h2 id="tibble-130">tibble 1.3.0</h2><p>tibble has one handy new function: <code>deframe()</code> is the opposite of <code>enframe()</code>: it turns a two-column data frame into a named vector.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df <span style="color:#666">&lt;-</span> <span style="color:#06287e">tibble</span>(x <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">b&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">c&#34;</span>), y <span style="color:#666">=</span> <span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">3</span>)<span style="color:#06287e">deframe</span>(df)<span style="color:#60a0b0;font-style:italic">#&gt; a b c</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 2 3</span></code></pre></div><p>See the full list of other changes in the <a href="https://github.com/tidyverse/tibble/releases/tag/v1.3.0">release notes</a>.</p></description></item><item><title>RStudio Connect 1.4.6</title><link>https://www.rstudio.com/blog/rstudio-connect-1-4-6/</link><pubDate>Tue, 11 Apr 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-1-4-6/</guid><description><p>We&rsquo;re excited to announce the release of <a href="https://www.rstudio.com/products/connect/">RStudio Connect: version 1.4.6</a>. This is an incremental release which features significantly improved startup time and support for server-side Shiny bookmarks.</p><p><img src="https://rstudioblog.files.wordpress.com/2017/04/media-20170407.png" alt="Creating a server-side Shiny bookmark in RStudio Connect"></p><p><strong>Improved Startup &amp; Job Listing Time</strong></p><p>We now track R process jobs in the database which allows us to list and query jobs much more quickly. This decreases the startup time of the RStudio Connect service &ndash; allowing even the busiest of servers to spin up in a matter of seconds. Additionally, operations that involve listing jobs such as viewing process logs for a particular application should be noticeably faster.</p><p><strong>Server-Side Shiny Bookmarks</strong></p><p>Shiny v0.14 introduced a feature by which users could <a href="https://shiny.rstudio.com/articles/bookmarking-state.html">bookmark the current state of the application</a> by either encoding the state in the URL or saving the state to the server. As of this release, RStudio Connect now supports server-side bookmarking of Shiny applications.</p><p>Other notable changes this release:</p><ul><li><p>BREAKING: Changed the default for <code>Authorization.DefaultUserRole</code> from <code>publisher</code> to <code>viewer</code>. New users will now be created with a <code>viewer</code> account until promoted. The <a href="http://docs.rstudio.com/connect/1.4.5/admin/user-management.html#user-roles">user roles</a> documentation explains the differences. To restore the previous behavior, set <code>DefaultUserRole = publisher</code>. Because viewer users cannot be added as collaborators on content, this means that in order to add a remote user as a collaborator on content you must first create their account, then promote them to a publisher account.</p></li><li><p>Fixed a bug in the previous release that had broken <code>Applications.ViewerOnDemandReports</code> and <code>Applications.ViewerCustomizedReports</code>. These settings are again functional and allow you to manage the capabilities of a viewer of a parameterized report on the server.</p></li><li><p>Tune the number of concurrent processes to use when building R packages. This is controlled with the <code>Server.CompilationConcurrency</code> setting and passed as the value to the make flag <code>-jNUM</code>. The default is to permit four concurrent processes. Decrease this setting in low memory environments.</p></li><li><p>The <code>/etc/rstudio-connect/rstudio-connect.gcfg</code> file is installed with more restrictive permissions.</p></li><li><p>Log file downloads include a more descriptive file name by default. Previously, we used the naming convention <code>&lt;jobId&gt;.log</code>, which resulted in file names like <code>GBFCaiPE6tegbrEM.log</code>. Now, we use the naming convention <code>rstudio-connect.&lt;appId&gt;.&lt;reportId&gt;.&lt;bundleId&gt;.&lt;jobType&gt;.&lt;jobId&gt;.log</code>, which results in file names like <code>rstudio-connect.34.259.15.packrat_restore.GBFCaiPE6tegbrEM.log</code>.</p></li><li><p>Bundle the admin guide and user guide in the product. You can access both from the Documentation tab.</p></li><li><p>Implemented improved, pop-out filtering panel when filtering content, which offers a better experience on small/mobile screens.</p></li><li><p>Improvements to the parameterized report pane when the viewer does not have the authority to render custom versions of the document.</p></li><li><p>Database performance improvements which should improve performance in high-traffic environments.</p></li></ul><blockquote><h4 id="upgrade-planning">Upgrade Planning</h4><p>The migration of jobs from disk to the database may take a few minutes. The server will be unavailable during this migration which will be performed the first time RStudio Connect v1.4.6 starts. Even on the busiest of servers we would expect this migration to complete in under 5 minutes.</p></blockquote><p>If you haven&rsquo;t yet had a chance to download and try <a href="https://rstudio.com/products/connect">RStudio Connect</a> we encourage you to do so. RStudio Connect is the best way to share all the work that you do in R (Shiny apps, R Markdown documents, plots, dashboards, etc.) with collaborators, colleagues, or customers.</p><p>You can find more details or download a 45 day evaluation of the product at <a href="https://www.rstudio.com/products/connect/">https://www.rstudio.com/products/connect/</a>. Additional resources can be found below.</p><ul><li><p><a href="https://www.rstudio.com/products/connect/">RStudio Connect home page &amp; downloads</a></p></li><li><p><a href="http://docs.rstudio.com/connect/admin/">RStudio Connect Admin Guide</a></p></li><li><p><a href="https://www.rstudio.com/wp-content/uploads/2016/01/RSC-IT-Q-and-A.pdf">What IT needs to know about RStudio Connect</a></p></li><li><p><a href="http://docs.rstudio.com/connect/news/">Detailed news and changes between each version</a></p></li><li><p><a href="https://www.rstudio.com/pricing/#ConnectPricing">Pricing</a></p></li><li><p><a href="https://beta.rstudioconnect.com/connect/">An online preview of RStudio Connect</a></p></li></ul></description></item><item><title>Shiny 1.0.1</title><link>https://www.rstudio.com/blog/shiny-1-0-1/</link><pubDate>Wed, 05 Apr 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-1-0-1/</guid><description><p>Shiny 1.0.1 is now available on CRAN! This release primarily includes bug fixes and minor new features.</p><p>The most notable additions in this version of Shiny are the introduction of the <code>reactiveVal()</code> function (like <code>reactiveValues()</code>, but it only stores a single value), and that the choices of <code>radioButtons()</code> and <code>checkboxGroupInput()</code> can now contain HTML content instead of just plain text. We&rsquo;ve also added compatibility for the development version of <code>ggplot2</code>.</p><h2 id="breaking-changes">Breaking changes</h2><p>We unintentionally introduced a minor breaking change in that <code>checkboxGroupInput</code> used to accept <code>choices = NULL</code> to create an empty input. With Shiny 1.0.1, this throws an error; using <code>choices = character(0)</code> works. We intend to eliminate this breakage in Shiny 1.0.2.</p><p><strong>Update</strong> (4/20/2017): This has now been fixed in Shiny 1.0.2, currently available on CRAN.</p><p>Also, the <code>selected</code> argument for <code>radioButtons</code>, <code>checkboxGroupInput</code>, and <code>selectInput</code> once upon a time accepted the name of a choice, instead of the value of a choice; this behavior has been deprecated with a warning for several years now, and in Shiny 1.0.1 it is no longer supported at all.</p><h2 id="storing-single-reactive-values-with-reactiveval">Storing single reactive values with <code>reactiveVal</code></h2><p>The <code>reactiveValues</code> object has been a part of Shiny since the earliest betas. It acts like a reactive version of an environment or named list, in that you can store and retrieve values using names:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">rv <span style="color:#666">&lt;-</span> <span style="color:#06287e">reactiveValues</span>(clicks <span style="color:#666">=</span> <span style="color:#40a070">0</span>)<span style="color:#06287e">observeEvent</span>(input<span style="color:#666">$</span>button, {currentValue <span style="color:#666">&lt;-</span> rv<span style="color:#666">$</span>clicksrv<span style="color:#666">$</span>clicks <span style="color:#666">&lt;-</span> currentValue <span style="color:#666">+</span> <span style="color:#40a070">1</span>})</code></pre></div><p>If you only have a single value to store, though, it&rsquo;s a little awkward that you have to use a data structure designed for multiple values.</p><p>With the new <code>reactiveVal</code> function, you can now create a reactive object for a single variable:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">clicks <span style="color:#666">&lt;-</span> <span style="color:#06287e">reactiveVal</span>(<span style="color:#40a070">0</span>)<span style="color:#06287e">observeEvent</span>(input<span style="color:#666">$</span>button, {currentValue <span style="color:#666">&lt;-</span> <span style="color:#06287e">clicks</span>()<span style="color:#06287e">clicks</span>(currentValue <span style="color:#666">+</span> <span style="color:#40a070">1</span>)})</code></pre></div><p>As you can see in this example, you can read the value by calling it like a function with no arguments; and you set the value by calling it with one argument.</p><p>This has the added benefit that you can easily pass the <code>clicks</code> object to another function or module (no need to wrap it in a <code>reactive()</code>).</p><h2 id="more-flexible-radiobuttons-and-checkboxgroupinput">More flexible <code>radioButtons</code> and <code>checkboxGroupInput</code></h2><p>It&rsquo;s now possible to create radio button and checkbox inputs with arbitrary HTML as labels. To do so, however, you need to pass different arguments to the functions. Now, when creating (or updating) either of <code>radioButtons()</code> or <code>checkboxGroupInput()</code>, you can specify the options in one of two (mutually exclusive) ways:</p><ul><li><p><strong>What we&rsquo;ve always had</strong>:Use the <code>choices</code> argument, which must be a vector or list. The names of each element are displayed in the app UI as labels (i.e. what the user sees in your app), and the values are used for computation (i.e. the value is what&rsquo;s returned by <code>input$rd</code>, where <code>rd</code> is a <code>radioButtons()</code> input). If the vector (or list) is unnamed, the values provided are used for both the UI labels and the server values.</p></li><li><p><strong>What&rsquo;s new and allows HTML</strong>:Use both the <code>choiceNames</code> and the <code>choiceValues</code> arguments, each of which must be an <em>unnamed</em> vector or list (and both must have the same length). The elements in <code>choiceValues</code> must still be plain text (these are the values used for computation). But the elements in <code>choiceNames</code> (the UI labels) can be constructed out of HTML, either using the <code>HTML()</code> function, or an HTML tag generation function, like <code>tags$img()</code> and <code>icon()</code>.</p></li></ul><p><a href="https://gist.github.com/bborgesr/f2c865556af3b92e6991e1a34ced2a4a">Here&rsquo;s an example app</a> that demos the new functionality (in this case, we have a <code>checkboxGroupInput()</code> whose labels include the flag of the country they correspond to):</p><p><img src="https://rstudioblog.files.wordpress.com/2017/04/countries-shadow.png" alt=""></p><h2 id="ggplot2--221-compatibility"><code>ggplot2</code> &gt; 2.2.1 compatibility</h2><p>The development version of <code>ggplot2</code> has some changes that break compatibility with earlier versions of Shiny. The fixes in Shiny 1.0.1 will allow it to work with any version of <code>ggplot2</code>.</p><h2 id="a-note-on-shiny-v100">A note on Shiny v1.0.0</h2><p>In January of this year, we quietly released Shiny 1.0.0 to CRAN. A lot of work went into that release, but other than minor bug fixes and features, it was mostly laying the foundation for some important features that will arrive in the coming months. So if you&rsquo;re wondering if you missed the blog post for Shiny 1.0.0, you didn&rsquo;t.</p><h2 id="full-changes">Full changes</h2><p>As always, you can view the full changelog for Shiny 1.0.1 (and 1.0.0!) in our <a href="https://github.com/rstudio/shiny/blob/master/NEWS.md">NEWS.md</a> file.</p></description></item><item><title>RStudio Connect 1.4.4.1</title><link>https://www.rstudio.com/blog/rstudio-connect-1-4-4-1/</link><pubDate>Tue, 28 Mar 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-1-4-4-1/</guid><description><p>We&rsquo;re excited to announce the release of <a href="https://www.rstudio.com/products/connect/">RStudio Connect: version 1.4.4.1</a>. This release includes the ability to manage different versions of your work on RStudio Connect.</p><p><img src="https://rstudioblog.files.wordpress.com/2017/03/screen-shot-2017-03-27-at-4-55-53-pm.png" alt="Managing old versions of deployed content in RStudio Connect"></p><p><strong>Rollback / Roll Forward</strong>The most notable feature of this release is the ability to &ldquo;rollback&rdquo; to a previously deployed version of your work or &ldquo;roll forward&rdquo; to a more recent version of your work.</p><p>You can also download a particular version, perhaps as a starting place for a new report or application, and delete old versions that you want to remove from the server.</p><p>Other important features allow you to:</p><ul><li><p>Specify the number of versions to retain. You can alter the setting <code>Applications.BundleRetentionLimit</code> to specify how many versions of your applications you want to keep on disk. By default, we retain all bundles eternally.</p></li><li><p>Limit the number of scheduled reports that will be run concurrently using the <code>Applications.ScheduleConcurrency</code> setting. This setting will help ensure that your server isn&rsquo;t overwhelmed by too many reports all scheduled to run at the same time of day. The default is set to 2.</p></li><li><p>Create a printable view of your content with a new &ldquo;Print&rdquo; menu option.</p></li><li><p>Notify users of unsaved changes before they take an action in parameterized reports.</p></li></ul><p>The release also includes numerous security and stability improvements.</p><p>If you haven&rsquo;t yet had a chance to download and try <a href="https://rstudio.com/products/connect">RStudio Connect</a> we encourage you to do so. RStudio Connect is the best way to share all the work that you do in R (Shiny apps, R Markdown documents, plots, dashboards, etc.) with collaborators, colleagues, or customers.</p><p>You can find more details or download a 45 day evaluation of the product at <a href="https://www.rstudio.com/products/connect/">https://www.rstudio.com/products/connect/</a>. Additional resources can be found below.</p><ul><li><p><a href="https://www.rstudio.com/products/connect/">RStudio Connect home page &amp; downloads</a></p></li><li><p><a href="http://docs.rstudio.com/connect/admin/">RStudio Connect Admin Guide</a></p></li><li><p><a href="https://www.rstudio.com/wp-content/uploads/2016/01/RSC-IT-Q-and-A.pdf">What IT needs to know about RStudio Connect</a></p></li><li><p><a href="http://docs.rstudio.com/connect/news/">Detailed news and changes between each version</a></p></li><li><p><a href="https://www.rstudio.com/pricing/#ConnectPricing">Pricing</a></p></li><li><p><a href="https://beta.rstudioconnect.com/connect/">An online preview of RStudio Connect</a></p></li></ul></description></item><item><title>leaflet 1.1.0</title><link>https://www.rstudio.com/blog/leaflet-1-1-0/</link><pubDate>Wed, 22 Feb 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/leaflet-1-1-0/</guid><description><p>Leaflet 1.1.0 is now available on CRAN! The <a href="https://rstudio.github.io/leaflet/">Leaflet package</a> is a tidy wrapper for the <a href="http://leafletjs.com/">Leaflet.js</a> mapping library, and makes it incredibly easy to generate interactive maps based on spatial data you have in R.</p><p><a href="http://rstudio.github.io/leaflet/choropleths.html"><img src="https://rstudioblog.files.wordpress.com/2017/02/leaflet-choro.png" alt="leaflet-choro"></a></p><p>This release was nearly a year in the making, and includes many important new features.</p><ul><li><p>Easily add textual <a href="http://rstudio.github.io/leaflet/popups.html#labels">labels</a> on markers, polygons, etc., either on hover or statically</p></li><li><p><a href="http://rstudio.github.io/leaflet/shapes.html#highlighting-shapes">Highlight</a> polygons, lines, circles, and rectangles on hover</p></li><li><p>Markers can now be <a href="http://rstudio.github.io/leaflet/markers.html#awesome-icons">configured</a> with a variety of colors and icons, via integration with <a href="https://github.com/lvoogdt/Leaflet.awesome-markers">Leaflet.awesome-markers</a></p></li><li><p>Built-in support for many types of objects from <code>[sf](https://cran.r-project.org/web/packages/sf/index.html)</code>, a new way of representing spatial data in R (all basic <code>sf</code>/<code>sfc</code>/<code>sfg</code> types except <code>MULTIPOINT</code> and <code>GEOMETRYCOLLECTION</code> are directly supported)</p></li><li><p>Projections other than Web Mercator are <a href="http://rstudio.github.io/leaflet/projections.html">now supported</a> via <a href="https://github.com/kartena/Proj4Leaflet">Proj4Leaflet</a></p></li><li><p>Color palette functions now natively support <a href="https://bids.github.io/colormap/">viridis</a> palettes; use <code>&quot;viridis&quot;</code>, <code>&quot;magma&quot;</code>, <code>&quot;inferno&quot;</code>, or <code>&quot;plasma&quot;</code> as the palette argument</p></li><li><p>Discrete color palette functions (<code>colorBin</code>, <code>colorQuantile</code>, and <code>colorFactor</code>) work much better with <a href="http://colorbrewer2.org/">color brewer</a> palettes</p></li><li><p>Integration with <a href="http://rstudio.github.io/leaflet/morefeatures.html">several Leaflet.js utility plugins</a></p></li><li><p>Data with <code>NA</code> points or zero rows no longer causes errors</p></li><li><p>Support for linked brushing and filtering, via <a href="https://rstudio.github.io/crosstalk/">Crosstalk</a> (more about this to come in another blog post)</p></li></ul><p>Many thanks to <a href="https://github.com/bhaskarvk">@bhaskarvk</a> who contributed much of the code for this release.</p><p>Going forward, our intention is to prevent any more Leaflet.js plugins from accreting in the core leaflet package. Instead, we have made it possible to write 3rd party R packages that extend leaflet (though the process to do this is not documented yet). In the meantime, Bhaskar has started developing his own <a href="https://github.com/bhaskarvk/leaflet.extras">leaflet.extras</a> package; it already supports several plugins, for everything from animated markers to heatmaps.</p></description></item><item><title>rstudio::conf 2017 session recordings are now available</title><link>https://www.rstudio.com/blog/rstudioconf-2017-session-recordings-are-now-available/</link><pubDate>Wed, 15 Feb 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudioconf-2017-session-recordings-are-now-available/</guid><description><p>Couldn&rsquo;t make it to Orlando in January? We&rsquo;re excited to bring you the next best thing.</p><p>Whether you missed the conference, missed a talk, or just want to refresh your memory, you can find all the recordings from the first ever conference about All Things R &amp; RStudio at <a href="https://www.rstudio.com/resources/webinars/#rstudioconf">https://www.rstudio.com/resources/webinars/#rstudioconf</a>. Just click on +rstudio::conf 2017 when you get there to expand the list.</p><p>Of course, the session recordings can&rsquo;t capture the complete &ldquo;in person&rdquo; experience. You had to be there to immerse in the 2 day workshops led by Hadley Wickham, Garrett Grolemund, and Joe Cheng; join in cocktails and conversation at the Datacamp Happy Hour; attend the exclusive Friday evening access to Universal Studio&rsquo;s Wizarding World of Harry Potter under a full moon; hang out at the RStudio Connect lounge and with our friends from Cloudera, Mango Solutions, MetrumRG, Quantide, and Supstat; or make and renew friendships with more than 400 attendees from every line of work and the entire RStudio team. We hope you&rsquo;ll consider joining us in January 2018 - location and exact dates still to be announced.</p><p>In the meantime, the recordings showcase some of the most interesting uses of R and RStudio for everyone.</p><p>Enjoy and share!</p></description></item><item><title>See RStudio + sparklyr for big data at Strata + Hadoop World</title><link>https://www.rstudio.com/blog/see-rstudio-sparklyr-for-big-data-at-strata-hadoop-world/</link><pubDate>Mon, 13 Feb 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/see-rstudio-sparklyr-for-big-data-at-strata-hadoop-world/</guid><description><p>If big data is your thing, you use R, and you&rsquo;re headed to Strata + Hadoop World in San Jose March 13 &amp; 14th, you can experience in person how easy and practical it is to analyze big data with R and Spark.</p><p>In a beginner level talk by RStudio&rsquo;s Edgar Ruiz and an intermediate level workshop by Win-Vector&rsquo;s John Mount, we cover the spectrum: What R is, what Spark is, how Sparklyr works, and what is required to set up and tune a Spark cluster. You&rsquo;ll also learn practical applications including: how to quickly set up a local Spark instance, store big data in Spark and then connect to the data with R, use R to apply machine-learning algorithms to big data stored in Spark, and filter and aggregate big data stored in Spark and then import the results into R for analysis and visualization.</p><p>2:40pm–3:20pm Wednesday, March 15, 2017Sparklyr: An R interface for Apache SparkEdgar Ruiz (RStudio)Primary topic: Spark &amp; beyondLocation: LL21 C/DLevel: BeginnerSecondary topics: R</p><p>1:30pm–5:00pm Tuesday, March 14, 2017Modeling big data with R, sparklyr, and Apache SparkJohn Mount (Win-Vector LLC)Primary topic: Data science &amp; advanced analyticsLocation: LL21 C/DLevel: IntermediateSecondary topics: R</p><p>While you&rsquo;re at the conference be sure to look us up in the Innovator&rsquo;s Pavilion - booth number P8 during the Expo Hall hours. We&rsquo;ll have the latest books from RStudio authors, t-shirts to win, demonstrations of RStudio Connect and RStudio Server Pro and, of course, stickers and cheatsheets. Share with us what you&rsquo;re doing with RStudio and get your product and company questions answered by RStudio employees.</p><p>See you in San Jose! (<a href="https://conferences.oreilly.com/strata/strata-ca">https://conferences.oreilly.com/strata/strata-ca</a>)</p></description></item><item><title>RStudio Connect 1.4.2</title><link>https://www.rstudio.com/blog/rstudio-connect-1-4-2/</link><pubDate>Thu, 09 Feb 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-connect-1-4-2/</guid><description><p>We&rsquo;re excited to announce the latest release of <a href="https://www.rstudio.com/products/connect/">RStudio Connect: version 1.4.2</a>. This release includes a number of notable features including an overhauled interface for parameterized R Markdown reports.</p><p><img src="https://rstudioblog.files.wordpress.com/2017/02/screen-shot-2017-02-07-at-9-06-03-am.png" alt="Enhanced Parameterized R Markdown Reports"></p><p class="caption">Enhanced Parameterized R Markdown Reports</p><p>The most notable feature in this release is the ability to publish <a href="https://rmarkdown.rstudio.com/developer_parameterized_reports.html">parameterized R Markdown</a> reports that are easier for anyone to customize. If you&rsquo;re unfamiliar, parameterized R Markdown reports allow you to inject input parameters into your R Markdown document to alter what analysis the report performs. The parameters of your R Markdown report are now visible on the left-hand sidebar, allowing users to easily tweak the inputs to the document and quickly view the output in the browser.</p><p>Users even have the opportunity to create private versions of the report which they can schedule to run again, email, or save and revisit in the browser. Of course, you can continue to use the wide variety of output formats (notebooks, dashboards, books, and others) while using parameterized R Markdown.</p><p>In addition to the parameterized report overhaul, there are some other notable features included in this release.</p><ul><li><p><strong>Content private by default</strong> - Content is set to private (&ldquo;Just Me&rdquo;) by default. Users can still change the visibility of their content before publishing, as before.</p></li><li><p><strong>Execute R as the authenticated viewer</strong> - You can now choose to have some applications execute their underlying R process as the authenticated viewer currently looking at the app. This allows applications to access any data or resource that the associated user has access to on the server. Requires PAM authentication. <a href="http://docs.rstudio.com/connect/1.4.2/admin/process-management.html#process-management-runas-current">More details here</a>.</p></li></ul><p>Other important features include:</p><ul><li><p>Show progress indicator when updating a report.</p></li><li><p>Users can now filter content to include only items that they can edit or view.</p></li><li><p>Users now only count against the named user license limit after they log in for the first time.</p></li><li><p>Added support for global &ldquo;System Messages&rdquo; that can display an HTML message to your users on the landing pages. <a href="http://docs.rstudio.com/connect/1.4.2/admin/server-management.html#system-messages">Details here</a>.</p></li><li><p>Updated packrat to gain more transparency on package build errors.</p></li><li><p>Updated the list of SSL ciphers to correspond with modern best-practices.</p></li></ul><p>If you haven&rsquo;t yet had a chance to download and try <a href="https://rstudio.com/products/connect">RStudio Connect</a> we encourage you to do so. RStudio Connect is the best way to share all the work that you do in R (Shiny apps, R Markdown documents, plots, dashboards, etc.) with collaborators, colleagues, or customers.</p><p>You can find more details or download a 45 day evaluation of the product at <a href="https://www.rstudio.com/products/connect/">https://www.rstudio.com/products/connect/</a>. Additional resources can be found below.</p><ul><li><p><a href="https://www.rstudio.com/products/connect/">RStudio Connect home page &amp; downloads</a></p></li><li><p><a href="http://docs.rstudio.com/connect/admin/">RStudio Connect Admin Guide</a></p></li><li><p><a href="https://www.rstudio.com/wp-content/uploads/2016/01/RSC-IT-Q-and-A.pdf">What IT needs to know about RStudio Connect</a></p></li><li><p><a href="https://www.rstudio.com/pricing/#ConnectPricing">Pricing</a></p></li><li><p><a href="https://beta.rstudioconnect.com/connect/">An online preview of RStudio Connect</a></p></li></ul></description></item><item><title>roxygen2 6.0.0</title><link>https://www.rstudio.com/blog/roxygen2-6-0-0/</link><pubDate>Wed, 01 Feb 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/roxygen2-6-0-0/</guid><description><p>roxygen2 6.0.0 is now available on CRAN. roxygen2 helps you document your packages by turning specially formatted inline comments into R&rsquo;s standard Rd format. It automates everything that can be automated, and provides helpers for sharing documentation between topics. Learn more at <a href="http://r-pkgs.had.co.nz/man.html">http://r-pkgs.had.co.nz/man.html</a>. Install the latest version with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">roxygen2&#34;</span>)</code></pre></div><p>There are two headline features in this version of roxygen2:</p><ul><li><p>Markdown support.</p></li><li><p>Improved documentation inheritance.</p></li></ul><p>These are described in detail below.</p><p>This release also included many minor improvements and bug fixes. For a full list of changes, please see <a href="https://github.com/klutometis/roxygen/releases/tag/v6.0.0">release notes</a>. A big thanks to all the contributors to this release: <a href="https://github.com/dlebauer">@dlebauer</a>, <a href="https://github.com/fmichonneau">@fmichonneau</a>, <a href="https://github.com/gaborcsardi">@gaborcsardi</a>, <a href="https://github.com/HenrikBengtsson">@HenrikBengtsson</a>, <a href="https://github.com/jefferis">@jefferis</a>, <a href="https://github.com/jeroenooms">@jeroenooms</a>, <a href="https://github.com/jimhester">@jimhester</a>, <a href="https://github.com/kevinushey">@kevinushey</a>, <a href="https://github.com/krlmlr">@krlmlr</a>, <a href="https://github.com/LiNk-NY">@LiNk-NY</a>, <a href="https://github.com/lorenzwalthert">@lorenzwalthert</a>, <a href="https://github.com/maxheld83">@maxheld83</a>, <a href="https://github.com/nteetor">@nteetor</a>, <a href="https://github.com/shrektan">@shrektan</a>, <a href="https://github.com/yutannihilation">@yutannihilation</a></p><h2 id="markdown">Markdown</h2><p>Thanks to the hard work of <a href="http://gaborcsardi.org/">Gabor Csardi</a> you can now write roxygen2 comments in markdown. While we have tried to make markdown mode as backward compatible as possible, there are a few cases where you will need to make some minor changes. For this reason, you&rsquo;ll need to explicitly opt-in to markdown support. There are two ways to do so:</p><ul><li><p>Add <code>Roxygen: list(markdown = TRUE)</code> to your <code>DESCRIPTION</code> to turn it on everywhere.</p></li><li><p>Add <code>@md</code> to individual roxygen blocks to enable for selected topics.</p></li></ul><p>roxygen2&rsquo;s markdown dialect supports inline formatting (bold, italics, code), lists (numbered and bulleted), and a number of helpful link shortcuts:</p><ul><li><p><code>[func()]</code>: links to a function in the current package, and is translated to <code>\code{\link[=func]{func()}.</code></p></li><li><p><code>[object]</code>: links to an object in the current package, and is translated to <code>\link{object}.</code></p></li><li><p><code>[link text][object]</code>: links to an object with custom text, and is translated to <code>\link[=link text]{object}</code></p></li></ul><p>Similarly, you can link to functions and objects in other packages with <code>[pkg::func()]</code>, <code>[pkg::object]</code>, and <code>[link text][pkg::object]</code>. For a complete list of syntax, and how to handle common problems, please see <code>vignette(&quot;markdown&quot;)</code> for more details.</p><p>To convert an existing roxygen2 package to use markdown, try <a href="https://github.com/r-pkgs/roxygen2md">https://github.com/r-pkgs/roxygen2md</a>. Happy markdown-ing!</p><h2 id="improved-inheritance">Improved inheritance</h2><p>Writing documentation is challenging because you want to reduce duplication as much as possible (so you don&rsquo;t accidentally end up with inconsistent documentation) but you don&rsquo;t want the user to have to follow a spider&rsquo;s web of cross-references. This version of roxygen2 provides more support for writing documentation in one place then reusing in multiple topics.</p><p>The new <code>@inherit</code> tag allows to you inherit parameters, return, references, title, description, details, sections, and seealso from another topic. <code>@inherit my_fun</code> will inherit everything; <code>@inherit my_fun return params</code> will allow to you inherit specified components. <code>@inherits fun sections</code> will inherit all sections; if you&rsquo;d like to inherit a single section, you can use <code>@inheritSection fun title</code>. You can also inherit from a topic in another package with <code>@inherit pkg::fun</code>.</p><p>Another new tag is <code>@inheritDotParams</code>, which allows you to automatically generate parameter documentation for <code>...</code> for the common case where you pass <code>...</code> on to another function. The documentation generated is similar to the style used in <code>?plot</code> and will eventually be incorporated in to RStudio&rsquo;s autocomplete. When you pass along <code>...</code> you often override some arguments, so the tag has a flexible specification:</p><ul><li><p><code>@inheritDotParams foo</code> takes all parameters from <code>foo()</code>.</p></li><li><p><code>@inheritDotParams foo a b e:h</code> takes parameters <code>a</code>, <code>b</code>, and all parameters between <code>e</code> and <code>h</code>.</p></li><li><p><code>@inheritDotParams foo -x -y</code> takes all parameters except for <code>x</code> and <code>y</code>.</p></li></ul><p>All the <code>@inherit</code> tags (including the existing <code>@inheritParams</code>) now work recursively, so you can inherit from a function that inherited from elsewhere.</p><p>If you want to generate a basic package documentation page (accessible from <code>package?packagename</code> and <code>?packagename</code>), you can document the special sentinel value <code>&quot;_PACKAGE&quot;</code>. It automatically uses the title, description, authors, url and bug reports fields from the <code>DESCRIPTION</code>. The simplest approach is to do this:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic">#&#39; @keywords internal</span><span style="color:#4070a0">&#34;</span><span style="color:#4070a0">_PACKAGE&#34;</span></code></pre></div><p>It only includes what&rsquo;s already in the DESCRIPTION, but it will typically be easier for R users to access.</p></description></item><item><title>sparklyr 0.5: Livy and dplyr improvements</title><link>https://www.rstudio.com/blog/sparklyr-0-5/</link><pubDate>Tue, 24 Jan 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/sparklyr-0-5/</guid><description><p>We&rsquo;re happy to announce that version 0.5 of the <a href="https://cran.rstudio.com/package=sparklyr">sparklyr package</a> is now available on CRAN. The new version comes with many improvements over the first release, including:</p><ul><li><p><strong>Extended dplyr</strong> support by implementing: <code>do()</code> and <code>n_distinct()</code>.</p></li><li><p><strong>New functions</strong> including <code>sdf_quantile()</code>, <code>ft_tokenizer()</code> and <code>ft_regex_tokenizer()</code>.</p></li><li><p><strong>Improved compatibility</strong>, sparklyr now respects the value of the &lsquo;na.action&rsquo; R option and <code>dim()</code>, <code>nrow()</code> and <code>ncol()</code>.</p></li><li><p><strong>Experimental</strong> support for <a href="http://livy.io/">Livy</a> to enable clients, including <a href="https://www.rstudio.com/products/rstudio/">RStudio</a>, to connect remotely to <a href="http://spark.apache.org/">Apache Spark</a>.</p></li><li><p><strong>Improved connections</strong> by simplifying initialization and providing error diagnostics.</p></li><li><p><strong>Certified</strong> sparklyr, <a href="https://www.rstudio.com/products/rstudio-server-pro2/">RStudio Server Pro</a> and <a href="https://www.rstudio.com/products/shiny-server-pro/">ShinyServer Pro</a> with <a href="http://www.cloudera.com/">Cloudera</a>.</p></li><li><p><strong>Updated</strong> <a href="http://spark.rstudio.com">spark.rstudio.com</a> with new <a href="https://spark.rstudio.com/deployment_examples.html">deployment examples</a> and a sparklyr <a href="https://spark.rstudio.com/images/sparklyr-cheatsheet.pdf">cheatsheet</a>.</p></li></ul><p>Additional changes and improvements can be found in the <a href="https://github.com/rstudio/sparklyr/blob/master/NEWS.md">sparklyr NEWS</a> file.</p><p>For questions or feedback, please feel free to open a <a href="https://github.com/rstudio/sparklyr/issues">sparklyr github issue</a> or a <a href="http://stackoverflow.com/questions/tagged/sparklyr">sparklyr stackoverflow question</a>.</p><h2 id="extended-dplyr-support">Extended dplyr support</h2><p><code>sparklyr 0.5</code> adds supports for <code>n_distinct()</code> as a faster and more concise equivalent of <code>length(unique(x))</code> and also adds support for <code>do()</code> as a convenient way to perform multiple serial computations over a <code>group_by()</code> operation:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(sparklyr)sc <span style="color:#666">&lt;-</span> <span style="color:#06287e">spark_connect</span>(master <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">local&#34;</span>)mtcars_tbl <span style="color:#666">&lt;-</span> <span style="color:#06287e">copy_to</span>(sc, mtcars, overwrite <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>)by_cyl <span style="color:#666">&lt;-</span> <span style="color:#06287e">group_by</span>(mtcars_tbl, cyl)fit_sparklyr <span style="color:#666">&lt;-</span> by_cyl <span style="color:#666">%&gt;%</span><span style="color:#06287e">do</span>(mod <span style="color:#666">=</span> <span style="color:#06287e">ml_linear_regression</span>(mpg <span style="color:#666">~</span> disp, data <span style="color:#666">=</span> .))<span style="color:#60a0b0;font-style:italic"># display results</span>fit_sparklyr<span style="color:#666">$</span>mod</code></pre></div><p>In this case, <code>.</code> represents a Spark DataFrame, which allows us to perform operations at scale (like this linear regression) for a small set of groups. However, since each group operation is performed sequentially, it is not recommended to use <code>do()</code> with a large number of groups. The code above performs multiple linear regressions with the following output:</p><pre><code>[[1]]Call: ml_linear_regression(mpg ~ disp, data = .)Coefficients:(Intercept) disp19.081987419 0.003605119[[2]]Call: ml_linear_regression(mpg ~ disp, data = .)Coefficients:(Intercept) disp40.8719553 -0.1351418[[3]]Call: ml_linear_regression(mpg ~ disp, data = .)Coefficients:(Intercept) disp22.03279891 -0.01963409</code></pre><p>It&rsquo;s worth mentioning that while <code>sparklyr</code> provides comprehensive support for <code>dplyr</code>, <code>dplyr</code> is not strictly required while using <code>sparklyr</code>. For instance, one can make use of <code>DBI</code> without <code>dplyr</code> as follows:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(sparklyr)<span style="color:#06287e">library</span>(DBI)sc <span style="color:#666">&lt;-</span> <span style="color:#06287e">spark_connect</span>(master <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">local&#34;</span>)<span style="color:#06287e">sdf_copy_to</span>(sc, iris)<span style="color:#06287e">dbGetQuery</span>(sc, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">SELECT * FROM iris LIMIT 4&#34;</span>)</code></pre></div><pre><code> Sepal_Length Sepal_Width Petal_Length Petal_Width Species1 5.1 3.5 1.4 0.2 setosa2 4.9 3.0 1.4 0.2 setosa3 4.7 3.2 1.3 0.2 setosa4 4.6 3.1 1.5 0.2 setosa5 5.0 3.6 1.4 0.2 setosa</code></pre><h2 id="new-functions">New functions</h2><p>The new <code>sdf_quantile()</code> function computes approximate quantiles (to some relative error), while the new <code>ft_tokenizer()</code> and <code>ft_regex_tokenizer()</code> functions split a string by white spaces or regex patterns.</p><p>For example, <code>ft_tokenizer()</code> can be used as follows:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(sparklyr)<span style="color:#06287e">library</span>(janeaustenr)<span style="color:#06287e">library</span>(dplyr)sc <span style="color:#666">%&gt;%</span><span style="color:#06287e">spark_dataframe</span>() <span style="color:#666">%&gt;%</span><span style="color:#06287e">na.omit</span>() <span style="color:#666">%&gt;%</span><span style="color:#06287e">ft_tokenizer</span>(input.col <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">text&#34;</span>, output.col <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">tokens&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">head</span>(<span style="color:#40a070">4</span>)</code></pre></div><p>Which produces the following output:</p><pre><code> text book tokens&lt;chr&gt; &lt;chr&gt; &lt;list&gt;1 SENSE AND SENSIBILITY Sense &amp; Sensibility &lt;list [3]&gt;2 Sense &amp; Sensibility &lt;list [1]&gt;3 by Jane Austen Sense &amp; Sensibility &lt;list [3]&gt;4 Sense &amp; Sensibility &lt;list [1]&gt;</code></pre><p>Tokens can be further processed through, for instance, <a href="http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.ml.feature.HashingTF">HashingTF</a>.</p><h2 id="improved-compatibility">Improved compatibility</h2><p>&lsquo;na.action&rsquo; is a parameter accepted as part of the &lsquo;ml.options&rsquo; argument, which defaults to <code>getOption(&quot;na.action&quot;, &quot;na.omit&quot;)</code>. This allows <code>sparklyr</code> to match the behavior of R while processing NA records, for instance, the following linear model drops NA record appropriately:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(sparklyr)<span style="color:#06287e">library</span>(dplyr)<span style="color:#06287e">library</span>(nycflights13)sc <span style="color:#666">&lt;-</span> <span style="color:#06287e">spark_connect</span>(master <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">local&#34;</span>)flights_clean <span style="color:#666">&lt;-</span> <span style="color:#06287e">na.omit</span>(<span style="color:#06287e">copy_to</span>(sc, flights))<span style="color:#06287e">ml_linear_regression</span>(flights_tblresponse <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">dep_delay&#34;</span>,features <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">arr_delay&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">arr_time&#34;</span>))</code></pre></div><pre><code>* Dropped 9430 rows with 'na.omit' (336776 =&gt; 327346)Call: ml_linear_regression(flights_tbl, response = &quot;dep_delay&quot;,features = c(&quot;arr_delay&quot;, &quot;arr_time&quot;))Coefficients:(Intercept) arr_delay arr_time6.1001212994 0.8210307947 0.0005284729</code></pre><p>In addition, <code>dim()</code>, <code>nrow()</code> and <code>ncol()</code> are now supported against Spark DataFrames.</p><h2 id="livy-connections">Livy connections</h2><p><a href="http://livy.io/">Livy</a>, <em>&ldquo;An Open Source REST Service for Apache Spark (Apache License)&quot;</em>, is now available in <code>sparklyr 0.5</code> as an <strong>experimental</strong> feature. Among many scenarios, this enables connections from the RStudio desktop to Apache Spark when Livy is available and correctly configured in the remote cluster.</p><h2 id="livy-running-locally">Livy running locally</h2><p>To work with Livy locally, <code>sparklyr</code> supports <code>livy_install()</code> which installs Livy in your local environment, this is similar to <code>spark_install()</code>. Since Livy is a service to enable remote connections into Apache Spark, the service needs to be started with <code>livy_service_start()</code>. Once the service is running, <code>spark_connect()</code> needs to reference the running service and use <code>method = &quot;Livy&quot;</code>, then <code>sparklyr</code> can be used as usual. A short example follows:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">livy_install</span>()<span style="color:#06287e">livy_service_start</span>()sc <span style="color:#666">&lt;-</span> <span style="color:#06287e">spark_connect</span>(master <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">http://localhost:8998&#34;</span>,method <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">livy&#34;</span>)<span style="color:#06287e">copy_to</span>(sc, iris)<span style="color:#06287e">spark_disconnect</span>(sc)<span style="color:#06287e">livy_service_stop</span>()</code></pre></div><h2 id="livy-running-in-hdinsight">Livy running in HDInsight</h2><p><a href="https://azure.microsoft.com/">Microsoft Azure</a> supports Apache Spark clusters configured with Livy and protected with basic authentication in <a href="https://azure.microsoft.com/en-us/services/hdinsight/">HDInsight clusters</a>. To use <code>sparklyr</code> with HDInsight clusters through Livy, first create the HDInsight cluster with Spark support:</p><p><img src="https://rstudioblog.files.wordpress.com/2017/01/hdinsight-azure.png" alt="hdinsight-azure">Creating Spark Cluster in Microsoft Azure HDInsight</p><p>Once the cluster is created, you can connect with <code>sparklyr</code> as follows:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(sparklyr)<span style="color:#06287e">library</span>(dplyr)config <span style="color:#666">&lt;-</span> <span style="color:#06287e">livy_config</span>(user <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">admin&#34;</span>, password <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">password&#34;</span>)sc <span style="color:#666">&lt;-</span> <span style="color:#06287e">spark_connect</span>(master <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">https://dm.azurehdinsight.net/livy/&#34;</span>,method <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">livy&#34;</span>,config <span style="color:#666">=</span> config)<span style="color:#06287e">copy_to</span>(sc, iris)</code></pre></div><p>From a desktop running RStudio, the remote connection looks like this:</p><p><img src="https://rstudioblog.files.wordpress.com/2017/01/rstudio-hdinsight-azure.png" alt="rstudio-hdinsight-azure.png"></p><h2 id="improved-connections">Improved connections</h2><p><code>sparklyr 0.5</code> no longer requires internet connectivity to download additional <a href="https://spark-packages.org/">Apache Spark packages</a>. This enables connections in secure clusters that do not have internet access or while on the go.</p><p>Some community members reported a generic <em>&ldquo;Ports file does not exists&rdquo;</em> error while connecting with <code>sparklyr 0.4</code>. In <code>0.5</code>, we&rsquo;ve deprecated the ports file and improved error reporting. For instance, the following invalid connection example throws: a descriptive error, the <code>spark-submit</code> parameters and logging information that helps us troubleshoot connection issues.</p><pre><code>&gt; library(sparklyr)&gt; sc &lt;- spark_connect(master = &quot;local&quot;,config = list(&quot;sparklyr.gateway.port&quot; = &quot;0&quot;))Error in force(code) :Failed while connecting to sparklyr to port (0) for sessionid (5305):Gateway in port (0) did not respond.Path: /spark-1.6.2-bin-hadoop2.6/bin/spark-submitParameters: --class, sparklyr.Backend, 'sparklyr-1.6-2.10.jar', 0, 5305---- Output Log ----16/12/12 12:42:35 INFO sparklyr: Session (5305) starting---- Error Log ----</code></pre><p>Additional technical details can be found in the <a href="https://github.com/rstudio/sparklyr/pull/238">sparklyr gateway socket</a> pull request.</p><h2 id="cloudera-certification">Cloudera certification</h2><p><a href="https://cran.rstudio.com/package=sparklyr">sparklyr</a> 0.4, sparklyr 0.5, <a href="https://www.rstudio.com/products/rstudio-server-pro2/">RStudio Server Pro 1.0</a> and <a href="https://www.rstudio.com/products/shiny-server-pro/">ShinyServer Pro 1.5</a> went through <a href="http://www.cloudera.com/partners/certified-technology.html">Cloudera&rsquo;s certification</a> and are now certified with <a href="http://www.cloudera.com/">Cloudera</a>. Among various benefits, authentication features like <a href="https://en.wikipedia.org/wiki/Kerberos_(protocol)">Kerberos</a>, have been tested and validated against secured clusters.</p><p>For more information see <a href="http://www.cloudera.com/partners/partners-listing.html?q=rstudio">Cloudera&rsquo;s partner listings</a>.</p></description></item><item><title>xml2 1.1.1</title><link>https://www.rstudio.com/blog/xml-1-1-1/</link><pubDate>Tue, 24 Jan 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/xml-1-1-1/</guid><description><p>Today we are pleased to release version 1.1.1 of xml2. xml2 makes it easy to read, create, and modify XML with R. You can install it with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">xml2&#34;</span>)</code></pre></div><p>As well as fixing many bugs, this release:</p><ul><li><p>Makes it easier to create an modify XML</p></li><li><p>Improves roundtrip support between XML and lists</p></li><li><p>Adds support for XML validation and XSLT transformations.</p></li></ul><p>You can see a full list of changes in the <a href="https://github.com/hadley/xml2/releases/tag/v1.1.1">release notes</a>. This is the first release maintained by <a href="https://github.com/jimhester">Jim Hester</a>.</p><h2 id="creating-and-modifying-xml">Creating and modifying XML</h2><p>xml2 has been overhauled with a set of methods to make generating and modfying XML easier:</p><ul><li><code>xml_new_root()</code> can be used to create a new document and root node simultaneously.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">xml_new_root</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">x&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">xml_add_child</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">y&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">xml_root</span>()<span style="color:#60a0b0;font-style:italic">#&gt; {xml_document}</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;x&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; [1] &lt;y/&gt;</span></code></pre></div><ul><li>New <code>xml_set_text()</code>, <code>xml_set_name()</code>, <code>xml_set_attr()</code>, and <code>xml_set_attrs()</code> make it easy to modify nodes within a pipeline.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">x <span style="color:#666">&lt;-</span> <span style="color:#06287e">read_xml</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">&lt;a&gt;</span><span style="color:#4070a0"> &lt;b /&gt;</span><span style="color:#4070a0"> &lt;c&gt;&lt;b/&gt;&lt;/c&gt;</span><span style="color:#4070a0"> &lt;/a&gt;&#34;</span>)x<span style="color:#60a0b0;font-style:italic">#&gt; {xml_document}</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;a&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; [1] &lt;b/&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; [2] &lt;c&gt;\n &lt;b/&gt;\n&lt;/c&gt;</span>x <span style="color:#666">%&gt;%</span><span style="color:#06287e">xml_find_all</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">.//b&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">xml_set_name</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">banana&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">xml_set_attr</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">oldname&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">b&#34;</span>)x<span style="color:#60a0b0;font-style:italic">#&gt; {xml_document}</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;a&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; [1] &lt;banana oldname=&#34;b&#34;/&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; [2] &lt;c&gt;\n &lt;banana oldname=&#34;b&#34;/&gt;\n&lt;/c&gt;</span></code></pre></div><ul><li><p>New <code>xml_add_parent()</code> makes it easy to insert a node as the parent of an existing node.</p></li><li><p>You can create more esoteric node types with <code>xml_comment()</code> (comments), <code>xml_cdata()</code> (CDATA nodes), and <code>xml_dtd()</code> (DTDs).</p></li></ul><h2 id="coercion-to-and-from-r-lists">Coercion to and from R Lists</h2><p>xml2 1.1.1 improves support for converting to and from R lists, thanks in part to work by <a href="https://github.com/peterfoley">Peter Foley</a> and <a href="https://github.com/jennybc">Jenny Bryan</a>. In particular xml2 now supports preserving the root node name as well as saving all xml2 attributes as R attributes. These changes allows you to convert most XML documents to and from R lists with <code>as_list()</code> and <code>as_xml_document()</code> without loss of data.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">x <span style="color:#666">&lt;-</span> <span style="color:#06287e">read_xml</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">&lt;fruits&gt;&lt;apple color = &#39;red&#39; /&gt;&lt;/fruits&gt;&#34;</span>)x<span style="color:#60a0b0;font-style:italic">#&gt; {xml_document}</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;fruits&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; [1] &lt;apple color=&#34;red&#34;/&gt;</span><span style="color:#06287e">as_list</span>(x)<span style="color:#60a0b0;font-style:italic">#&gt; $apple</span><span style="color:#60a0b0;font-style:italic">#&gt; list()</span><span style="color:#60a0b0;font-style:italic">#&gt; attr(,&#34;color&#34;)</span><span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;red&#34;</span><span style="color:#06287e">as_xml_document</span>(<span style="color:#06287e">as_list</span>(x))<span style="color:#60a0b0;font-style:italic">#&gt; {xml_document}</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;apple color=&#34;red&#34;&gt;</span></code></pre></div><h2 id="xml-validation-and-xslt">XML validation and xslt</h2><p>xml2 1.1.1 also adds support for XML validation, thanks to <a href="https://github.com/jeroenooms">Jeroen Ooms</a>. Simply read the document and schema files and call <code>xml_validate()</code>.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">doc <span style="color:#666">&lt;-</span> <span style="color:#06287e">read_xml</span>(<span style="color:#06287e">system.file</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">extdata/order-doc.xml&#34;</span>, package <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">xml2&#34;</span>))schema <span style="color:#666">&lt;-</span> <span style="color:#06287e">read_xml</span>(<span style="color:#06287e">system.file</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">extdata/order-schema.xml&#34;</span>, package <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">xml2&#34;</span>))<span style="color:#06287e">xml_validate</span>(doc, schema)<span style="color:#60a0b0;font-style:italic">#&gt; [1] TRUE</span><span style="color:#60a0b0;font-style:italic">#&gt; attr(,&#34;errors&#34;)</span><span style="color:#60a0b0;font-style:italic">#&gt; character(0)</span></code></pre></div><p>Jeroen also released the first xml2 extension package in conjunction with xml2 1.1.1, <a href="https://cran.r-project.org/package=xslt">xslt</a>. xslt allows one to apply <a href="https://en.wikipedia.org/wiki/XSLT">XSLT (Extensible Stylesheet Language Transformations)</a> to XML documents, which are great for transforming XML data into other formats such as HTML.</p></description></item><item><title>Announcing RStudio Connect - For all the work your teams do in R</title><link>https://www.rstudio.com/blog/announcing-rstudio-connect-for-all-the-work-your-teams-do-in-r/</link><pubDate>Tue, 10 Jan 2017 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-rstudio-connect-for-all-the-work-your-teams-do-in-r/</guid><description><p>We&rsquo;re thrilled to officially introduce the newest product in RStudio&rsquo;s product lineup: <a href="https://www.rstudio.com/products/connect/">RStudio Connect</a>.</p><p>You can download a free 45-day trial of it <a href="https://www.rstudio.com/products/connect/">here</a>.</p><p>RStudio Connect is a new publishing platform for all the work your teams do in R. It provides a single destination for your Shiny applications, R Markdown documents, interactive HTML widgets, static plots, and more.</p><p><img src="https://rstudioblog.files.wordpress.com/2017/01/settings-panel-shiny.png" alt="RStudio Connect Settings"></p><p>RStudio Connect isn&rsquo;t just for R users. Now anyone can interact with custom built analytical data products developed by R users without having to program in R themselves. Team members can receive updated reports built on the same models/forecasts which can be configured to be rebuilt and distributed on a scheduled basis. RStudio Connect is designed to bring the power of data science to your entire enterprise.</p><p>RStudio Connect empowers analysts to share and manage the content they&rsquo;ve created in R. Users of the RStudio IDE can publish content to RStudio Connect with the click of a button and immediately be able to manage that content from a user-friendly web application: setting access controls and performance settings and viewing the logs of the associated R processes on the server.</p><p><img src="https://rstudioblog.files.wordpress.com/2017/01/combined-optim2.gif" alt="Deploying content from the RStudio IDE into RStudio Connect"></p><p>RStudio Connect is on-premises software that you can install on a server behind your firewall ensuring that your data and R applications never have to leave your organization&rsquo;s control. We integrate with many enterprise authentication platform including LDAP/Active Directory, Google OAuth, PAM, and proxied authentication. We also provide an option to use an internal username/password system complete with user self-sign-up.</p><p><img src="https://rstudioblog.files.wordpress.com/2017/01/screen-shot-2017-01-10-at-10-05-03-am-e1484064347603.png" alt="RStudio Connect Admin Metrics"></p><p>RStudio Connect has been in Beta for almost a year. We&rsquo;ve had hundreds of customers validate and help us improve the software in that time. In November, we made RStudio Connect generally available without significant fanfare and began to work with Beta participants and existing RStudio customers eager to move it into their production environments. We are pleased that innovative early customers, like AdRoll, have already successfully introduced RStudio Connect into their data science process.</p><blockquote>"At AdRoll, we have used the open source version of Shiny Server for years to great success but deploying apps always served as a barrier for new users. With RStudio Connect's push button deployment from the RStudio IDE, the number of shiny devs has grown tremendously both in engineering and across teams completely new to shiny like finance and marketing. It's been really powerful for those just getting started to be able to go from developing locally to sharing apps with others in just seconds."<ul><li>Bryan Galvin, Senior Data Scientist, AdRoll</blockquote></li></ul><p>We invite you to take a look at RStudio Connect today, too!</p><p>You can find more details or download a 45 day evaluation of the product at <a href="https://www.rstudio.com/products/connect/">https://www.rstudio.com/products/connect/</a>. Additional resources can be found below.</p><ul><li><p><a href="https://www.rstudio.com/products/connect/">RStudio Connect home page &amp; downloads</a></p></li><li><p><a href="http://docs.rstudio.com/connect/admin/">RStudio Connect Admin Guide</a></p></li><li><p><a href="https://www.rstudio.com/wp-content/uploads/2016/01/RSC-IT-Q-and-A.pdf">What IT needs to know about RStudio Connect</a></p></li><li><p><a href="https://www.rstudio.com/pricing/#ConnectPricing">Pricing</a></p></li><li><p><a href="https://beta.rstudioconnect.com/connect/">An online preview of RStudio Connect</a></p></li></ul></description></item><item><title>Register today for rstudio::conf 2017!</title><link>https://www.rstudio.com/blog/register-today-for-rstudioconf-2017/</link><pubDate>Thu, 15 Dec 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/register-today-for-rstudioconf-2017/</guid><description><p>On the other side of the Holidays are sunny January days in Florida, home of <a href="https://www.rstudio.com/conference/">rstudio::conf 2017</a> and the exclusive Wizarding World of Harry Potter experience reserved for our guests!</p><p>The <a href="https://www.rstudio.com/conference/#speakers">speakers</a> and <a href="https://www.rstudio.com/conference/#sponsors">sponsors</a> are set and the Gaylord Resort is ready. The entire RStudio team, led by masters of ceremony Hadley Wickham and Joe Cheng, look forward to seeing you.</p><p><strong>There are fewer than 80 seats remaining</strong>. We hope you&rsquo;ll <a href="https://www.eventbrite.com/e/rstudioconf-registration-25131753752">register</a> soon to join hundreds of the most passionate data scientists in the world from January 12 to 14, each looking for the latest and best information on all things R and RStudio.</p><p>Note: Are you registered but don&rsquo;t have a ticket for Friday evening&rsquo;s exclusive access to Universal&rsquo;s Wizarding World of Harry Potter? We&rsquo;ve heard a few people say they missed it. If you&rsquo;re already registered and would like a ticket for you or a guest, please go <a href="https://www.eventbrite.com/e/rstudioconf-registration-25131753752?access=HarryPotter">here</a> and select the Register button.</p></description></item><item><title>Announcing bookdown: Authoring Books and Technical Documents with R Markdown</title><link>https://www.rstudio.com/blog/announcing-bookdown/</link><pubDate>Fri, 02 Dec 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-bookdown/</guid><description><p>We have released the R package <strong>bookdown</strong> (v0.3) to <a href="https://cran.rstudio.com/package=bookdown">CRAN</a>. It may be old news to some users, but we are happy to make an official announcement today. To install the package from CRAN, you can</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">bookdown&#34;</span>)</code></pre></div><p>The <strong>bookdown</strong> package provides an easier way to write books and technical publications than traditional tools such as LaTeX and Word. It inherits the simplicity of syntax and flexibility for data analysis from R Markdown, and extends R Markdown for technical writing, so that you can make better use of document elements such as figures, tables, equations, theorems, citations, and references, etc. Similar to LaTeX, you can number and cross-reference these elements with <strong>bookdown</strong>. <!-- more -->Below are some screenshots to show what <strong>bookdown</strong> can produce:</p><p><img src="https://rstudioblog.files.wordpress.com/2016/12/screen-shot-2016-12-01-at-2-39-28-pm.png" alt="An example of figure numbers and captions"></p><p><img src="https://rstudioblog.files.wordpress.com/2016/12/screen-shot-2016-12-01-at-2-40-20-pm.png" alt="Examples of table numbers and captions"></p><p><img src="https://rstudioblog.files.wordpress.com/2016/12/screen-shot-2016-12-01-at-2-41-18-pm.png" alt="Math equations"></p><p>Your document can even include live examples (e.g. <a href="http://htmlwidgets.org">HTML widgets</a> and <a href="https://shiny.rstudio.com">Shiny apps</a>) so readers can interact with them while reading the book. The book can be rendered to multiple output formats, including LaTeX/PDF, HTML, EPUB, and Word, thus making it easy to put your documents online. The style and theme of these output formats can be customized. Most features apply to all output formats, e.g., you can also number equations and theorems in HTML output.</p><p><a href="https://bookdown.org/yihui/bookdown"><img src="https://bookdown.org/yihui/bookdown/images/cover.jpg" alt="The bookdown book"></a></p><p>You can find the full documentation at <a href="https://bookdown.org/yihui/bookdown">https://bookdown.org/yihui/bookdown</a>. As a matter of fact, the documentation was written using <strong>bookdown</strong> (of course!), and its source is fully available <a href="https://github.com/rstudio/bookdown/tree/master/inst/examples">on GitHub</a>. The book is to be published by <a href="http://crcpress.com/product/isbn/9781138700109">Chapman &amp; Hall</a> by the end of this month (pre-order also available on <a href="http://a.co/0uHNbno">Amazon</a>). We used books and R primarily for examples in this book, but <strong>bookdown</strong> is not only for books or R. Most features introduced in this book also apply to other types of publications: journal papers, reports, <a href="https://github.com/cpsievert/phd-thesis">dissertations</a>, <a href="https://geanders.github.io/RProgrammingForResearch/">course handouts</a>, study notes, and even novels. You do not have to use R, either. Other choices of computing languages include Python, C, C++, SQL, Bash, Stan, JavaScript, and so on, although R is best supported. You can also leave out computing, for example, to write a novel.</p><p>There have been a large number of books published on <a href="https://bookdown.org">https://bookdown.org</a>, and we hope you can find some inspiration there to start your own book.</p><p><img src="https://rstudioblog.files.wordpress.com/2016/12/mimic.gif" alt="mimic.gif"></p><p>To be clear, the goal of <strong>bookdown</strong> is definitely not to replace sophisticated typesetting tools like LaTeX, but help authors focus on content (instead of appearance), and present common components of a technical document more easily using the Markdown syntax (such sections, quotes, figures, tables, and so on). To some degree, <strong>bookdown</strong> reinvented a small part of LaTeX in other formats (HTML, EPUB, Word). There are surely features of other typesetting tools that are unavailable in <strong>bookdown</strong>, in which case we encourage you to either submit a feature request with justifications, or take a deep breath and <a href="https://twitter.com/kwbroman/status/798938827876421633">say <strong>no</strong> to new features</a> to keep things simple (for your reference, the <strong>bookdown</strong> package and book didn&rsquo;t exist about <a href="https://github.com/rstudio/bookdown/graphs/contributors">a year ago</a>).</p><p>Writing books can be highly addictive: it helps you organize your (random) thoughts and content into chapters and sections, and it is very rewarding to see the number of pages grow each day like a little baby. You can do things that you normally cannot/won&rsquo;t do in journal papers. For example, you can thank your kids in the preface (without whom you should have finished the book two years earlier). Choose a fresh and crispy font, and you simply cannot stop writing! With one click of a button, you can go directly from R Markdown documents to <a href="https://bookdown.org/yihui/bookdown/bookdown.pdf">a PDF</a> that is ready to be printed by your publisher.</p><p>We hope you will enjoy <strong>bookdown</strong>. Please feel free to <a href="https://github.com/rstudio/bookdown/issues">let us know</a> if you have any feedback, or ask technical questions on <a href="http://stackoverflow.com/questions/tagged/bookdown">StackOverflow</a>.</p></description></item><item><title>Time is running out - register for Hadley Wickham's Master R in Melbourne!</title><link>https://www.rstudio.com/blog/time-is-running-out-register-for-hadley-wickhams-master-r-in-melbourne/</link><pubDate>Wed, 23 Nov 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/time-is-running-out-register-for-hadley-wickhams-master-r-in-melbourne/</guid><description><p>Want to Master R? There&rsquo;s no better time or place than Hadley Wickham&rsquo;s workshop on December 12th and 13th at the Cliftons in Melbourne, VIC, Australia.</p><p>Register here: <a href="https://www.eventbrite.com/e/master-r-developer-workshop-melbourne-tickets-22546200292">https://www.eventbrite.com/e/master-r-developer-workshop-melbourne-tickets-22546200292</a> (Note: Prices are in $US and VAT is not collected)</p><p>Discounts are still available for academics (students or faculty) and for 5 or more attendees from any organization. Email <a href="mailto:training@rstudio.com">training@rstudio.com</a> if you have any questions about the workshop that you don&rsquo;t find answered on the registration page.</p><p>Hadley has no Master R Workshops planned in the region for 2017 and his next one with availability won&rsquo;t be until September in San Francisco. If you&rsquo;ve always wanted to take Master R but haven&rsquo;t found the time, Melbourne, <a href="http://www.timeout.com/london/things-to-do/eighteen-cities-ranked-best-for-fun">the second most fun city in the world</a>, is the place to go!</p><p>P.S. We&rsquo;ve arranged a &ldquo;happy hour&rdquo; reception after class on Monday the 12th. Be sure to set aside an hour or so after the first day to talk to your classmates and Hadley about what&rsquo;s happening in R.</p></description></item><item><title>ggplot2 2.2.0</title><link>https://www.rstudio.com/blog/ggplot2-2-2-0/</link><pubDate>Mon, 14 Nov 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/ggplot2-2-2-0/</guid><description><p>I&rsquo;m very pleased to announce ggplot2 2.2.0. It includes four major new features:</p><ul><li><p>Subtitles and captions.</p></li><li><p>A large rewrite of the facetting system.</p></li><li><p>Improved theme options.</p></li><li><p>Better stacking.</p></li></ul><p>It also includes as numerous bug fixes and minor improvements, as described in the <a href="http://github.com/hadley/ggplot2/releases/tag/v2.2.0">release notes</a>.</p><p>The majority of this work was carried out by <a href="https://github.com/thomasp85">Thomas Pederson</a>, who I was lucky to have as my &ldquo;ggplot2 intern&rdquo; this summer. Make sure to check out his other visualisation packages: <a href="https://github.com/thomasp85/ggraph">ggraph</a>, <a href="https://github.com/thomasp85/ggforce">ggforce</a>, and <a href="https://github.com/thomasp85/tweenr">tweenr</a>.</p><p>Install ggplot2 with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">ggplot2&#34;</span>)</code></pre></div><h2 id="subtitles-and-captions">Subtitles and captions</h2><p>Thanks to <a href="https://rud.is/">Bob Rudis</a>, you can now add subtitles and captions to your plots:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(mpg, <span style="color:#06287e">aes</span>(displ, hwy)) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>(<span style="color:#06287e">aes</span>(color <span style="color:#666">=</span> class)) <span style="color:#666">+</span><span style="color:#06287e">geom_smooth</span>(se <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">FALSE</span>, method <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">loess&#34;</span>) <span style="color:#666">+</span><span style="color:#06287e">labs</span>(title <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Fuel efficiency generally decreases with engine size&#34;</span>,subtitle <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Two seaters (sports cars) are an exception because of their light weight&#34;</span>,caption <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Data from fueleconomy.gov&#34;</span>)</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/11/subtitle-11.png" alt="subtitle-1"></p><p>These are controlled by the theme settings <code>plot.subtitle</code> and <code>plot.caption</code>.</p><p>The plot title is now aligned to the left by default. To return to the previous centered alignment, use <code>theme(plot.title = element_text(hjust = 0.5))</code>.</p><h2 id="facets">Facets</h2><p>The facet and layout implementation has been moved to ggproto and received a large rewrite and refactoring. This will allow others to create their own facetting systems, as descrbied in the <code>vignette(&quot;extending-ggplot2&quot;)</code>. Along with the rewrite a number of features and improvements has been added, most notably:</p><ul><li>ou can now use functions in facetting formulas, thanks to <a href="https://github.com/DanRuderman">Dan Ruderman</a>.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(diamonds, <span style="color:#06287e">aes</span>(carat, price)) <span style="color:#666">+</span><span style="color:#06287e">geom_hex</span>(bins <span style="color:#666">=</span> <span style="color:#40a070">20</span>) <span style="color:#666">+</span><span style="color:#06287e">facet_wrap</span>(<span style="color:#666">~</span><span style="color:#06287e">cut_number</span>(depth, <span style="color:#40a070">6</span>))</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/11/facet-1-1.png" alt="facet-1-1"></p><ul><li>Axes are now drawn under the panels in <code>facet_wrap()</code> when the rentangle is not completely filled.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(mpg, <span style="color:#06287e">aes</span>(displ, hwy)) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>() <span style="color:#666">+</span><span style="color:#06287e">facet_wrap</span>(<span style="color:#666">~</span>class)</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/11/facet-2-1.png" alt="facet-2-1"></p><ul><li>You can set the position of the axes with the <code>position</code> argument.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(mpg, <span style="color:#06287e">aes</span>(displ, hwy)) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>() <span style="color:#666">+</span><span style="color:#06287e">scale_x_continuous</span>(position <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">top&#34;</span>) <span style="color:#666">+</span><span style="color:#06287e">scale_y_continuous</span>(position <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">right&#34;</span>)</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/11/facet-3-1.png" alt="facet-3-1"></p><ul><li>You can display a secondary axis that is a one-to-one transformation of the primary axis with <code>sec.axis</code>.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(mpg, <span style="color:#06287e">aes</span>(displ, hwy)) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>() <span style="color:#666">+</span><span style="color:#06287e">scale_y_continuous</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">mpg (US)&#34;</span>,sec.axis <span style="color:#666">=</span> <span style="color:#06287e">sec_axis</span>(<span style="color:#666">~</span> . <span style="color:#666">*</span> <span style="color:#40a070">1.20</span>, name <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">mpg (UK)&#34;</span>))</code></pre></div><ul><li>Strips can be placed on any side, and the placement with respect to axes can be controlled with the <code>strip.placement</code> theme option.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(mpg, <span style="color:#06287e">aes</span>(displ, hwy)) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>() <span style="color:#666">+</span><span style="color:#06287e">facet_wrap</span>(<span style="color:#666">~</span> drv, strip.position <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">bottom&#34;</span>) <span style="color:#666">+</span><span style="color:#06287e">theme</span>(strip.placement <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">outside&#34;</span>,strip.background <span style="color:#666">=</span> <span style="color:#06287e">element_blank</span>(),strip.text <span style="color:#666">=</span> <span style="color:#06287e">element_text</span>(face <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">bold&#34;</span>)) <span style="color:#666">+</span><span style="color:#06287e">xlab</span>(<span style="color:#007020;font-weight:bold">NULL</span>)</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/11/facet-5-1.png" alt="facet-5-1"></p><h2 id="theming">Theming</h2><ul><li><p>The <code>theme()</code> function now has named arguments so autocomplete and documentation suggestions are vastly improved.</p></li><li><p>Blank elements can now be overridden again so you get the expected behavior when setting e.g. <code>axis.line.x</code>.</p></li><li><p><code>element_line()</code> gets an <code>arrow</code> argument that lets you put arrows on axes.</p></li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">arrow <span style="color:#666">&lt;-</span> <span style="color:#06287e">arrow</span>(length <span style="color:#666">=</span> <span style="color:#06287e">unit</span>(<span style="color:#40a070">0.4</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">cm&#34;</span>), type <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">closed&#34;</span>)<span style="color:#06287e">ggplot</span>(mpg, <span style="color:#06287e">aes</span>(displ, hwy)) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>() <span style="color:#666">+</span><span style="color:#06287e">theme_minimal</span>() <span style="color:#666">+</span><span style="color:#06287e">theme</span>(axis.line <span style="color:#666">=</span> <span style="color:#06287e">element_line</span>(arrow <span style="color:#666">=</span> arrow))</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/11/theme-1-1.png" alt="theme-1-1"></p><ul><li>Control of legend styling has been improved. The whole legend area can be aligned with the plot area and a box can be drawn around all legends:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(mpg, <span style="color:#06287e">aes</span>(displ, hwy, shape <span style="color:#666">=</span> drv, colour <span style="color:#666">=</span> fl)) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>() <span style="color:#666">+</span><span style="color:#06287e">theme</span>(legend.justification <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">top&#34;</span>,legend.box <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">horizontal&#34;</span>,legend.box.margin <span style="color:#666">=</span> <span style="color:#06287e">margin</span>(<span style="color:#40a070">3</span>, <span style="color:#40a070">3</span>, <span style="color:#40a070">3</span>, <span style="color:#40a070">3</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">mm&#34;</span>),legend.margin <span style="color:#666">=</span> <span style="color:#06287e">margin</span>(),legend.box.background <span style="color:#666">=</span> <span style="color:#06287e">element_rect</span>(colour <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">grey50&#34;</span>))</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/11/theme-2-1.png" alt="theme-2-1"></p><ul><li><p><code>panel.margin</code> and <code>legend.margin</code> have been renamed to <code>panel.spacing</code> and <code>legend.spacing</code> respectively, as this better indicates their roles. A new <code>legend.margin</code> actually controls the margin around each legend.</p></li><li><p>When computing the height of titles, ggplot2 now inclues the height of the descenders (i.e. the bits <code>g</code> and <code>y</code> that hang underneath). This improves the margins around titles, particularly the y axis label. I have also very slightly increased the inner margins of axis titles, and removed the outer margins.</p></li><li><p>The default themes has been tweaked by <a href="http://www.obs-vlfr.fr/~irisson/">Jean-Olivier Irisson</a> making them better match <code>theme_grey()</code>.</p></li></ul><h2 id="stacking-bars">Stacking bars</h2><p><code>position_stack()</code> and <code>position_fill()</code> now stack values in the reverse order of the grouping, which makes the default stack order match the legend.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">avg_price <span style="color:#666">&lt;-</span> diamonds <span style="color:#666">%&gt;%</span><span style="color:#06287e">group_by</span>(cut, color) <span style="color:#666">%&gt;%</span><span style="color:#06287e">summarise</span>(price <span style="color:#666">=</span> <span style="color:#06287e">mean</span>(price)) <span style="color:#666">%&gt;%</span><span style="color:#06287e">ungroup</span>() <span style="color:#666">%&gt;%</span><span style="color:#06287e">mutate</span>(price_rel <span style="color:#666">=</span> price <span style="color:#666">-</span> <span style="color:#06287e">mean</span>(price))<span style="color:#06287e">ggplot</span>(avg_price) <span style="color:#666">+</span><span style="color:#06287e">geom_col</span>(<span style="color:#06287e">aes</span>(x <span style="color:#666">=</span> cut, y <span style="color:#666">=</span> price, fill <span style="color:#666">=</span> color))</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/11/stack-1-1.png" alt="stack-1-1"></p><p>(Note also the new <code>geom_col()</code> which is short-hand for <code>geom_bar(stat = &quot;identity&quot;)</code>, contributed by Bob Rudis.)</p><p>If you want to stack in the opposite order, try <a href="http://forcats.tidyverse.org/reference/fct_rev.html"><code>forcats::fct_rev()</code></a>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(avg_price) <span style="color:#666">+</span><span style="color:#06287e">geom_col</span>(<span style="color:#06287e">aes</span>(x <span style="color:#666">=</span> cut, y <span style="color:#666">=</span> price, fill <span style="color:#666">=</span> <span style="color:#06287e">fct_rev</span>(color)))</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/11/stack-2-1.png" alt="stack-2-1"></p><p>Additionally, you can now stack negative values:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(avg_price) <span style="color:#666">+</span><span style="color:#06287e">geom_col</span>(<span style="color:#06287e">aes</span>(x <span style="color:#666">=</span> cut, y <span style="color:#666">=</span> price_rel, fill <span style="color:#666">=</span> color))</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/11/stack-3-1.png" alt="stack-3-1"></p><p>The overall ordering cannot necessarily be matched in the presence of negative values, but the ordering on either side of the x-axis will match.</p><p>Labels can also be stacked, but the default position is suboptimal:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">series <span style="color:#666">&lt;-</span> <span style="color:#06287e">data.frame</span>(time <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#06287e">rep</span>(<span style="color:#40a070">1</span>, <span style="color:#40a070">4</span>),<span style="color:#06287e">rep</span>(<span style="color:#40a070">2</span>, <span style="color:#40a070">4</span>), <span style="color:#06287e">rep</span>(<span style="color:#40a070">3</span>, <span style="color:#40a070">4</span>), <span style="color:#06287e">rep</span>(<span style="color:#40a070">4</span>, <span style="color:#40a070">4</span>)),type <span style="color:#666">=</span> <span style="color:#06287e">rep</span>(<span style="color:#06287e">c</span>(<span style="color:#4070a0">&#39;</span><span style="color:#4070a0">a&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">b&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">c&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">d&#39;</span>), <span style="color:#40a070">4</span>),value <span style="color:#666">=</span> <span style="color:#06287e">rpois</span>(<span style="color:#40a070">16</span>, <span style="color:#40a070">10</span>))<span style="color:#06287e">ggplot</span>(series, <span style="color:#06287e">aes</span>(time, value, group <span style="color:#666">=</span> type)) <span style="color:#666">+</span><span style="color:#06287e">geom_area</span>(<span style="color:#06287e">aes</span>(fill <span style="color:#666">=</span> type)) <span style="color:#666">+</span><span style="color:#06287e">geom_text</span>(<span style="color:#06287e">aes</span>(label <span style="color:#666">=</span> type), position <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">stack&#34;</span>)</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/11/stack-4-1.png" alt="stack-4-1"></p><p>You can improve the position with the <code>vjust</code> parameter. A <code>vjust</code> of 0.5 will center the labels inside the corresponding area:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(series, <span style="color:#06287e">aes</span>(time, value, group <span style="color:#666">=</span> type)) <span style="color:#666">+</span><span style="color:#06287e">geom_area</span>(<span style="color:#06287e">aes</span>(fill <span style="color:#666">=</span> type)) <span style="color:#666">+</span><span style="color:#06287e">geom_text</span>(<span style="color:#06287e">aes</span>(label <span style="color:#666">=</span> type), position <span style="color:#666">=</span> <span style="color:#06287e">position_stack</span>(vjust <span style="color:#666">=</span> <span style="color:#40a070">0.5</span>))</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/11/stack-5-1.png" alt="stack-5-1"></p></description></item><item><title>svglite 1.2.0</title><link>https://www.rstudio.com/blog/svglite-1-2-0/</link><pubDate>Mon, 14 Nov 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/svglite-1-2-0/</guid><description><p>Today we are pleased to release a new version of svglite. This release fixes many bugs, includes new documentation vignettes, and improves fonts support.</p><p>You can install svglite with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">svglite&#34;</span>)</code></pre></div><h2 id="font-handling">Font handling</h2><p>Fonts are tricky with SVG because they are needed at two stages:</p><ul><li><p>When creating the SVG file, the fonts are needed in order to correctly measure the amount space each character occupies. This is particularly important for plot that use <code>plotmath</code>.</p></li><li><p>When drawing the SVG file on screen, the fonts are needed to draw each character correctly.</p></li></ul><p>For the best display, that means you need to have the same fonts installed on both the computer that generates the SVG file and the computer that draws it. By default, svglite uses fonts that are installed on pretty much every computer. svglite&rsquo;s font support is now much more flexible thanks to two new arguments: <code>system_fonts</code> and <code>user_fonts</code>.</p><ol><li><code>system_fonts</code> allows you to specify the name of a font installed on your computer. This is useful, for example, if you&rsquo;d like to use a font with better CJK support:</li></ol><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">svglite</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Rplots.svg&#34;</span>, system_fonts <span style="color:#666">=</span> <span style="color:#06287e">list</span>(sans <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Arial Unicode MS&#34;</span>))<span style="color:#06287e">plot.new</span>()<span style="color:#06287e">text</span>(<span style="color:#40a070">0.5</span>, <span style="color:#40a070">0.5</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">正規分布&#34;</span>)<span style="color:#06287e">dev.off</span>()</code></pre></div><ol start="2"><li><code>user_fonts</code> allows you to specify a font installed in a R package (like <a href="https://github.com/lionel-/fontquiver">fontquiver</a>). This is needed if you want to generate identical plot across different operating systems, and are using in the upcoming <a href="https://github.com/lionel-/vdiffr">vdiffr package</a> which provides graphical unit tests.</li></ol><p>For more details, see <code>vignette(&quot;fonts&quot;)</code>.</p><h2 id="text-scaling">Text scaling</h2><p>This update also fixes many bugs. The most important is that text is now properly scaled within the plot, and we provide a vignette that describes the details: <code>vignette(&quot;scaling&quot;)</code>. It documents, for instance, how to include a svglite graphic in a web page with the figure text consistently scaled with the surrounding text.</p><p>Find a full list of changes in the <a href="https://github.com/hadley/svglite/releases/tag/v1.2.0">release notes</a>.</p></description></item><item><title>R Views - a new perspective on R and RStudio</title><link>https://www.rstudio.com/blog/r-views-a-new-perspective-on-r-and-rstudio/</link><pubDate>Tue, 08 Nov 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/r-views-a-new-perspective-on-r-and-rstudio/</guid><description><p>On October 12, RStudio launched <a href="https://rviews.rstudio.com/">R Views</a> with great enthusiasm. R Views is a new blog for R users <a href="https://rviews.rstudio.com/about/">about the R Community and the R Language</a>. Under the care of editor-in-chief and new RStudio ambassador-at-large, Joseph Rickert, R Views provides a new perspective on R and RStudio that we like to think will become essential reading for you.</p><p>You may have read an R Views post already. In the first, widely syndicated, post, <a href="https://rviews.rstudio.com/2016/10/12/interview-with-j-j-allaire/">Joseph interviewed J.J. Allaire</a>, RStudio&rsquo;s founder, CEO and most prolific software developer. Later posts by <a href="https://rviews.rstudio.com/2016/10/19/creating-interactive-plots-with-r-and-highcharts/">Mine Cetinkaya-Rundel on Highcharts</a> and thoughtful book reviews, new R package picks, and a primer on Naive Bayes from Joseph rounded out the first month. Each post was entirely different from anything you could have read here, on what we now call our Developer Blog at rstudio.org.</p><p>Fortunately, you don&rsquo;t have to choose. Each has its purpose. Our Developer Blog is the place to go for RStudio news. You&rsquo;ll find product announcements, events, and company happenings - like the announcement of a new blog - right here. R Views is about R in action. You&rsquo;ll find stories and solutions and opinions that we hope will educate and challenge you.</p><p>Subscribe to each and stay up to date on all things R and RStudio!</p><p>Thanks for making R and RStudio part of your data science experience and for supporting our work.</p></description></item><item><title>Shiny Server (Pro) 1.5</title><link>https://www.rstudio.com/blog/shiny-server-pro-1-5/</link><pubDate>Fri, 04 Nov 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-server-pro-1-5/</guid><description><p><a href="https://www.rstudio.com/products/shiny/shiny-server/">Shiny Server 1.5.1.834 and Shiny Server Pro 1.5.1.760 are now available.</a></p><p>The Shiny Server 1.5.x release family upgrades our underlying Node.js engine from 0.10.47 to 6.9.1. The impetus for this change was not stability or performance, but because the 0.10.x release family has reached the end of its life.</p><p>We highly recommend that you test on a staging server before upgrading production Shiny Server 1.4.x machines to 1.5. You should always do this for any production-critical software, but it&rsquo;s particularly important for this release, due to the magnitude of changes to Node.js that we&rsquo;ve absorbed in one big gulp. (We&rsquo;ve done thorough end-to-end testing of this release, but there&rsquo;s no substitute for testing with your own apps, on your own servers.)</p><p>Some small bug fixes are also included in this release. See the <a href="https://support.rstudio.com/hc/en-us/articles/215642837-Shiny-Server-Pro-Release-History">release notes</a> for more details.</p><h4 id="the-beginning-of-the-end-for-ubuntu-1204-and-red-hat-5">The beginning of the end for Ubuntu 12.04 and Red Hat 5</h4><p>While we still support Ubuntu 12.04 and Red Hat 5 today, we&rsquo;ll be moving on from these very old releases in a few months. Both of these distributions will end-of-life in April 2017, and will stop receiving bug fixes and security fixes from their vendors at that time. If you&rsquo;re using Shiny Server with one of these platforms, we recommend that you start planning your upgrade.</p></description></item><item><title>Announcing RStudio v1.0!</title><link>https://www.rstudio.com/blog/announcing-rstudio-v1-0/</link><pubDate>Tue, 01 Nov 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-rstudio-v1-0/</guid><description><p>Today we&rsquo;re very pleased to announce the availability of RStudio Version 1.0! Version 1.0 is our 10th major release since the initial launch in February 2011 (see the full release history below), and our biggest ever! Highlights include:</p><ul><li><p>Authoring tools for <a href="https://rmarkdown.rstudio.com/r_notebooks.html">R Notebooks</a>.</p></li><li><p>Integrated support for the <a href="http://spark.rstudio.com">sparklyr</a> package (R interface to Spark).</p></li><li><p>Performance profiling via integration with the <a href="https://github.com/rstudio/profvis">profvis</a> package.</p></li><li><p>Enhanced data import tools based on the <a href="https://github.com/hadley/readr">readr</a>, <a href="https://github.com/hadley/readxl">readxl</a> and <a href="https://github.com/hadley/haven">haven</a> packages.</p></li><li><p>Authoring tools for R Markdown <a href="https://rmarkdown.rstudio.com/rmarkdown_websites.html">websites</a> and the <a href="https://bookdown.org/">bookdown</a> package.</p></li><li><p>Many other miscellaneous enhancements and bug fixes.</p></li></ul><p>We hope you <a href="https://www.rstudio.com/products/rstudio/download/">download version 1.0</a> now and as always <a href="https://support.rstudio.com/hc/en-us">let us know</a> what you think.</p><h2 id="r-notebooks">R Notebooks</h2><p><a href="https://rmarkdown.rstudio.com/r_notebooks.html">R Notebooks</a> add a powerful notebook authoring engine to <a href="https://rmarkdown.rstudio.com/">R Markdown</a>. Notebook interfaces for data analysis have compelling advantages including the close association of code and output and the ability to intersperse narrative with computation. Notebooks are also an excellent tool for teaching and a convenient way to share analyses.</p><p><img src="https://rmarkdown.rstudio.com/images/notebook-demo.png" alt=""></p><h3 id="interactive-r-markdown">Interactive R Markdown</h3><p>As an authoring format, R Markdown bears many similarities to traditional notebooks like <a href="https://jupyter.org/">Jupyter</a> and <a href="http://beakernotebook.com/">Beaker</a>. However, code in notebooks is typically executed interactively, one cell at a time, whereas code in R Markdown documents is typically executed in batch.</p><p>R Notebooks bring the interactive model of execution to your R Markdown documents, giving you the capability to work quickly and iteratively in a notebook interface without leaving behind the plain-text tools, compatibility with version control, and production-quality output you&rsquo;ve come to rely on from R Markdown.</p><h3 id="iterate-quickly">Iterate Quickly</h3><p>In a typical R Markdown document, you must re-knit the document to see your changes, which can take some time if it contains non-trivial computations. R Notebooks, however, let you run code and see the results in the document immediately. They can include just about any kind of content R produces, including console output, plots, data frames, and interactive <a href="http://www.htmlwidgets.org/">HTML widgets</a>.</p><p><img src="https://rstudioblog.files.wordpress.com/2016/09/screen-shot-2016-09-20-at-4-16-47-pm.png" alt="screen-shot-2016-09-20-at-4-16-47-pm"></p><p>You can see the progress of the code as it runs:</p><p><img src="https://rstudioblog.files.wordpress.com/2016/09/screen-shot-2016-09-21-at-10-52-02-am.png" alt="screen-shot-2016-09-21-at-10-52-02-am"></p><p>You can preview the results of individual inline expressions, too:</p><p><img src="https://rstudioblog.files.wordpress.com/2016/09/notebook-inline-output.png" alt="notebook-inline-output"></p><p>Even your LaTeX equations render in real-time as you type:</p><p><img src="https://rstudioblog.files.wordpress.com/2016/09/notebook-mathjax.png" alt="notebook-mathjax"></p><p>This focused mode of interaction doesn&rsquo;t require you to keep the console, viewer, or output panes open. Everything you need is at your fingertips in the editor, reducing distractions and helping you concentrate on your analysis. When you&rsquo;re done, you&rsquo;ll have a formatted, reproducible record of what you&rsquo;ve accomplished, with plenty of context, perfect for your own records or sharing with others.</p><h2 id="spark-with-sparklyr">Spark with sparklyr</h2><p>The <a href="https://spark.rstudio.com/">sparklyr package</a> is a new R interface for Apache Spark. RStudio now includes integrated support for Spark and the sparklyr package, including tools for:</p><ul><li><p>Creating and managing Spark connections</p></li><li><p>Browsing the tables and columns of Spark DataFrames</p></li><li><p>Previewing the first 1,000 rows of Spark DataFrames</p></li></ul><p>Once you&rsquo;ve installed the sparklyr package, you should find a new <strong>Spark</strong> pane within the IDE. This pane includes a <strong>New Connection</strong> dialog which can be used to make connections to local or remote Spark instances:</p><p><img src="https://spark.rstudio.com/images/spark-connect.png" alt=""></p><p>Once you&rsquo;ve connected to Spark you&rsquo;ll be able to browse the tables contained within the Spark cluster:</p><p><img src="https://spark.rstudio.com/images/spark-tab.png" alt=""></p><p>The Spark DataFrame preview uses the standard RStudio data viewer:</p><p><img src="https://spark.rstudio.com/images/spark-dataview.png" alt=""></p><h2 id="profiling-with-profvis">Profiling with profvis</h2><p>&ldquo;How can I make my code faster?&rdquo;</p><p>If you write R code, then you&rsquo;ve probably asked yourself this question. A profiler is an important tool for doing this: it records how the computer spends its time, and once you know that, you can focus on the slow parts to make them faster.</p><p>RStudio now includes integrated support for profiling R code and for visualizing profiling data. R itself has long had a built-in profiler, and now it&rsquo;s easier than ever to use the profiler and interpret the results.</p><p>To profile code with RStudio, select it in the editor, and then click on <strong>Profile -&gt; Profile Selected Line(s)</strong>. R will run that code with the profiler turned on, and then open up an interactive visualization.</p><p><a href="https://rstudioblog.files.wordpress.com/2016/05/profile1.gif"><img src="https://rstudioblog.files.wordpress.com/2016/05/profile1.gif&amp;h=844" alt=""></a></p><p>In the visualization, there are two main parts: on top, there is the code with information about the amount of time spent executing each line, and on the bottom there is a <em>flame graph</em>, which shows what R was doing over time. In the flame graph, the horizontal direction represents time, moving from left to right, and the vertical direction represents the <em>call stack</em>, which are the functions that are currently being called. (Each time a function calls another function, it goes on top of the stack, and when a function exits, it is removed from the stack.)</p><p><img src="https://rstudioblog.files.wordpress.com/2016/05/profile.png&amp;h=388" alt="profile.png"></p><p>The <strong>Data</strong> tab contains a call tree, showing which function calls are most expensive:</p><p><a href="https://rstudioblog.files.wordpress.com/2016/05/data1.png"><img src="https://rstudioblog.files.wordpress.com/2016/05/data1.png&amp;h=270" alt="Profiling data pane"></a></p><p>Armed with this information, you&rsquo;ll know what parts of your code to focus on to speed things up!</p><h2 id="data-import">Data Import</h2><p>RStudio now integrates with the <a href="http://readr.tidyverse.org/">readr</a>, <a href="https://cran.r-project.org/web/packages/readxl/index.html">readxl</a>, and <a href="http://haven.tidyverse.org/">haven</a> packages to provide comprehensive tools for importing data from many text file formats, Excel worksheets, as well as SAS, Stata, and SPSS data files. The tools are focused on interactively refining an import then providing the code required to reproduce the import on new datasets.</p><p>For example, here&rsquo;s the workflow we would use to import the Excel worksheet at <a href="http://www.fns.usda.gov/sites/default/files/pd/slsummar.xls">http://www.fns.usda.gov/sites/default/files/pd/slsummar.xls</a>.</p><p>First provide the dataset URL and review the import in preview mode (notice that this file contains two tables and as a result requires the first few rows to be removed):</p><p><img src="https://support.rstudio.com/hc/en-us/article_attachments/206278038/Screen_Shot_2016-04-08_at_3.12.13_PM.png" alt=""></p><p>We can clean this up by skipping 6 rows from this file and unchecking the &ldquo;First Row as Names&rdquo; checkbox:</p><p><img src="https://support.rstudio.com/hc/en-us/article_attachments/206278068/Screen_Shot_2016-04-08_at_3.12.21_PM.png" alt=""></p><p>The file is looking better but some columns are being displayed as strings when they are clearly numerical data. We can fix this by selecting &ldquo;numeric&rdquo; from the column drop-down:</p><p><img src="https://support.rstudio.com/hc/en-us/article_attachments/206278098/Screen_Shot_2016-04-08_at_3.12.26_PM.png" alt=""></p><p>The final step is to click &ldquo;Import&rdquo; to run the code displayed under &ldquo;Code Preview&rdquo; and import the data into R. The code is executed within the console and imported dataset is displayed automatically:</p><p><img src="https://support.rstudio.com/hc/en-us/article_attachments/206328087/Screen_Shot_2016-04-08_at_3.12.31_PM.png" alt=""></p><p>Note that rather than executing the import we could have just copied and pasted the import code and included it within any R script.</p><h2 id="rstudio-release-history">RStudio Release History</h2><p>We started working on RStudio in November of 2008 (8 years ago!) and had our first public release in February of 2011. Here are highlights of the various releases through the years:</p><table><thead><tr><th align="left">Version</th><th align="left">Date</th><th align="left">Highlights</th></tr></thead><tbody><tr><td align="left">0.92</td><td align="left">Feb 2011</td><td align="left">* Initial public release</td></tr><tr><td align="left">0.93</td><td align="left">Apr 2011</td><td align="left">* Interactive plotting with manipulate<br/> * Source editor themes<br/> * Configurable workspace layout</td></tr><tr><td align="left">0.94</td><td align="left">Jun 2011</td><td align="left">* Enhanced plot export<br/> * Enhanced package installation and management<br/> * Enhanced history management</td></tr><tr><td align="left">0.95</td><td align="left">Jan 2012</td><td align="left">* RStudio project system<br/> * Code navigation (typeahead search, go to definition)<br/> * Version control integration (Git and Subversion)</td></tr><tr><td align="left">0.96</td><td align="left">May 2012</td><td align="left">* Enhanced authoring for Sweave<br/> * Web publishing with R Markdown<br/> * Code folding and many other editing enhancements</td></tr><tr><td align="left">0.97</td><td align="left">Oct 2012</td><td align="left">* Package development tools<br/> * Vim editing mode<br/> * More intelligent R auto-indentation</td></tr><tr><td align="left">0.98</td><td align="left">Dec 2013</td><td align="left">* Interactive debugging tools<br/> * Enhanced environment pane<br/> * Viewer pane for web content / htmlwidgets</td></tr><tr><td align="left">0.98b</td><td align="left">Jun 2014</td><td align="left">* R Markdown v2 (publish to PDF, Word, and more)<br/> * Integrated tools for Shiny application development<br/> * Editor support for XML, SQL, Python, and Bash</td></tr><tr><td align="left">0.99</td><td align="left">May 2015</td><td align="left">* Data viewer with support for large datasets, filtering, searching, and sorting<br/> * Major enhancements to R and C/C++ code completion and inline code diagnostics<br/> * Multiple cursors, tab re-ordering, enhanced Vim mode</td></tr><tr><td align="left">0.99b</td><td align="left">Feb 2016</td><td align="left">* Emacs editing mode<br/> * Multi-window source editing<br/> * Customizable keyboard shortcuts<br/> * RStudio Addins</td></tr><tr><td align="left">1.0</td><td align="left">Nov 2016</td><td align="left">* Authoring tools for R Notebooks<br/> * Integrated support for sparklyr (R interface to Spark)<br/> * Enhanced data import tools<br/> * Performance profiling via integration with profvis</td></tr></tbody></table><p>The <a href="https://support.rstudio.com/hc/en-us/articles/200716783-RStudio-Release-History">RStudio Release History</a> page on our support website provides a complete history of all major and minor point releases.</p></description></item><item><title>Join Hadley Wickham's Master R Workshop in Melbourne, Australia December 12 & 13</title><link>https://www.rstudio.com/blog/join-hadley-wickhams-master-r-workshop-in-melbourne-australia-december-12-13/</link><pubDate>Fri, 28 Oct 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/join-hadley-wickhams-master-r-workshop-in-melbourne-australia-december-12-13/</guid><description><p>It&rsquo;s nearly summeRtime in Australia! Join RStudio Chief Data Scientist Hadley Wickham for his popular Master R workshop in Melbourne.</p><p>Register here: <a href="https://www.eventbrite.com/e/master-r-developer-workshop-melbourne-tickets-22546200292">https://www.eventbrite.com/e/master-r-developer-workshop-melbourne-tickets-22546200292</a></p><p>Melbourne will be Hadley&rsquo;s first and only scheduled Master R workshop in Australia. Whether you live or work nearby or you just need one more good reason to visit Melbourne in the Southern Hemisphere spring, consider joining him at the Cliftons Melbourne on December 12th and 13th. It&rsquo;s a rare opportunity to learn from one of the R community&rsquo;s most popular and innovative authors and package developers.</p><p>Hadley&rsquo;s workshops usually sell out. This is his final Master R in 2016 and he has no plans to offer another in the area in 2017. If you&rsquo;re an active R user and have been meaning to take this class, now is the perfect time to do it!</p><p>We look forward to seeing you in Melbourne!</p></description></item><item><title>Call for rstudio::conf lightning talks</title><link>https://www.rstudio.com/blog/call-for-rstudioconf-lightning-talks/</link><pubDate>Tue, 18 Oct 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/call-for-rstudioconf-lightning-talks/</guid><description><p>We are excited to announce that submissions for lightning talks at rstudio::conf are now open! Lightning talks are short (5 minute) high energy presentations that give you the chance to talk about an interesting project that you&rsquo;ve tackled with R. Short talks, or demos of your R code, R packages, and shiny apps are great options. See some of the great lightning talks from the <a href="https://www.rstudio.com/resources/webinars/shiny-developer-conference/">Shiny Developer Conference</a> (scroll down to user talks).</p><p><a href="https://rstudio.typeform.com/to/npmvlz">Submit your lightning talk proposal here!</a></p><p><strong>Submissions are due December 1, 2016.</strong> We&rsquo;ll announce the accepted talks on December 15. (You must be a registered attendee of rstudio::conf to present a lightning talk.)</p></description></item><item><title>Shiny Server (Pro) 1.4.7</title><link>https://www.rstudio.com/blog/shiny-server-pro-1-4-7/</link><pubDate>Fri, 14 Oct 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-server-pro-1-4-7/</guid><description><p><a href="https://www.rstudio.com/products/shiny/shiny-server/">Shiny Server 1.4.7.815 and Shiny Server Pro 1.4.7.736 are now available!</a> This release includes new features to support Shiny 0.14. It also updates our Node.js to 0.10.47, which includes important security fixes for SSL/TLS.</p><h3 id="connection-robustness-aka-grey-outs">Connection robustness (a.k.a. grey-outs)</h3><p>Shiny&rsquo;s architecture is built on top of websockets, which are long-lived network connections between the browser and an R session on the server. If this connection is broken for any reason, the browser is no longer able to communicate with its R session on the server. Shiny indicates this to the user by turning the page background grey and fading out the page contents.</p><p>In Shiny 0.14 and Shiny Server 1.4.7, we&rsquo;ve done work at both the server and package levels to minimize the amount of greyouts users will see. Simply by upgrading Shiny Server, transient (&lt;15sec) network interruptions should no longer disrupt Shiny apps. And for many Shiny apps, a secondary, opt-in reconnection mechanism should all but eliminate grey-outs. <a href="https://shiny.rstudio.com/articles/reconnecting.html">This article on shiny.rstudio.com</a> has all the details.</p><h3 id="bookmarkable-state">Bookmarkable state</h3><p>Shiny 0.14 introduced a <a href="https://shiny.rstudio.com/articles/bookmarking-state.html">&ldquo;bookmarkable state&rdquo; feature</a> that made it possible to snapshot the state of a running Shiny app, and send it to someone as a URL to try in their own browser. At the app author&rsquo;s option, the app state could either be fully encoded in the URL, or written to disk and referred to by a short ID. This latter approach requires support from the server, and that support is now officially provided by Shiny Server and Shiny Server 1.4.7. (This functionality is not yet available for ShinyApps.io, however.)</p><h2 id="coming-soon-shiny-server-150">Coming soon: Shiny Server 1.5.0</h2><p>Just a heads up: Shiny Server (Pro) 1.5.0 is coming in a few weeks. Shiny Server was originally written using Node.js 0.10, which is nearing the end of its lifespan. This release will move to Node.js 6.x.</p><p>Due to the complexity of this upgrade, Shiny Server 1.5.0 will not add any new features, except for supporting <a href="https://en.wikipedia.org/wiki/Transport_Layer_Security#Forward_secrecy">perfect forward secrecy</a> for SSL/TLS connections. The focus will be entirely on ensuring a smooth and stable release.</p></description></item><item><title>shinythemes 1.1.1</title><link>https://www.rstudio.com/blog/shinythemes-1-1-1/</link><pubDate>Thu, 13 Oct 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shinythemes-1-1-1/</guid><description><p>If there&rsquo;s one word that could describe the default styling of Shiny applications, it might be &ldquo;minimalist.&rdquo; Shiny&rsquo;s UI components are built using the Bootstrap web framework, and unless the appearance is customized, the application will be mostly white and light gray.</p><p>Fortunately, it&rsquo;s easy to add a bit of flavor to your Shiny application, with the <a href="https://rstudio.github.io/shinythemes/">shinythemes</a> package. We&rsquo;ve just released version 1.1.1 of shinythemes, which includes many new themes from <a href="http://bootswatch.com/">bootswatch.com</a>, as well as a theme selector which you can use to test out different themes on a live Shiny application.</p><p>Here&rsquo;s an example of the theme selector in use (try out the app <a href="https://gallery.shinyapps.io/117-shinythemes/">here</a>):</p><p><img src="https://rstudioblog.files.wordpress.com/2016/10/theme-selector.gif" alt="theme-selector"></p><hr><p>To install the latest version of shinythemes, run:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">shinythemes&#34;</span>)</code></pre></div><p>To use the theme selector, all you need to do is add this somewhere in your app&rsquo;s UI code:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">shinythemes<span style="color:#666">::</span><span style="color:#06287e">themeSelector</span>()</code></pre></div><p>Once you&rsquo;ve chosen which theme you want, all you need to do is use the <code>theme</code> argument of the <code>bootstrapPage</code>, <code>fluidPage</code>, <code>navbarPage</code>, or <code>fixedPage</code> functions. If you want to use &ldquo;cerulean&rdquo;, you would do this:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">fluidPage</span>(theme <span style="color:#666">=</span> <span style="color:#06287e">shinytheme</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">cerulean&#34;</span>),<span style="color:#007020;font-weight:bold">...</span>)</code></pre></div><p>To learn more and see screenshots of the different themes, visit the <a href="https://rstudio.github.io/shinythemes/">shinythemes web page</a>. Enjoy!</p></description></item><item><title>The schedule is out - rstudio:conf 2017</title><link>https://www.rstudio.com/blog/the-schedule-is-out-rstudioconf-2017/</link><pubDate>Thu, 13 Oct 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/the-schedule-is-out-rstudioconf-2017/</guid><description><p><a href="https://www.rstudio.com/conference/">rstudio::conf 2017</a>, the conference on all things R and RStudio, is only 90 days away. Now is the time to claim your spot or grab one of the few remaining seats at Training Days - including the new Tidyverse workshop.</p><p><strong><a href="https://www.eventbrite.com/e/rstudioconf-registration-25131753752">REGISTER NOW</a></strong></p><p>Whether you&rsquo;re already registered or still working on it, we&rsquo;re delighted today to announce the <a href="https://www.rstudio.com/conference/#topics">full conference schedule</a>, so that you can plan your days in Florida.</p><p>rstudio::conf 2017 takes place January 12-14 at the Gaylord Resorts in Kissimmee, Florida. There are over 30 talks and tutorials to choose from that are sure to accelerate your productivity in R and RStudio. In addition to the highlights below, topics include the latest news on R notebooks, sparklyr, profiling, the tidyverse, shiny, r markdown, html widgets, data access and the new enterprise-scale publishing capabilities of RStudio Connect.</p><p><strong>Schedule Highlights</strong></p><p>Keynotes</p><ul><li>Hadley Wickham, Chief Scientist, RStudio: <em>Data Science in the Tidyverse</em></li><li>Andrew Flowers, Economics Writer, FiveThirtyEight: <em>Finding and Telling Stories with R</em></li><li>J.J. Allaire, Software Engineer, CEO &amp; Founder: <em>RStudio Past, Present and Future</em></li></ul><p>Tutorials</p><ul><li>Winston Chang, Software Engineer, RStudio: <em>Building Dashboards with Shiny</em></li><li>Charlotte Wickham, Oregon State University: <em>Happy R Users Purrr</em></li><li>Yihui Xie, Software Engineer, RStudio: <em>Advanced R Markdown</em></li><li>Jenny Bryan, University of British Columbia: <em>Happy Git and GitHub for the UseR</em></li></ul><p>Featured Speakers</p><ul><li>Max Kuhn, Senior Director Non-Clinical Statistics, Pfizer</li><li>Dirk Eddelbuettel, Ketchum Trading: <em>Extending R with C++: A Brief Introduction to Rcpp</em></li><li>Hilary Parker, Stitch Fix: _Opinionated Analysis Development&rdquo;</li><li><em>Bryan Lewis, Paradigm4:</em> &ldquo;Fun with htmlwidgets&rdquo;</li><li><em>Ryan Hafen, Hafen Consulting:</em> &ldquo;Interactive plotting with rbokeh and crosstalk&rdquo;</li><li><em>Julia Silge, Datassist:</em> &ldquo;Text mining, the tidy way&rdquo;</li><li>_Bob Rudis, Rapid7: <em>&ldquo;Writing readable code with pipes&rdquo;</em></li></ul><p>Featured Talk</p><ul><li>Joseph Rickert, R Ambassador, RStudio: <em>R&rsquo;s Role in Data Science</em></li></ul><p>Be sure to visit <a href="https://www.rstudio.com/conference/">https://www.rstudio.com/conference/</a> for the full schedule and latest updates and don&rsquo;t forget to download the <a href="https://attendify.com/app/6qvu8i/">RStudio conference app</a> to help you plan your days in detail.</p><p><strong>Special Reminder:</strong> When you register, make sure you purchase your ticket for Friday evening at Universal&rsquo;s Wizarding World of Harry Potter. The park is reserved exclusively for rstudio::conf attendees. It&rsquo;s an extraordinary experience we&rsquo;re sure you&rsquo;ll enjoy!</p><p>We appreciate our sponsors and exhibitors!</p><p>[gallery link=&quot;none&rdquo; columns=&quot;4&rdquo; ids=&quot;3394,3395,3396,3397&rdquo;]</p></description></item><item><title>R Notebooks</title><link>https://www.rstudio.com/blog/r-notebooks/</link><pubDate>Wed, 05 Oct 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/r-notebooks/</guid><description><p>Today we&rsquo;re excited to announce <a href="https://rmarkdown.rstudio.com/r_notebooks.html">R Notebooks</a>, which add a powerful notebook authoring engine to <a href="https://rmarkdown.rstudio.com/">R Markdown</a>. Notebook interfaces for data analysis have compelling advantages including the close association of code and output and the ability to intersperse narrative with computation. Notebooks are also an excellent tool for teaching and a convenient way to share analyses.</p><p><img src="https://rstudioblog.files.wordpress.com/2016/09/screen-shot-2016-09-21-at-3-42-44-pm.png" alt="screen-shot-2016-09-21-at-3-42-44-pm"></p><p>You can try out R Notebooks today in the <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio Preview Release</a>.</p><h2 id="interactive-r-markdown">Interactive R Markdown</h2><p>As an authoring format, R Markdown bears many similarities to traditional notebooks like <a href="https://jupyter.org/">Jupyter</a> and <a href="http://beakernotebook.com/">Beaker</a>. However, code in notebooks is typically executed interactively, one cell at a time, whereas code in R Markdown documents is typically executed in batch.</p><p>R Notebooks bring the interactive model of execution to your R Markdown documents, giving you the capability to work quickly and iteratively in a notebook interface without leaving behind the plain-text tools and production-quality output you&rsquo;ve come to rely on from R Markdown.</p><table style="color:#808080;border:1px solid #d0d0d0;padding:2px;margin-top:15px;margin-bottom:15px;" ><tbody ><tr ><p>R Markdown NotebooksTraditional Notebooks</p></tr><tr ><td style="text-align:left;" >Plain text representation</td><td style="color:#000000;text-align:center;" >✓</td><td ></td></tr><tr ><td style="text-align:left;" >Same editor/tools used for R scripts</td><td style="color:#000000;text-align:center;" >✓</td><td ></td></tr><tr ><td style="text-align:left;" >Works well with version control</td><td style="color:#000000;text-align:center;" >✓</td><td ></td></tr><tr ><td style="text-align:left;" >Focus on production output</td><td style="color:#000000;text-align:center;" >✓</td><td ></td></tr><tr ><td style="text-align:left;" >Output inline with code</td><td style="color:#000000;text-align:center;" >✓</td><td style="color:#000000;text-align:center;" >✓</td></tr><tr ><td style="text-align:left;" >Output cached across sessions</td><td style="color:#000000;text-align:center;" >✓</td><td style="color:#000000;text-align:center;" >✓</td></tr><tr ><td style="text-align:left;" >Share code and output in a single file</td><td style="color:#000000;text-align:center;" >✓</td><td style="color:#000000;text-align:center;" >✓</td></tr><tr ><td style="text-align:left;" >Emphasized execution model</td><td style="color:#000000;text-align:center;" >Interactive & Batch</td><td style="color:#000000;text-align:center;" >Interactive</td></tr></tbody></table><p>This video provides a bit more background and a demonstration of notebooks in action:</p><p>[youtube https://www.youtube.com/watch?v=zNzZ1PfUDNk&amp;w=560&amp;h=315]</p><h2 id="iterate-quickly">Iterate Quickly</h2><p>In a typical R Markdown document, you must re-knit the document to see your changes, which can take some time if it contains non-trivial computations. R Notebooks, however, let you run code and see the results in the document immediately. They can include just about any kind of content R produces, including console output, plots, data frames, and interactive <a href="http://www.htmlwidgets.org/">HTML widgets</a>.</p><p><img src="https://rstudioblog.files.wordpress.com/2016/09/screen-shot-2016-09-20-at-4-16-47-pm.png" alt="screen-shot-2016-09-20-at-4-16-47-pm"></p><p>You can see the progress of the code as it runs:</p><p><img src="https://rstudioblog.files.wordpress.com/2016/09/screen-shot-2016-09-21-at-10-52-02-am.png" alt="screen-shot-2016-09-21-at-10-52-02-am"></p><p>You can preview the results of individual inline expressions, too:</p><p><img src="https://rstudioblog.files.wordpress.com/2016/09/notebook-inline-output.png" alt="notebook-inline-output"></p><p>Even your LaTeX equations render in real-time as you type:</p><p><img src="https://rstudioblog.files.wordpress.com/2016/09/notebook-mathjax.png" alt="notebook-mathjax"></p><p>This focused mode of interaction doesn&rsquo;t require you to keep the console, viewer, or output panes open. Everything you need is at your fingertips in the editor, reducing distractions and helping you concentrate on your analysis. When you&rsquo;re done, you&rsquo;ll have a formatted, reproducible record of what you&rsquo;ve accomplished, with plenty of context, perfect for your own records or sharing with others.</p><h2 id="batteries-included">Batteries Included</h2><p>R Notebooks can run more than just R code. You can run chunks <a href="https://rmarkdown.rstudio.com/authoring_knitr_engines.html">written in other languages</a>, like Python, Bash, or C++ (Rcpp).</p><p><img src="https://rstudioblog.files.wordpress.com/2016/09/screen-shot-2016-09-20-at-4-25-48-pm.png" alt="screen-shot-2016-09-20-at-4-25-48-pm"></p><p>It&rsquo;s even possible to run SQL directly:</p><p><img src="https://rstudioblog.files.wordpress.com/2016/09/notebook-sql.png" alt="notebook-sql"></p><p>This makes an R Notebook an excellent tool for orchestrating a reproducible, end-to-end data analysis workflow; you can easily ingest data using your tool of choice, and share data among languages by using packages like <a href="https://cran.r-project.org/web/packages/feather/index.html">feather</a>, or ordinary CSV files.</p><h2 id="reproducible-notebooks">Reproducible Notebooks</h2><p>While you can run chunks (and even individual lines of R code!) in any order you like, a fully reproducible document must be able to be re-executed start-to-finish in a clean environment. There&rsquo;s a built-in command to do this, too, so it&rsquo;s easy to test your notebooks for reproducibility.</p><p><img src="https://rstudioblog.files.wordpress.com/2016/09/screen-shot-2016-09-21-at-3-52-34-pm.png" alt="screen-shot-2016-09-21-at-3-52-34-pm"></p><h2 id="rich-output-formats">Rich Output Formats</h2><p>Since they&rsquo;re built on R Markdown, R Notebooks work seamlessly with other R Markdown output types. You can use any existing R Markdown document as a notebook, or render (knit) a notebook to any R Markdown output type.</p><p><img src="https://rstudioblog.files.wordpress.com/2016/09/notebook-yaml.png" alt="notebook-yaml"></p><p>The same document can be used as a notebook when you&rsquo;re quickly iterating on ideas and later rendered to a wholly different format for publication – no duplication of code, data, or output required.</p><h2 id="share-and-publish">Share and Publish</h2><p>R Notebooks are easy to share with collaborators. Because they&rsquo;re plain-text files, they work well with version control systems like Git. Your collaborators don&rsquo;t even need RStudio to edit them, since notebooks can be <a href="https://rmarkdown.rstudio.com/r_notebook_format.html">rendered in the R console</a> using the open source <a href="https://cran.r-project.org/web/packages/rmarkdown/index.html">rmarkdown</a> package.</p><p>Rendered notebooks can be previewed right inside RStudio:</p><p><img src="https://rstudioblog.files.wordpress.com/2016/09/notebook-preview.png" alt="notebook-preview"></p><p>While the notebook preview looks similar to a rendered R Markdown document, the notebook preview does not execute any of your R code chunks; it simply shows you a rendered copy of the markdown in your document along with the most recent chunk output. Because it&rsquo;s very fast to generate this preview (again, no R code is executed), it&rsquo;s generated every time you save the R Markdown document.</p><p>The generated HTML file has the special extension <em>.nb.html</em>. It is self-contained, free of dependencies, and can be viewed locally or published to any static web hosting service.</p><p><img src="https://rstudioblog.files.wordpress.com/2016/09/screen-shot-2016-09-14-at-12-12-35-pm.png" alt="screen-shot-2016-09-14-at-12-12-35-pm"></p><p>It also includes a bundled copy of the R Markdown source file, so it can be seamlessly opened in RStudio to resume work on the notebook with all output intact.</p><h2 id="try-it-out">Try It Out</h2><p>To try out R Notebooks, you&rsquo;ll need to download the latest <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio Preview Release</a>.</p><p>You can find documentation on notebook features on the <a href="https://rmarkdown.rstudio.com/r_notebooks.html">R Notebooks</a> page on the R Markdown website, and we&rsquo;ve also published a video tutorial in our <a href="https://www.rstudio.com/resources/webinars/introducing-notebooks-with-r-markdown/">R Notebooks Webinar</a>.</p><p>We believe the R Notebook will become a powerful new addition to your toolkit. Give it a spin and let us know what you think!</p></description></item><item><title>haven 1.0.0</title><link>https://www.rstudio.com/blog/haven-1-0-0/</link><pubDate>Tue, 04 Oct 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/haven-1-0-0/</guid><description><p>I&rsquo;m pleased to announce the release of haven. Haven is designed to faciliate the transfer of data between R and SAS, SPSS, and Stata. It makes it easy to read SAS, SPSS, and Stata file formats in to R data frames, and makes it easy to save your R data frames in to SAS, SPSS, and Stata if you need to collaborate with others using closed source statistical software. Install haven by running:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">haven&#34;</span>)</code></pre></div><p>haven 1.0.0 is a major release, and indicates that haven is now largely feature complete and has been tested on many real world datasets. There are four major changes in this version of haven:</p><ol><li><p>Improvements to the underlying ReadStat library</p></li><li><p>Better handling of &ldquo;special&rdquo; missing values</p></li><li><p>Improved date/time support</p></li><li><p>Support for other file metadata.</p></li></ol><p>There were also a whole bunch of other minor improvements and bug fixes: you can see the complete list in the <a href="http://haven.tidyverse.org/news/index.html#haven-1.0.0">release notes</a>.</p><h2 id="readstat">ReadStat</h2><p>Haven builds on top of the <a href="http://github.com/WizardMac/ReadStat/issues">ReadStat</a> C library by <a href="http://www.evanmiller.org/">Evan Miller</a>. This version of haven includes many improvements thanks to Evan&rsquo;s hard work on ReadStat:</p><ul><li><p>Can read binary/Ross compressed SAS files.</p></li><li><p>Support for reading and writing Stata 14 data files.</p></li><li><p>New <code>write_sas()</code> allows you to write data frames out to <code>sas7bdat</code> files. This is still somewhat experimental.</p></li><li><p><code>read_por()</code> now actually works.</p></li><li><p>Many other bug fixes and minor improvements.</p></li></ul><h2 id="missing-values">Missing values</h2><p>haven 1.0.0 includes comprehensive support for the &ldquo;special&rdquo; types of missing values found in SAS, SPSS, and Stata. All three tools provide a global &ldquo;system missing value&rdquo;, displayed as <code>.</code>. This is roughly equivalent to R&rsquo;s <code>NA</code>, although neither Stata nor SAS propagate missingness in numeric comparisons (SAS treats the missing value as the smallest possible number and Stata treats it as the largest possible number).</p><p>Each tool also provides a mechanism for recording multiple types of missingness:</p><ul><li><p>Stata has &ldquo;extended&rdquo; missing values, <code>.A</code> through <code>.Z</code>.</p></li><li><p>SAS has &ldquo;special&rdquo; missing values, <code>.A</code> through <code>.Z</code> plus <code>._</code>.</p></li><li><p>SPSS has per-column &ldquo;user&rdquo; missing values. Each column can declare up to three distinct values or a range of values (plus one distinct value) that should be treated as missing.</p></li></ul><p>Stata and SAS only support tagged missing values for numeric columns. SPSS supports up to three distinct values for character columns. Generally, operations involving a user-missing type return a system missing value.</p><p>Haven models these missing values in two different ways:</p><ul><li><p>For SAS and Stata, haven provides <code>tagged_na()</code> which extend R&rsquo;s regular <code>NA</code> to add a single character label.</p></li><li><p>For SPSS, haven provides <code>labelled_spss()</code> that also models user defined values and ranges.</p></li></ul><p>Use <code>zap_missing()</code> if you just want to convert to R&rsquo;s regular <code>NA</code>s.</p><p>You can get more details in the <a href="http://haven.tidyverse.org/articles/semantics.html">semantics vignette</a>.</p><h2 id="datetimes">Date/times</h2><p>Support for date/times has substantially improved:</p><ul><li><p><code>read_dta()</code> now recognises &ldquo;%d&rdquo; and custom date types.</p></li><li><p><code>read_sav()</code> now correctly recognises EDATE and JDATE formats as dates. Variables with format DATE, ADATE, EDATE, JDATE or SDATE are imported as <code>Date</code> variables instead of <code>POSIXct</code>.</p></li><li><p><code>write_dta()</code> and <code>write_sav()</code> support writing date/times.</p></li><li><p>Support for <code>hms()</code> has been moved into the <a href="https://github.com/rstats-db/hms">hms</a> package. Time varibles now have class <code>c(&quot;hms&quot;, &quot;difftime&quot;)</code> and a <code>units</code>attribute with value &ldquo;secs&rdquo;.</p></li></ul><h2 id="other-metadata">Other metadata</h2><p>Haven is slowly adding support for other types of metadata:</p><ul><li><p>Variable formats can be read and written. Similarly to to variable labels, formats are stored as an attribute on the vector. Use <code>zap_formats()</code> if you want to remove these attributes.</p></li><li><p>Added support for reading file &ldquo;label&rdquo; and &ldquo;notes&rdquo;. These are not currently printed, but are stored in the attributes if you need to access them.</p></li></ul></description></item><item><title>ggplot2 2.2.0 coming soon!</title><link>https://www.rstudio.com/blog/ggplot2-2-2-0-coming-soon/</link><pubDate>Fri, 30 Sep 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/ggplot2-2-2-0-coming-soon/</guid><description><p>I&rsquo;m planning to release ggplot2 2.2.0 in early November. In preparation, I&rsquo;d like to announce that a release candidate is now available: version 2.1.0.9001. Please try it out, and file an <a href="https://github.com/hadley/ggplot2/issues">issue on GitHub</a> if you discover any problems. I hope we can find and fix any major issues before the official release.</p><p>Install the pre-release version with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic"># install.packages(&#34;devtools&#34;)</span>devtools<span style="color:#666">::</span><span style="color:#06287e">install_github</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">hadley/ggplot2&#34;</span>)</code></pre></div><p>If you discover a major bug that breaks your plots, please <a href="https://github.com/hadley/ggplot2/issues">file a minimal reprex</a>, and then roll back to the released version with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">ggplot2&#34;</span>)</code></pre></div><p>ggplot2 2.2.0 will be a relatively major release including:</p><ul><li><p>Subtitles and captions.</p></li><li><p>A large rewrite of the facetting system.</p></li><li><p>Improved theme options.</p></li><li><p>Better stacking</p></li><li><p><a href="https://github.com/hadley/ggplot2/blob/master/NEWS.md">Numerous bug fixes and minor improvements</a>.</p></li></ul><p>The majority of this work was carried out by <a href="https://github.com/thomasp85">Thomas Pedersen</a>, who I was lucky to have as my &ldquo;ggplot2 intern&rdquo; this summer. Make sure to check out other visualisation packages: <a href="https://github.com/thomasp85/ggraph">ggraph</a>, <a href="https://github.com/thomasp85/ggforce">ggforce</a>, and <a href="https://github.com/thomasp85/tweenr">tweenr</a>.</p><h2 id="subtitles-and-captions">Subtitles and captions</h2><p>Thanks to <a href="https://rud.is/">Bob Rudis</a>, you can now add subtitles and captions:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(mpg, <span style="color:#06287e">aes</span>(displ, hwy)) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>(<span style="color:#06287e">aes</span>(color <span style="color:#666">=</span> class)) <span style="color:#666">+</span><span style="color:#06287e">geom_smooth</span>(se <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">FALSE</span>, method <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">loess&#34;</span>) <span style="color:#666">+</span><span style="color:#06287e">labs</span>(title <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Fuel efficiency generally decreases with engine size&#34;</span>,subtitle <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Two seaters (sports cars) are an exception because of their light weight&#34;</span>,caption <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Data from fueleconomy.gov&#34;</span>)</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/09/unnamed-chunk-3-1.png" alt="unnamed-chunk-3-1"></p><p>These are controlled by the theme settings <code>plot.subtitle</code> and <code>plot.caption</code>.</p><p>The plot title is now aligned to the left by default. To return to the previous centering, use <code>theme(plot.title = element_text(hjust = 0.5))</code>.</p><h2 id="facets">Facets</h2><p>The facet and layout implementation has been moved to ggproto and received a large rewrite and refactoring. This will allow others to create their own facetting systems, as descrbied in the <em>Extending ggplot2</em> vignette. Along with the rewrite a number of features and improvements has been added, most notably:</p><ul><li>Functions in facetting formulas, thanks to <a href="https://github.com/DanRuderman">Dan Ruderman</a>.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(diamonds, <span style="color:#06287e">aes</span>(carat, price)) <span style="color:#666">+</span><span style="color:#06287e">geom_hex</span>(bins <span style="color:#666">=</span> <span style="color:#40a070">20</span>) <span style="color:#666">+</span><span style="color:#06287e">facet_wrap</span>(<span style="color:#666">~</span><span style="color:#06287e">cut_number</span>(depth, <span style="color:#40a070">6</span>))</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/09/unnamed-chunk-4-1.png" alt="unnamed-chunk-4-1"></p><ul><li>Axes were dropped when the panels in <code>facet_wrap()</code> did not completely fill the rectangle. Now, an axis is drawn underneath the hanging panels:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(mpg, <span style="color:#06287e">aes</span>(displ, hwy)) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>() <span style="color:#666">+</span><span style="color:#06287e">facet_wrap</span>(<span style="color:#666">~</span>class)</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/09/unnamed-chunk-5-1.png" alt="unnamed-chunk-5-1"></p><ul><li>It is now possible to set the position of the axes through the <code>position</code> argument in the scale constructor:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(mpg, <span style="color:#06287e">aes</span>(displ, hwy)) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>() <span style="color:#666">+</span><span style="color:#06287e">scale_x_continuous</span>(position <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">top&#34;</span>) <span style="color:#666">+</span><span style="color:#06287e">scale_y_continuous</span>(position <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">right&#34;</span>)</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/09/unnamed-chunk-6-1.png" alt="unnamed-chunk-6-1"></p><ul><li>You can display a secondary axis that is a one-to-one transformation of the primary axis with the <code>sec.axis</code> argument:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(mpg, <span style="color:#06287e">aes</span>(displ, hwy)) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>() <span style="color:#666">+</span><span style="color:#06287e">scale_y_continuous</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">mpg (US)&#34;</span>,sec.axis <span style="color:#666">=</span> <span style="color:#06287e">sec_axis</span>(<span style="color:#666">~</span> . <span style="color:#666">*</span> <span style="color:#40a070">1.20</span>, name <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">mpg (UK)&#34;</span>))</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/09/unnamed-chunk-7-1.png" alt="unnamed-chunk-7-1"></p><ul><li>Strips can be placed on any side, and the placement with respect to axes can be controlled with the <code>strip.placement</code> theme option.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(mpg, <span style="color:#06287e">aes</span>(displ, hwy)) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>() <span style="color:#666">+</span><span style="color:#06287e">facet_wrap</span>(<span style="color:#666">~</span> drv, strip.position <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">bottom&#34;</span>) <span style="color:#666">+</span><span style="color:#06287e">theme</span>(strip.placement <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">outside&#34;</span>,strip.background <span style="color:#666">=</span> <span style="color:#06287e">element_blank</span>(),strip.text <span style="color:#666">=</span> <span style="color:#06287e">element_text</span>(face <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">bold&#34;</span>)) <span style="color:#666">+</span><span style="color:#06287e">xlab</span>(<span style="color:#007020;font-weight:bold">NULL</span>)</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/09/unnamed-chunk-8-1.png" alt="unnamed-chunk-8-1"></p><h2 id="theming">Theming</h2><ul><li><p>Blank elements can now be overridden again so you get the expected behavior when setting e.g. <code>axis.line.x</code>.</p></li><li><p><code>element_line()</code> gets an <code>arrow</code> argument that lets you put arrows on axes.</p></li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">arrow <span style="color:#666">&lt;-</span> <span style="color:#06287e">arrow</span>(length <span style="color:#666">=</span> <span style="color:#06287e">unit</span>(<span style="color:#40a070">0.4</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">cm&#34;</span>), type <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">closed&#34;</span>)<span style="color:#06287e">ggplot</span>(mpg, <span style="color:#06287e">aes</span>(displ, hwy)) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>() <span style="color:#666">+</span><span style="color:#06287e">theme_minimal</span>() <span style="color:#666">+</span><span style="color:#06287e">theme</span>(axis.line <span style="color:#666">=</span> <span style="color:#06287e">element_line</span>(arrow <span style="color:#666">=</span> arrow))</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/09/unnamed-chunk-9-1.png" alt="unnamed-chunk-9-1"></p><ul><li>Control of legend styling has been improved. The whole legend area can be aligned according to the plot area and a box can be drawn around all legends:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(mpg, <span style="color:#06287e">aes</span>(displ, hwy, shape <span style="color:#666">=</span> drv, colour <span style="color:#666">=</span> fl)) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>() <span style="color:#666">+</span><span style="color:#06287e">theme</span>(legend.justification <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">top&#34;</span>,legend.box.margin <span style="color:#666">=</span> <span style="color:#06287e">margin</span>(<span style="color:#40a070">3</span>, <span style="color:#40a070">3</span>, <span style="color:#40a070">3</span>, <span style="color:#40a070">3</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">mm&#34;</span>),legend.box.background <span style="color:#666">=</span> <span style="color:#06287e">element_rect</span>(colour <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">grey50&#34;</span>))</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/09/unnamed-chunk-10-1.png" alt="unnamed-chunk-10-1"></p><ul><li><p><code>panel.margin</code> and <code>legend.margin</code> have been renamed to <code>panel.spacing</code> and <code>legend.spacing</code> respectively as this better indicates their roles. A new <code>legend.margin</code> has been actually controls the margin around each legend.</p></li><li><p>When computing the height of titles ggplot2, now inclues the height of the descenders (i.e. the bits <code>g</code> and <code>y</code> that hang underneath). This makes improves the margins around titles, particularly the y axis label. I have also very slightly increased the inner margins of axis titles, and removed the outer margins.</p></li><li><p>The default themes has been tweaked by <a href="http://www.obs-vlfr.fr/~irisson/">Jean-Olivier Irisson</a> making them better match <code>theme_grey()</code>.</p></li><li><p>Lastly, the <code>theme()</code> function now has named arguments so autocomplete and documentation suggestions are vastly improved.</p></li></ul><h2 id="stacking-bars">Stacking bars</h2><p><code>position_stack()</code> and <code>position_fill()</code> now stack values in the reverse order of the grouping, which makes the default stack order match the legend.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">avg_price <span style="color:#666">&lt;-</span> diamonds <span style="color:#666">%&gt;%</span><span style="color:#06287e">group_by</span>(cut, color) <span style="color:#666">%&gt;%</span><span style="color:#06287e">summarise</span>(price <span style="color:#666">=</span> <span style="color:#06287e">mean</span>(price)) <span style="color:#666">%&gt;%</span><span style="color:#06287e">ungroup</span>() <span style="color:#666">%&gt;%</span><span style="color:#06287e">mutate</span>(price_rel <span style="color:#666">=</span> price <span style="color:#666">-</span> <span style="color:#06287e">mean</span>(price))<span style="color:#06287e">ggplot</span>(avg_price) <span style="color:#666">+</span><span style="color:#06287e">geom_col</span>(<span style="color:#06287e">aes</span>(x <span style="color:#666">=</span> cut, y <span style="color:#666">=</span> price, fill <span style="color:#666">=</span> color))</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/09/unnamed-chunk-11-1.png" alt="unnamed-chunk-11-1"></p><p>(Note also the new <code>geom_col()</code> which is short-hand for <code>geom_bar(stat = &quot;identity&quot;)</code>, contributed by Bob Rudis.)</p><p>Additionally, you can now stack negative values:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(avg_price) <span style="color:#666">+</span><span style="color:#06287e">geom_col</span>(<span style="color:#06287e">aes</span>(x <span style="color:#666">=</span> cut, y <span style="color:#666">=</span> price_rel, fill <span style="color:#666">=</span> color))</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/09/unnamed-chunk-12-1.png" alt="unnamed-chunk-12-1"></p><p>The overall ordering cannot necessarily be matched in the presence of negative values, but the ordering on either side of the x-axis will match.</p><p>If you want to stack in the opposite order, try <a href="http://forcats.tidyverse.org/reference/fct_rev.html"><code>forcats::fct_rev()</code></a>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(avg_price) <span style="color:#666">+</span><span style="color:#06287e">geom_col</span>(<span style="color:#06287e">aes</span>(x <span style="color:#666">=</span> cut, y <span style="color:#666">=</span> price, fill <span style="color:#666">=</span> <span style="color:#06287e">fct_rev</span>(color)))</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/09/unnamed-chunk-13-1.png" alt="unnamed-chunk-13-1"></p></description></item><item><title>sparklyr — R interface for Apache Spark</title><link>https://www.rstudio.com/blog/sparklyr-r-interface-for-apache-spark/</link><pubDate>Tue, 27 Sep 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/sparklyr-r-interface-for-apache-spark/</guid><description><p>We&rsquo;re excited today to announce <a href="http://spark.rstudio.com">sparklyr</a>, a new package that provides an interface between R and <a href="http://spark.apache.org/">Apache Spark</a>.</p><p><img src="https://www.rstudio.com/blog-images/2016-09-27-sparklyr-illustration.png" alt=""></p><p>Over the past couple of years we&rsquo;ve heard time and time again that people want a native <a href="https://github.com/hadley/dplyr">dplyr</a> interface to Spark, so we built one! sparklyr also provides interfaces to Spark&rsquo;s distributed machine learning algorithms and much more. Highlights include:</p><ul><li><p>Interactively manipulate Spark data using both <a href="https://github.com/hadley/dplyr">dplyr</a> and SQL (via DBI).</p></li><li><p>Filter and aggregate Spark datasets then bring them into R for analysis and visualization.</p></li><li><p>Orchestrate distributed machine learning from R using either <a href="https://spark.rstudio.com/mllib.html">Spark MLlib</a> or <a href="https://spark.rstudio.com/guides/h2o.html">H2O SparkingWater</a>.</p></li><li><p>Create <a href="https://spark.rstudio.com/extensions.html">extensions</a> that call the full Spark API and provide interfaces to Spark packages.</p></li><li><p>Integrated support for establishing Spark connections and browsing Spark data frames within the RStudio IDE.</p></li></ul><p>We&rsquo;re also excited to be working with several industry partners. <a href="http://www.ibm.com/">IBM</a> is incorporating sparklyr into their Data Science Experience, <a href="http://www.cloudera.com/">Cloudera</a> is working with us to ensure that sparklyr meets the requirements of their enterprise customers, and <a href="http://www.h2o.ai/">H2O</a> has provided an integration between sparklyr and H2O Sparkling Water.</p><h2 id="getting-started">Getting Started</h2><p>You can install sparklyr from CRAN as follows:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">sparklyr&#34;</span>)</code></pre></div><p>You should also install a local version of Spark for development purposes:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(sparklyr)<span style="color:#06287e">spark_install</span>(version <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">1.6.2&#34;</span>)</code></pre></div><p>If you use the RStudio IDE, you should also download the latest <a href="https://www.rstudio.com/products/rstudio/download/preview/">preview release</a> of the IDE which includes several enhancements for interacting with Spark.</p><p>Extensive documentation and examples are available at <a href="http://spark.rstudio.com">http://spark.rstudio.com</a>.</p><h2 id="connecting-to-spark">Connecting to Spark</h2><p>You can connect to both local instances of Spark as well as remote Spark clusters. Here we&rsquo;ll connect to a local instance of Spark:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(sparklyr)sc <span style="color:#666">&lt;-</span> <span style="color:#06287e">spark_connect</span>(master <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">local&#34;</span>)</code></pre></div><p>The returned Spark connection (<code>sc</code>) provides a remote dplyr data source to the Spark cluster.</p><h2 id="reading-data">Reading Data</h2><p>You can copy R data frames into Spark using the dplyr copy_to function (more typically though you&rsquo;ll read data within the Spark cluster using the <a href="https://spark.rstudio.com/dplyr.html#reading_data">spark_read</a> family of functions). For the examples below we&rsquo;ll copy some datasets from R into Spark (note that you may need to install the nycflights13 and Lahman packages in order to execute this code):</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(dplyr)iris_tbl <span style="color:#666">&lt;-</span> <span style="color:#06287e">copy_to</span>(sc, iris)flights_tbl <span style="color:#666">&lt;-</span> <span style="color:#06287e">copy_to</span>(sc, nycflights13<span style="color:#666">::</span>flights, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">flights&#34;</span>)batting_tbl <span style="color:#666">&lt;-</span> <span style="color:#06287e">copy_to</span>(sc, Lahman<span style="color:#666">::</span>Batting, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">batting&#34;</span>)</code></pre></div><h2 id="using-dplyr">Using dplyr</h2><p>We can now use all of the available dplyr verbs against the tables within the cluster. Here&rsquo;s a simple filtering example:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic"># filter by departure delay</span>flights_tbl <span style="color:#666">%&gt;%</span> <span style="color:#06287e">filter</span>(dep_delay <span style="color:#666">==</span> <span style="color:#40a070">2</span>)</code></pre></div><p><a href="https://cran.rstudio.com/web/packages/dplyr/vignettes/introduction.html">Introduction to dplyr</a> provides additional dplyr examples you can try. For example, consider the last example from the tutorial which plots data on flight delays:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">delay <span style="color:#666">&lt;-</span> flights_tbl <span style="color:#666">%&gt;%</span><span style="color:#06287e">group_by</span>(tailnum) <span style="color:#666">%&gt;%</span><span style="color:#06287e">summarise</span>(count <span style="color:#666">=</span> <span style="color:#06287e">n</span>(), dist <span style="color:#666">=</span> <span style="color:#06287e">mean</span>(distance), delay <span style="color:#666">=</span> <span style="color:#06287e">mean</span>(arr_delay)) <span style="color:#666">%&gt;%</span><span style="color:#06287e">filter</span>(count <span style="color:#666">&gt;</span> <span style="color:#40a070">20</span>, dist <span style="color:#666">&lt;</span> <span style="color:#40a070">2000</span>, <span style="color:#666">!</span><span style="color:#06287e">is.na</span>(delay)) <span style="color:#666">%&gt;%</span><span style="color:#06287e">collect</span>()<span style="color:#60a0b0;font-style:italic"># plot delays</span><span style="color:#06287e">library</span>(ggplot2)<span style="color:#06287e">ggplot</span>(delay, <span style="color:#06287e">aes</span>(dist, delay)) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>(<span style="color:#06287e">aes</span>(size <span style="color:#666">=</span> count), alpha <span style="color:#666">=</span> <span style="color:#40a070">1</span><span style="color:#666">/</span><span style="color:#40a070">2</span>) <span style="color:#666">+</span><span style="color:#06287e">geom_smooth</span>() <span style="color:#666">+</span><span style="color:#06287e">scale_size_area</span>(max_size <span style="color:#666">=</span> <span style="color:#40a070">2</span>)</code></pre></div><p><img src="https://spark.rstudio.com/images/ggplot2-flights.png" alt=""></p><p>Note that while the dplyr functions shown above look identical to the ones you use with R data frames, with sparklyr they use Spark as their back end and execute remotely in the cluster.</p><h3 id="window-functions">Window Functions</h3><p>dplyr <a href="https://cran.r-project.org/web/packages/dplyr/vignettes/window-functions.html">window functions</a> are also supported, for example:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">batting_tbl <span style="color:#666">%&gt;%</span><span style="color:#06287e">select</span>(playerID, yearID, teamID, G, AB<span style="color:#666">:</span>H) <span style="color:#666">%&gt;%</span><span style="color:#06287e">arrange</span>(playerID, yearID, teamID) <span style="color:#666">%&gt;%</span><span style="color:#06287e">group_by</span>(playerID) <span style="color:#666">%&gt;%</span><span style="color:#06287e">filter</span>(<span style="color:#06287e">min_rank</span>(<span style="color:#06287e">desc</span>(H)) <span style="color:#666">&lt;=</span> <span style="color:#40a070">2</span> <span style="color:#666">&amp;</span> H <span style="color:#666">&gt;</span> <span style="color:#40a070">0</span>)</code></pre></div><p>For additional documentation on using dplyr with Spark see the <a href="https://spark.rstudio.com/dplyr.html">dplyr</a> section of the sparklyr website.</p><h2 id="using-sql">Using SQL</h2><p>It&rsquo;s also possible to execute SQL queries directly against tables within a Spark cluster. The <code>spark_connection</code> object implements a <a href="https://github.com/rstats-db/DBI">DBI</a> interface for Spark, so you can use <code>dbGetQuery</code> to execute SQL and return the result as an R data frame:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(DBI)iris_preview <span style="color:#666">&lt;-</span> <span style="color:#06287e">dbGetQuery</span>(sc, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">SELECT * FROM iris LIMIT 10&#34;</span>)</code></pre></div><h2 id="machine-learning">Machine Learning</h2><p>You can orchestrate machine learning algorithms in a Spark cluster via either <a href="http://spark.rstudio.org/mllib.html">Spark MLlib</a> or via the <a href="http://spark.rstudio.org/h2o.html">H2O Sparkling Water</a> extension package. Both provide a set of high-level APIs built on top of DataFrames that help you create and tune machine learning workflows.</p><h3 id="spark-mllib">Spark MLlib</h3><p>In this example we&rsquo;ll use ml_linear_regression to fit a linear regression model. We&rsquo;ll use the built-in <code>mtcars</code> dataset, and see if we can predict a car&rsquo;s fuel consumption (<code>mpg</code>) based on its weight (<code>wt</code>) and the number of cylinders the engine contains (<code>cyl</code>). We&rsquo;ll assume in each case that the relationship between <code>mpg</code> and each of our features is linear.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic"># copy mtcars into spark</span>mtcars_tbl <span style="color:#666">&lt;-</span> <span style="color:#06287e">copy_to</span>(sc, mtcars)<span style="color:#60a0b0;font-style:italic"># transform our data set, and then partition into &#39;training&#39;, &#39;test&#39;</span>partitions <span style="color:#666">&lt;-</span> mtcars_tbl <span style="color:#666">%&gt;%</span><span style="color:#06287e">filter</span>(hp <span style="color:#666">&gt;=</span> <span style="color:#40a070">100</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">mutate</span>(cyl8 <span style="color:#666">=</span> cyl <span style="color:#666">==</span> <span style="color:#40a070">8</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">sdf_partition</span>(training <span style="color:#666">=</span> <span style="color:#40a070">0.5</span>, test <span style="color:#666">=</span> <span style="color:#40a070">0.5</span>, seed <span style="color:#666">=</span> <span style="color:#40a070">1099</span>)<span style="color:#60a0b0;font-style:italic"># fit a linear model to the training dataset</span>fit <span style="color:#666">&lt;-</span> partitions<span style="color:#666">$</span>training <span style="color:#666">%&gt;%</span><span style="color:#06287e">ml_linear_regression</span>(response <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">mpg&#34;</span>, features <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">wt&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">cyl&#34;</span>))</code></pre></div><p>For linear regression models produced by Spark, we can use <code>summary()</code> to learn a bit more about the quality of our fit, and the statistical significance of each of our predictors.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">summary</span>(fit)</code></pre></div><p>Spark machine learning supports a wide array of algorithms and feature transformations, and as illustrated above it&rsquo;s easy to chain these functions together with dplyr pipelines. To learn more see the <a href="https://spark.rstudio.com/mllib.html">Spark MLlib</a> section of the sparklyr website.</p><h3 id="h2o-sparkling-water">H2O Sparkling Water</h3><p>Let&rsquo;s walk the same <code>mtcars</code> example, but in this case use H2O&rsquo;s machine learning algorithms via the <a href="https://spark.rstudio.com/guides/h2o.html">H2O Sparkling Water</a> extension. The dplyr code used to prepare the data is the same, but after partitioning into test and training data we call <code>h2o.glm</code> rather than <code>ml_linear_regression</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic"># convert to h20_frame (uses the same underlying rdd)</span>training <span style="color:#666">&lt;-</span> <span style="color:#06287e">as_h2o_frame</span>(partitions<span style="color:#666">$</span>training)test <span style="color:#666">&lt;-</span> <span style="color:#06287e">as_h2o_frame</span>(partitions<span style="color:#666">$</span>test)<span style="color:#60a0b0;font-style:italic"># fit a linear model to the training dataset</span>fit <span style="color:#666">&lt;-</span> <span style="color:#06287e">h2o.glm</span>(x <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">wt&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">cyl&#34;</span>),y <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">mpg&#34;</span>,training_frame <span style="color:#666">=</span> training,lamda_search <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>)<span style="color:#60a0b0;font-style:italic"># inspect the model</span><span style="color:#06287e">print</span>(fit)</code></pre></div><p>For linear regression models produced by H2O, we can use either <code>print()</code> or <code>summary()</code> to learn a bit more about the quality of our fit. The <code>summary()</code> method returns some extra information about scoring history and variable importance.</p><p>To learn more see the <a href="https://spark.rstudio.com/guides/h2o.html">H2O Sparkling Water</a> section of the sparklyr website.</p><h2 id="extensions">Extensions</h2><p>The facilities used internally by sparklyr for its dplyr and machine learning interfaces are available to extension packages. Since Spark is a general purpose cluster computing system there are many potential applications for extensions (e.g. interfaces to custom machine learning pipelines, interfaces to 3rd party Spark packages, etc.).</p><p>The <a href="https://github.com/bnosac/spark.sas7bdat">sas7bdat</a> extension enables parallel reading of SAS datasets in the sas7bdat format into Spark data frames. The [rsparkling]https://spark.rstudio.com/guides/h2o.html) extension provides a bridge between sparklyr and H2O&rsquo;s <a href="http://www.h2o.ai/product/sparkling-water/">Sparkling Water</a>.</p><p>We&rsquo;re excited to see what other sparklyr extensions the R community creates. To learn more see the <a href="https://spark.rstudio.com/extensions.html">Extensions</a> section of the sparklyr website.</p><h2 id="rstudio-ide">RStudio IDE</h2><p>The latest RStudio <a href="https://www.rstudio.com/products/rstudio/download/preview/">Preview Release</a> of the RStudio IDE includes integrated support for Spark and the sparklyr package, including tools for:</p><ul><li><p>Creating and managing Spark connections</p></li><li><p>Browsing the tables and columns of Spark DataFrames</p></li><li><p>Previewing the first 1,000 rows of Spark DataFrames</p></li></ul><p>Once you&rsquo;ve installed the sparklyr package, you should find a new <strong>Spark</strong> pane within the IDE. This pane includes a <strong>New Connection</strong> dialog which can be used to make connections to local or remote Spark instances:</p><p><img src="https://www.rstudio.com/blog-images/2016-09-27-spark-connect.png" alt=""></p><p>Once you&rsquo;ve connected to Spark you&rsquo;ll be able to browse the tables contained within the Spark cluster:</p><p><img src="https://www.rstudio.com/blog-images/2016-09-27-spark-tab.png" alt=""></p><p>The Spark DataFrame preview uses the standard RStudio data viewer:</p><p><img src="https://www.rstudio.com/blog-images/2016-09-27-spark-dataview.png" alt=""></p><p>The RStudio IDE features for sparklyr are available now as part of the <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio Preview Release</a>. The final version of RStudio IDE that includes integrated support for sparklyr will ship within the next few weeks.</p><h2 id="partners">Partners</h2><p>We&rsquo;re very pleased to be joined in this announcement by IBM, Cloudera, and H2O, who are working with us to ensure that sparklyr meets the requirements of enterprise customers and is easy to integrate with current and future deployments of Spark.</p><h3 id="ibm">IBM</h3><p>&ldquo;With our latest contributions to Apache Spark and the release of sparklyr, we continue to emphasize R as a primary data science language within the Spark community. Additionally, we are making plans to include sparklyr in Data Science Experience to provide the tools data scientists are comfortable with to help them bring business-changing insights to their companies faster,&rdquo; said Ritika Gunnar, vice president of Offering Management, IBM Analytics.</p><h3 id="cloudera">Cloudera</h3><p>&ldquo;At Cloudera, data science is one of the most popular use cases we see for Apache Spark as a core part of the Apache Hadoop ecosystem, yet the lack of a compelling R experience has limited data scientists&rsquo; access to available data and compute,&rdquo; said Charles Zedlewski, vice president, Products at Cloudera. &ldquo;We are excited to partner with RStudio to help bring sparklyr to the enterprise, so that data scientists and IT teams alike can get more value from their existing skills and infrastructure, all with the security, governance, and management our customers expect.&rdquo;</p><h3 id="h2o">H2O</h3><p>&ldquo;At H2O.ai, we&rsquo;ve been focused on bringing the best of breed open source machine learning to data scientists working in R &amp; Python. However, the lack of robust tooling in the R ecosystem for interfacing with Apache Spark has made it difficult for the R community to take advantage of the distributed data processing capabilities of Apache Spark.</p><p>We&rsquo;re excited to work with RStudio to bring the ease of use of dplyr and the distributed machine learning algorithms from H2O&rsquo;s Sparkling Water to the R community via the sparklyr &amp; rsparkling packages&rdquo;</p></description></item><item><title>Shiny Server (Pro) 1.4.6</title><link>https://www.rstudio.com/blog/shiny-server-1-4-6/</link><pubDate>Thu, 22 Sep 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-server-1-4-6/</guid><description><p>We&rsquo;ve just released Shiny Server and Shiny Server Pro 1.4.6. Relative to 1.4.2, our previously blogged-about version, the 1.4.6 release primarily includes bug fixes, and mitigations for low-severity security issues found by penetration testing. The full list of changes is after the jump.</p><p>If you&rsquo;re running a Shiny Server Pro release that is older than 1.4.3 <em>and</em> are configured to use SSL/TLS, it&rsquo;s especially important that you upgrade, as the versions of Node.js that are bundled with Shiny Server Pro 1.4.3 and earlier include vulnerable versions of OpenSSL.</p><p><strong>Shiny Server (Open Source):</strong> <a href="https://www.rstudio.com/products/shiny/download-server/">Download now</a></p><p><strong>Shiny Server Pro:</strong> If you already have a license or evaluation key, please <a href="https://www.rstudio.com/products/shiny/download-commercial/">upgrade now</a>. Otherwise, you can <a href="https://www.rstudio.com/products/shiny-server-pro/evaluation/">start a free 45-day evaluation</a>.</p><!-- more --><h3 id="shiny-server-pro-146">Shiny Server Pro 1.4.6</h3><p>Bug fix release.</p><ul><li>Fix a bug where a 404 response on some URLs could cause the server to exit with an unhandled exception.</li></ul><h3 id="shiny-server-pro-145">Shiny Server Pro 1.4.5</h3><p>Security release to fix minor issues raised in penetration test results.</p><ul><li><p>Add <code>disable_login_autocomplete</code> directive that can be used to instruct browsers not to attempt to autocomplete on the login screen. Note that servers can only suggest this behavior to browsers (and in particular, Google Chrome chooses not to comply, as its developers argue that disabling autocomplete decreases security rather than increasing it).</p></li><li><p>Add opt-in clickjacking protection via <code>frame_options</code> directive. Login and /admin URLs now served with <code>X-Frame-Options: DENY</code> (the former can be opted out with an <code>auth_frame_options allow;</code> directive).</p></li><li><p>Fix open redirection on <strong>login</strong>. Previously, a URL created with malicious intent could cause you to go to an arbitrary URL after successful login. Now, it is only possible to be redirected to a path on Shiny Server.</p></li><li><p>Add Cross-Site Request Forgery (CSRF) protection to login and other POST operations.</p></li></ul><h3 id="shiny-server-pro-144">Shiny Server Pro 1.4.4</h3><ul><li><p>Fix fatal EBADF error that could cause server crashes.</p></li><li><p>Updated PAM integration to resolve bug with asynchronous PAM modules like pam_ldap, pam_vas, and nss_ldap.</p></li><li><p>Upgrade to Node.js v0.10.46 (security patches).</p></li></ul><h3 id="shiny-server-pro-143">Shiny Server Pro 1.4.3</h3><ul><li><p>Added proxied authentication mechanism via the <code>auth_proxy</code> option.</p></li><li><p>Upgrade to Node.js v0.10.45 (primarily for updated OpenSSL).</p></li></ul></description></item><item><title>lubridate 1.6.0</title><link>https://www.rstudio.com/blog/lubridate-1-6-0/</link><pubDate>Thu, 15 Sep 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/lubridate-1-6-0/</guid><description><p>I am pleased to announced lubridate 1.6.0. Lubridate is designed to make working with dates and times as pleasant as possible, and is maintained by <a href="http://vitalie.spinu.info/">Vitalie Spinu</a>. You can install the latest version with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">lubridate&#34;</span>)</code></pre></div><p>This release includes a range of bug fixes and minor improvements. Some highlights from this release include:</p><ul><li><code>period()</code> and <code>duration()</code> constructors now accept character strings and allow a very flexible specification of timespans:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">period</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">3H 2M 1S&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;3H 2M 1S&#34;</span><span style="color:#06287e">duration</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">3 hours, 2 mins, 1 secs&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;10921s (~3.03 hours)&#34;</span><span style="color:#60a0b0;font-style:italic"># Missing numerals default to 1.</span><span style="color:#60a0b0;font-style:italic"># Repeated units are summed</span><span style="color:#06287e">period</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">hour minute minute&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;1H 2M 0S&#34;</span></code></pre></div><p>Period and duration parsing allows for arbitrary abbreviations of time units as long as the specification is unambiguous. For single letter specs, <code>strptime()</code> rules are followed, so <code>m</code> stands for <code>months</code> and <code>M</code> for <code>minutes</code>.</p><p>These same rules allows you to compare strings and durations/periods:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#4070a0">&#34;</span><span style="color:#4070a0">2mins 1 sec&#34;</span> <span style="color:#666">&gt;</span> <span style="color:#06287e">period</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">2mins&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] TRUE</span></code></pre></div><ul><li>Date time rounding (with <code>round_date()</code>, <code>floor_date()</code> and <code>ceiling_date()</code>) now supports unit multipliers, like &ldquo;3 days&rdquo; or &ldquo;2 months&rdquo;:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ceiling_date</span>(<span style="color:#06287e">ymd_hms</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">2016-09-12 17:10:00&#34;</span>), unit <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">5 minutes&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;2016-09-12 17:10:00 UTC&#34;</span></code></pre></div><ul><li><p>The behavior of <code>ceiling_date</code> for <code>Date</code> objects is now more intuitive. In short, dates are now interpreted as time intervals that are physically part of longer unit intervals:</p><p>|day1| &hellip; |day31|day1| &hellip; |day28| &hellip;| January | February | &hellip;</p></li></ul><p>That means that rounding up <code>2000-01-01</code> by a month is done to the boundary between January and February which, i.e. <code>2000-02-01</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ceiling_date</span>(<span style="color:#06287e">ymd</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">2000-01-01&#34;</span>), unit <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">month&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;2000-02-01&#34;</span></code></pre></div><p>This behavior is controlled by the <code>change_on_boundary</code> argument.</p><ul><li>It is now possible to compare <code>POSIXct</code> and <code>Date</code> objects:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ymd_hms</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">2000-01-01 00:00:01&#34;</span>) <span style="color:#666">&gt;</span> <span style="color:#06287e">ymd</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">2000-01-01&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] TRUE</span></code></pre></div><ul><li><p>C-level parsing now handles English months and AM/PM indicator regardless of your locale. This means that English date-times are now always handled by lubridate C-level parsing and you don&rsquo;t need to explicitly switch the locale.</p></li><li><p>New parsing function <code>yq()</code> allows you to parse a year + quarter:</p></li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">yq</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">2016-02&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;2016-04-01&#34;</span></code></pre></div><p>The new <code>q</code> format is available in all lubridate parsing functions.</p><p>See the <a href="https://github.com/hadley/lubridate/releases/tag/v1.6.0">release notes</a> for the full list of changes. A big thanks goes to everyone who contributed: @<a href="https://github.com/arneschillert">arneschillert</a>, @<a href="https://github.com/cderv">cderv</a>, @<a href="https://github.com/ijlyttle">ijlyttle</a>, @<a href="https://github.com/jasonelaw">jasonelaw</a>, @<a href="https://github.com/jonboiser">jonboiser</a>, and @<a href="https://github.com/krlmlr">krlmlr</a>.</p></description></item><item><title>tidyverse 1.0.0</title><link>https://www.rstudio.com/blog/tidyverse-1-0-0/</link><pubDate>Thu, 15 Sep 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/tidyverse-1-0-0/</guid><description><p>The tidyverse is a set of packages that work in harmony because they share common data representations and API design. The <strong>tidyverse</strong> package is designed to make it easy to install and load core packages from the tidyverse in a single command.</p><p>The best place to learn about all the packages in the tidyverse and how they fit together is <a href="http://r4ds.had.co.nz/">R for Data Science</a>. Expect to hear more about the tidyverse in the coming months as I work on improved package websites, making <a href="http://joss.theoj.org/">citation easier</a>, and providing a common home for discussions about data analysis with the tidyverse.</p><h2 id="installation">Installation</h2><p>You can install tidyverse with</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">tidyverse&#34;</span>)</code></pre></div><p>This will install the core tidyverse packages that you are likely to use in almost every analysis:</p><ul><li><p><a href="http://ggplot2.org/">ggplot2</a>, for data visualisation.</p></li><li><p><a href="https://github.com/hadley/dplyr">dplyr</a>, for data manipulation.</p></li><li><p><a href="https://github.com/hadley/tidyr">tidyr</a>, for data tidying.</p></li><li><p><a href="https://github.com/hadley/readr">readr</a>, for data import.</p></li><li><p><a href="https://github.com/hadley/purrr">purrr</a>, for functional programming.</p></li><li><p><a href="https://github.com/hadley/tibble">tibble</a>, for tibbles, a modern re-imagining of data frames.</p></li></ul><p>It also installs a selection of other tidyverse packages that you&rsquo;re likely to use frequently, but probably not in every analysis. This includes packages for data manipulation:</p><ul><li><p><a href="https://github.com/krlmlr/hms">hms</a>, for times.</p></li><li><p><a href="https://github.com/hadley/stringr">stringr</a>, for strings.</p></li><li><p><a href="https://github.com/hadley/lubridate">lubridate</a>, for date/times.</p></li><li><p><a href="https://hadley.github.io/forcats/">forcats</a>, for factors.</p></li></ul><p>Data import:</p><ul><li><p><a href="https://github.com/rstats-db/DBI">DBI</a>, for databases.</p></li><li><p><a href="https://github.com/hadley/haven/">haven</a>, for SPSS, SAS and Stata files.</p></li><li><p><a href="https://github.com/hadley/httr/">httr</a>, for web apis.</p></li><li><p><a href="https://github.com/jeroenooms/jsonlite">jsonlite</a> for JSON.</p></li><li><p><a href="https://github.com/hadley/readxl">readxl</a>, for <code>.xls</code> and <code>.xlsx</code> files.</p></li><li><p><a href="https://github.com/hadley/rvest">rvest</a>, for web scraping.</p></li><li><p><a href="https://github.com/hadley/xml2">xml2</a>, for XML.</p></li></ul><p>And modelling:</p><ul><li><p><a href="https://github.com/hadley/modelr">modelr</a>, for simple modelling within a pipeline</p></li><li><p><a href="https://github.com/dgrtwo/broom">broom</a>, for turning models into tidy data</p></li></ul><p>These packages will be installed along with tidyverse, but you&rsquo;ll load them explicitly with <code>library()</code>.</p><h2 id="usage">Usage</h2><p><code>library(tidyverse)</code> will load the core tidyverse packages: ggplot2, tibble, tidyr, readr, purrr, and dplyr. You also get a condensed summary of conflicts with other packages you have loaded:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(tidyverse)<span style="color:#60a0b0;font-style:italic">#&gt; Loading tidyverse: ggplot2</span><span style="color:#60a0b0;font-style:italic">#&gt; Loading tidyverse: tibble</span><span style="color:#60a0b0;font-style:italic">#&gt; Loading tidyverse: tidyr</span><span style="color:#60a0b0;font-style:italic">#&gt; Loading tidyverse: readr</span><span style="color:#60a0b0;font-style:italic">#&gt; Loading tidyverse: purrr</span><span style="color:#60a0b0;font-style:italic">#&gt; Loading tidyverse: dplyr</span><span style="color:#60a0b0;font-style:italic">#&gt; Conflicts with tidy packages ---------------------------------------</span><span style="color:#60a0b0;font-style:italic">#&gt; filter(): dplyr, stats</span><span style="color:#60a0b0;font-style:italic">#&gt; lag(): dplyr, stats</span></code></pre></div><p>You can see conflicts created later with <code>tidyverse_conflicts()</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(MASS)<span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; Attaching package: &#39;MASS&#39;</span><span style="color:#60a0b0;font-style:italic">#&gt; The following object is masked from &#39;package:dplyr&#39;:</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; select</span><span style="color:#06287e">tidyverse_conflicts</span>()<span style="color:#60a0b0;font-style:italic">#&gt; Conflicts with tidy packages --------------------------------------</span><span style="color:#60a0b0;font-style:italic">#&gt; filter(): dplyr, stats</span><span style="color:#60a0b0;font-style:italic">#&gt; lag(): dplyr, stats</span><span style="color:#60a0b0;font-style:italic">#&gt; select(): dplyr, MASS</span></code></pre></div><p>And you can check that all tidyverse packages are up-to-date with <code>tidyverse_update()</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">tidyverse_update</span>()<span style="color:#60a0b0;font-style:italic">#&gt; The following packages are out of date:</span><span style="color:#60a0b0;font-style:italic">#&gt; * broom (0.4.0 -&gt; 0.4.1)</span><span style="color:#60a0b0;font-style:italic">#&gt; * DBI (0.4.1 -&gt; 0.5)</span><span style="color:#60a0b0;font-style:italic">#&gt; * Rcpp (0.12.6 -&gt; 0.12.7)</span><span style="color:#60a0b0;font-style:italic">#&gt; Update now?</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1: Yes</span><span style="color:#60a0b0;font-style:italic">#&gt; 2: No</span></code></pre></div></description></item><item><title>Shiny 0.14</title><link>https://www.rstudio.com/blog/shiny-0-14/</link><pubDate>Mon, 12 Sep 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-0-14/</guid><description><p>A new Shiny release is upon us! There are many new exciting features, bug fixes, and library updates. We&rsquo;ll just highlight the most important changes here, but you can browse through the <a href="https://shiny.rstudio.com/articles/upgrade-0.14.html#full-changelog">full changelog</a> for details. This will likely be the last release before shiny 1.0, so get out your party hats!</p><p>To install it, you can just run:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">shiny&#34;</span>)</code></pre></div><h2 id="bookmarkable-state">Bookmarkable state</h2><p>Shiny now supports bookmarkable state: users can save the state of an application and get a URL which will restore the application with that state. There are two types of bookmarking: encoding the state in a URL, and saving the state to the server. With an encoded state, the entire state of the application is contained in the URL&rsquo;s query string. You can see this in action with this app: <a href="https://gallery.shinyapps.io/113-bookmarking-url/">https://gallery.shinyapps.io/113-bookmarking-url/</a>. An example of a bookmark URL for this app is <a href="https://gallery.shinyapps.io/113-bookmarking-url/?_inputs_&amp;n=200">https://gallery.shinyapps.io/113-bookmarking-url/?inputs&amp;n=200</a>. When the state is saved to the server, the URL might look something like: <a href="https://gallery.shinyapps.io/bookmark-saved/?state_id=d80625dc681e913a">https://gallery.shinyapps.io/bookmark-saved/?state_id=d80625dc681e913a</a> (note that this URL is not for an active app).</p><p><strong><em>Important note</em>:</strong> Saved-to-server bookmarking currently works with Shiny Server Open Source. Support on Shiny Server Pro, RStudio Connect, and <a href="http://shinyapps.io">shinyapps.io</a> is under development and testing. However, URL-encoded bookmarking works on all hosting platforms.</p><p>See <a href="https://shiny.rstudio.com/articles/bookmarking-state.html">this article</a> to get started with bookmarkable state. There is also an <a href="https://shiny.rstudio.com/articles/advanced-bookmarking.html">advanced-level article</a>, and <a href="https://shiny.rstudio.com/articles/bookmarking-modules.html">a modules article</a> that details how to use bookmarking in conjunction with modules.</p><h2 id="notifications">Notifications</h2><p>Shiny can now display notifications on the client browser by using the <code>showNotification()</code> function. Use <a href="https://gallery.shinyapps.io/116-notifications/">this demo app</a> to play around with the notification API. For more, see our <a href="https://shiny.rstudio.com/articles/notifications.html">article</a> about notifications.</p><h2 id="progress-indicators">Progress indicators</h2><p>If your Shiny app contains computations that take a long time to complete, a progress bar can improve the user experience by communicating how far along the computation is, and how much is left. Progress bars were added in Shiny 0.10.2. In Shiny 0.14, we&rsquo;ve changed them to use the notifications system, which gives them a different look.</p><p><strong><em>Important note</em>:</strong> If you were already using progress bars and had customized them with your own CSS, you can add the <code>style = &quot;old&quot;</code> argument to your <code>withProgress()</code> call (or <code>Progress$new()</code>). This will result in the same appearance as before. You can also call <code>shinyOptions(progress.style = &quot;old&quot;)</code> in your app&rsquo;s server function to make all progress indicators use the old styling.</p><p>To see new progress bars in action, see <a href="https://gallery.shinyapps.io/085-progress/">this app</a> in the gallery. You can also learn more about them in <a href="https://shiny.rstudio.com/articles/progress.html">here</a>.</p><h2 id="modal-windows">Modal windows</h2><p>Shiny has now built-in support for displaying modal dialogs like the one below (<a href="https://gallery.shinyapps.io/114-modal-dialog/">live app here</a>):</p><p><img src="https://rstudioblog.files.wordpress.com/2016/09/modal-dialog.png" alt="Modal dialog"></p><p>To learn more about modal dialogs in Shiny, read the <a href="https://shiny.rstudio.com/articles/modal-dialogs.html">article</a> about them.</p><h2 id="insertui-and-removeui"><code>insertUI</code> and <code>removeUI</code></h2><p>Sometimes in a Shiny app, arbitrary HTML UI may need to be created on-the-fly in response to user input. The existing <code>uiOutput</code> and <code>renderUI</code> functions let you continue using reactive logic to call UI functions and make the results appear in a predetermined place in the UI. The <code>insertUI</code> and <code>removeUI</code> functions, which are used in the server code, allow you to use imperative logic to add and remove arbitrary chunks of HTML (all independent from one another), as many times as you want, whenever you want, wherever you want. This option may be more convenient when you want to, for example, add a new model to your app each time the user selects a different option (and leave previous models unchanged, rather than substitute the previous one for the latest one).</p><p>See <a href="https://gallery.shinyapps.io/111-insert-ui/">this simple demo app</a> of how one could use <code>insertUI</code> and <code>removeUI</code> to insert and remove text elements using a queue. Also see <a href="https://gallery.shinyapps.io/insertUI/">this other app</a> that demonstrates how to insert and remove a few common Shiny input objects. Finally, <a href="https://gallery.shinyapps.io/insertUI-modules/">this app</a> shows how to dynamically insert modules using <code>insertUI</code>.</p><p>For more, read <a href="https://shiny.rstudio.com/articles/dynamic-ui.html">our article</a> about dynamic UI generation and the reference documentation about <a href="https://shiny.rstudio.com/reference/shiny/latest/insertUI.html"><code>insertUI</code></a> and <a href="https://shiny.rstudio.com/reference/shiny/latest/removeUI.html"><code>removeUI</code></a>.</p><h2 id="documentation-for-connecting-to-an-external-database">Documentation for connecting to an external database</h2><p>Many Shiny users have asked about best practices for accessing external databases from their Shiny applications. Although database access has long been possible using various database connector packages in R, it can be challenging to use them robustly in the dynamic environment that Shiny provides. So far, it has been mostly up to application authors to find the appropriate database drivers and to discover how to manage the database connections within an application. In order to demystify this process, we wrote a series of articles (<a href="https://shiny.rstudio.com/articles/overview.html">first one here</a>) that covers the basics of connecting to an external database, as well as some security precautions to keep in mind (e.g. <a href="https://shiny.rstudio.com/articles/sql-injections.html">how to avoid SQL injection attacks</a>).</p><p>There are a few packages that you should look at if you&rsquo;re using a relational database in a Shiny app: the <code>dplyr</code> and <code>DBI</code> packages (both featured in the article linked to above), and the brand new <code>pool</code> package, which provides a further layer of abstraction to make it easier and safer to use either <code>DBI</code> or <code>dplyr</code>. <code>pool</code> is not yet on CRAN. In particular, <code>pool</code> will take care of managing connections, preventing memory leaks, and ensuring the best performance. See this <a href="https://shiny.rstudio.com/articles/pool-basics.html"><code>pool</code> basics article</a> and the <a href="https://shiny.rstudio.com/articles/pool-advanced.html">more advanced-level article</a> if you&rsquo;re feeling adventurous! (Both of these articles contain Shiny app examples that use <code>DBI</code> to connect to an external MySQL database.) If you are more comfortable with <code>dplyr</code> than <code>DBI</code>, don&rsquo;t miss the article about the <a href="https://shiny.rstudio.com/articles/pool-dplyr.html">integration of <code>pool</code> and <code>dplyr</code></a>.</p><p>If you&rsquo;re new to databases in the Shiny world, we recommend using <code>dplyr</code> and <code>pool</code> if possible. If you need greater control than <code>dplyr</code> offers (for example, if you need to modify data in the database or use transactions), then use <code>DBI</code> and <code>pool</code>. The <code>pool</code> package was introduced to make your life easier, but in no way constrains you, so we don&rsquo;t envision any situation in which you&rsquo;d be better off <em>not</em> using it. The only caveat is that <code>pool</code> is not yet on CRAN, so you may prefer to wait for that.</p><h2 id="others">Others</h2><p>There are many more minor features, small improvements, and bug fixes than we can cover here, so we&rsquo;ll just mention a few of the more noteworthy ones. (For more, you can see the <a href="https://shiny.rstudio.com/articles/upgrade-0.14.html#full-changelog">full changelog</a>.).</p><ul><li><p><strong>Error Sanitization</strong>: you now have the option to sanitize error messages; in other words, the content of the original error message can be suppressed so that it doesn&rsquo;t leak any sensitive information. To sanitize errors everywhere in your app, just add <code>options(shiny.sanitize.errors = TRUE)</code> somewhere in your app. Read <a href="https://shiny.rstudio.com/articles/sanitize-errors.html">this article</a> for more, or play with the <a href="https://gallery.shinyapps.io/110-error-sanitization/">demo app</a>.</p></li><li><p><strong>Code Diagnostics</strong>: if there is an error parsing <code>ui.R</code>, <code>server.R</code>, <code>app.R</code>, or <code>global.R</code>, Shiny will search the code for missing commas, extra commas, and unmatched braces, parens, and brackets, and will print out messages pointing out those problems. (<a href="https://github.com/rstudio/shiny/pull/1126">#1126</a>)</p></li><li><p><strong>Reactlog visualization</strong>: by default, the <a href="https://shiny.rstudio.com/reference/shiny/latest/showReactLog.html"><code>showReactLog()</code> function</a> (which brings up the reactive graph) also displays the time that each reactive and observer were active for:</p></li></ul><p><img src="https://rstudioblog.files.wordpress.com/2016/09/reactlog.png" alt="reactlog"></p><p>Additionally, to organize the graph, you can now drag any of the nodes to a specific position and leave it there.</p><ul><li><strong>Nicer-looking tables</strong>: we&rsquo;ve made tables generated with <code>renderTable()</code> look cleaner and more modern. While this won&rsquo;t break any older code, the finished look of your table will be quite a bit different, as the following image shows:</li></ul><p><img src="https://rstudioblog.files.wordpress.com/2016/09/render-table.png" alt="render-table"></p><p>For more, read our <a href="https://shiny.rstudio.com/articles/render-table.html">short article</a> about this update, experiment with all the new features in this <a href="https://gallery.shinyapps.io/109-render-table/">demo app</a>, or check out the <a href="https://shiny.rstudio.com/reference/shiny/latest/renderTable.html">reference documentation</a>.</p></description></item><item><title>forcats 0.1.0 </title><link>https://www.rstudio.com/blog/forcats-0-1-0/</link><pubDate>Wed, 31 Aug 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/forcats-0-1-0/</guid><description><p>I&rsquo;m excited to announce forcats, a new package for categorical variables, or factors. Factors have a bad rap in R because they often turn up when you don&rsquo;t want them. That&rsquo;s because historically, factors were more convenient than character vectors, as discussed in <a href="http://simplystatistics.org/2015/07/24/stringsasfactors-an-unauthorized-biography/"><em>stringsAsFactors: An unauthorized biography</em></a> by Roger Peng, and <a href="http://notstatschat.tumblr.com/post/124987394001/stringsasfactors-sigh"><em>stringsAsFactors = <sigh></em></a> by Thomas Lumley.</p><p>If you use packages from the tidyverse (like <a href="http://r4ds.had.co.nz/tibbles.html">tibble</a> and <a href="http://r4ds.had.co.nz/data-import.html">readr</a>) you don&rsquo;t need to worry about getting factors when you don&rsquo;t want them. But factors are a useful data structure in their own right, particularly for modelling and visualisation, because they allow you to control the order of the levels. Working with factors in base R can be a little frustrating because of a handful of missing tools. The goal of forcats is to fill in those missing pieces so you can access the power of factors with a minimum of pain.</p><p>Install forcats with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">forcats&#34;</span>)</code></pre></div><p>forcats provides two main types of tools to change either the values or the order of the levels. I&rsquo;ll call out some of the most important functions below, using using the included <code>gss_cat</code> dataset which contains a selection of categorical variables from the <a href="http://gss.norc.org/">General Social Survey</a>.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(dplyr)<span style="color:#06287e">library</span>(ggplot2)<span style="color:#06287e">library</span>(forcats)gss_cat<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 21,483 × 9</span><span style="color:#60a0b0;font-style:italic">#&gt; year marital age race rincome partyid</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;int&gt; &lt;fctr&gt; &lt;int&gt; &lt;fctr&gt; &lt;fctr&gt; &lt;fctr&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 2000 Never married 26 White $8000 to 9999 Ind,near rep</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2000 Divorced 48 White $8000 to 9999 Not str republican</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 2000 Widowed 67 White Not applicable Independent</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 2000 Never married 39 White Not applicable Ind,near rep</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 2000 Divorced 25 White Not applicable Not str democrat</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 2000 Married 25 White $20000 - 24999 Strong democrat</span><span style="color:#60a0b0;font-style:italic">#&gt; # ... with 2.148e+04 more rows, and 3 more variables: relig &lt;fctr&gt;,</span><span style="color:#60a0b0;font-style:italic">#&gt; # denom &lt;fctr&gt;, tvhours &lt;int&gt;</span></code></pre></div><h2 id="change-level-values">Change level values</h2><p>You can recode specified factor levels with <a href="https://hadley.github.io/forcats/fct_recode.html"><code>fct_recode()</code></a>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">gss_cat <span style="color:#666">%&gt;%</span> <span style="color:#06287e">count</span>(partyid)<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 10 × 2</span><span style="color:#60a0b0;font-style:italic">#&gt; partyid n</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;fctr&gt; &lt;int&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 No answer 154</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 Don&#39;t know 1</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 Other party 393</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 Strong republican 2314</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 Not str republican 3032</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 Ind,near rep 1791</span><span style="color:#60a0b0;font-style:italic">#&gt; # ... with 4 more rows</span>gss_cat <span style="color:#666">%&gt;%</span><span style="color:#06287e">mutate</span>(partyid <span style="color:#666">=</span> <span style="color:#06287e">fct_recode</span>(partyid,<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Republican, strong&#34;</span> <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Strong republican&#34;</span>,<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Republican, weak&#34;</span> <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Not str republican&#34;</span>,<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Independent, near rep&#34;</span> <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Ind,near rep&#34;</span>,<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Independent, near dem&#34;</span> <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Ind,near dem&#34;</span>,<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Democrat, weak&#34;</span> <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Not str democrat&#34;</span>,<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Democrat, strong&#34;</span> <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Strong democrat&#34;</span>)) <span style="color:#666">%&gt;%</span><span style="color:#06287e">count</span>(partyid)<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 10 × 2</span><span style="color:#60a0b0;font-style:italic">#&gt; partyid n</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;fctr&gt; &lt;int&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 No answer 154</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 Don&#39;t know 1</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 Other party 393</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 Republican, strong 2314</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 Republican, weak 3032</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 Independent, near rep 1791</span><span style="color:#60a0b0;font-style:italic">#&gt; # ... with 4 more rows</span></code></pre></div><p>Note that unmentioned levels are left as is, and the order of the levels is preserved.</p><p><a href="https://hadley.github.io/forcats/fct_relump.html"><code>fct_lump()</code></a> allows you to lump the rarest (or most common) levels in to a new &ldquo;other&rdquo; level. The default behaviour is to collapse the smallest levels in to other, ensuring that it&rsquo;s still the smallest level. For the religion variable that tells us that Protestants out number all other religions, which is interesting, but we probably want more level.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">gss_cat <span style="color:#666">%&gt;%</span><span style="color:#06287e">mutate</span>(relig <span style="color:#666">=</span> <span style="color:#06287e">fct_lump</span>(relig)) <span style="color:#666">%&gt;%</span><span style="color:#06287e">count</span>(relig)<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 2 × 2</span><span style="color:#60a0b0;font-style:italic">#&gt; relig n</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;fctr&gt; &lt;int&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 Other 10637</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 Protestant 10846</span></code></pre></div><p>Alternatively you can supply a number of levels to keep, <code>n</code>, or minimum proportion for inclusion, <code>prop</code>. If you use negative values, <code>fct_lump()</code>will change direction, and combine the most common values while preserving the rarest.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">gss_cat <span style="color:#666">%&gt;%</span><span style="color:#06287e">mutate</span>(relig <span style="color:#666">=</span> <span style="color:#06287e">fct_lump</span>(relig, n <span style="color:#666">=</span> <span style="color:#40a070">5</span>)) <span style="color:#666">%&gt;%</span><span style="color:#06287e">count</span>(relig)<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 6 × 2</span><span style="color:#60a0b0;font-style:italic">#&gt; relig n</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;fctr&gt; &lt;int&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 Other 913</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 Christian 689</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 None 3523</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 Jewish 388</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 Catholic 5124</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 Protestant 10846</span>gss_cat <span style="color:#666">%&gt;%</span><span style="color:#06287e">mutate</span>(relig <span style="color:#666">=</span> <span style="color:#06287e">fct_lump</span>(relig, prop <span style="color:#666">=</span> <span style="color:#40a070">-0.10</span>)) <span style="color:#666">%&gt;%</span><span style="color:#06287e">count</span>(relig)<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 12 × 2</span><span style="color:#60a0b0;font-style:italic">#&gt; relig n</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;fctr&gt; &lt;int&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 No answer 93</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 Don&#39;t know 15</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 Inter-nondenominational 109</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 Native american 23</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 Christian 689</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 Orthodox-christian 95</span><span style="color:#60a0b0;font-style:italic">#&gt; # ... with 6 more rows</span></code></pre></div><h2 id="change-level-order">Change level order</h2><p>There are four simple helpers for common operations:</p><ul><li><p><a href="https://hadley.github.io/forcats/fct_relevel.html"><code>fct_relevel()</code></a> is similar to <code>stats::relevel()</code> but allows you to move any number of levels to the front.</p></li><li><p><a href="https://hadley.github.io/forcats/fct_inorder.html"><code>fct_inorder()</code></a> orders according to the first appearance of each level.</p></li><li><p><a href="https://hadley.github.io/forcats/fct_infreq.html"><code>fct_infreq()</code></a> orders from most common to rarest.</p></li><li><p><a href="https://hadley.github.io/forcats/fct_rev.html"><code>fct_rev()</code></a> reverses the order of levels.</p></li></ul><p><a href="https://hadley.github.io/forcats/fct_reorder.html"><code>fct_reorder()</code></a> and <code>fct_reorder2()</code> are useful for visualisations. <code>fct_reorder()</code> reorders the factor levels by another variable. This is useful when you map a categorical variable to position, as shown in the following example which shows the average number of hours spent watching television across religions.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">relig <span style="color:#666">&lt;-</span> gss_cat <span style="color:#666">%&gt;%</span><span style="color:#06287e">group_by</span>(relig) <span style="color:#666">%&gt;%</span><span style="color:#06287e">summarise</span>(age <span style="color:#666">=</span> <span style="color:#06287e">mean</span>(age, na.rm <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>),tvhours <span style="color:#666">=</span> <span style="color:#06287e">mean</span>(tvhours, na.rm <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>),n <span style="color:#666">=</span> <span style="color:#06287e">n</span>())<span style="color:#06287e">ggplot</span>(relig, <span style="color:#06287e">aes</span>(tvhours, relig)) <span style="color:#666">+</span> <span style="color:#06287e">geom_point</span>()</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/08/reorder-1.png" alt="reorder-1"></p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(relig, <span style="color:#06287e">aes</span>(tvhours, <span style="color:#06287e">fct_reorder</span>(relig, tvhours))) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>()</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/08/reorder-2.png" alt="reorder-2"></p><p><code>fct_reorder2()</code> extends the same idea to plots where a factor is mapped to another aesthetic, like colour. The defaults are designed to make legends easier to read for line plots, as shown in the following example looking at marital status by age.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">by_age <span style="color:#666">&lt;-</span> gss_cat <span style="color:#666">%&gt;%</span><span style="color:#06287e">filter</span>(<span style="color:#666">!</span><span style="color:#06287e">is.na</span>(age)) <span style="color:#666">%&gt;%</span><span style="color:#06287e">group_by</span>(age, marital) <span style="color:#666">%&gt;%</span><span style="color:#06287e">count</span>() <span style="color:#666">%&gt;%</span><span style="color:#06287e">mutate</span>(prop <span style="color:#666">=</span> n <span style="color:#666">/</span> <span style="color:#06287e">sum</span>(n))<span style="color:#06287e">ggplot</span>(by_age, <span style="color:#06287e">aes</span>(age, prop)) <span style="color:#666">+</span><span style="color:#06287e">geom_line</span>(<span style="color:#06287e">aes</span>(colour <span style="color:#666">=</span> marital))</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/08/reorder2-1.png" alt="reorder2-1"></p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(by_age, <span style="color:#06287e">aes</span>(age, prop)) <span style="color:#666">+</span><span style="color:#06287e">geom_line</span>(<span style="color:#06287e">aes</span>(colour <span style="color:#666">=</span> <span style="color:#06287e">fct_reorder2</span>(marital, age, prop))) <span style="color:#666">+</span><span style="color:#06287e">labs</span>(colour <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">marital&#34;</span>)</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/08/reorder2-2.png" alt="reorder2-2"></p><h2 id="learning-more">Learning more</h2><p>You can learn more about forcats in <a href="http://r4ds.had.co.nz/factors.html">R for data science</a>, and on the <a href="https://hadley.github.io/forcats/">forcats website</a>.</p><p>Please <a href="https://github.com/hadley/forcats/issues">let me know</a> if you have more factor problems that forcats doesn&rsquo;t help with!</p></description></item><item><title>tibble 1.2.0</title><link>https://www.rstudio.com/blog/tibble-1-2-0/</link><pubDate>Mon, 29 Aug 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/tibble-1-2-0/</guid><description><p>We&rsquo;re proud to announce version 1.2.0 of the tibble package. Tibbles are a modern reimagining of the data frame, keeping what time has shown to be effective, and throwing out what is not. Grab the latest version with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">tibble&#34;</span>)</code></pre></div><p>This is mostly a maintenance release, with the following major changes:</p><ul><li><p>More options for adding individual rows and (new!) columns</p></li><li><p>Improved function names</p></li><li><p>Minor tweaks to the output</p></li></ul><p>There are many other small improvements and bug fixes: please see the <a href="https://github.com/hadley/tibble/releases/tag/v1.2">release notes</a> for a complete list.</p><p>Thanks to <a href="https://github.com/jennybc">Jenny Bryan</a> for <code>add_row()</code> and <code>add_column()</code> improvements and ideas, to <a href="https://github.com/BillDunlap">William Dunlap</a> for pointing out a bug with tibble&rsquo;s implementation of <code>all.equal()</code>, to <a href="https://github.com/kwstat">Kevin Wright</a> for pointing out a rare bug with <code>glimpse()</code>, and to all the other contributors. Use the <a href="https://github.com/hadley/tibble/issues">issue tracker</a> to submit bugs or suggest ideas, your contributions are always welcome.</p><h2 id="adding-rows-and-columns">Adding rows and columns</h2><p>There are now more options for adding individual rows, and columns can be added in a similar way, illustrated with this small tibble:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df <span style="color:#666">&lt;-</span> <span style="color:#06287e">tibble</span>(x <span style="color:#666">=</span> <span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">3</span>, y <span style="color:#666">=</span> <span style="color:#40a070">3</span><span style="color:#666">:</span><span style="color:#40a070">1</span>)df<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 3 × 2</span><span style="color:#60a0b0;font-style:italic">#&gt; x y</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;int&gt; &lt;int&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 3</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 2</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 3 1</span></code></pre></div><p>The <code>add_row()</code> function allows control over where the new rows are added. In the following example, the row (4, 0) is added before the second row:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df <span style="color:#666">%&gt;%</span><span style="color:#06287e">add_row</span>(x <span style="color:#666">=</span> <span style="color:#40a070">4</span>, y <span style="color:#666">=</span> <span style="color:#40a070">0</span>, .before <span style="color:#666">=</span> <span style="color:#40a070">2</span>)<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 4 × 2</span><span style="color:#60a0b0;font-style:italic">#&gt; x y</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;dbl&gt; &lt;dbl&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 3</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 4 0</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 2 2</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 3 1</span></code></pre></div><p>Adding more than one row is now fully supported, although not recommended in general because it can be a bit hard to read.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df <span style="color:#666">%&gt;%</span><span style="color:#06287e">add_row</span>(x <span style="color:#666">=</span> <span style="color:#40a070">4</span><span style="color:#666">:</span><span style="color:#40a070">5</span>, y <span style="color:#666">=</span> <span style="color:#40a070">0</span><span style="color:#666">:</span><span style="color:#40a070">-1</span>)<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 5 × 2</span><span style="color:#60a0b0;font-style:italic">#&gt; x y</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;int&gt; &lt;int&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 3</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 2</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 3 1</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 4 0</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 5 -1</span></code></pre></div><p>Columns can now be added in much the same way with the new <code>add_column()</code> function:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df <span style="color:#666">%&gt;%</span><span style="color:#06287e">add_column</span>(z <span style="color:#666">=</span> <span style="color:#40a070">-1</span><span style="color:#666">:</span><span style="color:#40a070">1</span>, w <span style="color:#666">=</span> <span style="color:#40a070">0</span>)<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 3 × 4</span><span style="color:#60a0b0;font-style:italic">#&gt; x y z w</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt; &lt;dbl&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 3 -1 0</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 2 0 0</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 3 1 1 0</span></code></pre></div><p>It also supports <code>.before</code> and <code>.after</code> arguments:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df <span style="color:#666">%&gt;%</span><span style="color:#06287e">add_column</span>(z <span style="color:#666">=</span> <span style="color:#40a070">-1</span><span style="color:#666">:</span><span style="color:#40a070">1</span>, .after <span style="color:#666">=</span> <span style="color:#40a070">1</span>)<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 3 × 3</span><span style="color:#60a0b0;font-style:italic">#&gt; x z y</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 -1 3</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 0 2</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 3 1 1</span>df <span style="color:#666">%&gt;%</span><span style="color:#06287e">add_column</span>(w <span style="color:#666">=</span> <span style="color:#40a070">0</span><span style="color:#666">:</span><span style="color:#40a070">2</span>, .before <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">x&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 3 × 3</span><span style="color:#60a0b0;font-style:italic">#&gt; w x y</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 0 1 3</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 1 2 2</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 2 3 1</span></code></pre></div><p>The <code>add_column()</code> function will never alter your existing data: you can&rsquo;t overwrite existing columns, and you can&rsquo;t add new observations.</p><h2 id="function-names">Function names</h2><p><code>frame_data()</code> is now <code>tribble()</code>, which stands for &ldquo;transposed tibble&rdquo;. The old name still works, but will be deprecated eventually.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">tribble</span>(<span style="color:#666">~</span>x, <span style="color:#666">~</span>y,<span style="color:#40a070">1</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a&#34;</span>,<span style="color:#40a070">2</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">z&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 2 × 2</span><span style="color:#60a0b0;font-style:italic">#&gt; x y</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;dbl&gt; &lt;chr&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 a</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 z</span></code></pre></div><h2 id="output-tweaks">Output tweaks</h2><p>We&rsquo;ve tweaked the output again to use the multiply character <code>×</code> instead of <code>x</code> when printing dimensions (this still renders nicely on Windows.) We surround non-semantic column with backticks, and <code>dttm</code> is now used instead of <code>time</code> to distinguish <code>POSIXt</code> and <code>hms</code> (or <code>difftime</code>) values.</p><p>The example below shows the new rendering:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">tibble</span>(`date and time` <span style="color:#666">=</span> <span style="color:#06287e">Sys.time</span>(), time <span style="color:#666">=</span> hms<span style="color:#666">::</span><span style="color:#06287e">hms</span>(minutes <span style="color:#666">=</span> <span style="color:#40a070">3</span>))<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 1 × 2</span><span style="color:#60a0b0;font-style:italic">#&gt; `date and time` time</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;dttm&gt; &lt;time&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 2016-08-29 16:48:57 00:03:00</span></code></pre></div><p>Expect the printed output to continue to evolve in next release. Stay tuned for a new function that reconstructs <code>tribble()</code> calls from existing data frames.</p></description></item><item><title>stringr 1.1.0</title><link>https://www.rstudio.com/blog/stringr-1-1-0/</link><pubDate>Wed, 24 Aug 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/stringr-1-1-0/</guid><description><p>I&rsquo;m pleased to announce version 1.1.0 of stringr. stringr makes string manipulation easier by using consistent function and argument names, and eliminating options that you don&rsquo;t need 95% of the time. To get started with stringr, check out the <a href="http://r4ds.had.co.nz/strings.html">strings chapter</a> in <a href="http://r4ds.had.co.nz/">R for data science</a>. Install it with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">stringr&#34;</span>)</code></pre></div><p>This release is mostly bug fixes, but there are a couple of new features you might care out.</p><ul><li>There are three new datasets, <code>fruit</code>, <code>words</code> and <code>sentences</code>, to help you practice your regular expression skills:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">str_subset</span>(fruit, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">(..)\\1&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;banana&#34; &#34;coconut&#34; &#34;cucumber&#34; &#34;jujube&#34; &#34;papaya&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt; [6] &#34;salal berry&#34;</span><span style="color:#06287e">head</span>(words)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;a&#34; &#34;able&#34; &#34;about&#34; &#34;absolute&#34; &#34;accept&#34; &#34;account&#34;</span>sentences[1]<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;The birch canoe slid on the smooth planks.&#34;</span></code></pre></div><ul><li>More functions work with <code>boundary()</code>: <code>str_detect()</code> and <code>str_subset()</code> can detect boundaries, and <code>str_extract()</code> and <code>str_extract_all()</code> pull out the components between boundaries. This is particularly useful if you want to extract logical constructs like words or sentences.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">x <span style="color:#666">&lt;-</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">This is harder than you might expect, e.g. punctuation!&#34;</span>x <span style="color:#666">%&gt;%</span> <span style="color:#06287e">str_extract_all</span>(<span style="color:#06287e">boundary</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">word&#34;</span>)) <span style="color:#666">%&gt;%</span> .[[1]]<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;This&#34; &#34;is&#34; &#34;harder&#34; &#34;than&#34; &#34;you&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt; [6] &#34;might&#34; &#34;expect&#34; &#34;e.g&#34; &#34;punctuation&#34;</span>x <span style="color:#666">%&gt;%</span> <span style="color:#06287e">str_extract</span>(<span style="color:#06287e">boundary</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">sentence&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;This is harder than you might expect, e.g. punctuation!&#34;</span></code></pre></div><ul><li><code>str_view()</code> and <code>str_view_all()</code> create HTML widgets that display regular expression matches. This is particularly useful for teaching.</li></ul><p>For a complete list of changes, please see the <a href="https://github.com/hadley/stringr/releases/tag/v1.1.0">release notes</a>.</p></description></item><item><title>Final Call: Hadley Wickham's Master R Workshop September 12 and 13 in NYC</title><link>https://www.rstudio.com/blog/final-call-hadley-wickhams-master-r-workshop-september-12-and-13-in-nyc/</link><pubDate>Thu, 18 Aug 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/final-call-hadley-wickhams-master-r-workshop-september-12-and-13-in-nyc/</guid><description><p>Want to Master R? There&rsquo;s no better time or place if you&rsquo;re within an easy train, plane, automobile ride or a short jog of Hadley Wickham&rsquo;s workshop on September 12th and 13th at the <a href="http://www.amaconferencecenter.org/new-york.htm">AMA Conference Center</a> in New York City.</p><p>Register here: <a href="https://www.eventbrite.com/e/master-r-developer-workshop-new-york-city-tickets-21347014495">https://www.eventbrite.com/e/master-r-developer-workshop-new-york-city-tickets-21347014495</a></p><p>As of today, there are just 20+ seats left. Discounts are still available for academics (students or faculty) and for 5 or more attendees from any organization. Email <a href="mailto:training@rstudio.com">training@rstudio.com</a> if you have any questions about the workshop that you don&rsquo;t find answered on the registration page.</p><p>Hadley has no Master R workshops planned for Boston, Washington DC, New York City or any location in the Northeast in the next year. If you&rsquo;ve always wanted to take Master R but haven&rsquo;t found the time, well, there&rsquo;s truly no better time!</p><p>P.S. We&rsquo;ve arranged a &ldquo;happy hour&rdquo; reception after class on Monday the 12th. Be sure to set aside an hour or so after the first day to talk to your classmates and Hadley about what&rsquo;s happening in R.</p></description></item><item><title>tidyr 0.6.0</title><link>https://www.rstudio.com/blog/tidyr-0-6-0/</link><pubDate>Mon, 15 Aug 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/tidyr-0-6-0/</guid><description><p>I&rsquo;m pleased to announce tidyr 0.6.0. tidyr makes it easy to &ldquo;tidy&rdquo; your data, storing it in a consistent form so that it&rsquo;s easy to manipulate, visualise and model. Tidy data has a simple convention: put variables in the columns and observations in the rows. You can learn more about it in the <a href="http://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.html">tidy data</a> vignette. Install it with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">tidyr&#34;</span>)</code></pre></div><p>I mostly released this version to bundle up a number of small tweaks needed for <a href="http://r4ds.had.co.nz/tidy-data.html">R for Data Science</a>. But there&rsquo;s one nice new feature, contributed by <a href="https://github.com/janschulz">Jan Schulz</a>: <code>drop_na()</code>. <code>drop_na()</code>drops rows containing missing values:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df <span style="color:#666">&lt;-</span> <span style="color:#06287e">tibble</span>(x <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#40a070">1</span>, <span style="color:#40a070">2</span>, <span style="color:#007020;font-weight:bold">NA</span>), y <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a&#34;</span>, <span style="color:#007020;font-weight:bold">NA</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">b&#34;</span>))df<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 3 × 2</span><span style="color:#60a0b0;font-style:italic">#&gt; x y</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;dbl&gt; &lt;chr&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 a</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 &lt;NA&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 NA b</span><span style="color:#60a0b0;font-style:italic"># Called without arguments, it drops rows containing</span><span style="color:#60a0b0;font-style:italic"># missing values in any variable:</span>df <span style="color:#666">%&gt;%</span> <span style="color:#06287e">drop_na</span>()<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 1 × 2</span><span style="color:#60a0b0;font-style:italic">#&gt; x y</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;dbl&gt; &lt;chr&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 a</span><span style="color:#60a0b0;font-style:italic"># Or you can restrict the variables it looks at,</span><span style="color:#60a0b0;font-style:italic"># using select() style syntax:</span>df <span style="color:#666">%&gt;%</span> <span style="color:#06287e">drop_na</span>(x)<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 2 × 2</span><span style="color:#60a0b0;font-style:italic">#&gt; x y</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;dbl&gt; &lt;chr&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 a</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 &lt;NA&gt;</span></code></pre></div><p>Please see the <a href="https://github.com/hadley/tidyr/releases/tag/v0.6.0">release notes</a> for a complete list of changes.</p></description></item><item><title>A New Version of DT (0.2) on CRAN</title><link>https://www.rstudio.com/blog/a-new-version-of-dt-0-2-on-cran/</link><pubDate>Tue, 09 Aug 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/a-new-version-of-dt-0-2-on-cran/</guid><description><p>The R package<a href="http://rstudio.github.io/DT"> <strong>DT</strong></a> v0.2 is on <a href="https://cran.rstudio.com/web/packages/DT/">CRAN</a> now. You may install it from CRAN via <code>install.packages('DT')</code> or update your R packages if you have already installed it before. It has been over a year since the last CRAN release of <strong>DT</strong>, and there have been a lot of changes in both <strong>DT</strong> and the upstream <a href="http://datatables.net/blog/2015-08-13">DataTables</a> library. You may read the <a href="https://github.com/rstudio/DT/releases/tag/v0.2">release notes</a> to know all changes, and we want to highlight two major changes here:</p><ul><li><p>Two extensions &ldquo;TableTools&rdquo; and &ldquo;ColVis&rdquo; have been removed from DataTables, and a new extension named &ldquo;Buttons&rdquo; was added. See <a href="http://rstudio.github.io/DT/extensions.html">this page</a> for examples.</p></li><li><p>For tables in the server-side processing mode (the default mode for tables in Shiny), the selected row indices are integers instead of characters (row names) now. This is for consistency with the client-side mode (which returns integer indices). In many cases, it does not make much difference if you index an R object with integers or names, and we hope this will not be a breaking change to your Shiny apps.</p></li></ul><p>In terms of new features added in the new version of <strong>DT</strong>, the most notable ones are:</p><ul><li><p>Besides row selections, you can also select columns or cells. Please note the implementation is <em>not</em> based on the &ldquo;<a href="https://datatables.net/extensions/select">Select</a>&rdquo; extension of DataTables, so not all features of &ldquo;Select&rdquo; are available in <strong>DT</strong>. You can find examples of row/column/cell selections on <a href="http://rstudio.github.io/DT/shiny.html">this page</a>.</p></li><li><p>There are a number of new functions to modify an existing table instance in a Shiny app without rebuilding the full table widget. One significant advantage of this feature is it will be much faster and more efficient to update certain aspects of a table, e.g., you can change the table caption, or set the global search keyword of a table without making <strong>DT</strong> to create the whole table from scratch. You can even replace the data object behind the table on the fly (using <code>DT::replaceData()</code>), and after the data is updated, the table state can be preserved (e.g., sorting and filtering can remain the same).</p></li><li><p>A few formatting functions such as <code>formatSignif()</code> and <code>formatString()</code> were also added to the package.</p></li></ul><p>As always, you are welcome to test the new release and we will appreciate your feedback. Please file bug reports to <a href="https://github.com/rstudio/DT/issues">Github</a>, and you may ask questions on <a href="http://stackoverflow.com/questions/tagged/dt">StackOverflow</a> using the <code>DT</code> tag.</p></description></item><item><title>readr 1.0.0</title><link>https://www.rstudio.com/blog/readr-1-0-0/</link><pubDate>Fri, 05 Aug 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/readr-1-0-0/</guid><description><p>readr 1.0.0 is now available on CRAN. readr makes it easy to read many types of rectangular data, including csv, tsv and fixed width files. Compared to base equivalents like <code>read.csv()</code>, readr is much faster and gives more convenient output: it never converts strings to factors, can parse date/times, and it doesn&rsquo;t munge the column names. Install the latest version with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">readr&#34;</span>)</code></pre></div><p>Releasing a version 1.0.0 was a deliberate choice to reflect the maturity and stability and readr, thanks largely to work by Jim Hester. readr is by no means perfect, but I don&rsquo;t expect any major changes to the API in the future.</p><p>In this version we:</p><ul><li><p>Use a better strategy for guessing column types.</p></li><li><p>Improved the default date and time parsers.</p></li><li><p>Provided a full set of lower-level file and line readers and writers.</p></li><li><p>Fixed many bugs.</p></li></ul><h2 id="column-guessing">Column guessing</h2><p>The process by which readr guesses the types of columns has received a substantial overhaul to make it easier to fix problems when the initial guesses aren&rsquo;t correct, and to make it easier to generate reproducible code. Now column specifications are printing by default when you read from a file:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">mtcars2 <span style="color:#666">&lt;-</span> <span style="color:#06287e">read_csv</span>(<span style="color:#06287e">readr_example</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">mtcars.csv&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; Parsed with column specification:</span><span style="color:#60a0b0;font-style:italic">#&gt; cols(</span><span style="color:#60a0b0;font-style:italic">#&gt; mpg = col_double(),</span><span style="color:#60a0b0;font-style:italic">#&gt; cyl = col_integer(),</span><span style="color:#60a0b0;font-style:italic">#&gt; disp = col_double(),</span><span style="color:#60a0b0;font-style:italic">#&gt; hp = col_integer(),</span><span style="color:#60a0b0;font-style:italic">#&gt; drat = col_double(),</span><span style="color:#60a0b0;font-style:italic">#&gt; wt = col_double(),</span><span style="color:#60a0b0;font-style:italic">#&gt; qsec = col_double(),</span><span style="color:#60a0b0;font-style:italic">#&gt; vs = col_integer(),</span><span style="color:#60a0b0;font-style:italic">#&gt; am = col_integer(),</span><span style="color:#60a0b0;font-style:italic">#&gt; gear = col_integer(),</span><span style="color:#60a0b0;font-style:italic">#&gt; carb = col_integer()</span><span style="color:#60a0b0;font-style:italic">#&gt; )</span></code></pre></div><p>The thought is that once you&rsquo;ve figured out the correct column types for a file, you should make the parsing strict. You can do this either by copying and pasting the printed column specification or by saving the spec to disk:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic"># Once you&#39;ve figured out the correct types</span>mtcars_spec <span style="color:#666">&lt;-</span> <span style="color:#06287e">write_rds</span>(<span style="color:#06287e">spec</span>(mtcars2), <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">mtcars2-spec.rds&#34;</span>)<span style="color:#60a0b0;font-style:italic"># Every subsequent load</span>mtcars2 <span style="color:#666">&lt;-</span> <span style="color:#06287e">read_csv</span>(<span style="color:#06287e">readr_example</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">mtcars.csv&#34;</span>),col_types <span style="color:#666">=</span> <span style="color:#06287e">read_rds</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">mtcars2-spec.rds&#34;</span>))<span style="color:#60a0b0;font-style:italic"># In production, you might want to throw an error if there</span><span style="color:#60a0b0;font-style:italic"># are any parsing problems.</span><span style="color:#06287e">stop_for_problems</span>(mtcars2)</code></pre></div><p>You can now also adjust the number of rows that readr uses to guess the column types with <code>guess_max</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">challenge <span style="color:#666">&lt;-</span> <span style="color:#06287e">read_csv</span>(<span style="color:#06287e">readr_example</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">challenge.csv&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; Parsed with column specification:</span><span style="color:#60a0b0;font-style:italic">#&gt; cols(</span><span style="color:#60a0b0;font-style:italic">#&gt; x = col_integer(),</span><span style="color:#60a0b0;font-style:italic">#&gt; y = col_character()</span><span style="color:#60a0b0;font-style:italic">#&gt; )</span><span style="color:#60a0b0;font-style:italic">#&gt; Warning: 1000 parsing failures.</span><span style="color:#60a0b0;font-style:italic">#&gt; row col expected actual</span><span style="color:#60a0b0;font-style:italic">#&gt; 1001 x no trailing characters .23837975086644292</span><span style="color:#60a0b0;font-style:italic">#&gt; 1002 x no trailing characters .41167997173033655</span><span style="color:#60a0b0;font-style:italic">#&gt; 1003 x no trailing characters .7460716762579978</span><span style="color:#60a0b0;font-style:italic">#&gt; 1004 x no trailing characters .723450553836301</span><span style="color:#60a0b0;font-style:italic">#&gt; 1005 x no trailing characters .614524137461558</span><span style="color:#60a0b0;font-style:italic">#&gt; .... ... ...................... ..................</span><span style="color:#60a0b0;font-style:italic">#&gt; See problems(...) for more details.</span>challenge <span style="color:#666">&lt;-</span> <span style="color:#06287e">read_csv</span>(<span style="color:#06287e">readr_example</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">challenge.csv&#34;</span>), guess_max <span style="color:#666">=</span> <span style="color:#40a070">1500</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Parsed with column specification:</span><span style="color:#60a0b0;font-style:italic">#&gt; cols(</span><span style="color:#60a0b0;font-style:italic">#&gt; x = col_double(),</span><span style="color:#60a0b0;font-style:italic">#&gt; y = col_date(format = &#34;&#34;)</span><span style="color:#60a0b0;font-style:italic">#&gt; )</span></code></pre></div><p>(If you want to suppress the printed specification, just provide the dummy spec <code>col_types = cols()</code>)</p><p>You can now access the guessing algorithm from R: <code>guess_parser()</code> will tell you which parser readr will select.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">guess_parser</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">1,234&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;number&#34;</span><span style="color:#60a0b0;font-style:italic"># Were previously guessed as numbers</span><span style="color:#06287e">guess_parser</span>(<span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">.&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">-&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;character&#34;</span><span style="color:#06287e">guess_parser</span>(<span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">10W&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">20N&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;character&#34;</span><span style="color:#60a0b0;font-style:italic"># Now uses the default time format</span><span style="color:#06287e">guess_parser</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">10:30&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;time&#34;</span></code></pre></div><h2 id="date-time-parsing-improvements">Date-time parsing improvements:</h2><p>The date time parsers recognise three new format strings:</p><ul><li><code>%I</code> for 12 hour time format:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(hms)<span style="color:#06287e">parse_time</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">1 pm&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">%I %p&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; 13:00:00</span></code></pre></div><p>Note that <code>parse_time()</code> returns <code>hms</code> from the <a href="https://github.com/rstats-db/hms">hms</a> package, rather than a custom <code>time</code> class</p><ul><li><code>%AD</code> and <code>%AT</code> are &ldquo;automatic&rdquo; date and time parsers. They are both slightly less flexible than previous defaults. The automatic date parser requires a four digit year, and only accepts <code>-</code> and <code>/</code> as separators. The flexible time parser now requires colons between hours and minutes and optional seconds.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">parse_date</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">2010-01-01&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">%AD&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;2010-01-01&#34;</span><span style="color:#06287e">parse_time</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">15:01&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">%AT&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; 15:01:00</span></code></pre></div><p>If the format argument is omitted in <code>parse_date()</code> or <code>parse_time()</code>, the default date and time formats specified in the locale will be used. These now default to <code>%AD</code> and <code>%AT</code> respectively. You may want to override in your standard <code>locale()</code> if the conventions are different where you live.</p><h2 id="low-level-readers-and-writers">Low-level readers and writers</h2><p>readr now contains a full set of efficient lower-level readers:</p><ul><li><p><code>read_file()</code> reads a file into a length-1 character vector; <code>read_file_raw()</code> reads a file into a single raw vector.</p></li><li><p><code>read_lines()</code> reads a file into a character vector with one entry per line; <code>read_lines_raw()</code> reads into a list of raw vectors with one entry per line.</p></li></ul><p>These are paired with <code>write_lines()</code> and <code>write_file()</code> to efficient write character and raw vectors back to disk.</p><h2 id="other-changes">Other changes</h2><ul><li><p><code>read_fwf()</code> was overhauled to reliably read only a partial set of columns, to read files with ragged final columns (by setting the final position/width to <code>NA</code>), and to skip comments (with the <code>comment</code> argument).</p></li><li><p>readr contains an experimental API for reading a file in chunks, e.g. <code>read_csv_chunked()</code> and <code>read_lines_chunked()</code>. These allow you to work with files that are bigger than memory. We haven&rsquo;t yet finalised the API so please use with care, and send us your feedback.</p></li><li><p>There are many otherbug fixes and other minor improvements. You can see a complete list in the <a href="https://github.com/hadley/readr/releases/tag/v1.0.0">release notes</a>.</p></li></ul><p>A big thanks goes to all the community members who contributed to this release: @<a href="https://github.com/antoine-lizee">antoine-lizee</a>, @<a href="https://github.com/fpinter">fpinter</a>, @<a href="https://github.com/ghaarsma">ghaarsma</a>, @<a href="https://github.com/jennybc">jennybc</a>, @<a href="https://github.com/jeroenooms">jeroenooms</a>, @<a href="https://github.com/leeper">leeper</a>, @<a href="https://github.com/LluisRamon">LluisRamon</a>, @<a href="https://github.com/noamross">noamross</a>, and @<a href="https://github.com/tvedebrink">tvedebrink</a>.</p></description></item><item><title>Don't miss Hadley Wickham's Master R Workshop September 12 and 13 in NYC</title><link>https://www.rstudio.com/blog/dont-miss-hadley-wickhams-master-r-workshop-september-12-and-13-in-nyc/</link><pubDate>Fri, 22 Jul 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/dont-miss-hadley-wickhams-master-r-workshop-september-12-and-13-in-nyc/</guid><description><p>New York City is a wonderful place to be most of the time but especially in September!</p><p>If you live or work in the city or just want a good business reason to visit, consider joining RStudio Chief Data Scientist Hadley Wickham in the heart of Manhattan on September 12th and 13th, just by Times Square at the <a href="http://www.amaconferencecenter.org/new-york.htm">AMA Conference Center</a>. It&rsquo;s a rare opportunity to learn from one of the R community&rsquo;s most popular and innovative authors and package developers.</p><p>Hadley&rsquo;s workshops usually sell out. This is his only East US public workshop in 2016 and there are no plans to do another in NYC in 2017. If you&rsquo;re an active R user and have been meaning to take this class, now is the perfect time to do it!</p><p>Register here: <a href="https://www.eventbrite.com/e/master-r-developer-workshop-new-york-city-tickets-21347014495">https://www.eventbrite.com/e/master-r-developer-workshop-new-york-city-tickets-21347014495</a></p><p>We look forward to seeing you in New York!</p></description></item><item><title>Discover R and RStudio at JSM 2016 Chicago!</title><link>https://www.rstudio.com/blog/discover-r-and-rstudio-at-jsm-2016-chicago/</link><pubDate>Tue, 19 Jul 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/discover-r-and-rstudio-at-jsm-2016-chicago/</guid><description><p>The <a href="https://www.amstat.org/meetings/jsm/2016/">JSM conference</a> in Chicago, July 31 thru August 4, 2016, is one of the largest to be found on statistics, with many terrific talks for R users. We&rsquo;ve listed some of the sessions that we&rsquo;re particularly excited about below. These include talks from RStudio employees, like Hadley Wickham, Yihui Xie, Mine Cetinkaya-Rundel, Garrett Grolemund, and Joe Cheng, but also include a bunch of other talks about R that we think look interesting.</p><p>When you&rsquo;re not in one of the sessions below, please visit us in the exhibition area, booth #126-128. We&rsquo;ll have copies of all our <a href="https://www.rstudio.com/resources/cheatsheets/">cheat sheets</a> and <a href="https://www.rstudio.com/about/gear/">stickers</a>, and it&rsquo;s a great place to learn about the other stuff we&rsquo;ve been working on lately: from <a href="https://spark.rstudio.com/">Sparklyr</a> and <a href="https://rmarkdown.rstudio.com/r_notebooks.html">R Markdown Notebooks</a> to the latest in <a href="https://www.rstudio.com/products/rstudio-server-pro/">RStudio Server Pro</a>, <a href="https://www.rstudio.com/products/shiny-server-pro/">Shiny Server Pro</a>, <a href="https://www.rstudio.com/products/shinyapps/">shinyapps.io</a>, <a href="https://www.rstudio.com/products/connect/">RStudio Connect (beta)</a> and <a href="https://www.rstudio.com/about/news-events/">more</a>!</p><p>Another great place to chat with people interested in R is the Statistical Computing and Graphics Mixer at 6pm on Monday in the Hilton Stevens Salon A4. It&rsquo;s advertised as a business meeting in the program, but don&rsquo;t let that put you off - it&rsquo;s open to all.</p><p><strong>SUNDAY</strong></p><p>Session 21: <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/ActivityDetails.cfm?SessionID=212713">Statistical Computing and Graphics Student Awards</a>Sunday, July 31, 2016 : 2:00 PM to 3:50 PM, CC-W175b</p><p>Session 47 <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/ActivityDetails.cfm?SessionID=212343">Making the Most of R Tools</a>Hadley Wickham, RStudio (Discussant)Sunday, July 31, 2016: 4:00 PM to 4:50 PM, CC-W183b</p><p><a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/AbstractDetails.cfm?abstractid=318293">Thinking with Data Using R and RStudio: Powerful Idioms for Analysts</a>Nicholas Jon Horton, Amherst College; Randall Pruim, Calvin College ; Daniel Kaplan, Macalester College<a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/AbstractDetails.cfm?abstractid=318249">Transform Your Workflow and Deliverables with Shiny and R Markdown</a>Garrett Grolemund, RStudio</p><p>Session 54 <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/ActivityDetails.cfm?SessionID=212554">Recent Advances in Information Visualization</a>Yihui Xie, RStudio (organizer)Sunday, July 31, 2016: 4:00 PM to 4:50 PM, CC-W183c</p><p>Session 85 <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/AbstractDetails.cfm?abstractid=321068">Reproducibility Promotes Transparency, Efficiency, and Aesthetics</a>Richard SchwinnSunday, July 31, 2016 : 5:35 PM to 5:50 PM, CC-W176a</p><p>Session 88 <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/AbstractDetails.cfm?abstractid=320629">Communicate Better with R, R Markdown, and Shiny</a>Garrett Grolemund, RStudio (Poster Session)Sunday, July 31, 2016: 6:00 PM to 8:00 PM, CC-Hall F1 West</p><p><strong>MONDAY</strong></p><p>Session 106 <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/AbstractDetails.cfm?abstractid=318508">Linked Brushing in R</a>Hadley Wickham, RStudioMonday, August 1, 2016 : 8:35 AM to 8:55 AM, CC-W196b</p><p>Session 127 <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/ActivityDetails.cfm?SessionID=212882">R Tools for Statistical Computing</a>Monday, August 1, 2016 : 8:30 AM to 10:20 AM, CC-W196c</p><p>8:35 AM <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/AbstractDetails.cfm?abstractid=320818">The Biglasso Package: Extending Lasso Model Fitting to Big Data in R</a> — Yaohui Zeng, University of Iowa ; Patrick Breheny, University of Iowa8:50 AM <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/AbstractDetails.cfm?abstractid=321324">Independent Sampling for a Spatial Model with Incomplete Data</a> — Harsimran Somal, University of Iowa ; Mary Kathryn Cowles, University of Iowa9:05 AM <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/AbstractDetails.cfm?abstractid=318843">Introduction to the TextmineR Package for R</a> — Thomas Jones, Impact Research9:20 AM <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/AbstractDetails.cfm?abstractid=319146">Vector-Generalized Time Series Models</a> — Victor Miranda Soberanis, University of Auckland ; Thomas Yee, University of Auckland9:35 AM <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/AbstractDetails.cfm?abstractid=319305">New Computational Approaches to Large/Complex Mixed Effects Models</a> — Norman Matloff, University of California at Davis9:50 AM <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/AbstractDetails.cfm?abstractid=321492">Broom: An R Package for Converting Statistical Modeling Objects Into Tidy Data Frames</a> — David G. Robinson, Stack Overflow10:05 AM <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/AbstractDetails.cfm?abstractid=318717">Exact Parametric and Nonparametric Likelihood-Ratio Tests for Two-Sample Comparisons</a> — Yang Zhao, SUNY Buffalo ; Albert Vexler, SUNY Buffalo ; Alan Hutson, SUNY Buffalo ; Xiwei Chen, SUNY Buffalo</p><p>Session 270 <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/AbstractDetails.cfm?abstractid=321774">Automated Analytics and Data Dashboards for Evaluating the Impacts of Educational Technologies</a>Daniel Stanhope and Joyce Yu and Karly RectanusMonday, August 1, 2016 : 3:05 PM to 3:50 PM, CC-Hall F1 West</p><p><strong>TUESDAY</strong></p><p>Session 276 <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/AbstractDetails.cfm?abstractid=319764">Statistical Tools for Clinical Neuroimaging</a>Ciprian CrainiceanuTuesday, August 2, 2016 : 7:00 AM to 8:15 AM, CC-W375a</p><p>Session 332 <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/ActivityDetails.cfm?SessionID=212444">Doing More with Data in and Outside the Undergraduate Classroom</a>Mine Cetinkaya-Rundel, Duke University (organizer)Tuesday, August 2, 2016 : 10:30 AM to 12:20 PM, CC-W184bc</p><p>Session 407 <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/ActivityDetails.cfm?SessionID=212551">Interactive Visualizations and Web Applications for Analytics</a>Tuesday, August 2, 2016 : 2:00 PM to 3:50 PM, CC-W179a</p><p>2:05 PM <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/AbstractDetails.cfm?abstractid=318484">Radiant: A Platform-Independent Browser-Based Interface for Business Analytics in R</a> — Vincent Nijs, Rady School of Management2:20 PM <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/AbstractDetails.cfm?abstractid=318431">Rbokeh: An R Interface to the Bokeh Plotting Library</a> — Ryan Hafen, Hafen Consulting2:35 PM <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/AbstractDetails.cfm?abstractid=318410">Composable Linked Interactive Visualizations in R with Htmlwidgets and Shiny</a> — Joseph Cheng, RStudio2:50 PM <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/AbstractDetails.cfm?abstractid=318382">Papayar: A Better Interactive Neuroimage Plotter in R</a> — John Muschelli, The Johns Hopkins University3:05 PM <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/AbstractDetails.cfm?abstractid=318325">Interactive and Dynamic Web-Based Graphics for Data Analysis</a> — Carson Sievert, Iowa State University3:20 PM <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/AbstractDetails.cfm?abstractid=318314">HTML Widgets: Interactive Visualizations from R Made Easy!</a> — Yihui Xie, RStudio ; Ramnath Vaidyanathan, Alteryx</p><p><strong>WEDNESDAY</strong></p><p>Session 475 <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/ActivityDetails.cfm?SessionID=212538">Steps Toward Reproducible Research</a>Yihui Xie, RStudio (Discussant)Wednesday, August 3, 2016 : 8:30 AM to 10:20 AM, CC-W196c</p><p>8:35 AM <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/AbstractDetails.cfm?abstractid=318480">Reproducibility for All and Our Love/Hate Relationship with Spreadsheets</a> — Jennifer Bryan, University of British Columbia8:55 AM <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/AbstractDetails.cfm?abstractid=318002">Steps Toward Reproducible Research</a> — Karl W. Broman, University of Wisconsin - Madison9:15 AM <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/AbstractDetails.cfm?abstractid=318468">Enough with Trickle-Down Reproducibility: Scientists, Open This Gate! Scientists, Tear Down This Wall!</a> — Karthik Ram, University of California at Berkeley9:35 AM <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/AbstractDetails.cfm?abstractid=318487">Integrating Reproducibility into the Undergraduate Statistics Curriculum</a> — Mine Cetinkaya-Rundel, Duke University</p><p>Session 581 <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/AbstractDetails.cfm?abstractid=317995">Mining Text in R</a>David Marchette, Naval Surface Warfare CenterWednesday, August 3, 2016 : 2:05 PM to 2:40 PM, CC-W180</p><p><strong>THURSDAY</strong></p><p>Session 696 <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/ActivityDetails.cfm?SessionID=212787">Statistics for Social Good</a>Hadley Wickham, RStudio (Chair)Thursday, August 4, 2016 : 10:30 AM to 12:20 PM, CC-W179a</p><p>Session 694 <a href="https://www.amstat.org/meetings/jsm/2016/onlineprogram/AbstractDetails.cfm?abstractid=318763">Web Application Teaching Tools for Statistics Using R and Shiny</a>Jimmy Doi and Gail Potter and Jimmy Wong and Irvin Alcaraz and Peter ChiThursday, August 4, 2016 : 11:05 AM to 11:20 AM, CC-W192a</p></description></item><item><title>httr 1.2.0</title><link>https://www.rstudio.com/blog/httr-1-2-0/</link><pubDate>Tue, 05 Jul 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/httr-1-2-0/</guid><description><p>httr 1.2.0 is now available on CRAN. The httr package makes it easy to talk to web APIs from R. Learn more in the <a href="http://cran.r-project.org/web/packages/httr/vignettes/quickstart.html">quick start</a> vignette. Install the latest version with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">httr&#34;</span>)</code></pre></div><p>There are a few small new features:</p><ul><li><p>New <code>RETRY()</code> function allows you to retry a request multiple times until it succeeds, if you you are trying to talk to an unreliable service. To avoid hammering the server, it uses exponential backoff with jitter, as described in <a href="https://www.awsarchitectureblog.com/2015/03/backoff.html">https://www.awsarchitectureblog.com/2015/03/backoff.html</a>.</p></li><li><p><code>DELETE()</code> gains a body parameter.</p></li><li><p><code>encode = &quot;raw&quot;</code> parameter to functions that accept bodies. This allows you to do your own encoding.</p></li><li><p><code>http_type()</code> returns the content/mime type of a request, sans parameters.</p></li></ul><p>There is one important bug fix:</p><ul><li>No longer uses use custom requests for standard <code>POST</code> requests. This has the side-effect of properly following redirects after <code>POST</code>, fixing some login issues in rvest.</li></ul><p>httr 1.2.1 includes a fix for a small bug that I discovered shortly after releasing 1.2.0.</p><p>For the complete list of improvements, please see the <a href="https://github.com/hadley/httr/releases/tag/v1.2.0">release notes</a>.</p></description></item><item><title>tibble 1.1</title><link>https://www.rstudio.com/blog/tibble-1-1/</link><pubDate>Tue, 05 Jul 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/tibble-1-1/</guid><description><p>We&rsquo;re proud to announce version 1.1 of the <code>tibble</code> package. Tibbles are a modern reimagining of the data frame, keeping what time has shown to be effective, and throwing out what is not. Grab the latest version with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">tibble&#34;</span>)</code></pre></div><p>There are three major new features:</p><ul><li><p>A more consistent naming scheme</p></li><li><p>Changes to how columns are extracted</p></li><li><p>Tweaks to the output</p></li></ul><p>There are many other small improvements and bug fixes: please see the <a href="https://github.com/hadley/tibble/releases/tag/v1.1">release notes</a> for a complete list.</p><h2 id="a-better-naming-scheme">A better naming scheme</h2><p>It&rsquo;s caused some confusion that you use <code>data_frame()</code> and <code>as_data_frame()</code> to create and coerce tibbles. It&rsquo;s also more important to make the distinction between tibbles and data frames more clear as we evolve a little further away from the semantics of data frames.</p><p>Now, we&rsquo;re consistently using &ldquo;tibble&rdquo; as the key word in creation, coercion, and testing functions:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">tibble</span>(x <span style="color:#666">=</span> <span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">5</span>, y <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">letters</span>[1<span style="color:#666">:</span><span style="color:#40a070">5</span>])<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 5 x 2</span><span style="color:#60a0b0;font-style:italic">#&gt; x y</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;int&gt; &lt;chr&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 a</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 b</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 3 c</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 4 d</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 5 e</span><span style="color:#06287e">as_tibble</span>(<span style="color:#06287e">data.frame</span>(x <span style="color:#666">=</span> <span style="color:#06287e">runif</span>(<span style="color:#40a070">5</span>)))<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 5 x 1</span><span style="color:#60a0b0;font-style:italic">#&gt; x</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;dbl&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 0.4603887</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 0.4824339</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 0.4546795</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 0.5042028</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 0.4558387</span><span style="color:#06287e">is_tibble</span>(<span style="color:#06287e">data.frame</span>())<span style="color:#60a0b0;font-style:italic">#&gt; [1] FALSE</span></code></pre></div><p>Previously <code>tibble()</code> was an alias for <code>frame_data()</code>. If you were using <code>tibble()</code> to create tibbles by rows, you&rsquo;ll need to switch to <code>frame_data()</code>. This is a breaking change, but we believe that the new naming scheme will be less confusing in the long run.</p><h2 id="extracting-columns">Extracting columns</h2><p>The previous version of tibble was a little too strict when you attempted to retrieve a column that did not exist: we had forgotten that many people check for the presence of column with <code>is.null(df$x)</code>. This is bad idea because of partial matching, but it is common:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df1 <span style="color:#666">&lt;-</span> <span style="color:#06287e">data.frame</span>(xyz <span style="color:#666">=</span> <span style="color:#40a070">1</span>)df1<span style="color:#666">$</span>x<span style="color:#60a0b0;font-style:italic">#&gt; [1] 1</span></code></pre></div><p>Now, instead of throwing an error, tibble will return <code>NULL</code>. If you use <code>$</code>, common in interactive scripts, tibble will generate a warning:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df2 <span style="color:#666">&lt;-</span> <span style="color:#06287e">tibble</span>(xyz <span style="color:#666">=</span> <span style="color:#40a070">1</span>)df2<span style="color:#666">$</span>x<span style="color:#60a0b0;font-style:italic">#&gt; Warning: Unknown column &#39;x&#39;</span><span style="color:#60a0b0;font-style:italic">#&gt; NULL</span>df2[[<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">x&#34;</span>]]<span style="color:#60a0b0;font-style:italic">#&gt; NULL</span></code></pre></div><p>We also provide a convenient helper for detecting the presence/absence of a column:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">has_name</span>(df1, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">x&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] FALSE</span><span style="color:#06287e">has_name</span>(df2, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">x&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] FALSE</span></code></pre></div><h2 id="output-tweaks">Output tweaks</h2><p>We&rsquo;ve tweaked the output to have a shorter header, more information in the footer. We&rsquo;re using <code>#</code> consistently to denote metadata, and we print missing character values as <code>&lt;NA&gt;</code> (instead of <code>NA</code>).</p><p>The example below shows the new rendering of the <code>flights</code> table.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">nycflights13<span style="color:#666">::</span>flights<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 336,776 x 19</span><span style="color:#60a0b0;font-style:italic">#&gt; year month day dep_time sched_dep_time dep_delay arr_time</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt; &lt;dbl&gt; &lt;int&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 2013 1 1 517 515 2 830</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2013 1 1 533 529 4 850</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 2013 1 1 542 540 2 923</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 2013 1 1 544 545 -1 1004</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 2013 1 1 554 600 -6 812</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 2013 1 1 554 558 -4 740</span><span style="color:#60a0b0;font-style:italic">#&gt; 7 2013 1 1 555 600 -5 913</span><span style="color:#60a0b0;font-style:italic">#&gt; 8 2013 1 1 557 600 -3 709</span><span style="color:#60a0b0;font-style:italic">#&gt; 9 2013 1 1 557 600 -3 838</span><span style="color:#60a0b0;font-style:italic">#&gt; 10 2013 1 1 558 600 -2 753</span><span style="color:#60a0b0;font-style:italic">#&gt; # ... with 336,766 more rows, and 12 more variables: sched_arr_time &lt;int&gt;,</span><span style="color:#60a0b0;font-style:italic">#&gt; # arr_delay &lt;dbl&gt;, carrier &lt;chr&gt;, flight &lt;int&gt;, tailnum &lt;chr&gt;,</span><span style="color:#60a0b0;font-style:italic">#&gt; # origin &lt;chr&gt;, dest &lt;chr&gt;, air_time &lt;dbl&gt;, distance &lt;dbl&gt;, hour &lt;dbl&gt;,</span><span style="color:#60a0b0;font-style:italic">#&gt; # minute &lt;dbl&gt;, time_hour &lt;time&gt;</span></code></pre></div><p>Thanks to <a href="http://github.com/lionel-">Lionel Henry</a> for contributing an option for determining the number of printed extra columns: <code>getOption(&quot;tibble.max_extra_cols&quot;)</code>. This is particularly important for the ultra-wide tables often released by statistical offices and other institutions.</p><p>Expect the printed output to continue to evolve. In the next version, we hope to do better with very wide columns (e.g. from long strings), and to make better use of now unused horizontal space (e.g. from long column names).</p></description></item><item><title>xml2 1.0.0</title><link>https://www.rstudio.com/blog/xml2-1-0-0/</link><pubDate>Tue, 05 Jul 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/xml2-1-0-0/</guid><description><p>We are pleased to announced that xml2 1.0.0 is now available on CRAN. Xml2 is a wrapper around the comprehensive <a href="http://xmlsoft.org">libxml2</a> C library, and makes it easy to work with XML and HTML files in R. Install the latest version with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">xml2&#34;</span>)</code></pre></div><p>There are three major improvements in 1.0.0:</p><ol><li><p>You can now modify and create XML documents.</p></li><li><p><code>xml_find_first()</code> replaces <code>xml_find_one()</code>, and provides better semantics for missing nodes.</p></li><li><p>Improved namespace handling when working with XPath.</p></li></ol><p>There are many other small improvements and bug fixes: please see the <a href="https://github.com/hadley/xml2/releases/tag/v1.0.0">release notes</a> for a complete list.</p><h2 id="modification-and-creation">Modification and creation</h2><p>xml2 now supports modification and creation of XML nodes. This includes new functions <code>xml_new_document()</code>, <code>xml_new_child()</code>, <code>xml_new_sibling()</code>, <code>xml_set_namespace()</code>, <code>xml_remove()</code>, <code>xml_replace()</code>, <code>xml_root()</code>, and replacement methods for <code>xml_name()</code>, <code>xml_attr()</code>, <code>xml_attrs()</code> and <code>xml_text()</code>.</p><p>The basic process of creating an XML document by hand looks something like this:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">root <span style="color:#666">&lt;-</span> <span style="color:#06287e">xml_new_document</span>() <span style="color:#666">%&gt;%</span> <span style="color:#06287e">xml_add_child</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">root&#34;</span>)root <span style="color:#666">%&gt;%</span><span style="color:#06287e">xml_add_child</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a1&#34;</span>, x <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">1&#34;</span>, y <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">2&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">xml_add_child</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">b&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">xml_add_child</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">c&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">invisible</span>()root <span style="color:#666">%&gt;%</span><span style="color:#06287e">xml_add_child</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a2&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">xml_add_sibling</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a3&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">invisible</span>()<span style="color:#06287e">cat</span>(<span style="color:#06287e">as.character</span>(root))<span style="color:#60a0b0;font-style:italic">#&gt; &lt;?xml version=&#34;1.0&#34;?&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;root&gt;&lt;a1 x=&#34;1&#34; y=&#34;2&#34;&gt;&lt;b&gt;&lt;c/&gt;&lt;/b&gt;&lt;/a1&gt;&lt;a2/&gt;&lt;a3/&gt;&lt;/root&gt;</span></code></pre></div><p>For a complete description of creation and mutation, please see <a href="https://cran.r-project.org/web/packages/xml2"><code>vignette(&quot;modification&quot;, package = &quot;xml2&quot;)</code></a>.</p><h2 id="xml_find_first"><code>xml_find_first()</code></h2><p><code>xml_find_one()</code> has been deprecated in favor of <code>xml_find_first()</code>. <code>xml_find_first()</code> now always returns a single node: if there are multiple matches, it returns the first (without a warning), and if there are no matches, it returns a new <code>xml_missing</code> object.</p><p>This makes it much easier to work with ragged/inconsistent hierarchies:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">x1 <span style="color:#666">&lt;-</span> <span style="color:#06287e">read_xml</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">&lt;a&gt;</span><span style="color:#4070a0"> &lt;b&gt;&lt;/b&gt;</span><span style="color:#4070a0"> &lt;b&gt;&lt;c&gt;See&lt;/c&gt;&lt;/b&gt;</span><span style="color:#4070a0"> &lt;b&gt;&lt;c&gt;Sea&lt;/c&gt;&lt;c /&gt;&lt;/b&gt;</span><span style="color:#4070a0">&lt;/a&gt;&#34;</span>)c <span style="color:#666">&lt;-</span> x1 <span style="color:#666">%&gt;%</span><span style="color:#06287e">xml_find_all</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">.//b&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">xml_find_first</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">.//c&#34;</span>)c<span style="color:#60a0b0;font-style:italic">#&gt; {xml_nodeset (3)}</span><span style="color:#60a0b0;font-style:italic">#&gt; [1] &lt;NA&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; [2] &lt;c&gt;See&lt;/c&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; [3] &lt;c&gt;Sea&lt;/c&gt;</span></code></pre></div><p>Missing nodes are replaced by missing values in functions that return vectors:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">xml_name</span>(c)<span style="color:#60a0b0;font-style:italic">#&gt; [1] NA &#34;c&#34; &#34;c&#34;</span><span style="color:#06287e">xml_text</span>(c)<span style="color:#60a0b0;font-style:italic">#&gt; [1] NA &#34;See&#34; &#34;Sea&#34;</span></code></pre></div><h2 id="xpath-and-namespaces">XPath and namespaces</h2><p>XPath is challenging to use if your document contains any namespaces:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">x <span style="color:#666">&lt;-</span> <span style="color:#06287e">read_xml</span>(<span style="color:#4070a0">&#39;</span><span style="color:#4070a0"></span><span style="color:#4070a0"> &lt;root&gt;</span><span style="color:#4070a0"> &lt;doc1 xmlns = &#34;http://foo.com&#34;&gt;&lt;baz /&gt;&lt;/doc1&gt;</span><span style="color:#4070a0"> &lt;doc2 xmlns = &#34;http://bar.com&#34;&gt;&lt;baz /&gt;&lt;/doc2&gt;</span><span style="color:#4070a0"> &lt;/root&gt;</span><span style="color:#4070a0">&#39;</span>)x <span style="color:#666">%&gt;%</span> <span style="color:#06287e">xml_find_all</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">.//baz&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; {xml_nodeset (0)}</span></code></pre></div><p>To make life slightly easier, the default <code>xml_ns()</code> object is automatically passed to <code>xml_find_*()</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">x <span style="color:#666">%&gt;%</span> <span style="color:#06287e">xml_ns</span>()<span style="color:#60a0b0;font-style:italic">#&gt; d1 &lt;-&gt; http://foo.com</span><span style="color:#60a0b0;font-style:italic">#&gt; d2 &lt;-&gt; http://bar.com</span>x <span style="color:#666">%&gt;%</span> <span style="color:#06287e">xml_find_all</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">.//d1:baz&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; {xml_nodeset (1)}</span><span style="color:#60a0b0;font-style:italic">#&gt; [1] &lt;baz/&gt;</span></code></pre></div><p>If you just want to avoid the hassle of namespaces altogether, we have a new nuclear option: <code>xml_ns_strip()</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">xml_ns_strip</span>(x)x <span style="color:#666">%&gt;%</span> <span style="color:#06287e">xml_find_all</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">.//baz&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; {xml_nodeset (2)}</span><span style="color:#60a0b0;font-style:italic">#&gt; [1] &lt;baz/&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; [2] &lt;baz/&gt;</span></code></pre></div></description></item><item><title>Join us at rstudio::conf 2017!</title><link>https://www.rstudio.com/blog/join-us-at-rstudioconf-2017/</link><pubDate>Thu, 30 Jun 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/join-us-at-rstudioconf-2017/</guid><description><p>Following our initial and very gratifying Shiny Developer Conference this past January, which sold out in a few days, RStudio is very excited to announce a new and bigger conference today!</p><p><a href="https://www.rstudio.com/conference/"><strong>rstudio::conf</strong></a>, the conference about all things R and RStudio, will take place January 13 and 14, 2017 in Orlando, Florida. The conference will feature talks and tutorials from popular RStudio data scientists and developers like Hadley Wickham, Yihui Xie, Joe Cheng, Winston Chang, Garrett Grolemund, and J.J. Allaire, along with lightning talks from RStudio partners and customers.</p><p>Preceding the conference, on January 11 and 12, RStudio will offer two days of optional training. Training attendees can choose from Hadley Wickham&rsquo;s Master R training, a new Intermediate Shiny workshop from Shiny creator Joe Cheng or a new workshop from Garrett Grolemund that is based on his soon-to-be-published book with Hadley: Introduction to Data Science with R.</p><p><a href="https://www.rstudio.com/conference/"><strong>rstudio::conf</strong></a> is for R and RStudio users who want to learn how to write better shiny applications in a better way, explore all the new capabilities of the R Markdown authoring framework, apply R to big data and work effectively with Spark, understand the RStudio toolchain for data science with R, discover best practices and tips for coding with RStudio, and investigate enterprise scale development and deployment practices and tools, including the new RStudio Connect.</p><p><em>Not to be missed, RStudio has also reserved Universal Studio&rsquo;s The Wizarding World of Harry Potter on Friday night, January 13, for the exclusive use of conference attendees!</em></p><p>Conference attendance is limited to 400. Training is limited to 70 students for each of the three 2-day workshops. All seats are are available on a first-come, first-serve basis.</p><p><strong>Please go to <a href="http://www.rstudio.com/conference">www.rstudio.com/conference</a> to purchase.</strong></p><p>We hope to see you in Florida at <a href="https://www.rstudio.com/conference/"><strong>rstudio::conf 2017</strong></a>!</p><p>For questions or issues registering, please email <a href="mailto:conf@rstudio.com">conf@rstudio.com</a>. To ask about sponsorship opportunities contact <a href="mailto:anne@rstudio.com">anne@rstudio.com</a>.</p></description></item><item><title>dplyr 0.5.0</title><link>https://www.rstudio.com/blog/dplyr-0-5-0/</link><pubDate>Mon, 27 Jun 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/dplyr-0-5-0/</guid><description><p>I&rsquo;m very pleased to announce that dplyr 0.5.0 is now available from CRAN. Get the latest version with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">dplyr&#34;</span>)</code></pre></div><p>dplyr 0.5.0 is a big release with a heap of new features, a whole bunch of minor improvements, and many bug fixes, both from me and from the broader dplyr community. In this blog post, I&rsquo;ll highlight the most important changes:</p><ul><li><p>Some breaking changes to single table verbs.</p></li><li><p>New tibble and dtplyr packages.</p></li><li><p>New vector functions.</p></li><li><p>Replacements for <code>summarise_each()</code> and <code>mutate_each()</code>.</p></li><li><p>Improvements to SQL translation.</p></li></ul><p>To see the complete list, please read the <a href="https://github.com/hadley/dplyr/releases/tag/v0.5.0">release notes</a>.</p><h2 id="breaking-changes">Breaking changes</h2><p><code>arrange()</code> once again ignores grouping, reverting back to the behaviour of dplyr 0.3 and earlier. This makes <code>arrange()</code> inconsistent with other dplyr verbs, but I think this behaviour is generally more useful. Regardless, it&rsquo;s not going to change again, as more changes will just cause more confusion.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">mtcars <span style="color:#666">%&gt;%</span><span style="color:#06287e">group_by</span>(cyl) <span style="color:#666">%&gt;%</span><span style="color:#06287e">arrange</span>(<span style="color:#06287e">desc</span>(mpg))<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [32 x 11]</span><span style="color:#60a0b0;font-style:italic">#&gt; Groups: cyl [3]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 32 x 11</span><span style="color:#60a0b0;font-style:italic">#&gt; mpg cyl disp hp drat wt qsec vs am gear carb</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1</span><span style="color:#60a0b0;font-style:italic">#&gt; ... with 27 more rows</span></code></pre></div><p>If you give <code>distinct()</code> a list of variables, it now only keeps those variables (instead of, as previously, keeping the first value from the other variables). To preserve the previous behaviour, use <code>.keep_all = TRUE</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df <span style="color:#666">&lt;-</span> <span style="color:#06287e">data_frame</span>(x <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#40a070">1</span>, <span style="color:#40a070">1</span>, <span style="color:#40a070">1</span>, <span style="color:#40a070">2</span>, <span style="color:#40a070">2</span>), y <span style="color:#666">=</span> <span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">5</span>)<span style="color:#60a0b0;font-style:italic"># Now only keeps x variable</span>df <span style="color:#666">%&gt;%</span> <span style="color:#06287e">distinct</span>(x)<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 2 x 1</span><span style="color:#60a0b0;font-style:italic">#&gt; x</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;dbl&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2</span><span style="color:#60a0b0;font-style:italic"># Previous behaviour preserved all variables</span>df <span style="color:#666">%&gt;%</span> <span style="color:#06287e">distinct</span>(x, .keep_all <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>)<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 2 x 2</span><span style="color:#60a0b0;font-style:italic">#&gt; x y</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;dbl&gt; &lt;int&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 1</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 4</span></code></pre></div><p>The <code>select()</code> helper functions <code>starts_with()</code>, <code>ends_with()</code>, etc are now real exported functions. This means that they have better documentation, and there&rsquo;s an extension mechnaism if you want to write your own helpers.</p><h2 id="tibble-and-dtplyr-packages">Tibble and dtplyr packages</h2><p>Functions related to the creation and coercion of <code>tbl_df</code>s (&ldquo;tibble&quot;s for short), now live in their own package: <a href="https://blog.rstudio.com/2016/03/24/tibble-1-0-0/">tibble</a>. See <code>vignette(&quot;tibble&quot;)</code> for more details.</p><p>Similarly, all code related to the data table dplyr backend code has been separated out in to a new <a href="https://github.com/hadley/dtplyr">dtplyr</a> package. This decouples the development of the data.table interface from the development of the dplyr package, and I hope will spur improvements to the backend. If both data.table and dplyr are loaded, you&rsquo;ll get a message reminding you to load dtplyr.</p><h2 id="vector-functions">Vector functions</h2><p>This version of dplyr gains a number of vector functions inspired by SQL. Two functions make it a little easier to eliminate or generate missing values:</p><ul><li>Given a set of vectors, <code>coalesce()</code> finds the first non-missing value in each position:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">x <span style="color:#666">&lt;-</span> <span style="color:#06287e">c</span>(<span style="color:#40a070">1</span>, <span style="color:#40a070">2</span>, <span style="color:#007020;font-weight:bold">NA</span>, <span style="color:#40a070">4</span>, <span style="color:#007020;font-weight:bold">NA</span>, <span style="color:#40a070">6</span>)y <span style="color:#666">&lt;-</span> <span style="color:#06287e">c</span>(<span style="color:#007020;font-weight:bold">NA</span>, <span style="color:#40a070">2</span>, <span style="color:#40a070">3</span>, <span style="color:#40a070">4</span>, <span style="color:#40a070">5</span>, <span style="color:#007020;font-weight:bold">NA</span>)<span style="color:#60a0b0;font-style:italic"># Use this to piece together a complete vector:</span><span style="color:#06287e">coalesce</span>(x, y)<span style="color:#60a0b0;font-style:italic">#&gt; [1] 1 2 3 4 5 6</span><span style="color:#60a0b0;font-style:italic"># Or just replace missing value with a constant:</span><span style="color:#06287e">coalesce</span>(x, <span style="color:#40a070">0</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] 1 2 0 4 0 6</span></code></pre></div><ul><li>The complement of <code>coalesce()</code> is <code>na_if()</code>: it replaces a specified value with an <code>NA</code>.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">x <span style="color:#666">&lt;-</span> <span style="color:#06287e">c</span>(<span style="color:#40a070">1</span>, <span style="color:#40a070">5</span>, <span style="color:#40a070">2</span>, <span style="color:#40a070">-99</span>, <span style="color:#40a070">-99</span>, <span style="color:#40a070">10</span>)<span style="color:#06287e">na_if</span>(x, <span style="color:#40a070">-99</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] 1 5 2 NA NA 10</span></code></pre></div><p>Three functions provide convenient ways of replacing values. In order from simplest to most complicated, they are:</p><ul><li><code>if_else()</code>, a vectorised if statement, takes a logical vector (usually created with a comparison operator like <code>==</code>, <code>&lt;</code>, or <code>%in%</code>) and replaces <code>TRUE</code>s with one vector and <code>FALSE</code>s with another.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">x1 <span style="color:#666">&lt;-</span> <span style="color:#06287e">sample</span>(<span style="color:#40a070">5</span>)<span style="color:#06287e">if_else</span>(x1 <span style="color:#666">&lt;</span> <span style="color:#40a070">5</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">small&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">big&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;small&#34; &#34;small&#34; &#34;big&#34; &#34;small&#34; &#34;small&#34;</span></code></pre></div><p><code>if_else()</code> is similar to <code>base::ifelse()</code>, but has two useful improvements.First, it has a fourth argument that will replace missing values:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">x2 <span style="color:#666">&lt;-</span> <span style="color:#06287e">c</span>(<span style="color:#007020;font-weight:bold">NA</span>, x1)<span style="color:#06287e">if_else</span>(x2 <span style="color:#666">&lt;</span> <span style="color:#40a070">5</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">small&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">big&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">unknown&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;unknown&#34; &#34;small&#34; &#34;small&#34; &#34;big&#34; &#34;small&#34; &#34;small&#34;</span></code></pre></div><p>Secondly, it also have stricter semantics that <code>ifelse()</code>: the <code>true</code> and <code>false</code> arguments must be the same type. This gives a less surprising return type, and preserves S3 vectors like dates and factors:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">x <span style="color:#666">&lt;-</span> <span style="color:#06287e">factor</span>(<span style="color:#06287e">sample</span>(<span style="color:#007020;font-weight:bold">letters</span>[1<span style="color:#666">:</span><span style="color:#40a070">5</span>], <span style="color:#40a070">10</span>, replace <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>))<span style="color:#06287e">ifelse</span>(x <span style="color:#666">%in%</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">b&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">c&#34;</span>), x, <span style="color:#06287e">factor</span>(<span style="color:#007020;font-weight:bold">NA</span>))<span style="color:#60a0b0;font-style:italic">#&gt; [1] NA NA 1 NA 3 2 3 NA 3 2</span><span style="color:#06287e">if_else</span>(x <span style="color:#666">%in%</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">b&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">c&#34;</span>), x, <span style="color:#06287e">factor</span>(<span style="color:#007020;font-weight:bold">NA</span>))<span style="color:#60a0b0;font-style:italic">#&gt; [1] &lt;NA&gt; &lt;NA&gt; a &lt;NA&gt; c b c &lt;NA&gt; c b</span><span style="color:#60a0b0;font-style:italic">#&gt; Levels: a b c d e</span></code></pre></div><p>Currently, <code>if_else()</code> is very strict, so you&rsquo;ll need to careful match the types of <code>true</code> and <code>false</code>. This is most likely to bite you when you&rsquo;re using missing values, and you&rsquo;ll need to use a specific <code>NA</code>: <code>NA_integer_</code>, <code>NA_real_</code>, or <code>NA_character_</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">if_else</span>(<span style="color:#007020;font-weight:bold">TRUE</span>, <span style="color:#40a070">1</span>, <span style="color:#007020;font-weight:bold">NA</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Error: `false` has type &#39;logical&#39; not &#39;double&#39;</span><span style="color:#06287e">if_else</span>(<span style="color:#007020;font-weight:bold">TRUE</span>, <span style="color:#40a070">1</span>, <span style="color:#007020;font-weight:bold">NA_real_</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] 1</span></code></pre></div><ul><li><code>recode()</code>, a vectorised <code>switch()</code>, takes a numeric vector, character vector, or factor, and replaces elements based on their values.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">x <span style="color:#666">&lt;-</span> <span style="color:#06287e">sample</span>(<span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">b&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">c&#34;</span>, <span style="color:#007020;font-weight:bold">NA</span>), <span style="color:#40a070">10</span>, replace <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>)<span style="color:#60a0b0;font-style:italic"># The default is to leave non-replaced values as is</span><span style="color:#06287e">recode</span>(x, a <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Apple&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;c&#34; &#34;Apple&#34; NA NA &#34;c&#34; NA &#34;b&#34; NA</span><span style="color:#60a0b0;font-style:italic">#&gt; [9] &#34;c&#34; &#34;Apple&#34;</span><span style="color:#60a0b0;font-style:italic"># But you can choose to override the default:</span><span style="color:#06287e">recode</span>(x, a <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Apple&#34;</span>, .default <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">NA_character_</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] NA &#34;Apple&#34; NA NA NA NA NA NA</span><span style="color:#60a0b0;font-style:italic">#&gt; [9] NA &#34;Apple&#34;</span><span style="color:#60a0b0;font-style:italic"># You can also choose what value is used for missing values</span><span style="color:#06287e">recode</span>(x, a <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Apple&#34;</span>, .default <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">NA_character_</span>, .missing <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Unknown&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] NA &#34;Apple&#34; &#34;Unknown&#34; &#34;Unknown&#34; NA &#34;Unknown&#34; NA</span><span style="color:#60a0b0;font-style:italic">#&gt; [8] &#34;Unknown&#34; NA &#34;Apple&#34;</span></code></pre></div><ul><li><code>case_when()</code>, is a vectorised set of <code>if</code> and <code>else if</code>s. You provide it a set of test-result pairs as formulas: The left side of the formula should return a logical vector, and the right hand side should return either a single value, or a vector the same length as the left hand side. All results must be the same type of vector.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">x <span style="color:#666">&lt;-</span> <span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">40</span><span style="color:#06287e">case_when</span>(x <span style="color:#666">%%</span> <span style="color:#40a070">35</span> <span style="color:#666">==</span> <span style="color:#40a070">0</span> <span style="color:#666">~</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">fizz buzz&#34;</span>,x <span style="color:#666">%%</span> <span style="color:#40a070">5</span> <span style="color:#666">==</span> <span style="color:#40a070">0</span> <span style="color:#666">~</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">fizz&#34;</span>,x <span style="color:#666">%%</span> <span style="color:#40a070">7</span> <span style="color:#666">==</span> <span style="color:#40a070">0</span> <span style="color:#666">~</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">buzz&#34;</span>,<span style="color:#007020;font-weight:bold">TRUE</span> <span style="color:#666">~</span> <span style="color:#06287e">as.character</span>(x))<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;1&#34; &#34;2&#34; &#34;3&#34; &#34;4&#34; &#34;fizz&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt; [6] &#34;6&#34; &#34;buzz&#34; &#34;8&#34; &#34;9&#34; &#34;fizz&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt; [11] &#34;11&#34; &#34;12&#34; &#34;13&#34; &#34;buzz&#34; &#34;fizz&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt; [16] &#34;16&#34; &#34;17&#34; &#34;18&#34; &#34;19&#34; &#34;fizz&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt; [21] &#34;buzz&#34; &#34;22&#34; &#34;23&#34; &#34;24&#34; &#34;fizz&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt; [26] &#34;26&#34; &#34;27&#34; &#34;buzz&#34; &#34;29&#34; &#34;fizz&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt; [31] &#34;31&#34; &#34;32&#34; &#34;33&#34; &#34;34&#34; &#34;fizz buzz&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt; [36] &#34;36&#34; &#34;37&#34; &#34;38&#34; &#34;39&#34; &#34;fizz&#34;</span></code></pre></div><p><code>case_when()</code> is still somewhat experiment and does not currently work inside <code>mutate()</code>. That will be fixed in a future version.</p><p>I also added one small helper for dealing with floating point comparisons: <code>near()</code> tests for equality with numeric tolerance (<code>abs(x - y) &lt; tolerance</code>).</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">x <span style="color:#666">&lt;-</span> <span style="color:#06287e">sqrt</span>(<span style="color:#40a070">2</span>) ^ <span style="color:#40a070">2</span>x <span style="color:#666">==</span> <span style="color:#40a070">2</span><span style="color:#60a0b0;font-style:italic">#&gt; [1] FALSE</span><span style="color:#06287e">near</span>(x, <span style="color:#40a070">2</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] TRUE</span></code></pre></div><h2 id="predicate-functions">Predicate functions</h2><p>Thanks to ideas and code from <a href="http://github.com/lionel-">Lionel Henry</a>, a new family of functions improve upon <code>summarise_each()</code> and <code>mutate_each()</code>:</p><ul><li><code>summarise_all()</code> and <code>mutate_all()</code> apply a function to all (non-grouped) columns:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">mtcars <span style="color:#666">%&gt;%</span> <span style="color:#06287e">group_by</span>(cyl) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">summarise_all</span>(mean)<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 3 x 11</span><span style="color:#60a0b0;font-style:italic">#&gt; cyl mpg disp hp drat wt qsec vs</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 4 26.66364 105.1364 82.63636 4.070909 2.285727 19.13727 0.9090909</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 6 19.74286 183.3143 122.28571 3.585714 3.117143 17.97714 0.5714286</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 8 15.10000 353.1000 209.21429 3.229286 3.999214 16.77214 0.0000000</span><span style="color:#60a0b0;font-style:italic">#&gt; ... with 3 more variables: am &lt;dbl&gt;, gear &lt;dbl&gt;, carb &lt;dbl&gt;</span></code></pre></div><ul><li><p><code>summarise_at()</code> and <code>mutate_at()</code> operate on a subset of columns. You can select columns with:</p><ul><li><p>a character vector of column names,</p></li><li><p>a numeric vector of column positions, or</p></li><li><p>a column specification with <code>select()</code> semantics generated with the new <code>vars()</code> helper.</p></li></ul><p>mtcars %&gt;% group_by(cyl) %&gt;% summarise_at(c(&ldquo;mpg&rdquo;, &ldquo;wt&rdquo;), mean)#&gt; # A tibble: 3 x 3#&gt; cyl mpg wt#&gt; <dbl> <dbl> <dbl>#&gt; 1 4 26.66364 2.285727#&gt; 2 6 19.74286 3.117143#&gt; 3 8 15.10000 3.999214mtcars %&gt;% group_by(cyl) %&gt;% summarise_at(vars(mpg, wt), mean)#&gt; # A tibble: 3 x 3#&gt; cyl mpg wt#&gt; <dbl> <dbl> <dbl>#&gt; 1 4 26.66364 2.285727#&gt; 2 6 19.74286 3.117143#&gt; 3 8 15.10000 3.999214</p></li><li><p><code>summarise_if()</code> and <code>mutate_if()</code> take a predicate function (a function that returns <code>TRUE</code> or <code>FALSE</code> when given a column). This makes it easy to apply a function only to numeric columns:</p></li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">iris <span style="color:#666">%&gt;%</span> <span style="color:#06287e">summarise_if</span>(is.numeric, mean)<span style="color:#60a0b0;font-style:italic">#&gt; Sepal.Length Sepal.Width Petal.Length Petal.Width</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 5.843333 3.057333 3.758 1.199333</span></code></pre></div><p>All of these functions pass <code>...</code> on to the individual <code>funs</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">iris <span style="color:#666">%&gt;%</span> <span style="color:#06287e">summarise_if</span>(is.numeric, mean, trim <span style="color:#666">=</span> <span style="color:#40a070">0.25</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Sepal.Length Sepal.Width Petal.Length Petal.Width</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 5.802632 3.032895 3.934211 1.230263</span></code></pre></div><p>A new <code>select_if()</code> allows you to pick columns with a predicate function:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df <span style="color:#666">&lt;-</span> <span style="color:#06287e">data_frame</span>(x <span style="color:#666">=</span> <span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">3</span>, y <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">b&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">c&#34;</span>))df <span style="color:#666">%&gt;%</span> <span style="color:#06287e">select_if</span>(is.numeric)<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 3 x 1</span><span style="color:#60a0b0;font-style:italic">#&gt; x</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;int&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 3</span>df <span style="color:#666">%&gt;%</span> <span style="color:#06287e">select_if</span>(is.character)<span style="color:#60a0b0;font-style:italic">#&gt; # A tibble: 3 x 1</span><span style="color:#60a0b0;font-style:italic">#&gt; y</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;chr&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 a</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 b</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 c</span></code></pre></div><p><code>summarise_each()</code> and <code>mutate_each()</code> will be deprecated in a future release.</p><h2 id="sql-translation">SQL translation</h2><p>I have completely overhauled the translation of dplyr verbs into SQL statements. Previously, dplyr used a rather ad-hoc approach which tried to guess when a new subquery was needed. Unfortunately this approach was fraught with bugs, so I have now implemented a richer internal data model. In the short-term, this is likely to lead to some minor performance decreases (as the generated SQL is more complex), but the dplyr is much more likely to generate correct SQL. In the long-term, these abstractions will make it possible to write a query optimiser/compiler in dplyr, which would make it possible to generate much more succinct queries. If you know anything about writing query optimisers or compilers and are interested in working on this problem, please let me know!</p></description></item><item><title>See RStudio at UseR! 2016</title><link>https://www.rstudio.com/blog/see-rstudio-at-user-2016/</link><pubDate>Mon, 27 Jun 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/see-rstudio-at-user-2016/</guid><description><p>UseR! 2016 has arrived and the RStudio team is at Stanford to share our newest products and latest enhancements to Shiny, R Markdown, dplyr, and more. Here&rsquo;s a quick snapshot of RStudio related sessions. We hope to see you in as many of them as you can attend!</p><p><strong>Monday June 27</strong></p><p>Morning Tutorials</p><ul><li><p><a href="https://user2016.sched.org/event/7Baa/dynamic-documents-with-r-markdown-part-1">Dynamic Documents with R Markdown</a> -Karl Broman, Ian Lyttle, Yihui Xie</p></li><li><p><a href="https://user2016.sched.org/event/7Bad/using-git-and-github-with-r-rstudio-and-r-markdown-part-1">Using Git and Github with RStudio and R Markdown</a> - Jenny Bryan</p></li></ul><p>Afternoon Tutorials</p><ul><li><p><a href="https://user2016.sched.org/event/7Bb6">Effective Shiny Programming</a> - Joe Cheng</p></li><li><p><a href="https://user2016.sched.org/event/7Bb7/extracting-data-from-the-web-apis-and-beyond-part-1">Extracting Data from the Web, APIs and beyond </a>-Karthik Ram, Scott Chamberlain, Garrett Grolemund</p></li></ul><p>Afternoon short talks moderated by Hadley Wickham</p><ul><li><p><a href="https://user2016.sched.org/event/7BVC/whats-up-with-the-r-consortium">What&rsquo;s up with the R Consortium?</a> Speaker: Joseph Rickert</p></li><li><p><a href="https://user2016.sched.org/event/7BVD/presentation-of-the-women-in-r-task-force">Presentation of the Women in R task force</a> Speaker: Heather Turner</p></li><li><p><a href="https://user2016.sched.org/event/7BVE/r-ladies-presentation-1">R-Ladies Presentation (1)</a> Speaker: Gabriella de Queiroz</p></li><li><p><a href="https://user2016.sched.org/event/7BVF/r-ladies-presentation-2">R-Ladies Presentation (2) </a>Speakers: Alice Daish, Hannah Frick</p></li><li><p><a href="https://user2016.sched.org/event/7BVG/discussion">R Discussion</a></p></li></ul><p><strong>Tuesday June 28</strong></p><ul><li><p><a href="https://user2016.sched.org/event/78HJ/using-spark-with-shiny-and-r-markdown">Using Spark with Shiny and R Markdown</a> - Jeff Allen</p></li><li><p><a href="https://user2016.sched.org/event/7BXe/rcppparallel-a-toolkit-for-portable-high-performance-algorithms">RcppParallel: A Toolkit for Portable, High-Performance Algorithms</a> - Kevin Ushey</p></li><li><p><a href="https://user2016.sched.org/event/7BY1/linking-htmlwidgets-with-crosstalk-and-mobservable">Linking htmlwidgets with crosstak and mobservable</a> - Joe Cheng</p></li><li><p><a href="https://user2016.sched.org/event/7BXs/covr-bringing-code-coverage-to-r">Bringing Code Coverage to R</a> - Jim Hester</p></li><li><p><a href="https://user2016.sched.org/event/7BXz/flexdashboard-easy-interactive-dashboards-for-r">flexdashboard: Easy Interactive Dashboards for R </a>- Jonathan McPherson</p></li></ul><p><strong>Wednesday June 29</strong></p><ul><li><p><a href="https://user2016.sched.org/event/7BaF/towards-a-grammar-of-interactive-graphics">Keynote: Towards a Grammar of Interactive Graphics</a> - Hadley Wickham</p></li><li><p><a href="https://user2016.sched.org/event/7BXl/notebooks-with-r-markdown">Notebooks with R Markdown</a> - J.J. Allaire</p></li><li><p><a href="https://user2016.sched.org/event/7BXk/importing-modern-data-into-r">Importing Modern Data into R</a> - Javier Luraschi</p></li><li><p><a href="https://user2016.sched.org/event/7BY5/profvis-profiling-tools-for-faster-r-code">Profvis: Profiling Tools for Faster R Code</a> - Winston Chang</p></li></ul><p><strong>Thursday June 30</strong></p><ul><li><p><a href="https://user2016.sched.org/event/7BXi/shiny-gadgets-interactive-tools-for-programming-and-data-analysis">Shiny Gadgets: Interactive Tools</a> - Garrett Grolemund</p></li><li><p><a href="https://user2016.sched.org/event/7BXq/authoring-books-with-r-markdown">Authoring Books with R Markdown </a>- Hadley Wickham</p></li></ul><p><strong>Stop by the booth!</strong>Don&rsquo;t miss our table in the exhibition area during the conference. Come talk to us about your plans for R and learn how RStudio Server Pro and Shiny Server Pro can provide enterprise-ready support and scalability for your RStudio IDE and Shiny deployments.</p><p><em>Note: Although UseR! is sold out, arrangements have been made to stream the keynote talks from <a href="https://aka.ms/user2016conference">https://aka.ms/user2016conference</a>. Video recordings of the other sessions (where permitted by speakers) will be made available by UseR! organizers after the conference.</em></p></description></item><item><title>tidyr 0.5.0</title><link>https://www.rstudio.com/blog/tidyr-0-5-0/</link><pubDate>Mon, 13 Jun 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/tidyr-0-5-0/</guid><description><p>I&rsquo;m pleased to announce tidyr 0.5.0. tidyr makes it easy to &ldquo;tidy&rdquo; your data, storing it in a consistent form so that it&rsquo;s easy to manipulate, visualise and model. Tidy data has a simple convention: put variables in the columns and observations in the rows. You can learn more about it in the <a href="http://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.html">tidy data</a> vignette. Install it with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">tidyr&#34;</span>)</code></pre></div><p>This release has three useful new features:</p><ol><li><code>separate_rows()</code> separates values that contain multiple values separated by a delimited into multiple rows. Thanks to <a href="https://github.com/aaronwolen">Aaron Wolen</a> for the contribution!</li></ol><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df <span style="color:#666">&lt;-</span> <span style="color:#06287e">data_frame</span>(x <span style="color:#666">=</span> <span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">2</span>, y <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a,b&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">d,e,f&#34;</span>))df <span style="color:#666">%&gt;%</span><span style="color:#06287e">separate_rows</span>(y, sep <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">,&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [5 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x y</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;int&gt; &lt;chr&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 a</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 1 b</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 2 d</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 2 e</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 2 f</span></code></pre></div><p>Compare with <code>separate()</code> which separates into (named) columns:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df <span style="color:#666">%&gt;%</span><span style="color:#06287e">separate</span>(y, <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">y1&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">y2&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">y3&#34;</span>), sep <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">,&#34;</span>, fill <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">right&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [2 x 4]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x y1 y2 y3</span><span style="color:#60a0b0;font-style:italic">#&gt; * &lt;int&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 a b &lt;NA&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 d e f</span></code></pre></div><ol start="2"><li><code>spread()</code> gains a <code>sep</code> argument. Setting this will name columns as &ldquo;key|sep|value&rdquo;. This is useful when you&rsquo;re spreading based on a numeric column:</li></ol><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df <span style="color:#666">&lt;-</span> <span style="color:#06287e">data_frame</span>(x <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#40a070">1</span>, <span style="color:#40a070">2</span>, <span style="color:#40a070">1</span>),key <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#40a070">1</span>, <span style="color:#40a070">1</span>, <span style="color:#40a070">2</span>),val <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">b&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">c&#34;</span>))df <span style="color:#666">%&gt;%</span> <span style="color:#06287e">spread</span>(key, val)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [2 x 3]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x 1 2</span><span style="color:#60a0b0;font-style:italic">#&gt; * &lt;dbl&gt; &lt;chr&gt; &lt;chr&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 a c</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 b &lt;NA&gt;</span>df <span style="color:#666">%&gt;%</span> <span style="color:#06287e">spread</span>(key, val, sep <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">_&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [2 x 3]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x key_1 key_2</span><span style="color:#60a0b0;font-style:italic">#&gt; * &lt;dbl&gt; &lt;chr&gt; &lt;chr&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 a c</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 b &lt;NA&gt;</span></code></pre></div><ol start="3"><li><code>unnest()</code> gains a <code>.sep</code> argument. This is useful if you have multiple columns of data frames that have the same variable names:</li></ol><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df <span style="color:#666">&lt;-</span> <span style="color:#06287e">data_frame</span>(x <span style="color:#666">=</span> <span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">2</span>,y1 <span style="color:#666">=</span> <span style="color:#06287e">list</span>(<span style="color:#06287e">data_frame</span>(y <span style="color:#666">=</span> <span style="color:#40a070">1</span>),<span style="color:#06287e">data_frame</span>(y <span style="color:#666">=</span> <span style="color:#40a070">2</span>)),y2 <span style="color:#666">=</span> <span style="color:#06287e">list</span>(<span style="color:#06287e">data_frame</span>(y <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a&#34;</span>),<span style="color:#06287e">data_frame</span>(y <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">b&#34;</span>)))df <span style="color:#666">%&gt;%</span> <span style="color:#06287e">unnest</span>()<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [2 x 3]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x y y</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;int&gt; &lt;dbl&gt; &lt;chr&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 1 a</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 2 b</span>df <span style="color:#666">%&gt;%</span> <span style="color:#06287e">unnest</span>(.sep <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">_&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [2 x 3]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x y1_y y2_y</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;int&gt; &lt;dbl&gt; &lt;chr&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 1 a</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 2 b</span></code></pre></div><p>It also gains a <code>.id</code> column that makes the names of the list explicit:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df <span style="color:#666">&lt;-</span> <span style="color:#06287e">data_frame</span>(x <span style="color:#666">=</span> <span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">2</span>,y <span style="color:#666">=</span> <span style="color:#06287e">list</span>(a <span style="color:#666">=</span> <span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">3</span>,b <span style="color:#666">=</span> <span style="color:#40a070">3</span><span style="color:#666">:</span><span style="color:#40a070">1</span>))df <span style="color:#666">%&gt;%</span> <span style="color:#06287e">unnest</span>()<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [6 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x y</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;int&gt; &lt;int&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 1</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 1 2</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 1 3</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 2 3</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 2 2</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 2 1</span>df <span style="color:#666">%&gt;%</span> <span style="color:#06287e">unnest</span>(.id <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">id&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [6 x 3]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x y id</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;int&gt; &lt;int&gt; &lt;chr&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 1 a</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 1 2 a</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 1 3 a</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 2 3 b</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 2 2 b</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 2 1 b</span></code></pre></div><p>tidyr 0.5.0 also includes a bumper crop of bug fixes, including fixes for <code>spread()</code> and <code>gather()</code> in the presence of list-columns. Please see the <a href="https://github.com/hadley/tidyr/releases/tag/v0.5.0">release notes</a> for a complete list of changes.</p></description></item><item><title>Profiling with RStudio and profvis</title><link>https://www.rstudio.com/blog/profiling-with-rstudio-and-profvis/</link><pubDate>Mon, 23 May 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/profiling-with-rstudio-and-profvis/</guid><description><p>&ldquo;How can I make my code faster?&rdquo; If you write R code, then you&rsquo;ve probably asked yourself this question. A profiler is an important tool for doing this: it records how the computer spends its time, and once you know that, you can focus on the slow parts to make them faster.</p><p>The <a href="https://www.rstudio.com/products/rstudio/download/preview/">preview releases</a> of RStudio now have integrated support for profiling R code and for visualizing profiling data. R itself has long had a built-in profiler, and now it&rsquo;s easier than ever to use the profiler and interpret the results.</p><p>To profile code with RStudio, select it in the editor, and then click on <strong>Profile -&gt; Profile Selected Line(s)</strong>. R will run that code with the profiler turned on, and then open up an interactive visualization.</p><p><img src="https://rstudioblog.files.wordpress.com/2016/05/profile1.gif" alt=""></p><p>In the visualization, there are two main parts: on top, there is the code with information about the amount of time spent executing each line, and on the bottom there is a <em>flame graph</em>, which shows R was doing over time. In the flame graph, the horizontal direction represents time, moving from left to right, and the vertical direction represents the <em>call stack</em>, which are the functions that are currently being called. (Each time a function calls another function, it goes on top of the stack, and when a function exits, it is removed from the stack.)</p><p><img src="https://rstudioblog.files.wordpress.com/2016/05/profile.png" alt="profile.png"></p><p>The <strong>Data</strong> tab contains a call tree, showing which function calls are most expensive:</p><p><img src="https://rstudioblog.files.wordpress.com/2016/05/data1.png" alt="Profiling data pane"></p><p>Armed with this information, you&rsquo;ll know what parts of your code to focus on to speed things up!</p><p>The interactive profile visualizations are created with the <a href="https://rstudio.github.io/profvis/">profvis</a> package, which can be used separately from the RStudio IDE. If you use profvis outside of RStudio, the visualizations will open in a web browser.</p><p>To learn more about interpreting profiling data, check out the <a href="https://rstudio.github.io/profvis/">profvis website</a>, which has interactive demos. You can also find out more about <a href="http://rstudio.github.io/profvis/rstudio.html">profiling with RStudio</a> there.</p></description></item><item><title>flexdashboard: Easy interactive dashboards for R</title><link>https://www.rstudio.com/blog/flexdashboard-easy-interactive-dashboards-for-r/</link><pubDate>Tue, 17 May 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/flexdashboard-easy-interactive-dashboards-for-r/</guid><description><p>Today we&rsquo;re excited to announce <a href="https://rmarkdown.rstudio.com/flexdashboard/">flexdashboard</a>, a new package that enables you to easily create flexible, attractive, interactive dashboards with R. Authoring and customization of dashboards is done using <a href="http://rmarkdown.rstudio.com">R Markdown</a> and you can optionally include <a href="https://shiny.rstudio.com">Shiny</a> components for additional interactivity.</p><p><img src="https://rstudioblog.files.wordpress.com/2016/05/neighborhood-diversity-flexdashboard.png" alt="neighborhood-diversity-flexdashboard"></p><p>Highlights of the <a href="https://rmarkdown.rstudio.com/flexdashboard/">flexdashboard</a> package include:</p><ul><li><p>Support for a wide variety of components including interactive <a href="http://www.htmlwidgets.org/">htmlwidgets</a>; base, lattice, and grid graphics; tabular data; gauges; and value boxes.</p></li><li><p>Flexible and easy to specify row and column-based <a href="https://rmarkdown.rstudio.com/flexdashboard/layouts.html">layouts</a>. Components are intelligently re-sized to fill the browser and adapted for display on mobile devices.</p></li><li><p>Extensive support for text annotations to include assumptions, contextual narrative, and analysis within dashboards.</p></li><li><p><a href="https://rstudio.github.io/flexdashboard/articles/using.html">Storyboard</a> layouts for presenting sequences of visualizations and related commentary.</p></li><li><p>By default dashboards are standard HTML documents that can be deployed on any web server or even attached to an email message. You can optionally add <a href="https://shiny.rstudio.com/">Shiny</a> components for additional interactivity and then <a href="https://shiny.rstudio.com/deploy/">deploy</a> on Shiny Server or shinyapps.io.</p></li></ul><h3 id="getting-started">Getting Started</h3><p>The flexdashboard package is available on CRAN; you can install it as follows:</p><pre><code>install.packages(&quot;flexdashboard&quot;, type = &quot;source&quot;)</code></pre><p>To author a flexdashboard you create an <a href="https://rmarkdown.rstudio.com/">R Markdown</a> document with the <code>flexdashboard::flex_dashboard</code> output format. You can do this from within RStudio using the <strong>New R Markdown</strong> dialog:</p><p><img src="https://rmarkdown.rstudio.com/flexdashboard/images/NewRMarkdown.png" alt=""></p><p>Dashboards are simple R Markdown documents where each level 3 header (<code>###</code>) defines a section of the dashboard. For example, here&rsquo;s a simple dashboard layout with 3 charts arranged top to bottom:</p><pre><code>---title: &quot;My Dashboard&quot;output: flexdashboard::flex_dashboard---### Chart 1```{r}```### Chart 2```{r}```### Chart 3```{r}```</code></pre><p>You can use level 2 headers (<code>-----------</code>) to introduce rows and columns into your dashboard and section attributes to control their relative size:</p><pre><code>---title: &quot;My Dashboard&quot;output: flexdashboard::flex_dashboard---Column {data-width=600}-------------------------------------### Chart 1```{r}```Column {data-width=400}-------------------------------------### Chart 2```{r}```### Chart 3```{r}```</code></pre><h3 id="learning-more">Learning More</h3><p>The <a href="https://rmarkdown.rstudio.com/flexdashboard/">flexdashboard website</a> includes extensive documentation on building your own dashboards, including:</p><ul><li><p>A <a href="https://rstudio.github.io/flexdashboard/articles/using.html">user guide</a> for all of the features and options of flexdashboard, including layout orientations (row vs. column based), chart sizing, the various supported components, theming, and creating dashboards with multiple pages.</p></li><li><p>Details on using <a href="https://rmarkdown.rstudio.com/flexdashboard/shiny.html">Shiny</a> to create dashboards that enable viewers to change underlying parameters and see the results immediately, or that update themselves incrementally as their underlying data changes.</p></li><li><p>A variety of <a href="https://rmarkdown.rstudio.com/flexdashboard/layouts.html">sample layouts</a> which you can use as a starting point for your own dashboards.</p></li><li><p>Many <a href="https://rmarkdown.rstudio.com/flexdashboard/examples.html">examples</a> of flexdashboard in action (including links to source code if you want to dig into how each example was created).</p></li></ul><p>The examples below illustrate the use of flexdashboard with various packages and layouts (click the thumbnail to view a running version of each dashboard):</p><p><a href="https://beta.rstudioconnect.com/jjallaire/htmlwidgets-d3heatmap/"><img src="https://rstudioblog.files.wordpress.com/2016/05/htmlwidgets-d3heatmap.png" alt="htmlwidgets-d3heatmap"></a></p><p class="caption">d3heatmap: NBA scoring</p><p><a href="https://beta.rstudioconnect.com/jjallaire/htmlwidgets-ggplotly-geoms/"><img src="https://rstudioblog.files.wordpress.com/2016/05/plotly.png" alt="ggplotly: ggplot2 geoms"></a></p><p class="caption">ggplotly: ggplot2 geoms</p><p><a href="https://jjallaire.shinyapps.io/shiny-biclust/"><img src="https://rstudioblog.files.wordpress.com/2016/05/shiny-biclust.png" alt="Shiny: biclust example"></a></p><p class="caption">Shiny: biclust example</p><p><a href="https://beta.rstudioconnect.com/jjallaire/htmlwidgets-dygraphs/"><img src="https://rstudioblog.files.wordpress.com/2016/05/dygraphs.png" alt="dygraphs: Linked time series"></a></p><p class="caption">dygraphs: linked time series</p><p><a href="https://beta.rstudioconnect.com/jjallaire/htmlwidgets-highcharter/"><img src="https://rstudioblog.files.wordpress.com/2016/05/htmlwidgets-highcharter.png" alt="highcharter: sales report"></a></p><p class="caption">highcharter: sales report</p><p><a href="https://beta.rstudioconnect.com/jjallaire/htmlwidgets-showcase-storyboard/"><img src="https://rstudioblog.files.wordpress.com/2016/05/htmlwidgets-showcase-storyboard.png" alt="Storyboard: htmlwidgets showcase"></a></p><p class="caption">Storyboard: htmlwidgets showcase</p><p><a href="https://beta.rstudioconnect.com/jjallaire/htmlwidgets-rbokeh-iris/"><img src="https://rstudioblog.files.wordpress.com/2016/05/htmlwidgets-rbokeh-iris.png" alt="rbokeh: iris dataset"></a></p><p class="caption">rbokeh: iris dataset</p><p><a href="https://jjallaire.shinyapps.io/shiny-ggplot2-diamonds/"><img src="https://rstudioblog.files.wordpress.com/2016/05/shiny-diamonds-explorer.png" alt="Shiny: diamonds explorer"></a></p><p class="caption">Shiny: diamonds explorer</p><h3 id="try-it-out">Try It Out</h3><p>The <a href="https://rmarkdown.rstudio.com/flexdashboard/">flexdashboard</a> package provides a simple yet powerful framework for creating dashboards from R. If you know R Markdown you already know enough to begin creating dashboards right now! We hope you&rsquo;ll try it out and <a href="https://github.com/rstudio/flexdashboard/issues">let us know</a> how it&rsquo;s working and what else we can do to make it better.</p></description></item><item><title>Shiny JavaScript Tutorials</title><link>https://www.rstudio.com/blog/shiny-javascript-tutorials/</link><pubDate>Fri, 06 May 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-javascript-tutorials/</guid><description><p>We are happy to announce a new series of tutorials that will take your Shiny apps to the next level. In the tutorials, Herman Sontrop and Erwin Schuijtvlot of <a href="http://www.friss.eu/en/">FRISS</a> will teach you how to create custom JavaScript widgets and embed them into your Shiny apps.</p><p>The JavaScript language is a powerful tool when combined with Shiny. You can use JavaScript code to create highly sophisticated actions, and the code can be run by your user&rsquo;s web browser. Best of all, JavaScript comes with a host of amazing visualization libraries that are ready to use out of the box, like <a href="http://c3js.org/">c3.js</a>, <a href="https://d3js.org/">d3.js</a>, <a href="http://introjs.com/">intro.js</a> and more.</p><p>The <a href="https://shiny.rstudio.com/articles/js-build-widget.html">first tutorial</a> is ready now, and we will publish each new lesson at the <a href="https://shiny.rstudio.com/articles/#extensions">Shiny Development Center</a> as it becomes available.</p><h2 id="about-friss">About FRISS</h2><p><a href="http://www.friss.eu/en"><img src="https://rstudioblog.files.wordpress.com/2016/05/friss.jpg" alt="FRISS"></a></p><p>FRISS (<a href="http://www.friss.eu/en">friss.eu</a>) is software company with a 100% focus on fraud, risk &amp; compliance for insurers worldwide. Shiny is an important component of the analytics framework employed by FRISS for its clients. In these tutorials, FRISS shares its expertise in developing Shiny apps with JavaScript.</p></description></item><item><title>ShinyDevCon videos now available</title><link>https://www.rstudio.com/blog/shinydevcon-videos-now-available/</link><pubDate>Thu, 05 May 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shinydevcon-videos-now-available/</guid><description><p>This past January, we held the first ever Shiny Developer Conference. It was a chance to gather together a group of intermediate to advanced Shiny users, and take their skills to the next level.</p><p>It was an exciting event for me in particular, as I&rsquo;ve been dying to share some of these intermediate and advanced Shiny concepts for years now. There are many concepts that aren&rsquo;t strictly required to be productive with Shiny, but make a huge difference in helping you write efficient, robust, maintainable apps—and also make Shiny app authoring a lot more satisfying.</p><p>The feedback we received from conference attendees was overwhelmingly positive: everyone from relative novices to the most advanced users told us they gained new insights into how to improve their Shiny apps, or had a perspective shift on concepts they thought they already understood. The user-contributed lightning talks were also a big hit, helping people see what&rsquo;s possible using Shiny and inspiring them to push their own apps further.</p><p>If you weren&rsquo;t able to attend but are still interested in building your Shiny skills, we&rsquo;re happy to announce the availability of videos of the tutorials and talks:</p><p><a href="https://www.rstudio.com/resources/webinars/shiny-developer-conference/">Shiny Developer Conference 2016 Videos</a></p><p>At the moment, these videos are our best sources of info on the topics of reactive programming, Shiny gadgets, Shiny modules, debugging Shiny apps, and performance. If you&rsquo;re at all serious about writing Shiny apps, we highly recommend you take the time to watch!</p><p>If you&rsquo;re interested in attending next year&rsquo;s conference, you can sign up for our email list using the subscription form at the top of the <a href="https://www.rstudio.com/resources/webinars/shiny-developer-conference/">Shiny Developer Conference 2016 Videos</a> page, and we&rsquo;ll let you know when more details are available.</p></description></item><item><title>Register now for Hadley Wickham's Master R in Amsterdam</title><link>https://www.rstudio.com/blog/register-now-for-hadley-wickhams-master-r-in-amsterdam/</link><pubDate>Sat, 30 Apr 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/register-now-for-hadley-wickhams-master-r-in-amsterdam/</guid><description><p>On May 19 and 20, 2016, Hadley Wickham will teach his two day Master R Developer Workshop in the centrally located European city of Amsterdam.</p><p>This is the first time we&rsquo;ve offered Hadley&rsquo;s workshop in Europe. It&rsquo;s a rare chance to learn from Hadley in person. Only 3 public Master R Developer Workshop classes are offered per year and no future classes in Europe are planned at this time for 2016 or 2017.</p><p>If you don&rsquo;t want to miss this opportunity, <a href="https://www.eventbrite.com/e/master-r-developer-workshop-amsterdam-tickets-21345736673">register now to secure your seat</a>!</p><p>For the convenience of those who may travel to the workshop, it will be held at the <a href="http://www.nh-hotels.com/events/en/event-detail/30729/rstudio_public_workshop.html">Hotel NH Amsterdam Schiphol Airport</a>.</p><p>We look forward to seeing you soon!</p></description></item><item><title>testthat 1.0.0</title><link>https://www.rstudio.com/blog/testthat-1-0-0/</link><pubDate>Fri, 29 Apr 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/testthat-1-0-0/</guid><description><p>testthat 1.0.0 is now available on CRAN. Testthat makes it easy to turn your existing informal tests into formal automated tests that you can rerun quickly and easily. Learn more at <a href="http://r-pkgs.had.co.nz/tests.html">http://r-pkgs.had.co.nz/tests.html</a>. Install the latest version with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">testthat&#34;</span>)</code></pre></div><p>This version of testthat saw a major behind the scenes overhaul. This is the reason for the 1.0.0 release, and it will make it easier to add new expectations and reporters in the future. As well as the internal changes, there are improvements in four main areas:</p><ul><li><p>New expectations.</p></li><li><p>Support for the pipe.</p></li><li><p>More consistent tests for side-effects.</p></li><li><p>Support for testing C++ code.</p></li></ul><p>These are described in detail below. For a complete set of changes, please see the <a href="https://github.com/hadley/testthat/releases/tag/v1.0.0">release notes</a>.</p><h2 id="improved-expectations">Improved expectations</h2><p>There are five new expectations:</p><ul><li><p><code>expect_type()</code> checks the base type of an object (with <code>typeof()</code>), <code>expect_s3_class()</code> tests that an object is S3 with given class, and <code>expect_s4_class()</code> tests that an object is S4 with given class. I recommend using these more specific expectations instead of the generic <code>expect_is()</code>, because they more clearly convey intent.</p></li><li><p><code>expect_length()</code> checks that an object has expected length.</p></li><li><p><code>expect_output_file()</code> compares output of a function with a text file, optionally update the file. This is useful for regression tests for <code>print()</code> methods.</p></li></ul><p>A number of older expectations have been deprecated:</p><ul><li><p><code>expect_more_than()</code> and <code>expect_less_than()</code> have been deprecated. Please use <code>expect_gt()</code> and <code>expect_lt()</code> instead.</p></li><li><p><code>takes_less_than()</code> has been deprecated.</p></li><li><p><code>not()</code> has been deprecated. Please use the explicit individual forms <code>expect_error(..., NA)</code> , <code>expect_warning(.., NA)</code>, etc.</p></li></ul><p>We also did a thorough review of the documentation, ensuring that related expectations are documented together.</p><h2 id="piping">Piping</h2><p>Most expectations now invisibly return the input <code>object</code>. This makes it possible to chain together expectations with magrittr:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">factor</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">expect_type</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">integer&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">expect_s3_class</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">factor&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">expect_length</span>(<span style="color:#40a070">1</span>)</code></pre></div><p>To make this style even easier, testthat now imports and re-exports the pipe so you don&rsquo;t need to explicitly attach magrittr.</p><h2 id="side-effects">Side-effects</h2><p>Expectations that test for side-effects (i.e. <code>expect_message()</code>, <code>expect_warning()</code>, <code>expect_error()</code>, and <code>expect_output()</code>) are now more consistent:</p><ul><li><code>expect_message(f(), NA)</code> will fail if a message is produced (i.e. it&rsquo;s not missing), and similarly for <code>expect_output()</code>, <code>expect_warning()</code>, and <code>expect_error()</code>.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">quiet <span style="color:#666">&lt;-</span> <span style="color:#06287e">function</span>() {}noisy <span style="color:#666">&lt;-</span> <span style="color:#06287e">function</span>() <span style="color:#06287e">message</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Hi!&#34;</span>)<span style="color:#06287e">expect_message</span>(<span style="color:#06287e">quiet</span>(), <span style="color:#007020;font-weight:bold">NA</span>)<span style="color:#06287e">expect_message</span>(<span style="color:#06287e">noisy</span>(), <span style="color:#007020;font-weight:bold">NA</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Error: noisy() showed 1 message.</span><span style="color:#60a0b0;font-style:italic">#&gt; * Hi!</span></code></pre></div><ul><li><code>expect_message(f(), NULL)</code> will fail if a message isn&rsquo;t produced, and similarly for <code>expect_output()</code>, <code>expect_warning()</code>, and <code>expect_error()</code>.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">expect_message</span>(<span style="color:#06287e">quiet</span>(), <span style="color:#007020;font-weight:bold">NULL</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Error: quiet() showed 0 messages</span><span style="color:#06287e">expect_message</span>(<span style="color:#06287e">noisy</span>(), <span style="color:#007020;font-weight:bold">NULL</span>)</code></pre></div><p>There were three other changes made in the interest of consistency:</p><ul><li><p>Previously testing for one side-effect (e.g. messages) tended to muffle other side effects (e.g. warnings). This is no longer the case.</p></li><li><p>Warnings that are not captured explicitly by <code>expect_warning()</code> are tracked and reported. These do not currently cause a test suite to fail, but may do in the future.</p></li><li><p>If you want to test a print method, <code>expect_output()</code> now requires you to explicitly print the object: <code>expect_output(&quot;a&quot;, &quot;a&quot;)</code> will fail, <code>expect_output(print(&quot;a&quot;), &quot;a&quot;)</code> will succeed. This makes it more consistent with the other side-effect functions.</p></li></ul><h2 id="c">C++</h2><p>Thanks to the work of <a href="http://github.com/kevinushey">Kevin Ushey</a>, testthat now includes a simple interface to unit test C++ code using the <a href="https://github.com/philsquared/Catch">Catch</a> library. Using Catch in your packages is easy – just call <code>testthat::use_catch()</code> and the necessary infrastructure, alongside a few sample test files, will be generated for your package. By convention, you can place your unit tests in <code>src/test-&lt;name&gt;.cpp</code>. Here&rsquo;s a simple example of a test file you might write when using testthat + Catch:</p><pre><code>#include &lt;testthat.h&gt;context(&quot;Addition&quot;) {test_that(&quot;two plus two equals four&quot;) {int result = 2 + 2;expect_true(result == 4);}}</code></pre><p>These unit tests will be compiled and run during calls to <code>devtools::test()</code>, as well as <code>R CMD check</code>. See <code>?use_catch</code> for a full list of functions supported by testthat, and for more details.</p><p>For now, Catch unit tests will only be compiled when using the gcc and clang compilers – this implies that the unit tests you write will not be compiled + run on Solaris, which should make it easier to submit packages that use testthat for C++ unit tests to CRAN.</p></description></item><item><title>Feather: A Fast On-Disk Format for Data Frames for R and Python, powered by Apache Arrow</title><link>https://www.rstudio.com/blog/feather/</link><pubDate>Tue, 29 Mar 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/feather/</guid><description><p>Wes McKinney, Software Engineer, ClouderaHadley Wickham, Chief Scientist, RStudio</p><p>This past January, we (Hadley and Wes) met and discussed some of the systems challenges facing the Python and R open source communities. In particular, we wanted to see if there were some opportunities to collaborate on tools for improving interoperability between Python, R, and external compute and storage systems.</p><p>One thing that struck us was that while R&rsquo;s data frames and Python&rsquo;s pandas data frames utilize very different internal memory representations, they share a very similar semantic model. In both R and Panda&rsquo;s, data frames are lists of named, equal-length columns, which can be numeric, boolean, and date-and-time, categorical (_factors), or _string. Every column can have missing values.</p><p>Around this time, the open source community had just started the new Apache <a href="http://arrow.apache.org/">Arrow</a> project, designed to improve data interoperability for systems dealing with columnar tabular data.</p><p>In discussing Apache Arrow in the context of Python and R, we wanted to see if we could use the insights from feather to design a very fast file format for storing data frames that could be used by both languages. Thus, the Feather format was born.</p><p><strong>What is Feather?</strong></p><p>Feather is a fast, lightweight, and easy-to-use binary file format for storing data frames. It has a few specific design goals:</p><ul><li><p>Lightweight, minimal API: make pushing data frames in and out of memory as simple as possible</p></li><li><p>Language agnostic: Feather files are the same whether written by Python or R code. Other languages can read and write Feather files, too.</p></li><li><p>High read and write performance. When possible, Feather operations should be bound by local disk performance.</p></li></ul><p><strong>Code examples</strong></p><p>The Feather API is designed to make reading and writing data frames as easy as possible. In R, the code might look like:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(feather)path <span style="color:#666">&lt;-</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">my_data.feather&#34;</span><span style="color:#06287e">write_feather</span>(df, path)df <span style="color:#666">&lt;-</span> <span style="color:#06287e">read_feather</span>(path)</code></pre></div><p>Analogously, in Python, we have:</p><pre><code>import featherpath = 'my_data.feather'feather.write_dataframe(df, path)df = feather.read_dataframe(path)</code></pre><p><strong>How fast is Feather?</strong></p><p>Feather is extremely fast. Since Feather does not currently use any compression internally, it works best when used with solid-state drives as come with most of today&rsquo;s laptop computers. For this first release, we prioritized a simple implementation and are thus writing unmodified Arrow memory to disk.</p><p>To give you an idea, here is a Python benchmark writing an approximately 800MB pandas DataFrame to disk:</p><pre><code>import featherimport pandas as pdimport numpy as nparr = np.random.randn(10000000) # 10% nullsarr[::10] = np.nandf = pd.DataFrame({'column_{0}'.format(i): arr for i in range(10)})feather.write_dataframe(df, 'test.feather')</code></pre><p>On Wes&rsquo;s laptop (latest-gen Intel processor with SSD), this takes:</p><pre><code>In [9]: %time df = feather.read_dataframe('test.feather')CPU times: user 316 ms, sys: 944 ms, total: 1.26 sWall time: 1.26 sIn [11]: 800 / 1.26Out[11]: 634.9206349206349</code></pre><p>This is effective performance of over 600 MB/s. Of course, the performance you see will depend on your hardware configuration.</p><p>And in R (on Hadley&rsquo;s laptop, which is very similar):</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(feather)x <span style="color:#666">&lt;-</span> <span style="color:#06287e">runif</span>(<span style="color:#40a070">1e7</span>)x<span style="color:#06287e">[sample</span>(<span style="color:#40a070">1e7</span>, <span style="color:#40a070">1e6</span>)] <span style="color:#666">&lt;-</span> <span style="color:#007020;font-weight:bold">NA</span> <span style="color:#60a0b0;font-style:italic"># 10% NAs</span>df <span style="color:#666">&lt;-</span> <span style="color:#06287e">as.data.frame</span>(<span style="color:#06287e">replicate</span>(<span style="color:#40a070">10</span>, x))<span style="color:#06287e">write_feather</span>(df, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">test.feather&#39;</span>)<span style="color:#06287e">system.time</span>(<span style="color:#06287e">read_feather</span>(<span style="color:#4070a0">&#39;</span><span style="color:#4070a0">test.feather&#39;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; user system elapsed</span><span style="color:#60a0b0;font-style:italic">#&gt; 0.731 0.287 1.020</span></code></pre></div><p><strong>How can I get Feather?</strong></p><p>The Feather source code is hosted at <a href="http://github.com/wesm/feather">http://github.com/wesm/feather</a>.</p><p><strong>Installing Feather for R</strong></p><p>Feather is currently available from github, and you can install with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">devtools<span style="color:#666">::</span><span style="color:#06287e">install_github</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">wesm/feather/R&#34;</span>)</code></pre></div><p>Feather uses C++11, so if you&rsquo;re on windows, you&rsquo;ll need the new <a href="https://github.com/rwinlib/r-base/wiki/Testing-Packages-with-Experimental-R-Devel-Build-for-Windows">gcc 4.93 toolchain</a>. (All going well this will be included in R 3.3.0, which is scheduled for release on April 14. We&rsquo;ll aim for a CRAN release soon after that).</p><p><strong>Installing Feather for Python</strong></p><p>For Python, you can install Feather from PyPI like so:</p><pre><code>$ pip install feather-format</code></pre><p>We will look into providing more installation options, such as conda builds, in the future.</p><p><strong>What should you <em>not</em> use Feather for?</strong></p><p>Feather is not designed for long-term data storage. At this time, we do not guarantee that the file format will be stable between versions. Instead, use Feather for quickly exchanging data between Python and R code, or for short-term storage of data frames as part of some analysis.</p><p><strong>Feather, Apache Arrow, and the community</strong></p><p>One of the great parts of Feather is that the file format is language agnostic. Other languages, such as Julia or Scala (for Spark users), can read and write the format without knowledge of details of Python or R.</p><p>Feather is one of the first projects to bring the tangible benefits of the Arrow spec to users in the form of an efficient, language-agnostic representation of tabular data on disk. Since Arrow does not provide for a file format, we are using Google&rsquo;s Flatbuffers library (github.com/google/flatbuffers) to serialize column types and related metadata in a language-independent way in the file.</p><p>The Python interface uses Cython to expose Feather&rsquo;s C++11 core to users, while the R interface uses Rcpp for the same task.</p></description></item><item><title>RStudio at the Open Data Science Conference</title><link>https://www.rstudio.com/blog/rstudio-at-the-open-data-science-conference/</link><pubDate>Mon, 28 Mar 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-at-the-open-data-science-conference/</guid><description><p>If you&rsquo;re a data wrangler or data scientist, <a href="http://www.odsc.com/boston/">ODSC East</a> in Boston from May 20-22 is a wonderful opportunity to get up-to-date on the latest open source tools and trends. R and RStudio will have a significant presence.</p><p><strong>J.J. Allaire</strong>, RStudio founder and CEO, will talk about recent and upcoming improvements in <a href="https://rmarkdown.rstudio.com/">R Markdown</a>.</p><p>The creator of Shiny and CTO of RStudio,** Joe Cheng**, will review the progress made bridging modern web browsers and R, along with the newest updates to <a href="http://www.htmlwidgets.org/">htmlwidgets</a> and <a href="https://shiny.rstudio.com/">Shiny</a> frameworks. In addition, Joe will join **<a href="http://www.zevross.com/">Zev Ross Spatial Analysis</a>** to offer a Shiny developer workshop for those interested in a deeper dive.</p><p>Other notable R speakers include <strong>Max Kuhn</strong>, the author of the Caret package for machine learning and <strong>Jared Lander</strong>, R contributor and author of R for Everyone.</p><p>For RStudio and R enthusiasts, ODSC has graciously offered <a href="https://odsceast.eventbrite.com/?discount=odsc-rs">discounted tickets</a>.</p><p>We hope to see you there!</p></description></item><item><title>tibble 1.0.0</title><link>https://www.rstudio.com/blog/tibble-1-0-0/</link><pubDate>Thu, 24 Mar 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/tibble-1-0-0/</guid><description><p>I&rsquo;m pleased to announce tibble, a new package for manipulating and printing data frames in R. Tibbles are a modern reimagining of the data.frame, keeping what time has proven to be effective, and throwing out what is not. The name comes from dplyr: originally you created these objects with <code>tbl_df()</code>, which was most easily pronounced as &ldquo;tibble diff&rdquo;.</p><p>Install tibble with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">tibble&#34;</span>)</code></pre></div><p>This package extracts out the <code>tbl_df</code> class associated functions from dplyr. <a href="https://github.com/krlmlr">Kirill Müller</a> extracted the code from dplyr, enhanced the tests, and added a few minor improvements.</p><h2 id="creating-tibbles">Creating tibbles</h2><p>You can create a tibble from an existing object with <code>as_data_frame()</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">as_data_frame</span>(iris)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [150 x 5]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; Sepal.Length Sepal.Width Petal.Length Petal.Width Species</span><span style="color:#60a0b0;font-style:italic">#&gt; (dbl) (dbl) (dbl) (dbl) (fctr)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 5.1 3.5 1.4 0.2 setosa</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 4.9 3.0 1.4 0.2 setosa</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 4.7 3.2 1.3 0.2 setosa</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 4.6 3.1 1.5 0.2 setosa</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 5.0 3.6 1.4 0.2 setosa</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 5.4 3.9 1.7 0.4 setosa</span><span style="color:#60a0b0;font-style:italic">#&gt; 7 4.6 3.4 1.4 0.3 setosa</span><span style="color:#60a0b0;font-style:italic">#&gt; 8 5.0 3.4 1.5 0.2 setosa</span><span style="color:#60a0b0;font-style:italic">#&gt; 9 4.4 2.9 1.4 0.2 setosa</span><span style="color:#60a0b0;font-style:italic">#&gt; 10 4.9 3.1 1.5 0.1 setosa</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ... ... ... ... ...</span></code></pre></div><p>This works for data frames, lists, matrices, and tables.</p><p>You can also create a new tibble from individual vectors with <code>data_frame()</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">data_frame</span>(x <span style="color:#666">=</span> <span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">5</span>, y <span style="color:#666">=</span> <span style="color:#40a070">1</span>, z <span style="color:#666">=</span> x ^ <span style="color:#40a070">2</span> <span style="color:#666">+</span> y)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [5 x 3]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x y z</span><span style="color:#60a0b0;font-style:italic">#&gt; (int) (dbl) (dbl)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 1 2</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 1 5</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 3 1 10</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 4 1 17</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 5 1 26</span></code></pre></div><p><code>data_frame()</code> does much less than <code>data.frame()</code>: it never changes the type of the inputs (e.g. it never converts strings to factors!), it never changes the names of variables, and it never creates <code>row.names()</code>. You can read more about these features in the vignette, <code>vignette(&quot;tibble&quot;)</code>.</p><p>You can define a tibble row-by-row with <code>frame_data()</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">frame_data</span>(<span style="color:#666">~</span>x, <span style="color:#666">~</span>y, <span style="color:#666">~</span>z,<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a&#34;</span>, <span style="color:#40a070">2</span>, <span style="color:#40a070">3.6</span>,<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">b&#34;</span>, <span style="color:#40a070">1</span>, <span style="color:#40a070">8.5</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [2 x 3]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x y z</span><span style="color:#60a0b0;font-style:italic">#&gt; (chr) (dbl) (dbl)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 a 2 3.6</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 b 1 8.5</span></code></pre></div><h2 id="tibbles-vs-data-frames">Tibbles vs data frames</h2><p>There are two main differences in the usage of a data frame vs a tibble: printing, and subsetting.</p><p>Tibbles have a refined print method that shows only the first 10 rows, and all the columns that fit on screen. This makes it much easier to work with large data. In addition to its name, each column reports its type, a nice feature borrowed from <code>str()</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(nycflights13)flights<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [336,776 x 16]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; year month day dep_time dep_delay arr_time arr_delay carrier tailnum</span><span style="color:#60a0b0;font-style:italic">#&gt; (int) (int) (int) (int) (dbl) (int) (dbl) (chr) (chr)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 2013 1 1 517 2 830 11 UA N14228</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2013 1 1 533 4 850 20 UA N24211</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 2013 1 1 542 2 923 33 AA N619AA</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 2013 1 1 544 -1 1004 -18 B6 N804JB</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 2013 1 1 554 -6 812 -25 DL N668DN</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 2013 1 1 554 -4 740 12 UA N39463</span><span style="color:#60a0b0;font-style:italic">#&gt; 7 2013 1 1 555 -5 913 19 B6 N516JB</span><span style="color:#60a0b0;font-style:italic">#&gt; 8 2013 1 1 557 -3 709 -14 EV N829AS</span><span style="color:#60a0b0;font-style:italic">#&gt; 9 2013 1 1 557 -3 838 -8 B6 N593JB</span><span style="color:#60a0b0;font-style:italic">#&gt; 10 2013 1 1 558 -2 753 8 AA N3ALAA</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ... ... ... ... ... ... ... ... ...</span><span style="color:#60a0b0;font-style:italic">#&gt; Variables not shown: flight (int), origin (chr), dest (chr), air_time</span><span style="color:#60a0b0;font-style:italic">#&gt; (dbl), distance (dbl), hour (dbl), minute (dbl).</span></code></pre></div><p>Tibbles are strict about subsetting. If you try to access a variable that does not exist, you&rsquo;ll get an error:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">flights<span style="color:#666">$</span>yea<span style="color:#60a0b0;font-style:italic">#&gt; Error: Unknown column &#39;yea&#39;</span></code></pre></div><p>Tibbles also clearly delineate <code>[</code> and <code>[[</code>: <code>[</code> always returns another tibble, <code>[[</code> always returns a vector. No more <code>drop = FALSE</code>!</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">class</span>(iris[ , <span style="color:#40a070">1</span>])<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;numeric&#34;</span><span style="color:#06287e">class</span>(iris[ , <span style="color:#40a070">1</span>, drop <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">FALSE</span>])<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;data.frame&#34;</span><span style="color:#06287e">class</span>(<span style="color:#06287e">as_data_frame</span>(iris)[ , <span style="color:#40a070">1</span>])<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;tbl_df&#34; &#34;tbl&#34; &#34;data.frame&#34;</span></code></pre></div><h2 id="interacting-with-legacy-code">Interacting with legacy code</h2><p>A handful of functions are don&rsquo;t work with tibbles because they expect <code>df[, 1]</code> to return a vector, not a data frame. If you encounter one of these functions, use <code>as.data.frame()</code> to turn a tibble back to a data frame:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">class</span>(<span style="color:#06287e">as.data.frame</span>(<span style="color:#06287e">tbl_df</span>(iris)))</code></pre></div></description></item><item><title>R Markdown Custom Formats</title><link>https://www.rstudio.com/blog/r-markdown-custom-formats/</link><pubDate>Mon, 21 Mar 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/r-markdown-custom-formats/</guid><description><p>The R Markdown package ships with a raft of output formats including HTML, PDF, MS Word, R package vignettes, as well as Beamer and HTML5 presentations. This isn&rsquo;t the entire universe of available formats though (far from it!). R Markdown formats are fully extensible and as a result there are several R packages that provide additional formats. In this post we wanted to highlight a few of these packages, including:</p><ul><li><p><a href="http://rstudio.github.io/tufte/">tufte</a> — Documents in the style of Edward Tufte</p></li><li><p><a href="https://github.com/rstudio/rticles">rticles</a> — Formats for creating LaTeX based journal articles</p></li><li><p><a href="https://github.com/juba/rmdformats">rmdformats</a> — Formats for creating HTML documents</p></li></ul><p>We&rsquo;ll also discuss how to create your own custom formats as well as re-usable document templates for existing formats.</p><h3 id="using-custom-formats">Using Custom Formats</h3><p>Custom R Markdown formats are just R functions which return a definition of the format&rsquo;s behavior. For example, here&rsquo;s the metadata for a document that uses the <code>html_document</code> format:</p><pre><code>---title: &quot;My Document&quot;output: html_document---</code></pre><p>When rendering, R Markdown calls the <code>rmarkdown::html_document</code> function to get the definition of the output format. A custom format works just the same way but is also qualified with the name of the package that contains it. For example, here&rsquo;s the metadata for a document that uses the <code>tufte_handout</code> format:</p><pre><code>---title: &quot;My Document&quot;output: tufte::tufte_handout---</code></pre><p>Custom formats also typically register a template that helps you get started with using them. If you are using RStudio you can easily create a new document based on a custom format via the <strong>New R Markdown</strong> dialog:</p><p><img src="https://rstudioblog.files.wordpress.com/2016/03/screen-shot-2016-03-21-at-11-16-04-am.png" alt="Screen Shot 2016-03-21 at 11.16.04 AM"></p><h3 id="tufte-handouts">Tufte Handouts</h3><p>The <a href="http://rstudio.github.io/tufte/">tufte</a> package includes custom formats for creating documents in the style that <a href="http://www.edwardtufte.com/tufte/">Edward Tufte</a> uses in his books and handouts. Tufte&rsquo;s style is known for its extensive use of sidenotes, tight integration of graphics with text, and well-set typography. Formats for both LaTeX and HTML/CSS output are provided (these are in turn based on the work in <a href="https://github.com/tufte-latex/tufte-latex">tufte-latex</a> and <a href="https://github.com/edwardtufte/tufte-css">tufte-css</a>). Here&rsquo;s some example output from the LaTeX format:</p><p><img src="https://rmarkdown.rstudio.com/images/tufte-handout.png" alt=""></p><p>If you want LaTeX/PDF output, you can use the <code>tufte_handout</code> format for handouts and <code>tufte_book</code> for books. For HTML output, you use the <code>tufte_html</code> format. For example:</p><pre><code>---title: &quot;An Example Using the Tufte Style&quot;authors: &quot;John Smith&quot;output:tufte::tufte_handout: defaulttufte::tufte_html: default---</code></pre><p>You can install the tufte package from CRAN as follows:</p><pre><code>install.packages(&quot;tufte&quot;)</code></pre><p>See the <a href="http://rstudio.github.io/tufte/">tufte package website</a> for additional documentation on using the Tufte custom formats.</p><h3 id="journal-articles">Journal Articles</h3><p>The <strong>rticles</strong> package provides a suite of custom <a href="https://rmarkdown.rstudio.com/">R Markdown</a> LaTeX formats and templates for various journal article formats, including:</p><ul><li><p><a href="http://www.jstatsoft.org/">JSS</a> articles</p></li><li><p><a href="http://journal.r-project.org/">R Journal</a> articles</p></li><li><p><a href="http://ctex.org/">CTeX</a> documents</p></li><li><p><a href="http://www.acm.org/">ACM</a> articles</p></li><li><p><a href="http://pubs.acs.org/">ACS</a> articles</p></li><li><p><a href="https://www.elsevier.com/">Elsevier</a> journal submissions.</p></li></ul><p><img src="https://rstudioblog.files.wordpress.com/2016/03/screen-shot-2016-03-21-at-11-48-40-am.png" alt="Screen Shot 2016-03-21 at 11.48.40 AM"></p><p>You can install the <a href="https://github.com/rstudio/rticles">rticles</a> package from CRAN as follows:</p><pre><code>install.packages(&quot;rticles&quot;)</code></pre><p>See the <a href="https://github.com/rstudio/rticles">rticles repository</a> for more details on using the formats included with the package. The <a href="https://github.com/rstudio/rticles/tree/master/R">source code</a> of the rticles package is an excellent resource for learning how to create LaTeX based custom formats.</p><h3 id="rmdformats-package">rmdformats Package</h3><p>The <a href="https://github.com/juba/rmdformats">rmdformats</a> package from Julien Barnier includes three HTML based document formats that provide nice alternatives to the default html_document format that is included in the rmarkdown package. The <code>readthedown</code> format is inspired by the <a href="https://readthedocs.org/">Read the docs</a> Sphinx theme and is fully responsive, with collapsible navigation:</p><p><img src="https://rstudioblog.files.wordpress.com/2016/03/readthedown.png" alt="readthedown"></p><p>The <code>html_docco</code> and <code>html_clean</code> formats both provide provide automatic thumbnails for figures with lightbox display, and html_clean provides an automatic and dynamic table of contents:</p><p><img src="https://rstudioblog.files.wordpress.com/2016/03/html_docco.png" alt="html_docco"> <img src="https://rstudioblog.files.wordpress.com/2016/03/html_clean.png" alt="html_clean"></p><p>You can install the <a href="https://github.com/juba/rmdformats">rmdformats</a> package from CRAN as follows:</p><pre><code>install.packages(&quot;rmdformats&quot;)</code></pre><p>See the <a href="https://github.com/juba/rmdformats">rmdformats repository</a> for documentation on using the <code>readthedown</code>, <code>html_docco</code>, and <code>html_clean</code> formats.</p><h3 id="creating-new-formats">Creating New Formats</h3><p>Hopefully checking out some of the custom formats described above has you inspired to create your very own new formats. The R Markdown website includes documentation on <a href="https://rmarkdown.rstudio.com/developer_custom_formats.html">how to create a custom format</a>. In addition, the source code of the <a href="https://github.com/rstudio/tufte">tufte</a>, <a href="https://github.com/rstudio/rticles">rticles</a>, and <a href="https://github.com/juba/rmdformats">rmdformats</a> packages provide good examples to work from.</p><p>Short of creating a brand new format, it&rsquo;s also possible to create a re-usable document template that shows up within the RStudio <strong>New R Markdown</strong> dialog box. This would be appropriate if an existing template met your needs but you wanted to have an easy way to create documents with a pre-set list of options and skeletal content. See the article on <a href="https://rmarkdown.rstudio.com/developer_document_templates.html">document templates</a> for additional details on how to do this.</p></description></item><item><title>R Markdown v0.9.5</title><link>https://www.rstudio.com/blog/rmarkdown-v0-9-5/</link><pubDate>Mon, 21 Mar 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rmarkdown-v0-9-5/</guid><description><p>A new release of the <a href="https://rmarkdown.rstudio.com/">rmarkdown</a> package is now available on CRAN. This release features some long-requested enhancements to the <a href="https://rmarkdown.rstudio.com/html_document_format.html">HTML document</a> format, including:</p><ol><li><p>The ability to have a floating (i.e. always visible) table of contents.</p></li><li><p>Folding and unfolding for R code (to easily show and hide code for either an entire document or for individual chunks).</p></li><li><p>Support for presenting content within tabbed sections (e.g. several plots could each have their own tab).</p></li><li><p>Five new themes including &ldquo;lumen&rdquo;, &ldquo;paper&rdquo;, &ldquo;sandstone&rdquo;, &ldquo;simplex&rdquo;, &amp; &ldquo;yeti&rdquo;.</p></li></ol><p>There are also three new formats for creating <a href="https://rmarkdown.rstudio.com/github_document_format.html">GitHub</a>, <a href="https://rmarkdown.rstudio.com/odt_document_format.html">OpenDocument</a>, and <a href="https://rmarkdown.rstudio.com/rtf_document_format.html">RTF</a> documents as well as a number of smaller enhancements and bug fixes (see the package <a href="https://cran.r-project.org/web/packages/rmarkdown/NEWS">NEWS</a> for all of the details).</p><h3 id="floating-toc">Floating TOC</h3><p>You can specify the <code>toc_float</code> option to float the table of contents to the left of the main document content. The floating table of contents will always be visible even when the document is scrolled. For example:</p><pre><code>---title: &amp;quot;Habits&amp;quot;output:html_document:toc: truetoc_float: true---</code></pre><p>Here&rsquo;s what the floating table of contents looks like on one of the R Markdown website&rsquo;s pages:</p><p><img src="https://rstudioblog.files.wordpress.com/2016/03/screen-shot-2016-03-21-at-7-19-59-am.png" alt="FloatingTOC"></p><h3 id="code-folding">Code Folding</h3><p>When the knitr chunk option <code>echo = TRUE</code> is specified (the default behavior) the R source code within chunks is included within the rendered document. In some cases it may be appropriate to exclude code entirely (<code>echo = FALSE</code>) but in other cases you might want the code available but not visible by default.</p><p>The <code>code_folding: hide</code> option enables you to include R code but have it hidden by default. Users can then choose to show hidden R code chunks either indvidually or document wide. For example:</p><pre><code>---title: &amp;quot;Habits&amp;quot;output:html_document:code_folding: hide---</code></pre><p>Here&rsquo;s the default HTML document template with code folding enabled. Note that each chunk has it&rsquo;s own toggle for showing or hiding code and there is also a global menu for operating on all chunks at once.</p><p><img src="https://rstudioblog.files.wordpress.com/2016/03/screen-shot-2016-03-21-at-7-27-40-am.png" alt="Screen Shot 2016-03-21 at 7.27.40 AM"></p><p>Note that you can specify <code>code_folding: show</code> to still show all R code by default but then allow users to hide the code if they wish.</p><h3 id="tabbed-sections">Tabbed Sections</h3><p>You can organize content using tabs by applying the <code>.tabset</code> class attribute to headers within a document. This will cause all sub-headers of the header with the <code>.tabset</code> attribute to appear within tabs rather than as standalone sections. For example:</p><pre><code>## Sales Report {.tabset}### By Product(tab content)### By Region(tab content)</code></pre><p>Here&rsquo;s what tabbed sections look like within a rendered document:</p><p><img src="https://rstudioblog.files.wordpress.com/2016/03/screen-shot-2016-03-21-at-7-43-38-am.png" alt="Screen Shot 2016-03-21 at 7.43.38 AM"></p><h3 id="authoring-enhancements">Authoring Enhancements</h3><p>We also shouldn&rsquo;t fail to mention that the <a href="https://www.rstudio.com/products/rstudio/download/">most recent release</a> of RStudio included several enhancements to R Markdown document editing. There&rsquo;s now an optional outline view that enables quick navigation across larger documents:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/12/screen-shot-2015-12-22-at-9-27-34-am.png&amp;h=502" alt="Screen Shot 2015-12-22 at 9.27.34 AM"></p><p>We also also added inline UI to code chunks for running individual chunks, running all previous chunks, and specifying various commonly used knit options:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/12/screen-shot-2015-12-22-at-9-30-11-am.png&amp;h=800" alt="Screen Shot 2015-12-22 at 9.30.11 AM"></p><h3 id="whats-next">What&rsquo;s Next</h3><p>We&rsquo;ve got lots of additional work planned for R Markdown including new document formats, additional authoring enhancements in RStudio, and some new tools to make it easier to publish and manage documents created with R Markdown. More details to follow soon!</p></description></item><item><title>R on Travis-CI</title><link>https://www.rstudio.com/blog/r-on-travis-ci/</link><pubDate>Wed, 09 Mar 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/r-on-travis-ci/</guid><description><p>Support for building R projects on Travis has recently undergone improvements which we hope will make it an even better tool for the R community. Feature highlights include:</p><ul><li><p>Support for Travis&rsquo; <a href="https://docs.travis-ci.com/user/workers/container-based-infrastructure/">container-based infrastructure</a>.</p></li><li><p>Package dependency caching (on the container-based builds).</p></li><li><p>Building with multiple R versions (R-devel, R-release (3.2.3) and R-oldrel (3.1.3)).</p></li><li><p>Log filtering to improve readability and hide less relevant information.</p></li><li><p>Updated dependencies TexLive (2015) and pandoc (1.15.2).</p></li></ul><p>See the Travis documentation on <a href="https://docs.travis-ci.com/user/languages/r">building an R project</a> for complete details on the available options.</p><p>Using the container-based infrastructure with package caching is now recommended for nearly all projects. There are more compute and network resources available for container based builds, which means they start processing in less time and run faster. The package caching makes package installation comparable or faster than using binary packages.</p><p>A minimal .travis.yml file that is suitable for most cases is</p><pre><code>language: rsudo: falsecache: packages</code></pre><p>New packages can omit <code>sudo: false</code>, as it is the default for new repositories. However older repositories will have to explicitly set <code>sudo: false</code> to use the container based infrastructure.</p><p>If your package depends on development packages that are not on CRAN (such as GitHub) we recommend you use the <a href="https://github.com/hadley/devtools/blob/master/vignettes/dependencies.Rmd">Remotes:</a> annotation in your package <code>DESCRPITION</code> file. This will allow your package and dependencies to be easily installed by <code>devtools::install_github()</code> as well as on Travis (<a href="https://github.com/search?utf8=%E2%9C%93&amp;q=filename%3ADESCRIPTION+path%3A%2F+Remotes&amp;type=Code&amp;ref=searchresults">Examples</a>). It is generally no longer necessary to use <code>r_github_packages</code>, <code>r_packages</code>, <code>r_binary_packages</code>, etc. as this can be handled with <code>Remotes</code>.</p><p>If you need system dependencies, first check to see if they&rsquo;re available with the <a href="https://docs.travis-ci.com/user/installing-dependencies/#Installing-Packages-with-the-APT-Addon">apt-addon</a> and include them in your <code>.travis.yml</code>. This will allow you to install them without sudo and still use the container based infrastructure.</p><pre><code>addons:apt:packages:- libv8-dev</code></pre><p>We hope these improvements will make your use of Travis with R simple and useful. Please file any issues found at <a href="https://github.com/travis-ci/travis-ci/issues">https://github.com/travis-ci/travis-ci/issues</a> and mention @craigcitro, @hadley and @jimhester in the issue.</p></description></item><item><title>ggplot2 2.1.0</title><link>https://www.rstudio.com/blog/ggplot2-2-1-0/</link><pubDate>Thu, 03 Mar 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/ggplot2-2-1-0/</guid><description><p>I&rsquo;m very pleased to announce the release of ggplot2 2.1.0, scales 0.4.0, and gtable 0.2.0. These are set of relatively minor updates that fix a whole bunch of little problems that crept in during the <a href="https://blog.rstudio.com/2015/12/21/ggplot2-2-0-0/">last big update</a>. The most important changes are described below.</p><ol><li>When mapping an aesthetic to a constant the default guide title is the name of the aesthetic (i.e. &ldquo;colour&rdquo;), not the value (i.e. &ldquo;loess&rdquo;). This is a really handy technique for labelling individual layers:</li></ol><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(mpg, <span style="color:#06287e">aes</span>(displ, <span style="color:#40a070">1</span> <span style="color:#666">/</span> hwy)) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>() <span style="color:#666">+</span><span style="color:#06287e">geom_smooth</span>(method <span style="color:#666">=</span> lm, <span style="color:#06287e">aes</span>(colour <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">lm&#34;</span>), se <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">FALSE</span>) <span style="color:#666">+</span><span style="color:#06287e">geom_smooth</span>(<span style="color:#06287e">aes</span>(colour <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">loess&#34;</span>), se <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">FALSE</span>)</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/03/unnamed-chunk-2-1.png" alt="unnamed-chunk-2-1"></p><ol start="2"><li><code>stat_bin()</code> (which powers <code>geom_histogram()</code> and <code>geom_freqpoly()</code>), has been overhauled to use the same algorithm as ggvis. This has considerably better parameters and defaults thanks to the work of <a href="http://www.calvin.edu/~rpruim/">Randall Pruim</a>. Changes include:</li></ol><pre><code>* Better arguments and a better algorithm for determining the origin. You can now specify either `boundary` (i.e. the position of the left or right side) or the `center` of a bin. `origin` has been deprecated in favour of these arguments.* `drop` is deprecated in favour of `pad`, which adds extra 0-count bins at either end, as is needed for frequency polygons. `geom_histogram()` defaults to `pad = FALSE` which considerably improves the default limits for the histogram, especially when the bins are big.* The default algorithm does a (somewhat) better job at picking nice widths and origins across a wider range of input data.</code></pre><p>You can see the impact of these changes on the following two histograms:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(diamonds, <span style="color:#06287e">aes</span>(carat)) <span style="color:#666">+</span><span style="color:#06287e">geom_histogram</span>(binwidth <span style="color:#666">=</span> <span style="color:#40a070">1</span>)<span style="color:#06287e">ggplot</span>(diamonds, <span style="color:#06287e">aes</span>(carat)) <span style="color:#666">+</span><span style="color:#06287e">geom_histogram</span>(binwidth <span style="color:#666">=</span> <span style="color:#40a070">1</span>, boundary <span style="color:#666">=</span> <span style="color:#40a070">0</span>)</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/03/unnamed-chunk-3-1.png" alt="unnamed-chunk-3-1"><img src="https://rstudioblog.files.wordpress.com/2016/03/unnamed-chunk-3-2.png" alt=""></p><ol start="3"><li><p>All layer functions (<code>geom_*()</code> + <code>stat_*()</code>) functions now have a consistent argument order: <code>data</code>, <code>mapping</code>, then <code>geom</code>/<code>stat</code>/<code>position</code>, then <code>...</code>, then layer specific arguments, then common layer arguments. This might break some code if you were relying on partial name matching, but in the long-term should make ggplot2 easier to use. In particular, you can now set the <code>n</code> parameter in <code>geom_density2d()</code> without it partially matching <code>na.rm</code>.</p></li><li><p>For geoms with both <code>colour</code> and <code>fill</code>, <code>alpha</code> once again only affects fill. <code>alpha</code> was changed to modify both <code>colour</code> and <code>fill</code> in 2.0.0, but I&rsquo;ve reverted it to the old behaviour because it was causing pain for quite a few people.</p></li></ol><p>You can see a full list of changes in the <a href="https://github.com/hadley/ggplot2/releases/tag/v2.1.0">release notes</a>.</p></description></item><item><title>Shinyapps.io Update Notification</title><link>https://www.rstudio.com/blog/shinyapps-io-update-notification/</link><pubDate>Fri, 19 Feb 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shinyapps-io-update-notification/</guid><description><p>RStudio is pleased to notify account holders of recent updates to shinyapps.io.</p><p>Note: Action is required if your shiny application URL includes <strong>internal.shinyapps.io</strong></p><p><strong>What&rsquo;s New?</strong></p><p>We have updated the authentication and invitation system to improve the user experience, security, and extensibility for anyone with private applications. You may have already noticed some changes to the authentication flow for your applications if you are a Standard or Professional account holder.</p><p>As a part of these changes, we have eliminated the IFRAME and the associated RStudio branding, except for customers using custom domains where the IFRAME is still required.</p><p>For customers on free plans, we will replace the RStudio branding bar with a softer, less intrusive branding overlay.</p><p><strong>Possible Action Required</strong></p><p>If you have used the provided URL from shinyapps.io for your shiny applications like most accounts, no action is needed. Your applications will simply benefit from the improvements.</p><p>If your shiny application URL begins with <strong>internal.shinyapps.io</strong> you must change it.</p><p>To complete the update we will** SHUTDOWN **all internal.shinyapps.io URLs on** March 2, 2016**. If you have publicly linked your application to <a href="https://www.rstudio.com/products/shinyapps/">internal.shinyapps.io</a> or you have embedded applications on your website by directly referring to the <a href="https://www.rstudio.com/products/shinyapps/">internal.shinyapps.io</a> URL, **you MUST change your links** to the URL you see in the <a href="https://www.rstudio.com/products/shinyapps/">shinyapps.io</a> dashboard for your application.</p><p>While relatively few accounts are impacted and no action is required for most shinyapps.io users, if you have questions please contact <a href="mailto:shinyapps-support@rstudio.com">shinyapps-support@rstudio.com</a>.</p><p>Thank you all for your help and thanks for using <a href="https://www.rstudio.com/products/shinyapps/">shinyapps.io</a>!</p><p>The RStudio <a href="https://www.rstudio.com/products/shinyapps/">shinyapps.io</a> Team</p></description></item><item><title>New Release of RStudio (v0.99.878)</title><link>https://www.rstudio.com/blog/new-release-of-rstudio-v0-99-878/</link><pubDate>Tue, 09 Feb 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/new-release-of-rstudio-v0-99-878/</guid><description><p>We&rsquo;re pleased to announce that a new release of RStudio (v0.99.878) is <a href="https://www.rstudio.com/products/rstudio/">available for download</a> now. Highlights of this release include:</p><ul><li><p>Support for registering custom <a href="http://rstudio.github.io/rstudioaddins/">RStudio Addins</a>.</p></li><li><p>R Markdown editing improvements including outline view and inline UI for chunk execution.</p></li><li><p>Support for multiple <a href="https://support.rstudio.com/hc/en-us/articles/207126217">source windows</a> (tear editor tabs off main window).</p></li><li><p>Pane zooming for working distraction free within a single pane.</p></li><li><p>Editor and IDE keyboard shortcuts can <a href="https://support.rstudio.com/hc/en-us/articles/206382178">now be customized</a>.</p></li><li><p>New <a href="https://support.rstudio.com/hc/en-us/articles/210928128">Emacs keybindings</a> mode for the source editor.</p></li><li><p>Support for <a href="https://rmarkdown.rstudio.com/developer_parameterized_reports.html">parameterized</a> R Markdown reports.</p></li><li><p>Various improvements to RStudio Server Pro including <a href="https://support.rstudio.com/hc/en-us/articles/211789298">multiple concurrent R sessions</a>, use of <a href="https://support.rstudio.com/hc/en-us/articles/212364537">multiple R versions</a>, and <a href="https://support.rstudio.com/hc/en-us/articles/211659737">shared projects</a> for collaboration.</p></li></ul><p>There are lots of other small improvements across the product, check out the <a href="https://www.rstudio.com/products/rstudio/release-notes/">release notes</a> for full details.</p><h3 id="rstudio-addins">RStudio Addins</h3><p>RStudio Addins provide a mechanism for executing custom R functions interactively from within the RStudio IDE—either through keyboard shortcuts, or through the <em>Addins</em> menu. Coupled with the <a href="https://cran.rstudio.com/web/packages/rstudioapi/index.html">rstudioapi</a> package, users can now write R code to interact with and modify the contents of documents open in RStudio.</p><p>An addin can be as simple as a function that inserts a commonly used snippet of text, and as complex as a Shiny application that accepts input from the user and uses it to transform the contents of the active editor. The sky is the limit!</p><p>Here&rsquo;s an example of addin that enables interactive subsetting of a data frame with live preview:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/12/subset-addin.gif" alt="subset-addin"></p><p>This addin is implemented using a <a href="https://shiny.rstudio.com/articles/gadgets.html">Shiny Gadget</a> (see the <a href="https://github.com/rstudio/addinexamples/blob/master/R/subsetAddin.R">source code</a> for more details). RStudio Addins are distributed as <a href="http://r-pkgs.had.co.nz/">R packages</a>. Once you&rsquo;ve installed an R package that contains addins, they&rsquo;ll be immediately become available within RStudio.</p><p>You can learn more about using and developing addins here: <a href="http://rstudio.github.io/rstudioaddins/">http://rstudio.github.io/rstudioaddins/</a>.</p><h3 id="r-markdown">R Markdown</h3><p>We&rsquo;ve made a number of improvements to R Markdown authoring. There&rsquo;s now an optional outline view that enables quick navigation across larger documents:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/12/screen-shot-2015-12-22-at-9-27-34-am.png" alt="Screen Shot 2015-12-22 at 9.27.34 AM"></p><p>We&rsquo;ve also added inline UI to code chunks for running individual chunks, running all previous chunks, and specifying various commonly used knit options:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/12/screen-shot-2015-12-22-at-9-30-11-am.png" alt="Screen Shot 2015-12-22 at 9.30.11 AM"></p><h3 id="multiple-source-windows">Multiple Source Windows</h3><p>There are two ways to open a new source window:</p><p><strong>Pop out an editor</strong>: click the Show in New Window button in any source editor tab.</p><p><img src="https://support.rstudio.com/hc/en-us/article_attachments/202540607/popout.png" alt=""></p><p><strong>Tear off a pane:</strong> drag a tab out of the main window and onto the desktop; a new source window will be opened where you dropped the tab.</p><p><img src="https://support.rstudio.com/hc/en-us/article_attachments/202617168/tabassign.gif" alt=""></p><p>You can have as many source windows open as you like. Each source window has its own set of tabs; these tabs are independent of the tabs in RStudio&rsquo;s main source pane.</p><h3 id="customizable-keyboard-shortcuts">Customizable Keyboard Shortcuts</h3><p>You can now customize keyboard shortcuts in RStudio &ndash; you can bind keys to execute RStudio application commands, editor commands, or even user-defined R functions.</p><p>Access the keyboard shortcuts by clicking <code>Tools -&gt; Modify Keyboard Shortcuts...</code>:</p><p>This will present a dialog that enables remapping of all available editor commands (commands that affect the current document&rsquo;s contents, or the current selection) and RStudio commands (commands whose actions are scoped beyond just the current editor).</p><p><img src="https://support.rstudio.com/hc/en-us/article_attachments/202570217/Screen_Shot_2015-07-31_at_12.52.59_PM.png" alt=""></p><h2 id="emacs-keybindings">Emacs Keybindings</h2><p>We&rsquo;ve introduced a new keybindings mode to go along with the default bindings and Vim bindings already supported. Emacs mode provides a base set of keybindings for navigation and selection, including:</p><ul><li><p><code>C-p</code>, <code>C-n</code>, <code>C-b</code> and <code>C-f</code> to move the cursor up, down left and right by characters</p></li><li><p><code>M-b</code>, <code>M-f</code> to move left and right by words</p></li><li><p><code>C-a</code>, <code>C-e</code> to navigate to the start, or end, of line;</p></li><li><p><code>C-k</code> to &lsquo;kill&rsquo; to end of line, and <code>C-y</code> to &lsquo;yank&rsquo; the last kill,</p></li><li><p><code>C-s</code>, <code>C-r</code> to initiate an Emacs-style incremental search (forward / reverse),</p></li><li><p><code>C-Space</code> to set/unset mark, and <code>C-w</code> to kill the marked region.</p></li></ul><p>There are some additional keybindings that <a href="http://ess.r-project.org/">Emacs Speaks Statistics (ESS)</a> users might find familiar:</p><ul><li><p><code>C-c C-v</code> displays help for the object under the cursor,</p></li><li><p><code>C-c C-n</code> evaluates the current line / selection,</p></li><li><p><code>C-x b</code> allows you to visit another file,</p></li><li><p><code>M-C-a</code> moves the cursor to the beginning of the current function,</p></li><li><p><code>M-C-e</code> moves to the end of the current function,</p></li><li><p><code>C-c C-f</code> evaluates the current function.</p></li></ul><p>We&rsquo;ve also introduced a number of keybindings that allow you to interact with the IDE as you might normally do in Emacs:</p><ul><li><p><code>C-x C-n</code> to create a new document,</p></li><li><p><code>C-x C-f</code> to find / open an existing document,</p></li><li><p><code>C-x C-s</code> to save the current document,</p></li><li><p><code>C-x k</code> to close the current file.</p></li></ul><h3 id="heading"></h3><h3 id="rstudio-server-pro">RStudio Server Pro</h3><p>We&rsquo;ve introduced a number of significant enhancements to <a href="https://www.rstudio.com/products/rstudio-server-pro/">RStudio Server Pro</a> in this release, including:</p><ul><li>The ability to open multiple concurrent R sessions. Multiple concurrent sessions are useful for running multiple analyses in parallel and for switching between different tasks.</li></ul><p><img src="https://support.rstudio.com/hc/en-us/article_attachments/203432748/multipleRSessions3.png" alt=""></p><ul><li>Flexible use of multiple R versions on the same server. This is useful when you have some analysts or projects that require older versions of R or R packages and some that require newer versions.</li></ul><p><img src="https://support.rstudio.com/hc/en-us/article_attachments/203432008/mutlipleRVersions2.png" alt=""></p><ul><li>Project sharing for easy collaboration within workgroups. When you share a project, RStudio Server securely grants other users access to the project, and when multiple users are active in the project at once, you can see each others&rsquo; activity and work together in a shared editor.</li></ul><p><img src="https://support.rstudio.com/hc/en-us/article_attachments/203236337/Screen_Shot_2015-09-30_at_3.55.46_PM.png" alt=""></p><p>See the updated <a href="https://www.rstudio.com/products/rstudio-server-pro/">RStudio Server Pro</a> page for additional details, including a set of videos which demonstrate the new features.</p><h3 id="try-it-out">Try it Out</h3><p>RStudio v0.99.878 is <a href="https://www.rstudio.com/products/rstudio/download/">available for download</a> now. We hope you enjoy the new release and as always please <a href="https://support.rstudio.com/">let us know</a> how it&rsquo;s working and what else we can do to make the product better.</p></description></item><item><title>Hadley Wickham's Advanced R in Amsterdam</title><link>https://www.rstudio.com/blog/hadley-wickhams-advanced-r-in-amsterdam/</link><pubDate>Sat, 06 Feb 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/hadley-wickhams-advanced-r-in-amsterdam/</guid><description><p>On May 19 and 20, 2016, Hadley Wickham will teach his two day Master R Developer Workshop in the centrally located European city of Amsterdam.</p><p>Are you ready to upgrade your R skills? <a href="https://www.eventbrite.com/e/master-r-developer-workshop-amsterdam-tickets-21345736673">Register soon to secure your seat</a>.</p><p>For the convenience of those who may travel to the workshop, it will be held at the <a href="http://www.nh-hotels.com/events/en/event-detail/30729/rstudio_public_workshop.html">Hotel NH Amsterdam Schiphol Airport</a>.</p><p>Hadley teaches a few workshops each year and this is the only one planned for Europe. They are very popular and hotel rooms are limited. Please register soon.</p><p>We look forward to seeing you in the month of May!</p></description></item><item><title>Devtools 1.10.0</title><link>https://www.rstudio.com/blog/devtools-1-10-0/</link><pubDate>Tue, 02 Feb 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/devtools-1-10-0/</guid><description><p>Devtools 1.10.0 is now available on CRAN. Devtools makes package building so easy that a package can become your default way to organise code, data, documentation, and tests. You can learn more about creating your own package in <a href="http://r-pkgs.had.co.nz/">R packages</a>. Install devtools with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">devtools&#34;</span>)</code></pre></div><p>This version is mostly a collection of bug fixes and minor improvements. For example:</p><ul><li><p>Devtools employs a new strategy for detecting RTools on windows: we now only check for Rtools if you need to <code>load_all()</code> or <code>build()</code> a package with compiled code. This should make life easier for most windows users.</p></li><li><p>Package installation receieved a lot of tweaks from the community. Devtools now makes use of the <code>Additional_repositories</code> field, which is useful if you’re using <a href="http://dirk.eddelbuettel.com/code/drat.html">drat</a> for non-CRAN packages. <code>install_github()</code> is now lazy and won’t reinstall if the currently installed version is the same as the one on github. Local installs now add git and github metadata, if available.</p></li><li><p><code>use_news_md()</code> adds a (very) basic <code>NEWS.md</code> template. CRAN now accepts <code>NEWS.md</code> files so <code>release()</code> warns if you’ve previously added it to <code>.Rbuilignore</code>.</p></li><li><p><code>use_mit_license()</code> writes the necessary infrastructure to declare that your package is MIT licensed (in a CRAN-compliant way).</p></li><li><p><code>check(cran = TRUE)</code> automatically adds <code>--run-donttest</code> as this is a de facto CRAN standard.</p></li></ul><p>To see the full list of changes, please read the <a href="https://github.com/hadley/devtools/releases/tag/v1.10.0">release notes</a>.</p></description></item><item><title>httr 1.1.0 (and 1.0.0)</title><link>https://www.rstudio.com/blog/httr-1-1-0/</link><pubDate>Tue, 02 Feb 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/httr-1-1-0/</guid><description><p>httr 1.1.0 is now available on CRAN. The httr packages makes it easy to talk to web APIs from R. Learn more in the <a href="http://cran.r-project.org/web/packages/httr/vignettes/quickstart.html">quick start</a> vignette.</p><p>Install the latest version with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">httr&#34;</span>)</code></pre></div><p>When writing this blog post I discovered that I forgot to annouce httr 1.0.0. This was a major release marking the transition from the RCurl package to the <a href="https://github.com/jeroenooms/curl">curl</a> package, a modern binding to <a href="https://curl.haxx.se/libcurl/">libcurl</a> written by <a href="https://jeroenooms.github.io">Jeroen Ooms</a>. This makes httr more reliable, less likely to leak memory, and prevents the diabolical &ldquo;easy handle already used in multi handle&rdquo; error.</p><p>httr 1.1.0 includes a couple of new features:</p><ul><li><p><code>stop_for_status()</code>, <code>warn_for_status()</code> and (new) <code>message_for_status()</code> replace the old <code>message</code> argument with a new <code>task</code> argument that optionally describes the current task. This allows API wrappers to provide more informative error messages on failure.</p></li><li><p><code>http_error()</code> replaces <code>url_ok()</code> and <code>url_successful()</code>. <code>http_error()</code> more clearly conveys intent and works with urls, responses and status codes.</p></li></ul><p>Otherwise, OAuth support continues to improve thanks to support from the community:</p><ul><li><p><a href="https://github.com/nathangoulding">Nathan Goulding</a> added RSA-SHA1 signature support to <code>oauth1.0_token()</code>. He also fixed bugs in <code>oauth_service_token()</code> and improved the caching behaviour of <code>refresh_oauth2.0()</code>. This makes httr easier to use with Google&rsquo;s <a href="https://developers.google.com/identity/protocols/OAuth2ServiceAccount">service accounts</a>.</p></li><li><p><a href="https://github.com/grahamrp">Graham Parsons</a> added support for HTTP basic authentication to <code>oauth2.0_token()</code> with the <code>use_basic_auth</code>. This is now the default method used when retrieving a token.</p></li><li><p><a href="https://github.com/cornf4ke">Daniel Lockau</a> implemented <code>user_params</code> which allows you to pass arbitrary additional parameters to the token access endpoint when acquiring or refreshing a token. This allows you to use httr with Microsoft Azure. He also wrote a demo so you can see exactly how this works.</p></li></ul><p>To see the full list of changes, please read the release notes for <a href="https://github.com/hadley/httr/releases/tag/v1.0.0">1.0.0</a> and <a href="https://github.com/hadley/httr/releases/tag/v1.1.0">1.1.0</a>.</p></description></item><item><title>memoise 1.0.0</title><link>https://www.rstudio.com/blog/memoise-1-0-0/</link><pubDate>Tue, 02 Feb 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/memoise-1-0-0/</guid><description><p>We are pleased to announce version 1.0.0 of the memoise package is now available on <a href="https://cran.r-project.org/web/packages/memoise/">CRAN</a>. <a href="https://en.wikipedia.org/wiki/Memoization">Memoization</a> stores the value of function call and returns the cached result when the function is called again with the same arguments.</p><p>The following function computes <a href="https://en.wikipedia.org/wiki/Fibonacci_number">Fibonacci numbers</a> and illustrates the usefulness of memoization. Because the function definition is recursive, the intermediate results can be looked up rather than recalculated at each level of recursion, which reduces the runtime drastically. The last time the memoised function is called the final result can simply be returned, so no measurable execution time is recorded.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">fib <span style="color:#666">&lt;-</span> <span style="color:#06287e">function</span>(n) {<span style="color:#06287e">if </span>(n <span style="color:#666">&lt;</span> <span style="color:#40a070">2</span>) {<span style="color:#06287e">return</span>(n)} else {<span style="color:#06287e">return</span>(<span style="color:#06287e">fib</span>(n<span style="color:#40a070">-1</span>) <span style="color:#666">+</span> <span style="color:#06287e">fib</span>(n<span style="color:#40a070">-2</span>))}}<span style="color:#06287e">system.time</span>(x <span style="color:#666">&lt;-</span> <span style="color:#06287e">fib</span>(<span style="color:#40a070">30</span>))<span style="color:#60a0b0;font-style:italic">#&gt; user system elapsed</span><span style="color:#60a0b0;font-style:italic">#&gt; 4.454 0.010 4.472</span>fib <span style="color:#666">&lt;-</span> <span style="color:#06287e">memoise</span>(fib)<span style="color:#06287e">system.time</span>(y <span style="color:#666">&lt;-</span> <span style="color:#06287e">fib</span>(<span style="color:#40a070">30</span>))<span style="color:#60a0b0;font-style:italic">#&gt; user system elapsed</span><span style="color:#60a0b0;font-style:italic">#&gt; 0.004 0.000 0.004</span><span style="color:#06287e">system.time</span>(z <span style="color:#666">&lt;-</span> <span style="color:#06287e">fib</span>(<span style="color:#40a070">30</span>))<span style="color:#60a0b0;font-style:italic">#&gt; user system elapsed</span><span style="color:#60a0b0;font-style:italic">#&gt; 0 0 0</span><span style="color:#06287e">all.equal</span>(x, y)<span style="color:#60a0b0;font-style:italic">#&gt; [1] TRUE</span><span style="color:#06287e">all.equal</span>(x, z)<span style="color:#60a0b0;font-style:italic">#&gt; [1] TRUE</span></code></pre></div><p>Memoization is also very useful for storing queries to external resources, such as network APIs and databases.</p><p>Improvements in this release make memoised functions much nicer to use interactively. Memoised functions now have a print method which outputs the original function definition rather than the memoization code.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">mem_sum <span style="color:#666">&lt;-</span> <span style="color:#06287e">memoise</span>(sum)mem_sum<span style="color:#60a0b0;font-style:italic">#&gt; Memoised Function:</span><span style="color:#60a0b0;font-style:italic">#&gt; function (..., na.rm = FALSE) .Primitive(&#34;sum&#34;)</span></code></pre></div><p>Memoised functions now forward their arguments from the original function rather than simply passing them with <code>...</code>. This allows autocompletion to work transparently for memoised functions and also fixes a bug related to non-constant default arguments. [<a href="https://github.com/hadley/memoise/issues/6">1</a>]</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">mem_scan <span style="color:#666">&lt;-</span> <span style="color:#06287e">memoise</span>(scan)<span style="color:#06287e">args</span>(mem_scan)<span style="color:#60a0b0;font-style:italic">#&gt; function (file = &#34;&#34;, what = double(), nmax = -1L, n = -1L, sep = &#34;&#34;,</span><span style="color:#60a0b0;font-style:italic">#&gt; quote = if (identical(sep, &#34;\n&#34;)) &#34;&#34; else &#34;&#39;\&#34;&#34;, dec = &#34;.&#34;,</span><span style="color:#60a0b0;font-style:italic">#&gt; skip = 0L, nlines = 0L, na.strings = &#34;NA&#34;, flush = FALSE,</span><span style="color:#60a0b0;font-style:italic">#&gt; fill = FALSE, strip.white = FALSE, quiet = FALSE, blank.lines.skip = TRUE,</span><span style="color:#60a0b0;font-style:italic">#&gt; multi.line = TRUE, comment.char = &#34;&#34;, allowEscapes = FALSE,</span><span style="color:#60a0b0;font-style:italic">#&gt; fileEncoding = &#34;&#34;, encoding = &#34;unknown&#34;, text, skipNul = FALSE)</span><span style="color:#60a0b0;font-style:italic">#&gt; NULL</span></code></pre></div><p>Memoisation can now depend on external variables aside from the function arguments. This feature can be used in a variety of ways, such as invalidating the memoisation when a new package is attached.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">mem_f <span style="color:#666">&lt;-</span> <span style="color:#06287e">memoise</span>(runif, <span style="color:#666">~</span><span style="color:#06287e">search</span>())<span style="color:#06287e">mem_f</span>(<span style="color:#40a070">2</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] 0.009113091 0.988083122</span><span style="color:#06287e">mem_f</span>(<span style="color:#40a070">2</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] 0.009113091 0.988083122</span><span style="color:#06287e">library</span>(ggplot2)<span style="color:#06287e">mem_f</span>(<span style="color:#40a070">2</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] 0.89150566 0.01128355</span></code></pre></div><p>Or invalidating the memoisation after a given amount of time has elapsed. A <code>timeout()</code> helper function is provided to make this feature easier to use.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">mem_f <span style="color:#666">&lt;-</span> <span style="color:#06287e">memoise</span>(runif, <span style="color:#666">~</span><span style="color:#06287e">timeout</span>(<span style="color:#40a070">10</span>))<span style="color:#06287e">mem_f</span>(<span style="color:#40a070">2</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] 0.6935329 0.3584699</span><span style="color:#06287e">mem_f</span>(<span style="color:#40a070">2</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] 0.6935329 0.3584699</span><span style="color:#06287e">Sys.sleep</span>(<span style="color:#40a070">10</span>)<span style="color:#06287e">mem_f</span>(<span style="color:#40a070">2</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] 0.2008418 0.4538413</span></code></pre></div><p>A great amount of thanks for this release goes to <a href="http://krlmlr.github.io/">Kirill Müller</a>, who wrote the argument forwarding implementation and added comprehensive tests to the package. [<a href="https://github.com/hadley/memoise/pull/13">2</a>, <a href="https://github.com/hadley/memoise/pull/14">3</a>]</p><p>See the <a href="https://github.com/hadley/memoise/releases/tag/v1.0.0">release notes</a> for a complete list of changes.</p></description></item><item><title>tidyr 0.4.0</title><link>https://www.rstudio.com/blog/tidyr-0-4-0/</link><pubDate>Tue, 02 Feb 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/tidyr-0-4-0/</guid><description><p>I&rsquo;m pleased to announce tidyr 0.4.0. tidyr makes it easy to &ldquo;tidy&rdquo; your data, storing it in a consistent form so that it&rsquo;s easy to manipulate, visualise and model. Tidy data has a simple convention: put variables in the columns and observations in the rows. You can learn more about it in the <a href="http://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.html">tidy data</a> vignette. Install it with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">tidyr&#34;</span>)</code></pre></div><p>There are two big features in this release: support for nested data frames, and improved tools for turning implicit missing values into explicit missing values. These are described in detail below. As well as these big features, all tidyr verbs now handle <code>grouped_df</code> objects created by dplyr, <code>gather()</code> makes a character <code>key</code> column (instead of a factor), and there are lots of other minor fixes and improvements. Please see the <a href="https://github.com/hadley/tidyr/releases/tag/v0.4.0">release notes</a> for a complete list of changes.</p><h2 id="nested-data-frames">Nested data frames</h2><p><code>nest()</code> and <code>unnest()</code> have been overhauled to support a new way of structuring your data: the <strong>nested</strong> data frame. In a grouped data frame, you have one row per observation, and additional metadata define the groups. In a nested data frame, you have one <strong>row</strong> per group, and the individual observations are stored in a column that is a list of data frames. This is a useful structure when you have lists of other objects (like models) with one element per group.</p><p>For example, take the <a href="https://github.com/jennybc/gapminder">gapminder</a> dataset:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(gapminder)<span style="color:#06287e">library</span>(dplyr)gapminder<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [1,704 x 6]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; country continent year lifeExp pop gdpPercap</span><span style="color:#60a0b0;font-style:italic">#&gt; (fctr) (fctr) (int) (dbl) (int) (dbl)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 Afghanistan Asia 1952 28.8 8425333 779</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 Afghanistan Asia 1957 30.3 9240934 821</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 Afghanistan Asia 1962 32.0 10267083 853</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 Afghanistan Asia 1967 34.0 11537966 836</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 Afghanistan Asia 1972 36.1 13079460 740</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 Afghanistan Asia 1977 38.4 14880372 786</span><span style="color:#60a0b0;font-style:italic">#&gt; 7 Afghanistan Asia 1982 39.9 12881816 978</span><span style="color:#60a0b0;font-style:italic">#&gt; 8 Afghanistan Asia 1987 40.8 13867957 852</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ... ... ... ... ... ...</span></code></pre></div><p>We can plot the trend in life expetancy for each country:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(ggplot2)<span style="color:#06287e">ggplot</span>(gapminder, <span style="color:#06287e">aes</span>(year, lifeExp)) <span style="color:#666">+</span><span style="color:#06287e">geom_line</span>(<span style="color:#06287e">aes</span>(group <span style="color:#666">=</span> country))</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2016/02/unnamed-chunk-4-1.png" alt="unnamed-chunk-4-1"></p><p>But it&rsquo;s hard to see what&rsquo;s going on because of all the overplotting. One interesting solution is to summarise each country with a linear model. To do that most naturally, you want one data frame for each country. <code>nest()</code> creates this structure:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">by_country <span style="color:#666">&lt;-</span> gapminder <span style="color:#666">%&gt;%</span><span style="color:#06287e">group_by</span>(continent, country) <span style="color:#666">%&gt;%</span><span style="color:#06287e">nest</span>()by_country<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [142 x 3]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; continent country data</span><span style="color:#60a0b0;font-style:italic">#&gt; (fctr) (fctr) (list)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 Asia Afghanistan &lt;tbl_df [12,4]&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 Europe Albania &lt;tbl_df [12,4]&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 Africa Algeria &lt;tbl_df [12,4]&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 Africa Angola &lt;tbl_df [12,4]&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 Americas Argentina &lt;tbl_df [12,4]&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 Oceania Australia &lt;tbl_df [12,4]&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 7 Europe Austria &lt;tbl_df [12,4]&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 8 Asia Bahrain &lt;tbl_df [12,4]&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ... ... ...</span></code></pre></div><p>The intriguing thing about this data frame is that it now contains one row per group, and to store the original data we have a new <code>data</code> column, a list of data frames. If we look at the first one, we can see that it contains the complete data for Afghanistan (sans grouping columns):</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">by_country<span style="color:#666">$</span>data[[1]]<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [12 x 4]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; year lifeExp pop gdpPercap</span><span style="color:#60a0b0;font-style:italic">#&gt; (int) (dbl) (int) (dbl)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1952 43.1 9279525 2449</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 1957 45.7 10270856 3014</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 1962 48.3 11000948 2551</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 1967 51.4 12760499 3247</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 1972 54.5 14760787 4183</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 1977 58.0 17152804 4910</span><span style="color:#60a0b0;font-style:italic">#&gt; 7 1982 61.4 20033753 5745</span><span style="color:#60a0b0;font-style:italic">#&gt; 8 1987 65.8 23254956 5681</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ... ... ... ...</span></code></pre></div><p>This form is natural because there are other vectors where you&rsquo;ll have one value per country. For example, we could fit a linear model to each country with <a href="http://r4ds.had.co.nz/lists.html">purrr</a>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">by_country <span style="color:#666">&lt;-</span> by_country <span style="color:#666">%&gt;%</span><span style="color:#06287e">mutate</span>(model <span style="color:#666">=</span> purrr<span style="color:#666">::</span><span style="color:#06287e">map</span>(data, <span style="color:#666">~</span> <span style="color:#06287e">lm</span>(lifeExp <span style="color:#666">~</span> year, data <span style="color:#666">=</span> .)))by_country<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [142 x 4]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; continent country data model</span><span style="color:#60a0b0;font-style:italic">#&gt; (fctr) (fctr) (list) (list)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 Asia Afghanistan &lt;tbl_df [12,4]&gt; &lt;S3:lm&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 Europe Albania &lt;tbl_df [12,4]&gt; &lt;S3:lm&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 Africa Algeria &lt;tbl_df [12,4]&gt; &lt;S3:lm&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 Africa Angola &lt;tbl_df [12,4]&gt; &lt;S3:lm&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 Americas Argentina &lt;tbl_df [12,4]&gt; &lt;S3:lm&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 Oceania Australia &lt;tbl_df [12,4]&gt; &lt;S3:lm&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 7 Europe Austria &lt;tbl_df [12,4]&gt; &lt;S3:lm&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 8 Asia Bahrain &lt;tbl_df [12,4]&gt; &lt;S3:lm&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ... ... ... ...</span></code></pre></div><p>Because we used <code>mutate()</code>, we get an extra column containing one linear model per country.</p><p>It might seem unnatural to store a list of linear models in a data frame. However, I think it is actually a really convenient and powerful strategy because it allows you to keep related vectors together. If you filter or arrange the vector of models, there&rsquo;s no way for the other components to get out of sync.</p><p><code>nest()</code> got us into this form; <code>unnest()</code> gets us out. You give it the list-columns that you want to unnested, and tidyr will automatically repeat the grouping columns. Unnesting <code>data</code> gets us back to the original form:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">by_country <span style="color:#666">%&gt;%</span> <span style="color:#06287e">unnest</span>(data)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [1,704 x 6]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; continent country year lifeExp pop gdpPercap</span><span style="color:#60a0b0;font-style:italic">#&gt; (fctr) (fctr) (int) (dbl) (int) (dbl)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 Asia Afghanistan 1952 43.1 9279525 2449</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 Asia Afghanistan 1957 45.7 10270856 3014</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 Asia Afghanistan 1962 48.3 11000948 2551</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 Asia Afghanistan 1967 51.4 12760499 3247</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 Asia Afghanistan 1972 54.5 14760787 4183</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 Asia Afghanistan 1977 58.0 17152804 4910</span><span style="color:#60a0b0;font-style:italic">#&gt; 7 Asia Afghanistan 1982 61.4 20033753 5745</span><span style="color:#60a0b0;font-style:italic">#&gt; 8 Asia Afghanistan 1987 65.8 23254956 5681</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ... ... ... ... ... ...</span></code></pre></div><p>When working with models, unnesting is particularly useful when you combine it with <a href="https://github.com/dgrtwo/broom">broom</a> to extract model summaries:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic"># Extract model summaries:</span>by_country <span style="color:#666">%&gt;%</span> <span style="color:#06287e">unnest</span>(model <span style="color:#666">%&gt;%</span> purrr<span style="color:#666">::</span><span style="color:#06287e">map</span>(broom<span style="color:#666">::</span>glance))<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [142 x 15]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; continent country data model r.squared</span><span style="color:#60a0b0;font-style:italic">#&gt; (fctr) (fctr) (list) (list) (dbl)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 Asia Afghanistan &lt;tbl_df [12,4]&gt; &lt;S3:lm&gt; 0.985</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 Europe Albania &lt;tbl_df [12,4]&gt; &lt;S3:lm&gt; 0.888</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 Africa Algeria &lt;tbl_df [12,4]&gt; &lt;S3:lm&gt; 0.967</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 Africa Angola &lt;tbl_df [12,4]&gt; &lt;S3:lm&gt; 0.034</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 Americas Argentina &lt;tbl_df [12,4]&gt; &lt;S3:lm&gt; 0.919</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 Oceania Australia &lt;tbl_df [12,4]&gt; &lt;S3:lm&gt; 0.766</span><span style="color:#60a0b0;font-style:italic">#&gt; 7 Europe Austria &lt;tbl_df [12,4]&gt; &lt;S3:lm&gt; 0.680</span><span style="color:#60a0b0;font-style:italic">#&gt; 8 Asia Bahrain &lt;tbl_df [12,4]&gt; &lt;S3:lm&gt; 0.493</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ... ... ... ... ...</span><span style="color:#60a0b0;font-style:italic">#&gt; Variables not shown: adj.r.squared (dbl), sigma (dbl),</span><span style="color:#60a0b0;font-style:italic">#&gt; statistic (dbl), p.value (dbl), df (int), logLik (dbl),</span><span style="color:#60a0b0;font-style:italic">#&gt; AIC (dbl), BIC (dbl), deviance (dbl), df.residual (int).</span><span style="color:#60a0b0;font-style:italic"># Extract coefficients:</span>by_country <span style="color:#666">%&gt;%</span> <span style="color:#06287e">unnest</span>(model <span style="color:#666">%&gt;%</span> purrr<span style="color:#666">::</span><span style="color:#06287e">map</span>(broom<span style="color:#666">::</span>tidy))<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [284 x 7]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; continent country term estimate std.error</span><span style="color:#60a0b0;font-style:italic">#&gt; (fctr) (fctr) (chr) (dbl) (dbl)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 Asia Afghanistan (Intercept) -1.07e+03 43.8022</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 Asia Afghanistan year 5.69e-01 0.0221</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 Europe Albania (Intercept) -3.77e+02 46.5834</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 Europe Albania year 2.09e-01 0.0235</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 Africa Algeria (Intercept) -6.13e+02 38.8918</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 Africa Algeria year 3.34e-01 0.0196</span><span style="color:#60a0b0;font-style:italic">#&gt; 7 Africa Angola (Intercept) -6.55e+01 202.3625</span><span style="color:#60a0b0;font-style:italic">#&gt; 8 Africa Angola year 6.07e-02 0.1022</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ... ... ... ... ...</span><span style="color:#60a0b0;font-style:italic">#&gt; Variables not shown: statistic (dbl), p.value (dbl).</span><span style="color:#60a0b0;font-style:italic"># Extract residuals etc:</span>by_country <span style="color:#666">%&gt;%</span> <span style="color:#06287e">unnest</span>(model <span style="color:#666">%&gt;%</span> purrr<span style="color:#666">::</span><span style="color:#06287e">map</span>(broom<span style="color:#666">::</span>augment))<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [1,704 x 11]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; continent country lifeExp year .fitted .se.fit</span><span style="color:#60a0b0;font-style:italic">#&gt; (fctr) (fctr) (dbl) (int) (dbl) (dbl)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 Asia Afghanistan 43.1 1952 43.4 0.718</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 Asia Afghanistan 45.7 1957 46.2 0.627</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 Asia Afghanistan 48.3 1962 49.1 0.544</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 Asia Afghanistan 51.4 1967 51.9 0.472</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 Asia Afghanistan 54.5 1972 54.8 0.416</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 Asia Afghanistan 58.0 1977 57.6 0.386</span><span style="color:#60a0b0;font-style:italic">#&gt; 7 Asia Afghanistan 61.4 1982 60.5 0.386</span><span style="color:#60a0b0;font-style:italic">#&gt; 8 Asia Afghanistan 65.8 1987 63.3 0.416</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ... ... ... ... ... ...</span><span style="color:#60a0b0;font-style:italic">#&gt; Variables not shown: .resid (dbl), .hat (dbl), .sigma</span><span style="color:#60a0b0;font-style:italic">#&gt; (dbl), .cooksd (dbl), .std.resid (dbl).</span></code></pre></div><p>I think storing multiple models in a data frame is a powerful and convenient technique, and I plan to write more about it in the future.</p><h2 id="expanding">Expanding</h2><p>The <code>complete()</code> function allows you to turn implicit missing values into explicit missing values. For example, imagine you&rsquo;ve collected some data every year basis, but unfortunately some of your data has gone missing:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">resources <span style="color:#666">&lt;-</span> <span style="color:#06287e">frame_data</span>(<span style="color:#666">~</span>year, <span style="color:#666">~</span>metric, <span style="color:#666">~</span>value,<span style="color:#40a070">1999</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">coal&#34;</span>, <span style="color:#40a070">100</span>,<span style="color:#40a070">2001</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">coal&#34;</span>, <span style="color:#40a070">50</span>,<span style="color:#40a070">2001</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">steel&#34;</span>, <span style="color:#40a070">200</span>)resources<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [3 x 3]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; year metric value</span><span style="color:#60a0b0;font-style:italic">#&gt; (dbl) (chr) (dbl)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1999 coal 100</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2001 coal 50</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 2001 steel 200</span></code></pre></div><p>Here the value for steel in 1999 is implicitly missing: it&rsquo;s simply absent from the data frame. We can use <code>complete()</code> to make this missing row explicit, adding that combination of the variables and inserting a placeholder <code>NA</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">resources <span style="color:#666">%&gt;%</span> <span style="color:#06287e">complete</span>(year, metric)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [4 x 3]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; year metric value</span><span style="color:#60a0b0;font-style:italic">#&gt; (dbl) (chr) (dbl)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1999 coal 100</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 1999 steel NA</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 2001 coal 50</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 2001 steel 200</span></code></pre></div><p>With complete you&rsquo;re not limited to just combinations that exist in the data. For example, here we know that there should be data for every year, so we can use the <code>fullseq()</code> function to generate every year over the range of the data:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">resources <span style="color:#666">%&gt;%</span> <span style="color:#06287e">complete</span>(year <span style="color:#666">=</span> <span style="color:#06287e">full_seq</span>(year, <span style="color:#40a070">1L</span>), metric)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [6 x 3]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; year metric value</span><span style="color:#60a0b0;font-style:italic">#&gt; (dbl) (chr) (dbl)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1999 coal 100</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 1999 steel NA</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 2000 coal NA</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 2000 steel NA</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 2001 coal 50</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 2001 steel 200</span></code></pre></div><p>In other scenarios, you may not want to generate the full set of combinations. For example, imagine you have an experiment where each person is assigned one treatment. You don&rsquo;t want to expand the combinations of person and treatment, but you do want to make sure every person has all replicates. You can use <code>nesting()</code> to prevent the full Cartesian product from being generated:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">experiment <span style="color:#666">&lt;-</span> <span style="color:#06287e">data_frame</span>(person <span style="color:#666">=</span> <span style="color:#06287e">rep</span>(<span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Alex&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Robert&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Sam&#34;</span>), <span style="color:#06287e">c</span>(<span style="color:#40a070">3</span>, <span style="color:#40a070">2</span>, <span style="color:#40a070">1</span>)),trt <span style="color:#666">=</span> <span style="color:#06287e">rep</span>(<span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">b&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a&#34;</span>), <span style="color:#06287e">c</span>(<span style="color:#40a070">3</span>, <span style="color:#40a070">2</span>, <span style="color:#40a070">1</span>)),rep <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#40a070">1</span>, <span style="color:#40a070">2</span>, <span style="color:#40a070">3</span>, <span style="color:#40a070">1</span>, <span style="color:#40a070">2</span>, <span style="color:#40a070">1</span>),measurment_1 <span style="color:#666">=</span> <span style="color:#06287e">runif</span>(<span style="color:#40a070">6</span>),measurment_2 <span style="color:#666">=</span> <span style="color:#06287e">runif</span>(<span style="color:#40a070">6</span>))experiment<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [6 x 5]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; person trt rep measurment_1 measurment_2</span><span style="color:#60a0b0;font-style:italic">#&gt; (chr) (chr) (dbl) (dbl) (dbl)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 Alex a 1 0.7161 0.927</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 Alex a 2 0.3231 0.942</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 Alex a 3 0.4548 0.668</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 Robert b 1 0.0356 0.667</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 Robert b 2 0.5081 0.143</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 Sam a 1 0.6917 0.753</span>experiment <span style="color:#666">%&gt;%</span> <span style="color:#06287e">complete</span>(<span style="color:#06287e">nesting</span>(person, trt), rep)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [9 x 5]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; person trt rep measurment_1 measurment_2</span><span style="color:#60a0b0;font-style:italic">#&gt; (chr) (chr) (dbl) (dbl) (dbl)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 Alex a 1 0.7161 0.927</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 Alex a 2 0.3231 0.942</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 Alex a 3 0.4548 0.668</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 Robert b 1 0.0356 0.667</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 Robert b 2 0.5081 0.143</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 Robert b 3 NA NA</span><span style="color:#60a0b0;font-style:italic">#&gt; 7 Sam a 1 0.6917 0.753</span><span style="color:#60a0b0;font-style:italic">#&gt; 8 Sam a 2 NA NA</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ... ... ... ... ...</span></code></pre></div></description></item><item><title>Shiny 0.13.0</title><link>https://www.rstudio.com/blog/shiny-0-13-0/</link><pubDate>Wed, 20 Jan 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-0-13-0/</guid><description><p>Shiny 0.13.0 is now available on CRAN! This release has some of the most exciting features we&rsquo;ve shipped since the first version of Shiny. Highlights include:</p><ul><li><p>Shiny Gadgets</p></li><li><p>HTML templates</p></li><li><p>Shiny modules</p></li><li><p>Error stack traces</p></li><li><p>Checking for missing inputs</p></li><li><p>New JavaScript events</p></li></ul><p>For a comprehensive list of changes, see the <a href="https://cran.r-project.org/web/packages/shiny/NEWS">NEWS file</a>.</p><p>To install the new version from CRAN, run:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">shiny&#34;</span>)</code></pre></div><p>Read on for details about these new features!</p><!-- more --><h2 id="shiny-gadgets">Shiny Gadgets</h2><p>With Shiny Gadgets, you can use Shiny to create interactive graphical tools that run locally, taking your data as input and returning a result. This means that Shiny isn&rsquo;t just for creating applications to be delivered over the web – it can also be part of your interactive data analysis toolkit!</p><p>Your workflow could, for example, look something like this:</p><ol><li><p>At the R console, read in and massage your data.</p></li><li><p>Use a Shiny Gadget&rsquo;s graphical interface to build a model and tweak model parameters. When finished, the Gadget returns the model object.</p></li><li><p>At the R console, use the model to make predictions.</p></li></ol><p>Here&rsquo;s a Shiny Gadget in action (<a href="https://gist.github.com/wch/c4b857d73493e6550cba">code here</a>). This Gadget fits an <code>lm</code> model to a data set, and lets the user interactively exclude data points used to build the model; when finished, it returns the data with points excluded, and the model object:</p><p><img src="https://rstudioblog.files.wordpress.com/2016/01/lm_gadget.gif" alt="lm_gadget"></p><p>When used in RStudio, Shiny Gadgets integrate seamlessly, appearing in the Viewer panel, or in a pop-up dialog window. You can even declare your Shiny Gadgets to be <a href="http://rstudio.github.io/rstudioaddins/">RStudio Add-ins</a>, so they can be launched from the RStudio Add-ins menu or a customizable keyboard shortcut.</p><p>When used outside of RStudio, Shiny Gadgets have the same functionality – the only differences are that you invoke them by executing their R function, and that they open in a separate browser window.</p><p>Best of all, if you know how to write Shiny apps, you&rsquo;re 90% of the way to writing Gadgets! For the other 10%, see the <a href="https://shiny.rstudio.com/articles/gadgets.html">article</a> in the Shiny Dev Center.</p><h2 id="html-templates">HTML templates</h2><p>In previous versions of Shiny, you could choose between writing your UI using either ui.R (R function calls like <code>fluidPage</code>, <code>plotOutput</code>, and <code>div</code>), or index.html (plain old HTML markup).</p><p>With Shiny 0.13.0, you can have the best of both worlds in a single app, courtesy of the new HTML templating system (from the <strong>htmltools</strong> package). You can author the structure and style of your page in HTML, but still conveniently insert input and output widgets using R functions.</p><pre><code>&lt;!DOCTYPE html&gt;&lt;html&gt;&lt;head&gt;&lt;link href=&quot;custom.css&quot; rel=&quot;stylesheet&quot; /&gt;{{ headContent() }}&lt;/head&gt;&lt;body&gt;{{ sliderInput(&quot;x&quot;, &quot;X&quot;, 1, 100, sliderValue) }}{{ button }}&lt;/body&gt;&lt;/html&gt;</code></pre><p>To use the template for your UI, you process it with <code>htmlTemplate()</code>. The text within the <code>{{ ... }}</code> is evaluated as R code, and is replaced with the return value.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">htmlTemplate</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">template.html&#34;</span>,button <span style="color:#666">=</span> <span style="color:#06287e">actionButton</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">go&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Go&#34;</span>))</code></pre></div><p>In the example above, the template is used to generate an entire web page. Templates can also be used for pieces of HTML that are inserted into a web page. You could, for example, create a reusable UI component which uses an HTML template.</p><p>If you want to learn more, see the <a href="https://shiny.rstudio.com/articles/templates.html">HTML templates</a> article.</p><h2 id="shiny-modules">Shiny modules</h2><p>We&rsquo;ve been surprised at the number of users making large, complex Shiny apps – to the point that abstractions for managing Shiny code complexity has become a frequent request.</p><p>After much discussion and iteration, we&rsquo;ve come up with a <a href="https://shiny.rstudio.com/articles/modules.html">modules feature</a> that should be a huge help for these apps. A Shiny module is like a fragment of UI and server logic that can be embedded in either a Shiny app, or in another Shiny module. Shiny modules use namespaces, so you can create and interact with UI elements without worrying about their input and output IDs conflicting with anyone else&rsquo;s. You can even embed a Shiny module in a single app multiple times, and each instance of the module will be independent of the others.</p><p>To get started, check out the <a href="https://shiny.rstudio.com/articles/modules.html">Shiny modules</a> article.</p><p>(Special thanks to <a href="http://twitter.com/ijlyttle">Ian Lyttle</a>, whose earlier work with <a href="https://github.com/ijlyttle/shinychord">shinychord</a> provided inspiration for modules.)</p><h2 id="better-debugging-with-stack-traces">Better debugging with stack traces</h2><p>In previous versions of Shiny, if your code threw an error, it would tell you that an error occurred (the app would keep running), but wouldn&rsquo;t tell you where it&rsquo;s from:</p><pre><code>Listening on http://127.0.0.1:6212Error in : length(n) == 1L is not TRUE</code></pre><p>As of 0.13.0, Shiny gives a stack trace so you can easily find where the problem occurred:</p><pre><code>Listening on http://127.0.0.1:6212Warning: Error in : length(n) == 1L is not TRUEStack trace (innermost first):96: stopifnot95: head.default94: head93: reactive mydata [~/app.R#10]82: mydata81: ggplot80: renderPlot [~/app.R#14]72: output$plot5: &lt;Anonymous&gt;4: do.call3: print.shiny.appobj2: print1: source</code></pre><p>In this case, the error was in a reactive named <code>mydata</code> in <code>app.R</code>, line 10, when it called the <code>head()</code> function. Notice that the stack trace only shows stack frames that are relevant to the app – there are many frames that are internal Shiny code, and they are hidden from view by default.</p><p>For more information, see the <a href="https://shiny.rstudio.com/articles/debugging.html">debugging</a> article.</p><h2 id="checking-inputs-with-req">Checking inputs with <code>req()</code></h2><p>In Shiny apps, it&rsquo;s common to have a reactive expression or an output that can only proceed if certain conditions are met. For example, an input might need to have a selected value, or an <code>actionButton</code> might need to be clicked before an output should be shown.</p><p>Previously, you would need to use a check like <code>if (is.null(input$x)) return()</code>, or <code>validate(need(input$x))</code>, and a similar check would be needed in all downstream reactives/observers that rely on that reactive expression.</p><p>Shiny 0.13.0 provides new a function, <code>req()</code>, which simplifies this process. It can be used <code>req(input$x)</code>. Reactives and observers which are downstream will not need a separate check because a <code>req()</code> upstream will cause them to stop.</p><p>You can call <code>req()</code> with multiple arguments to check multiple inputs. And you can also check for specific conditions besides the presence or absence of an input by passing a logical value, e.g. <code>req(Sys.time() &lt;= endTime)</code> will stop if the current time is later than <code>endTime</code>.</p><p>For more details, see the <a href="https://shiny.rstudio.com/articles/req.html">article</a> in the Shiny Dev Center.</p><h2 id="javascript-events">JavaScript Events</h2><p>For developers who want to write JavaScript code to interact with Shiny in the client&rsquo;s browser, Shiny now has a set of JavaScript events to which event handler functions can be attached. For example, the <code>shiny:inputchanged</code> event is triggered when an input changes, and the <code>shiny:disconnected</code> event is triggered when the connection to the server ends.</p><p>See the <a href="https://shiny.rstudio.com/articles/js-events.html">article</a> for more.</p></description></item><item><title>RcppParallel: Getting R and C++ to work (some more) in parallel</title><link>https://www.rstudio.com/blog/rcppparallel-getting-r-and-c-to-work-some-more-in-parallel/</link><pubDate>Fri, 15 Jan 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rcppparallel-getting-r-and-c-to-work-some-more-in-parallel/</guid><description><p>(Post by <a href="http://dirk.eddelbuettel.com/">Dirk Eddelbuettel</a> and <a href="https://github.com/jjallaire">JJ Allaire</a>)</p><p>A common theme over the last few decades was that we could afford to simply sit back and let computer (hardware) engineers take care of increases in computing speed thanks to <a href="http://en.wikipedia.org/wiki/Moore%27s_law">Moore&rsquo;s law</a>. That same line of thought now frequently points out that we are getting closer and closer to the physical limits of what Moore&rsquo;s law can do for us.</p><p>So the new best hope is (and has been) parallel processing. Even our smartphones have multiple cores, and most if not all retail PCs now possess two, four or more cores. Real computers, aka somewhat decent servers, can be had with 24, 32 or more cores as well, and all that is before we even consider GPU coprocessors or <a href="http://en.wikipedia.org/wiki/Xeon_Phi">other upcoming changes</a>.</p><p>Sometimes our tasks are embarrassingly simple as is the case with many data-parallel jobs: we can use higher-level operations such as those offered by the base R package <a href="https://stat.ethz.ch/R-manual/R-devel/library/parallel/doc/parallel.pdf">parallel</a> to spawn multiple processing tasks and gather the results. Dirk covered all this in some detail in previous <a href="http://dirk.eddelbuettel.com/presentations.html">talks</a> on High Performance Computing with R (and you can also consult the <a href="http://cran.r-project.org/web/views/HighPerformanceComputing.html">CRAN </a><a href="http://cran.r-project.org/web/views/HighPerformanceComputing.html">Task View on High Performance Computing with R</a>).</p><p>But sometimes we cannot use data-parallel approaches. Hence we have to redo our algorithms. Which is <em>really hard</em>. R itself has been relying on the (fairly mature) <a href="http://openmp.org/wp/">OpenMP</a> standard for some of its operations. <a href="http://www.rinfinance.com/agenda/2014/talk/LukeTierney.pdf">Luke Tierney&rsquo;s </a><a href="http://www.rinfinance.com/agenda/2014/talk/LukeTierney.pdf">keynote</a> at the 2014 R/Finance conference mentioned some of the issues related to OpenMP, which works really well on Linux but currently not so well on other platforms. R is expected to make wider use of it in future versions once compiler support for OpenMP on Windows and OS X improves.</p><p>In the meantime, the <a href="http://rcppcore.github.io/RcppParallel">RcppParallel</a> package provides a complete toolkit for creating portable, high-performance parallel algorithms without requiring direct manipulation of operating system threads. RcppParallel includes:</p><ul><li><p><a href="https://www.threadingbuildingblocks.org/">Intel Thread Building Blocks</a> (v4.3), a C++ library for task parallelism with a wide variety of parallel algorithms and data structures (Windows, OS X, Linux, and Solaris x86 only).</p></li><li><p><a href="http://tinythreadpp.bitsnbites.eu/">TinyThread</a>, a C++ library for portable use of operating system threads.</p></li><li><p><code>RVector</code> and <code>RMatrix</code> wrapper classes for safe and convenient access to R data structures in a multi-threaded environment.</p></li><li><p>High level parallel functions (<code>parallelFor</code> and <code>parallelReduce</code>) that use Intel TBB as a back-end on systems that support it and TinyThread on other platforms.</p></li></ul><p>RcppParallel is <a href="https://cran.r-project.org/web/packages/RcppParallel/index.html">available on CRAN</a> now and several packages including <a href="https://cran.r-project.org/web/packages/dbmss/index.html">dbmss</a>, <a href="https://cran.r-project.org/web/packages/gaston/index.html">gaston</a>, <a href="https://cran.r-project.org/web/packages/markovchain/index.html">markovchain</a>, <a href="https://cran.r-project.org/web/packages/rPref/index.html">rPref</a>, <a href="https://cran.r-project.org/web/packages/SpatPCA/index.html">SpatPCA</a>, <a href="https://cran.r-project.org/web/packages/StMoSim/index.html">StMoSim</a>, and <a href="https://cran.r-project.org/web/packages/text2vec/index.html">text2vec</a> are already taking advantage of it (you can read more about the tex2vec implementation <a href="http://dsnotes.com/blog/text2vec/2016/01/09/fast-parallel-async-adagrad/">here</a>).</p><p>For more background and documentation see the <a href="http://rcppcore.github.io/RcppParallel">RcppParallel web site</a> as well as the slides from the <a href="http://dirk.eddelbuettel.com/papers/rcpp_parallel_talk_jan2015.pdf">talk we gave on RcppParallel</a> at the Workshop for Distributed Computing in R.</p><p>In addition, the <a href="http://gallery.rcpp.org/">Rcpp Gallery</a> includes several pieces demonstrating the use of RcppParallel, including:</p><ul><li><p><a href="http://gallery.rcpp.org/articles/parallel-matrix-transform">A parallel matrix transformation</a></p></li><li><p><a href="http://gallery.rcpp.org/articles/parallel-vector-sum">A parallel vector summation</a></p></li><li><p><a href="http://gallery.rcpp.org/articles/parallel-inner-product">A parallel inner product</a></p></li><li><p><a href="http://gallery.rcpp.org/articles/parallel-distance-matrix">A parallel distance matrix calculation </a></p></li></ul><p>All four are interesting and demonstrate different aspects of parallel computing via <a href="http://rcppcore.github.io/RcppParallel">RcppParallel</a>. But the last article is key—it shows how a particular matrix distance metric (which is missing from R) can be implemented in a serial manner in both R, and also via Rcpp. The fastest implementation, however, uses both Rcpp and <a href="http://rcppcore.github.io/RcppParallel">RcppParallel</a> and thereby achieves a truly impressive speed gain as the gains from using compiled code (via Rcpp) and from using a parallel algorithm (via RcppParallel) are multiplicative. On a couple of four-core machines the RcppParallel version was between 200 and 300 times faster than the R version.</p><p>Exciting times for parallel programming in R! To learn more head over to the <a href="http://rcppcore.github.io/RcppParallel">RcppParallel</a> package and start playing.</p></description></item><item><title>purrr 0.2.0</title><link>https://www.rstudio.com/blog/purrr-0-2-0/</link><pubDate>Wed, 06 Jan 2016 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/purrr-0-2-0/</guid><description><p>I&rsquo;m pleased to announce purrr 0.2.0. Purrr fills in the missing pieces in R&rsquo;s functional programming tools, and is designed to make your pure (and now) type-stable functions purrr.</p><p>I&rsquo;m still working out exactly what purrr should do, and how it compares to existing functions in base R, dplyr, and tidyr. One main insight that has affected much of the current version is that functions designed for programming should be type-stable. Type-stability is an idea brought to my attention by Julia. Even though functions in R and Julia can return different types of output, by and large, you should strive to make functions that always return the same type of data structure. This makes functions more robust to varying input, and makes them easier to reason about (and in Julia, to optimise). (But not every function can be type-stable - how could <code>$</code> work?)</p><p>Purrr 0.2.0 adds type-stable alternatives for maps, flattens, and <code>try()</code>, as described below. There were a lot of other minor improvements, bug fixes, and a number of deprecations. Please see the <a href="https://github.com/hadley/purrr/releases/tag/v0.2.0">release notes</a> for a complete list of changes.</p><h2 id="type-stable-maps">Type stable maps</h2><p>A <strong>map</strong> is a function that calls an another function on each element of a vector. Map functions in base R are the &ldquo;applys&rdquo;: <code>lapply()</code>, <code>sapply()</code>, <code>vapply()</code>, etc. <code>lapply()</code> is type-stable: no matter what the inputs are, the output is already a list. <code>sapply()</code> is not type-stable: it can return different types of output depending on the input. The following code shows a simple (if somewhat contrived) example of <code>sapply()</code> returning either a vector, a matrix, or a list, depending on its inputs:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df <span style="color:#666">&lt;-</span> <span style="color:#06287e">data.frame</span>(a <span style="color:#666">=</span> <span style="color:#40a070">1L</span>,b <span style="color:#666">=</span> <span style="color:#40a070">1.5</span>,y <span style="color:#666">=</span> <span style="color:#06287e">Sys.time</span>(),z <span style="color:#666">=</span> <span style="color:#06287e">ordered</span>(<span style="color:#40a070">1</span>))df[1<span style="color:#666">:</span><span style="color:#40a070">4</span>] <span style="color:#666">%&gt;%</span> <span style="color:#06287e">sapply</span>(class) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">str</span>()<span style="color:#60a0b0;font-style:italic">#&gt; List of 4</span><span style="color:#60a0b0;font-style:italic">#&gt; $ a: chr &#34;integer&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt; $ b: chr &#34;numeric&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt; $ y: chr [1:2] &#34;POSIXct&#34; &#34;POSIXt&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt; $ z: chr [1:2] &#34;ordered&#34; &#34;factor&#34;</span>df[1<span style="color:#666">:</span><span style="color:#40a070">2</span>] <span style="color:#666">%&gt;%</span> <span style="color:#06287e">sapply</span>(class) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">str</span>()<span style="color:#60a0b0;font-style:italic">#&gt; Named chr [1:2] &#34;integer&#34; &#34;numeric&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt; - attr(*, &#34;names&#34;)= chr [1:2] &#34;a&#34; &#34;b&#34;</span>df[3<span style="color:#666">:</span><span style="color:#40a070">4</span>] <span style="color:#666">%&gt;%</span> <span style="color:#06287e">sapply</span>(class) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">str</span>()<span style="color:#60a0b0;font-style:italic">#&gt; chr [1:2, 1:2] &#34;POSIXct&#34; &#34;POSIXt&#34; &#34;ordered&#34; &#34;factor&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt; - attr(*, &#34;dimnames&#34;)=List of 2</span><span style="color:#60a0b0;font-style:italic">#&gt; ..$ : NULL</span><span style="color:#60a0b0;font-style:italic">#&gt; ..$ : chr [1:2] &#34;y&#34; &#34;z&#34;</span></code></pre></div><p>This behaviour makes <code>sapply()</code> appropriate for interactive use, since it usually guesses correctly and gives a useful data structure. It&rsquo;s not appropriate for use in package or production code because if the input isn&rsquo;t what you expect, it won&rsquo;t fail, and will instead return an unexpected data structure. This typically causes an error further along the process, so you get a confusing error message and it&rsquo;s difficult to isolate the root cause.</p><p>Base R has a type-stable version of <code>sapply()</code> called <code>vapply()</code>. It takes an additional argument that determines what the output will be. purrr takes a different approach. Instead of one function that does it all, purrr has multiple functions, one for each common type of output: <code>map_lgl()</code>, <code>map_int()</code>, <code>map_dbl()</code>, <code>map_chr()</code>, and <code>map_df()</code>. These either produce the specified type of output or throw an error. This forces you to deal with the problem right away:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df[1<span style="color:#666">:</span><span style="color:#40a070">4</span>] <span style="color:#666">%&gt;%</span> <span style="color:#06287e">map_chr</span>(class)<span style="color:#60a0b0;font-style:italic">#&gt; Error: Result 3 is not a length 1 atomic vector</span>df[1<span style="color:#666">:</span><span style="color:#40a070">4</span>] <span style="color:#666">%&gt;%</span> <span style="color:#06287e">map_chr</span>(<span style="color:#666">~</span> <span style="color:#06287e">paste</span>(<span style="color:#06287e">class</span>(.), collapse <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">/&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; a b y z</span><span style="color:#60a0b0;font-style:italic">#&gt; &#34;integer&#34; &#34;numeric&#34; &#34;POSIXct/POSIXt&#34; &#34;ordered/factor&#34;</span></code></pre></div><p>Other variants of <code>map()</code> have similar suffixes. For example, <code>map2()</code> allows you to iterate over two vectors in parallel:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">x <span style="color:#666">&lt;-</span> <span style="color:#06287e">list</span>(<span style="color:#40a070">1</span>, <span style="color:#40a070">3</span>, <span style="color:#40a070">5</span>)y <span style="color:#666">&lt;-</span> <span style="color:#06287e">list</span>(<span style="color:#40a070">2</span>, <span style="color:#40a070">4</span>, <span style="color:#40a070">6</span>)<span style="color:#06287e">map2</span>(x, y, c)<span style="color:#60a0b0;font-style:italic">#&gt; [[1]]</span><span style="color:#60a0b0;font-style:italic">#&gt; [1] 1 2</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; [[2]]</span><span style="color:#60a0b0;font-style:italic">#&gt; [1] 3 4</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; [[3]]</span><span style="color:#60a0b0;font-style:italic">#&gt; [1] 5 6</span></code></pre></div><p><code>map2()</code> always returns a list. If you want to add together the corresponding values and store the result as a double vector, you can use <code>map2_dbl()</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">map2_dbl</span>(x, y, `+`)<span style="color:#60a0b0;font-style:italic">#&gt; [1] 3 7 11</span></code></pre></div><p>Another map variant is <code>invoke_map()</code>, which takes a list of functions and list of arguments. It also has type-stable suffixes:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">spread <span style="color:#666">&lt;-</span> <span style="color:#06287e">list</span>(sd <span style="color:#666">=</span> sd, iqr <span style="color:#666">=</span> IQR, mad <span style="color:#666">=</span> mad)x <span style="color:#666">&lt;-</span> <span style="color:#06287e">rnorm</span>(<span style="color:#40a070">100</span>)<span style="color:#06287e">invoke_map_dbl</span>(spread, x <span style="color:#666">=</span> x)<span style="color:#60a0b0;font-style:italic">#&gt; sd iqr mad</span><span style="color:#60a0b0;font-style:italic">#&gt; 0.9121309 1.2515807 0.9774154</span></code></pre></div><h2 id="type-stable-flatten">Type-stable flatten</h2><p>Another situation when type-stability is important is flattening a nested list into a simpler data structure. Base R has <code>unlist()</code>, but it&rsquo;s dangerous because it always succeeds. As an alternative, purrr provides <code>flatten_lgl()</code>, <code>flatten_int()</code>, <code>flatten_dbl()</code>, and <code>flatten_chr()</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">x <span style="color:#666">&lt;-</span> <span style="color:#06287e">list</span>(<span style="color:#40a070">1L</span>, <span style="color:#40a070">2</span><span style="color:#666">:</span><span style="color:#40a070">3</span>, <span style="color:#40a070">4L</span>)x <span style="color:#666">%&gt;%</span> <span style="color:#06287e">str</span>()<span style="color:#60a0b0;font-style:italic">#&gt; List of 3</span><span style="color:#60a0b0;font-style:italic">#&gt; $ : int 1</span><span style="color:#60a0b0;font-style:italic">#&gt; $ : int [1:2] 2 3</span><span style="color:#60a0b0;font-style:italic">#&gt; $ : int 4</span>x <span style="color:#666">%&gt;%</span> <span style="color:#06287e">flatten</span>() <span style="color:#666">%&gt;%</span> <span style="color:#06287e">str</span>()<span style="color:#60a0b0;font-style:italic">#&gt; List of 4</span><span style="color:#60a0b0;font-style:italic">#&gt; $ : int 1</span><span style="color:#60a0b0;font-style:italic">#&gt; $ : int 2</span><span style="color:#60a0b0;font-style:italic">#&gt; $ : int 3</span><span style="color:#60a0b0;font-style:italic">#&gt; $ : int 4</span>x <span style="color:#666">%&gt;%</span> <span style="color:#06287e">flatten_int</span>() <span style="color:#666">%&gt;%</span> <span style="color:#06287e">str</span>()<span style="color:#60a0b0;font-style:italic">#&gt; int [1:4] 1 2 3 4</span></code></pre></div><h2 id="type-stable-try">Type-stable <code>try()</code></h2><p>Another function in base R that is not type-stable is <code>try()</code>. <code>try()</code> ensures that an expression always succeeds, either returning the original value or the error message:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">str</span>(<span style="color:#06287e">try</span>(<span style="color:#06287e">log</span>(<span style="color:#40a070">10</span>)))<span style="color:#60a0b0;font-style:italic">#&gt; num 2.3</span><span style="color:#06287e">str</span>(<span style="color:#06287e">try</span>(<span style="color:#06287e">log</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a&#34;</span>), silent <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>))<span style="color:#60a0b0;font-style:italic">#&gt; Class &#39;try-error&#39; atomic [1:1] Error in log(&#34;a&#34;) : non-numeric argument to mathematical function</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; ..- attr(*, &#34;condition&#34;)=List of 2</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ..$ message: chr &#34;non-numeric argument to mathematical function&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ..$ call : language log(&#34;a&#34;)</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ..- attr(*, &#34;class&#34;)= chr [1:3] &#34;simpleError&#34; &#34;error&#34; &#34;condition&#34;</span></code></pre></div><p><code>safely()</code> is a type-stable version of try. It always returns a list of two elements, the result and the error, and one will always be <code>NULL</code>.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">safely</span>(log)(<span style="color:#40a070">10</span>)<span style="color:#60a0b0;font-style:italic">#&gt; $result</span><span style="color:#60a0b0;font-style:italic">#&gt; [1] 2.302585</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; $error</span><span style="color:#60a0b0;font-style:italic">#&gt; NULL</span><span style="color:#06287e">safely</span>(log)(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; $result</span><span style="color:#60a0b0;font-style:italic">#&gt; NULL</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; $error</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;simpleError in .f(...): non-numeric argument to mathematical function&gt;</span></code></pre></div><p>Notice that <code>safely()</code> takes a function as input and returns a &ldquo;safe&rdquo; function, a function that never throws an error. A powerful technique is to use <code>safely()</code> and <code>map()</code> together to attempt an operation on each element of a list:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">safe_log <span style="color:#666">&lt;-</span> <span style="color:#06287e">safely</span>(log)x <span style="color:#666">&lt;-</span> <span style="color:#06287e">list</span>(<span style="color:#40a070">10</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a&#34;</span>, <span style="color:#40a070">5</span>)log_x <span style="color:#666">&lt;-</span> x <span style="color:#666">%&gt;%</span> <span style="color:#06287e">map</span>(safe_log)<span style="color:#06287e">str</span>(log_x)<span style="color:#60a0b0;font-style:italic">#&gt; List of 3</span><span style="color:#60a0b0;font-style:italic">#&gt; $ :List of 2</span><span style="color:#60a0b0;font-style:italic">#&gt; ..$ result: num 2.3</span><span style="color:#60a0b0;font-style:italic">#&gt; ..$ error : NULL</span><span style="color:#60a0b0;font-style:italic">#&gt; $ :List of 2</span><span style="color:#60a0b0;font-style:italic">#&gt; ..$ result: NULL</span><span style="color:#60a0b0;font-style:italic">#&gt; ..$ error :List of 2</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ..$ message: chr &#34;non-numeric argument to mathematical function&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ..$ call : language .f(...)</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ..- attr(*, &#34;class&#34;)= chr [1:3] &#34;simpleError&#34; &#34;error&#34; &#34;condition&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt; $ :List of 2</span><span style="color:#60a0b0;font-style:italic">#&gt; ..$ result: num 1.61</span><span style="color:#60a0b0;font-style:italic">#&gt; ..$ error : NULL</span></code></pre></div><p>This is output is slightly inconvenient because you&rsquo;d rather have a list of three results, and another list of three errors. You can use the new <code>transpose()</code> function to switch the order of the first and second levels in the hierarchy:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">log_x <span style="color:#666">%&gt;%</span> <span style="color:#06287e">transpose</span>() <span style="color:#666">%&gt;%</span> <span style="color:#06287e">str</span>()<span style="color:#60a0b0;font-style:italic">#&gt; List of 2</span><span style="color:#60a0b0;font-style:italic">#&gt; $ result:List of 3</span><span style="color:#60a0b0;font-style:italic">#&gt; ..$ : num 2.3</span><span style="color:#60a0b0;font-style:italic">#&gt; ..$ : NULL</span><span style="color:#60a0b0;font-style:italic">#&gt; ..$ : num 1.61</span><span style="color:#60a0b0;font-style:italic">#&gt; $ error :List of 3</span><span style="color:#60a0b0;font-style:italic">#&gt; ..$ : NULL</span><span style="color:#60a0b0;font-style:italic">#&gt; ..$ :List of 2</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ..$ message: chr &#34;non-numeric argument to mathematical function&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ..$ call : language .f(...)</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ..- attr(*, &#34;class&#34;)= chr [1:3] &#34;simpleError&#34; &#34;error&#34; &#34;condition&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt; ..$ : NULL</span></code></pre></div><p>This makes it easy to extract the inputs where the original functions failed, or just keep the good successful result:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">results <span style="color:#666">&lt;-</span> x <span style="color:#666">%&gt;%</span> <span style="color:#06287e">map</span>(safe_log) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">transpose</span>()(ok <span style="color:#666">&lt;-</span> results<span style="color:#666">$</span>error <span style="color:#666">%&gt;%</span> <span style="color:#06287e">map_lgl</span>(is_null))<span style="color:#60a0b0;font-style:italic">#&gt; [1] TRUE FALSE TRUE</span>(bad_inputs <span style="color:#666">&lt;-</span> x <span style="color:#666">%&gt;%</span> <span style="color:#06287e">discard</span>(ok))<span style="color:#60a0b0;font-style:italic">#&gt; [[1]]</span><span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;a&#34;</span>(successes <span style="color:#666">&lt;-</span> results<span style="color:#666">$</span>result <span style="color:#666">%&gt;%</span> <span style="color:#06287e">keep</span>(ok) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">flatten_dbl</span>())<span style="color:#60a0b0;font-style:italic">#&gt; [1] 2.302585 1.609438</span></code></pre></div></description></item><item><title>ggplot 2.0.0</title><link>https://www.rstudio.com/blog/ggplot2-2-0-0/</link><pubDate>Mon, 21 Dec 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/ggplot2-2-0-0/</guid><description><p>I&rsquo;m very pleased to announce the release of ggplot2 2.0.0. I know I promised <a href="https://blog.rstudio.com/2015/01/09/ggplot2-updates/">that there wouldn&rsquo;t be any more updates</a>, but while working on the 2nd edition of the ggplot2 book, I just couldn&rsquo;t stop myself from fixing some long standing problems.</p><p>On the scale of ggplot2 releases, this one is huge with over one hundred fixes and improvements. This might break some of your existing code (although I&rsquo;ve tried to minimise breakage as much as possible), but I hope the new features make up for any short term hassle. This blog post documents the most important changes:</p><ul><li><p>ggplot2 now has an official extension mechanism.</p></li><li><p>There are a handful of new geoms, and updates to existing geoms.</p></li><li><p>The default appearance has been thoroughly tweaked so most plots should look better.</p></li><li><p>Facets have a much richer set of labelling options.</p></li><li><p>The documentation has been overhauled to be more helpful, and require less integration across multiple pages.</p></li><li><p>A number of older and less used features have been deprecated.</p></li></ul><p>These are described in more detail below. See the <a href="https://github.com/hadley/ggplot2/releases/tag/v2.0.0">release notes</a> for a complete list of all changes.</p><h2 id="extensibility">Extensibility</h2><p>Perhaps the bigggest news in this release is that ggplot2 now has an official extension mechanism. This means that others can now easily create their on stats, geoms and positions, and provide them in other packages. This should allow the ggplot2 community to flourish, even as less development work happens in ggplot2 itself. See <a href="https://cran.r-project.org/web/packages/ggplot2/vignettes/extending-ggplot2.html"><code>vignette(&quot;extending-ggplot2&quot;)</code></a> for details.</p><p>Coupled with this change, ggplot2 no longer uses proto or reference classes. Instead, we now use ggproto, a new OO system designed specifically for ggplot2. Unlike proto and RC, ggproto supports clean cross-package inheritance, which is necessary for extensibility. Creating a new OO system isn&rsquo;t usually the right solution, but I&rsquo;m pretty sure it was necessary here. Read more about it in the vignette.</p><h2 id="new-and-updated-geoms">New and updated geoms</h2><ul><li>ggplot no longer throws an error if you your plot has no layers. Instead it automatically adds <code>geom_blank()</code>:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(mpg, <span style="color:#06287e">aes</span>(cyl, hwy))</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2015/12/unnamed-chunk-2-1.png" alt=""></p><ul><li><code>geom_count()</code> (a new alias for the old <code>stat_sum()</code>) counts the number of points at unique locations on a scatterplot, and maps the size of the point to the count:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(mpg, <span style="color:#06287e">aes</span>(cty, hwy)) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>()<span style="color:#06287e">ggplot</span>(mpg, <span style="color:#06287e">aes</span>(cty, hwy)) <span style="color:#666">+</span><span style="color:#06287e">geom_count</span>()</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2015/12/unnamed-chunk-3-1.png" alt=""><img src="https://rstudioblog.files.wordpress.com/2015/12/unnamed-chunk-3-2.png" alt=""></p><ul><li><code>geom_curve()</code> draws curved lines in the same way that <code>geom_segment()</code> draws straight lines:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df <span style="color:#666">&lt;-</span> <span style="color:#06287e">expand.grid</span>(x <span style="color:#666">=</span> <span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">2</span>, y <span style="color:#666">=</span> <span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">2</span>)<span style="color:#06287e">ggplot</span>(df, <span style="color:#06287e">aes</span>(x, y, xend <span style="color:#666">=</span> x <span style="color:#666">+</span> <span style="color:#40a070">0.5</span>, yend <span style="color:#666">=</span> y <span style="color:#666">+</span> <span style="color:#40a070">0.5</span>)) <span style="color:#666">+</span><span style="color:#06287e">geom_curve</span>(<span style="color:#06287e">aes</span>(colour <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">curve&#34;</span>)) <span style="color:#666">+</span><span style="color:#06287e">geom_segment</span>(<span style="color:#06287e">aes</span>(colour <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">segment&#34;</span>))</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2015/12/unnamed-chunk-4-1.png" alt=""></p><ul><li><code>geom_bar()</code> now behaves differently from <code>geom_histogram()</code>. Instead of binning the data, it counts the number of unique observations at each location:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(mpg, <span style="color:#06287e">aes</span>(cyl)) <span style="color:#666">+</span><span style="color:#06287e">geom_bar</span>()<span style="color:#06287e">ggplot</span>(mpg, <span style="color:#06287e">aes</span>(cyl)) <span style="color:#666">+</span><span style="color:#06287e">geom_histogram</span>(binwidth <span style="color:#666">=</span> <span style="color:#40a070">1</span>)</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2015/12/unnamed-chunk-5-1.png" alt=""><img src="https://rstudioblog.files.wordpress.com/2015/12/unnamed-chunk-5-2.png" alt=""></p><p>If you got into the (bad) habit of using <code>geom_histogram()</code> to create bar charts, or <code>geom_bar()</code> to create histograms, you&rsquo;ll need to switch.</p><ul><li><p>Layers are now much stricter about their arguments - you will get an error if you&rsquo;ve supplied an argument that isn&rsquo;t an aesthetic or a parameter. This breaks the handful of geoms/stats that used <code>...</code> to pass additional arguments on to the underlying computation. Now <code>geom_smooth()</code>/<code>stat_smooth()</code> and <code>geom_quantile()</code>/<code>stat_quantile()</code> use <code>method.args</code> instead; and <code>stat_summary()</code>, <code>stat_summary_hex()</code>, and <code>stat_summary2d()</code> use <code>fun.args</code>. This is likely to cause some short-term pain but in the long-term it will make it much easier to spot spelling mistakes and other errors.</p></li><li><p><code>geom_text()</code> has been overhauled to make labelling your data a little easier. You can use <code>nudge_x</code> and <code>nudge_y</code> arguments to offset labels from their corresponding points. <code>check_overlap = TRUE</code> provides a simple way to avoid overplotting of labels: labels that would otherwise overlap are omitted.</p></li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(mtcars, <span style="color:#06287e">aes</span>(wt, mpg, label <span style="color:#666">=</span> <span style="color:#06287e">rownames</span>(mtcars))) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>() <span style="color:#666">+</span><span style="color:#06287e">geom_text</span>(nudge_y <span style="color:#666">=</span> <span style="color:#40a070">0.5</span>, check_overlap <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>)</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2015/12/unnamed-chunk-6-1.png" alt=""></p><p>(Labelling points well is still a huge pain, but at least these new features make life a lit better.)</p><ul><li><code>geom_label()</code> works like <code>geom_text()</code> but draws a rounded rectangle underneath each label:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">grid <span style="color:#666">&lt;-</span> <span style="color:#06287e">expand.grid</span>(x <span style="color:#666">=</span> <span style="color:#06287e">seq</span>(<span style="color:#666">-</span><span style="color:#007020;font-weight:bold">pi</span>, <span style="color:#007020;font-weight:bold">pi</span>, length <span style="color:#666">=</span> <span style="color:#40a070">50</span>),y <span style="color:#666">=</span> <span style="color:#06287e">seq</span>(<span style="color:#666">-</span><span style="color:#007020;font-weight:bold">pi</span>, <span style="color:#007020;font-weight:bold">pi</span>, length <span style="color:#666">=</span> <span style="color:#40a070">50</span>)) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">mutate</span>(r <span style="color:#666">=</span> x ^ <span style="color:#40a070">2</span> <span style="color:#666">+</span> y ^ <span style="color:#40a070">2</span>, z <span style="color:#666">=</span> <span style="color:#06287e">cos</span>(r ^ <span style="color:#40a070">2</span>) <span style="color:#666">*</span> <span style="color:#06287e">exp</span>(<span style="color:#666">-</span>r <span style="color:#666">/</span> <span style="color:#40a070">6</span>))<span style="color:#06287e">ggplot</span>(grid, <span style="color:#06287e">aes</span>(x, y)) <span style="color:#666">+</span><span style="color:#06287e">geom_raster</span>(<span style="color:#06287e">aes</span>(fill <span style="color:#666">=</span> z)) <span style="color:#666">+</span><span style="color:#06287e">geom_label</span>(data <span style="color:#666">=</span> <span style="color:#06287e">data.frame</span>(x <span style="color:#666">=</span> <span style="color:#40a070">0</span>, y <span style="color:#666">=</span> <span style="color:#40a070">0</span>), label <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Center&#34;</span>) <span style="color:#666">+</span><span style="color:#06287e">theme</span>(legend.position <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">none&#34;</span>) <span style="color:#666">+</span><span style="color:#06287e">coord_fixed</span>()</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2015/12/unnamed-chunk-7-1.png" alt=""></p><ul><li><code>aes_()</code> replaces <code>aes_q()</code>, and works like the SE functions in dplyr and my other recent packages. It supports formulas, so the most concise SE version of <code>aes(carat, price)</code> is now <code>aes_(~carat, ~price)</code>. You may want to use this form in packages, as it will avoid spurious <code>R CMD check</code> warnings about undefined global variables.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(mpg, <span style="color:#06287e">aes_</span>(<span style="color:#666">~</span>displ, <span style="color:#666">~</span>cty)) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>()<span style="color:#60a0b0;font-style:italic"># Same as</span><span style="color:#06287e">ggplot</span>(mpg, <span style="color:#06287e">aes</span>(displ, cty)) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>()</code></pre></div><h2 id="appearance">Appearance</h2><p>I&rsquo;ve made a number of small tweaks to the default appearance:</p><ul><li><p>The default <code>theme_grey()</code> background colour has been changed from &ldquo;grey90&rdquo; to &ldquo;grey92&rdquo;: this makes the background a little less visually prominent.</p></li><li><p>Labels and titles have been tweaked for readability. Axis labels are darker, and legend titles get the same visual treatment as axis labels.</p></li><li><p>The default font size dropped from 12 to 11. You might be surprised that I&rsquo;ve made the default text size smaller as it was already hard for many people to read. It turns out there was a bug in RStudio (<a href="https://www.rstudio.com/products/rstudio/download/preview/">fixed in 0.99.724</a>), that shrunk the text of all grid based graphics. Once that was resolved the defaults seemed too big to my eyes.</p></li><li><p><code>scale_size()</code> now maps values to <em>area</em>, not radius. Use <code>scale_radius()</code> if you want the old behaviour (not recommended, except perhaps for lines). Continue to use <code>scale_size_area()</code> if you want 0 values to have 0 area.</p></li><li><p>Bar and rectangle legends no longer get a diagonal line. Instead, the border has been tweaked to make it visible, and more closely match the size of line drawn on the plot.</p></li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(mpg, <span style="color:#06287e">aes</span>(<span style="color:#06287e">factor</span>(cyl), fill <span style="color:#666">=</span> drv)) <span style="color:#666">+</span><span style="color:#06287e">geom_bar</span>(colour <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">black&#34;</span>, size <span style="color:#666">=</span> <span style="color:#40a070">1</span>) <span style="color:#666">+</span><span style="color:#06287e">coord_flip</span>()</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2015/12/unnamed-chunk-9-1.png" alt=""></p><ul><li><p><code>geom_point()</code> now uses shape 19 instead of 16. This looks much better on the default Linux graphics device. (It&rsquo;s very slightly smaller than the old point, but it shouldn&rsquo;t affect any graphics significantly). You can now control the width of the outline on shapes 21-25 with the <code>stroke</code> parameter.</p></li><li><p>The default legend will now allocate multiple rows (if vertical) or columns (if horizontal) in order to make a legend that is more likely to fit on the screen. You can override with the <code>nrow</code>/<code>ncol</code> arguments to <code>guide_legend()</code></p></li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">p <span style="color:#666">&lt;-</span> <span style="color:#06287e">ggplot</span>(mpg, <span style="color:#06287e">aes</span>(displ,hwy, colour <span style="color:#666">=</span> manufacturer)) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>() <span style="color:#666">+</span><span style="color:#06287e">theme</span>(legend.position <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">bottom&#34;</span>)p<span style="color:#60a0b0;font-style:italic"># Revert back to previous behaviour</span>p <span style="color:#666">+</span> <span style="color:#06287e">guides</span>(colour <span style="color:#666">=</span> <span style="color:#06287e">guide_legend</span>(nrow <span style="color:#666">=</span> <span style="color:#40a070">1</span>))</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2015/12/unnamed-chunk-10-1.png" alt=""><img src="https://rstudioblog.files.wordpress.com/2015/12/unnamed-chunk-10-2.png" alt=""></p><ul><li>Two new themes were contributed by <a href="http://github.com/jiho">Jean-Olivier Irisson</a>: <code>theme_void()</code> is completely empty and <code>theme_dark()</code> has a dark background designed to make colours pop out.</li></ul><h2 id="facet-labels">Facet labels</h2><p>Thanks to the work of <a href="https://github.com/lionel-">Lionel Henry</a>, facet labels have received three major improvements:</p><ol><li><p>You can switch the position of facet labels so they&rsquo;re next to the axes.</p></li><li><p><code>facet_wrap()</code> now supports custom labellers.</p></li><li><p>You can create combined labels when facetting by multiple variables.</p></li></ol><h3 id="switching-the-labels">Switching the labels</h3><p>The new <code>switch</code> argument allows you to switch the labels to display near the axes:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">data <span style="color:#666">&lt;-</span> <span style="color:#06287e">transform</span>(mtcars,am <span style="color:#666">=</span> <span style="color:#06287e">factor</span>(am, levels <span style="color:#666">=</span> <span style="color:#40a070">0</span><span style="color:#666">:</span><span style="color:#40a070">1</span>, <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Automatic&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Manual&#34;</span>)),gear <span style="color:#666">=</span> <span style="color:#06287e">factor</span>(gear, levels <span style="color:#666">=</span> <span style="color:#40a070">3</span><span style="color:#666">:</span><span style="color:#40a070">5</span>, labels <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Three&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Four&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Five&#34;</span>)))<span style="color:#06287e">ggplot</span>(data, <span style="color:#06287e">aes</span>(mpg, disp)) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>() <span style="color:#666">+</span><span style="color:#06287e">facet_grid</span>(am <span style="color:#666">~</span> gear, switch <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">both&#34;</span>)</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2015/12/unnamed-chunk-11-1.png" alt=""></p><p>This is especially useful when the labels directly characterise the axes. In that situation, switching the labels can make the plot clearer and more readable. You may also want to use a neutral label background by setting <code>strip.background</code> to <code>element_blank()</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">data <span style="color:#666">&lt;-</span> mtcars <span style="color:#666">%&gt;%</span><span style="color:#06287e">mutate</span>(Logarithmic <span style="color:#666">=</span> <span style="color:#06287e">log</span>(mpg),Inverse <span style="color:#666">=</span> <span style="color:#40a070">1</span> <span style="color:#666">/</span> mpg,Cubic <span style="color:#666">=</span> mpg ^ <span style="color:#40a070">3</span>,Original <span style="color:#666">=</span> mpg) <span style="color:#666">%&gt;%</span> tidyr<span style="color:#666">::</span><span style="color:#06287e">gather</span>(transformation, mpg2, Logarithmic<span style="color:#666">:</span>Original)<span style="color:#06287e">ggplot</span>(data, <span style="color:#06287e">aes</span>(mpg2, disp)) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>() <span style="color:#666">+</span><span style="color:#06287e">facet_wrap</span>(<span style="color:#666">~</span>transformation, scales <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">free&#34;</span>, switch <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">x&#34;</span>) <span style="color:#666">+</span><span style="color:#06287e">theme</span>(strip.background <span style="color:#666">=</span> <span style="color:#06287e">element_blank</span>())</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2015/12/unnamed-chunk-12-1.png" alt=""></p><h3 id="wrap-labeller">Wrap labeller</h3><p>A longstanding issue in ggplot was that <code>facet_wrap()</code> did not support custom labellers. Labellers are small functions that make it easy to customise the labels. You can now supply labellers to both wrap and grid facets:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(data, <span style="color:#06287e">aes</span>(mpg2, disp)) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>() <span style="color:#666">+</span><span style="color:#06287e">facet_wrap</span>(<span style="color:#666">~</span>transformation, scales <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">free&#34;</span>, labeller <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">label_both&#34;</span>)</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2015/12/unnamed-chunk-13-1.png" alt=""></p><h3 id="composite-margins">Composite margins</h3><p>Labellers have now better support for composite margins when you facet over multiple variable with <code>+</code>. All labellers gain a <code>multi_line</code> argument to control whether labels should be displayed as a single line or over multiple lines, one for each factor.</p><p>The labellers still work the same way except for <code>label_bquote()</code>. That labeller makes it easy to write mathematical expression involving the values of facetted factors. Historically, <code>label_bquote()</code> could only specify a single expression for all margins and factor. The factor value was referred to via the backquoted placeholder <code>.(x)</code>. Now that it supports expressions combining multiple factors, you must backquote the variable names themselves. In addition, you can provide different expressions for each margin:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">my_labeller <span style="color:#666">&lt;-</span> <span style="color:#06287e">label_bquote</span>(rows <span style="color:#666">=</span> .(am) <span style="color:#666">/</span> alpha,cols <span style="color:#666">=</span> .(vs) ^ .(cyl))<span style="color:#06287e">ggplot</span>(mtcars, <span style="color:#06287e">aes</span>(wt, mpg)) <span style="color:#666">+</span><span style="color:#06287e">geom_point</span>() <span style="color:#666">+</span><span style="color:#06287e">facet_grid</span>(am <span style="color:#666">~</span> vs <span style="color:#666">+</span> cyl, labeller <span style="color:#666">=</span> my_labeller)</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2015/12/unnamed-chunk-14-1.png" alt=""></p><h2 id="documentation">Documentation</h2><p>I&rsquo;ve given the documentation a thorough overhaul:</p><ul><li><p>Tighly linked geoms and stats (e.g. <code>geom_boxplot()</code> and <code>stat_boxplot()</code>) are now documented in the same file so you can see all the arguments in one place. Similarly, variations on a theme (like <code>geom_path()</code>, <code>geom_line()</code>, and <code>geom_step()</code>) are documented together.</p></li><li><p>I&rsquo;ve tried to reduce the use of <code>...</code> so that you can see all the documentation in one place rather than having to follow links around. In some cases this has involved adding additional arguments to geoms to make it more clear what you can do.</p></li><li><p>Thanks to <a href="https://github.com/hrbrmstr">Bob Rudis</a>, the use of <code>qplot()</code> in examples has been grealy reduced. This is inline with the 2nd edition of the ggplot2 book, which eliminates <code>qplot()</code> in favour of <code>ggplot()</code>.</p></li></ul><h2 id="deprecated-features">Deprecated features</h2><ul><li><p>The <code>order</code> aesthetic is officially deprecated. It never really worked, and was poorly documented.</p></li><li><p>The <code>stat</code> and <code>position</code> arguments to <code>qplot()</code> have been deprecated. <code>qplot()</code> is designed for quick plots - if you need to specify position or stat, use <code>ggplot()</code> instead.</p></li><li><p>The theme setting <code>axis.ticks.margin</code> has been deprecated: now use the margin property of <code>axis.ticks</code>.</p></li><li><p><code>stat_abline()</code>, <code>stat_hline()</code> and <code>stat_vline()</code> have been removed: these were never suitable for use other than with their corresponding geoms and were not documented.</p></li><li><p><code>show_guide</code> has been renamed to <code>show.legend</code>: this more accurately reflects what it does (controls appearance of layer in legend), and uses the same convention as other ggplot2 arguments (i.e. a <code>.</code> between names). (Yes, I know that&rsquo;s inconsistent with function names (which use <code>_</code>) but it&rsquo;s too late to change now.)</p></li></ul><p>A number of geoms have been renamed to be more consistent. The previous names will continue to work for the forseeable future, but you should switch to the new names for new work.</p><ul><li><p><code>stat_binhex()</code> and <code>stat_bin2d()</code> have been renamed to <code>stat_bin_hex()</code> and <code>stat_bin_2d()</code>. <code>stat_summary2d()</code> has been renamed to <code>stat_summary_2d()</code>, <code>geom_density2d()</code>/<code>stat_density2d()</code> has been renamed to <code>geom_density_2d()</code>/<code>stat_density_2d()</code>.</p></li><li><p><code>stat_spoke()</code> is now <code>geom_spoke()</code> since I realised it&rsquo;s a reparameterisation of <code>geom_segment()</code>.</p></li><li><p><code>stat_bindot()</code> has been removed because it&rsquo;s so tightly coupled to <code>geom_dotplot()</code>. If you happened to use <code>stat_bindot()</code>, just change to <code>geom_dotplot()</code>.</p></li></ul><p>All defunct functions have been removed.</p></description></item><item><title>svglite 1.0.0</title><link>https://www.rstudio.com/blog/svglite-1-0-0/</link><pubDate>Thu, 10 Dec 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/svglite-1-0-0/</guid><description><p>I&rsquo;m pleased to announced a new package for producing <a href="https://en.wikipedia.org/wiki/Scalable_Vector_Graphics">SVG</a>s from R: <a href="http://github.com/hadley/svglite">svglite</a>. This package is a fork of <a href="https://github.com/mdecorde">Matthieu Decorde</a> RSvgDevice and wouldn&rsquo;t be possible without his hard work. I&rsquo;d also like to thank <a href="https://github.com/davidgohel">David Gohel</a> who wrote the gdtools package: it solves all the hardest problems associated with making good SVGs from R.</p><p>Today, most browsers have good support for SVG and it is a great way of displaying vector graphics on the web. Unfortunately, R&rsquo;s built-in <code>svg()</code> device is focussed on high quality rendering, not size or speed. It renders text as individual polygons: this ensures a graphic will look exactly the same regardless of what fonts you have installed, but makes output considerably larger (and harder to edit in other tools). svglite produces hand-optimised SVG that is as small as possible.</p><h2 id="features">Features</h2><p>svglite is a complete graphics device: that means you can give it any graphic and it will look the same as the equivalent <code>.pdf</code> or <code>.png</code>. Please <a href="https://github.com/hadley/svglite/issues">file an issue</a> if you discover a plot that doesn&rsquo;t look right.</p><h2 id="use">Use</h2><p>In an interactive session, you use it like any other R graphics device:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">svglite<span style="color:#666">::</span><span style="color:#06287e">svglite</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">myfile.svg&#34;</span>)<span style="color:#06287e">plot</span>(<span style="color:#06287e">runif</span>(<span style="color:#40a070">10</span>), <span style="color:#06287e">runif</span>(<span style="color:#40a070">10</span>))<span style="color:#06287e">dev.off</span>()</code></pre></div><p>If you want to use it in knitr, just set your chunk options as follows:</p><pre><code>```{r setup, include = FALSE}library(svglite)knitr::opts_chunk$set(dev = &quot;svglite&quot;,fig.ext = &quot;.svg&quot;)</code></pre><p>(Thanks to Bob Rudis for <a href="https://twitter.com/hrbrmstr/status/662708164597563392">the tip</a>)</p><p>There are also a few helper functions:</p><ul><li><p><code>htmlSVG()</code> makes it easy to preview the SVG in RStudio.</p></li><li><p><code>editSVG()</code> opens the SVG file in your default SVG editor.</p></li><li><p><code>xmlSVG()</code> returns the SVG as an <a href="http://github.com/hadley/xml2">xml2</a> object.</p></li></ul></description></item><item><title>Register for Hadley Wickham's Master R Developer Workshop - San Francisco</title><link>https://www.rstudio.com/blog/register-for-hadley-wickhams-master-r-developer-workshop-san-francisco/</link><pubDate>Wed, 09 Dec 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/register-for-hadley-wickhams-master-r-developer-workshop-san-francisco/</guid><description><p>Are you ready to upgrade your R skills? <a href="https://www.eventbrite.com/e/master-r-developer-workshop-san-francisco-tickets-18345399584">Register soon to secure your seat</a>.</p><p>On January 28 and 29, 2016, Hadley Wickham will teach his popular Master R Developer Workshop at <a href="https://www.google.com/maps/place/The+Westin+San+Francisco+Airport/@37.603403,-122.376052,15z/data=!4m2!3m1!1s0x0:0xffffcdd47141a784">the Westin San Francisco Airport</a>. The workshop is offered only 3 times a year and the San Francisco class is already nearly 50% full. This is the only Master R Developer Workshop Hadley is planning for the US West Coast in 2016.</p><p>We look forward to seeing you there!</p></description></item><item><title>RStudio Essentials Webinar Series</title><link>https://www.rstudio.com/blog/rstudio-essentials-webinar-series/</link><pubDate>Tue, 01 Dec 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-essentials-webinar-series/</guid><description><p>The RStudio IDE is bursting with capabilities and features. Do you know how to use them all? Tomorrow, we begin an &ldquo;RStudio Essentials&rdquo; webinar series. This will be the perfect way to learn how to use the IDE to its fullest. The series is broken into six sections always on a Wednesday at 11 a.m. EDT:</p><ul><li><p><a href="http://pages.rstudio.net/Webinar-Dec2015Series_Registration.html">Programming Part 1 (Writing code in RStudio)</a> - December 2nd</p></li><li><p><a href="http://pages.rstudio.net/Webinar-Dec2015Series_Registration.html">Programming Part 2 (Debugging code in RStudio)</a> - December 9th</p></li><li><p><a href="http://pages.rstudio.net/Webinar-Dec2015Series_Registration.html">Programming Part 3 (Package Writing and in RStudio)</a> - December 16th</p></li><li><p><a href="http://pages.rstudio.net/Webinar-Dec2015Series_Registration.html">Managing Change Part 1 (Projects in RStudio)</a> - January 6th</p></li><li><p><a href="http://pages.rstudio.net/Webinar-Dec2015Series_Registration.html">Managing Change Part 2 (Github and RStudio)</a> - January 20th</p></li><li><p><a href="http://pages.rstudio.net/Webinar-Dec2015Series_Registration.html">Managing Change Part 3 (Package version with Packrat)</a> - February 3rd</p></li></ul><p>Each webinar will be 30 minutes long, which will make them easy to attend. If you miss a live webinar or want to review them, recorded versions will be available to registrants. Register <a href="http://pages.rstudio.net/Webinar-Dec2015Series_Registration.html">here</a>.</p><p>p.s. Don&rsquo;t forget that you can watch many useful past webinars at our <a href="https://www.rstudio.com/resources/webinars/">webinars archive</a>.</p></description></item><item><title>roxygen2 5.0.0</title><link>https://www.rstudio.com/blog/roxygen2-5-0-0/</link><pubDate>Thu, 29 Oct 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/roxygen2-5-0-0/</guid><description><p>roxygen2 5.0.0 is now available on CRAN. roxygen2 helps you document your packages by turning specially formatted inline comments in R&rsquo;s standard Rd format. Learn more at <a href="http://r-pkgs.had.co.nz/man.html">http://r-pkgs.had.co.nz/man.html</a>.</p><p>In this release:</p><ul><li><p>Roxygen records its version in a single place: the <code>RoxygenNote</code> field in your <code>DESCRIPTION</code>. This should make it easier to see what&rsquo;s changed when you upgrade roxygen2, because only files with differences will be modified. Previously every Rd file was modified to update the version number.</p></li><li><p>You can now easily document functions that you&rsquo;ve imported from another package:</p></li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic">#&#39; @importFrom magrittr %&gt;%</span><span style="color:#60a0b0;font-style:italic">#&#39; @export</span>magrittr<span style="color:#666">::</span>`%&gt;%`</code></pre></div><p>All imported-and-re-exported functions will be documented in the same file (<code>rexports.Rd</code>), with a brief descrption and links to the original documentation.</p><ul><li>You can more easily generate package documentation by documenting the special string &ldquo;_PACKAGE&rdquo;:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic">#&#39; @details Details</span><span style="color:#4070a0">&#34;</span><span style="color:#4070a0">_PACKAGE&#34;</span></code></pre></div><p>The title and description will be automatically filled in from the <code>DESCRIPTION</code>.</p><ul><li><p>New tags <code>@rawRd</code> and <code>@rawNamespace</code> allow you to insert raw (unescaped) text in Rd and the <code>NAMESPACE</code>. <code>@evalRd()</code> is similar, but instead of literal Rd, you give it R code that produces literal Rd code when run. This should make it easier to experiment with new types of output.</p></li><li><p>Roxygen2 now parses the source code files in the order specified in the <code>Collate</code> field in <code>DESCRIPTION</code>. This improves the ordering of the generated documentation when using <code>@describeIn</code> and/or <code>@rdname</code> split across several <code>.R</code> files, as often happens when working with S4.</p></li><li><p>The parser has been completely rewritten in C++. This gives a nice performance boost and adds improves the error messages: now get the line number of the tag, not the start of the block.</p></li><li><p><code>@family</code> now cross-links each manual page only once, instread of linking to all aliases.</p></li></ul><p>There were many other minor improvements and bug fixes; please see the <a href="https://github.com/klutometis/roxygen/releases/tag/v5.0.0">release notes</a> for a complete list. A bug thanks goes to all the <a href="https://github.com/klutometis/roxygen/graphs/contributors?from=2015-06-04&amp;to=2015-10-29&amp;type=c">contributors</a> who made this release possible.</p></description></item><item><title>Shiny Developer Conference | Stanford University | January 2016</title><link>https://www.rstudio.com/blog/shiny-developer-conference-stanford-university-january-2016/</link><pubDate>Thu, 29 Oct 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-developer-conference-stanford-university-january-2016/</guid><description><p><em><strong>Update Nov 2 2015:</strong> Wow, that was fast. Registration is full. If you add yourself to the <a href="https://www.eventbrite.com/waitlist?eid=19153967031&amp;tid=0">waitlist</a>, we&rsquo;ll contact you first if/when we do this conference again.</em></p><p>In the three years since we launched Shiny, our focus has been on helping people get started with Shiny. But there&rsquo;s a huge difference between using Shiny and using it <em>well</em>, and we want to start getting serious about helping people use Shiny most effectively. It&rsquo;s the difference between having apps that merely work, and apps that are performant, robust, and maintainable.</p><p>That&rsquo;s why RStudio is thrilled to announce the first ever Shiny Developer Conference, to be held at Stanford University on January 30-31, 2016, three months from today. We&rsquo;ll skip past the basics, and dig into principles and practices that will simultaneously simplify and improve the robustness of your code. We&rsquo;ll introduce you to some brand new tools we&rsquo;ve created to help you build ever larger and more complex apps. And we&rsquo;ll show you what to do if things go wrong.</p><p>Check out the <a href="http://shiny2016.eventbrite.com/">agenda</a> to see the complete lineup of speakers and talks.</p><p>We&rsquo;re capping the conference at just 90 people, so if you&rsquo;d like to level up your Shiny skills, register now at <a href="http://shiny2016.eventbrite.com">http://shiny2016.eventbrite.com</a>.</p><p>Hope to see you there!</p><hr><p><em>Note that this conference is intended for R users who are already comfortable writing Shiny apps. We won&rsquo;t cover the basics of Shiny app creation at all. If you&rsquo;re looking to get started with Shiny, please see our <a href="https://shiny.rstudio.com/tutorial/">tutorial</a>.</em></p></description></item><item><title>readr 0.2.0</title><link>https://www.rstudio.com/blog/readr-0-2-0/</link><pubDate>Wed, 28 Oct 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/readr-0-2-0/</guid><description><p>readr 0.2.0 is now available on CRAN. readr makes it easy to read many types of tabular data, including csv, tsv and fixed width. Compared to base equivalents like <code>read.csv()</code>, readr is much faster and gives more convenient output: it never converts strings to factors, can parse date/times, and it doesn&rsquo;t munge the column names.</p><p>This is a big release, so below I describe the new features divided into four main categories:</p><ul><li><p>Improved support for international data.</p></li><li><p>Column parsing improvements.</p></li><li><p>File parsing improvements, including support for comments.</p></li><li><p>Improved writers.</p></li></ul><p>There were too many minor improvements and bug fixes to describe in detail here. See the <a href="https://github.com/hadley/readr/releases/tag/v0.2.0">release notes</a> for a complete list.</p><h2 id="internationalisation">Internationalisation</h2><p>readr now has a strategy for dealing with settings that vary across languages and localities: <strong>locales</strong>. A locale, created with <code>locale()</code>, includes:</p><ul><li><p>The names of months and days, used when parsing dates.</p></li><li><p>The default time zone, used when parsing datetimes.</p></li><li><p>The character encoding, used when reading non-ASCII strings.</p></li><li><p>Default date format, used when guessing column types.</p></li><li><p>The decimal and grouping marks, used when reading numbers.</p></li></ul><p>I&rsquo;ll cover the most important of these parameters below. For more details, see <code>vignette(&quot;locales&quot;)</code>.To override the default US-centric locale, you pass a custom locale to <code>read_csv()</code>, <code>read_tsv()</code>, or <code>read_fwf()</code>. Rather than showing those funtions here, I&rsquo;ll use the <code>parse_*()</code> functions because they work with character vectors instead of a files, but are otherwise identical.</p><h3 id="names-of-months-and-days">Names of months and days</h3><p>The first argument to <code>locale()</code> is <code>date_names</code> which controls what values are used for month and day names. The easiest way to specify them is with a ISO 639 language code:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">locale</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">ko&#34;</span>) <span style="color:#60a0b0;font-style:italic"># Korean</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;locale&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; Numbers: 123,456.78</span><span style="color:#60a0b0;font-style:italic">#&gt; Formats: %Y%.%m%.%d / %H:%M</span><span style="color:#60a0b0;font-style:italic">#&gt; Timezone: UTC</span><span style="color:#60a0b0;font-style:italic">#&gt; Encoding: UTF-8</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;date_names&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; Days: 일요일 (일), 월요일 (월), 화요일 (화), 수요일 (수), 목요일 (목),</span><span style="color:#60a0b0;font-style:italic">#&gt; 금요일 (금), 토요일 (토)</span><span style="color:#60a0b0;font-style:italic">#&gt; Months: 1월, 2월, 3월, 4월, 5월, 6월, 7월, 8월, 9월, 10월, 11월, 12월</span><span style="color:#60a0b0;font-style:italic">#&gt; AM/PM: 오전/오후</span><span style="color:#06287e">locale</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">fr&#34;</span>) <span style="color:#60a0b0;font-style:italic"># French</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;locale&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; Numbers: 123,456.78</span><span style="color:#60a0b0;font-style:italic">#&gt; Formats: %Y%.%m%.%d / %H:%M</span><span style="color:#60a0b0;font-style:italic">#&gt; Timezone: UTC</span><span style="color:#60a0b0;font-style:italic">#&gt; Encoding: UTF-8</span><span style="color:#60a0b0;font-style:italic">#&gt; &lt;date_names&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; Days: dimanche (dim.), lundi (lun.), mardi (mar.), mercredi (mer.),</span><span style="color:#60a0b0;font-style:italic">#&gt; jeudi (jeu.), vendredi (ven.), samedi (sam.)</span><span style="color:#60a0b0;font-style:italic">#&gt; Months: janvier (janv.), février (févr.), mars (mars), avril (avr.), mai</span><span style="color:#60a0b0;font-style:italic">#&gt; (mai), juin (juin), juillet (juil.), août (août),</span><span style="color:#60a0b0;font-style:italic">#&gt; septembre (sept.), octobre (oct.), novembre (nov.),</span><span style="color:#60a0b0;font-style:italic">#&gt; décembre (déc.)</span><span style="color:#60a0b0;font-style:italic">#&gt; AM/PM: AM/PM</span></code></pre></div><p>This allows you to parse dates in other languages:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">parse_date</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">1 janvier 2015&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">%d %B %Y&#34;</span>, locale <span style="color:#666">=</span> <span style="color:#06287e">locale</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">fr&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;2015-01-01&#34;</span><span style="color:#06287e">parse_date</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">14 oct. 1979&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">%d %b %Y&#34;</span>, locale <span style="color:#666">=</span> <span style="color:#06287e">locale</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">fr&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;1979-10-14&#34;</span></code></pre></div><h3 id="timezones">Timezones</h3><p>readr assumes that times are in <a href="https://en.wikipedia.org/wiki/Coordinated_Universal_Time">Coordinated Universal Time</a>, aka UTC. UTC is the best timezone for data because it doesn&rsquo;t have daylight savings. If your data isn&rsquo;t already in UTC, you&rsquo;ll need to supply a <code>tz</code> in the locale:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">parse_datetime</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">2001-10-10 20:10&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;2001-10-10 20:10:00 UTC&#34;</span><span style="color:#06287e">parse_datetime</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">2001-10-10 20:10&#34;</span>,locale <span style="color:#666">=</span> <span style="color:#06287e">locale</span>(tz <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Pacific/Auckland&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;2001-10-10 20:10:00 NZDT&#34;</span><span style="color:#06287e">parse_datetime</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">2001-10-10 20:10&#34;</span>,locale <span style="color:#666">=</span> <span style="color:#06287e">locale</span>(tz <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Europe/Dublin&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;2001-10-10 20:10:00 IST&#34;</span></code></pre></div><p>List all available times zones with <code>OlsonNames()</code>. If you&rsquo;re American, note that &ldquo;EST&rdquo; is not Eastern Standard Time – it&rsquo;s a Canadian time zone that doesn&rsquo;t have DST! Instead of relying on ambiguous abbreivations, use:</p><ul><li><p>PST/PDT = &ldquo;US/Pacific&rdquo;</p></li><li><p>CST/CDT = &ldquo;US/Central&rdquo;</p></li><li><p>MST/MDT = &ldquo;US/Mountain&rdquo;</p></li><li><p>EST/EDT = &ldquo;US/Eastern&rdquo;</p></li></ul><h3 id="default-formats">Default formats</h3><p>Locales also provide default date and time formats. The time format isn&rsquo;t currently used for anything, but the date format is used when guessing column types. The default date format is <code>%Y-%m-%d</code> because that&rsquo;s unambiguous:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">str</span>(<span style="color:#06287e">parse_guess</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">2010-10-10&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; Date[1:1], format: &#34;2010-10-10&#34;</span></code></pre></div><p>If you&rsquo;re an American, you might want you use your illogical date sytem::</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">str</span>(<span style="color:#06287e">parse_guess</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">01/02/2013&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; chr &#34;01/02/2013&#34;</span><span style="color:#06287e">str</span>(<span style="color:#06287e">parse_guess</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">01/02/2013&#34;</span>,locale <span style="color:#666">=</span> <span style="color:#06287e">locale</span>(date_format <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">%d/%m/%Y&#34;</span>)))<span style="color:#60a0b0;font-style:italic">#&gt; Date[1:1], format: &#34;2013-02-01&#34;</span></code></pre></div><h3 id="character-encoding">Character encoding</h3><p>All readr functions yield strings encoded in UTF-8. This encoding is the most likely to give good results in the widest variety of settings. By default, readr assumes that your input is also in UTF-8, which is less likely to be the case, especially when you&rsquo;re working with older datasets. To parse a dataset that&rsquo;s not in UTF-8, you need to a supply an <code>encoding</code>.The following code creates a string encoded with latin1 (aka ISO-8859-1), and shows how it&rsquo;s different from the string encoded as UTF-8, and how to parse it with readr:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">x <span style="color:#666">&lt;-</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Émigré cause célèbre déjà vu.\n&#34;</span>y <span style="color:#666">&lt;-</span> stringi<span style="color:#666">::</span><span style="color:#06287e">stri_conv</span>(x, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">UTF-8&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Latin1&#34;</span>)<span style="color:#60a0b0;font-style:italic"># These strings look like they&#39;re identical:</span>x<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;Émigré cause célèbre déjà vu.\n&#34;</span>y<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;Émigré cause célèbre déjà vu.\n&#34;</span><span style="color:#06287e">identical</span>(x, y)<span style="color:#60a0b0;font-style:italic">#&gt; [1] TRUE</span><span style="color:#60a0b0;font-style:italic"># But they have different encodings:</span><span style="color:#06287e">Encoding</span>(x)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;UTF-8&#34;</span><span style="color:#06287e">Encoding</span>(y)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;latin1&#34;</span><span style="color:#60a0b0;font-style:italic"># That means while they print the same, their raw (binary)</span><span style="color:#60a0b0;font-style:italic"># representation is actually rather different:</span><span style="color:#06287e">charToRaw</span>(x)<span style="color:#60a0b0;font-style:italic">#&gt; [1] c3 89 6d 69 67 72 c3 a9 20 63 61 75 73 65 20 63 c3 a9 6c c3 a8 62 72</span><span style="color:#60a0b0;font-style:italic">#&gt; [24] 65 20 64 c3 a9 6a c3 a0 20 76 75 2e 0a</span><span style="color:#06287e">charToRaw</span>(y)<span style="color:#60a0b0;font-style:italic">#&gt; [1] c9 6d 69 67 72 e9 20 63 61 75 73 65 20 63 e9 6c e8 62 72 65 20 64 e9</span><span style="color:#60a0b0;font-style:italic">#&gt; [24] 6a e0 20 76 75 2e 0a</span><span style="color:#60a0b0;font-style:italic"># readr expects strings to be encoded as UTF-8. If they&#39;re</span><span style="color:#60a0b0;font-style:italic"># not, you&#39;ll get weird characters</span><span style="color:#06287e">parse_character</span>(x)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;Émigré cause célèbre déjà vu.\n&#34;</span><span style="color:#06287e">parse_character</span>(y)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;\xc9migr\xe9 cause c\xe9l\xe8bre d\xe9j\xe0 vu.\n&#34;</span><span style="color:#60a0b0;font-style:italic"># If you know the encoding, supply it:</span><span style="color:#06287e">parse_character</span>(y, locale <span style="color:#666">=</span> <span style="color:#06287e">locale</span>(encoding <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">latin1&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;Émigré cause célèbre déjà vu.\n&#34;</span></code></pre></div><p>If you don&rsquo;t know what encoding the file uses, try <code>guess_encoding()</code>. It&rsquo;s not 100% perfect (as it&rsquo;s fundamentally a heuristic), but should at least get you pointed in the right direction:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">guess_encoding</span>(y)<span style="color:#60a0b0;font-style:italic">#&gt; encoding confidence</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 ISO-8859-2 0.4</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 ISO-8859-1 0.3</span><span style="color:#60a0b0;font-style:italic"># Note that the first guess produces a valid string,</span><span style="color:#60a0b0;font-style:italic"># but isn&#39;t correct:</span><span style="color:#06287e">parse_character</span>(y, locale <span style="color:#666">=</span> <span style="color:#06287e">locale</span>(encoding <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">ISO-8859-2&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;Émigré cause célčbre déjŕ vu.\n&#34;</span><span style="color:#60a0b0;font-style:italic"># But ISO-8859-1 is another name for latin1</span><span style="color:#06287e">parse_character</span>(y, locale <span style="color:#666">=</span> <span style="color:#06287e">locale</span>(encoding <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">ISO-8859-1&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;Émigré cause célèbre déjà vu.\n&#34;</span></code></pre></div><h3 id="numbers">Numbers</h3><p>Some countries use the decimal point, while others use the decimal comma. The <code>decimal_mark</code> option controls which readr uses when parsing doubles:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">parse_double</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">1,23&#34;</span>, locale <span style="color:#666">=</span> <span style="color:#06287e">locale</span>(decimal_mark <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">,&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; [1] 1.23</span></code></pre></div><p>The <code>big_mark</code> option describes which character is used to space groups of digits. Do you write <code>1,000,000</code>, <code>1.000.000</code>, <code>1 000 000</code>, or <code>1'000'000</code>? Specifying the grouping mark allows <code>parse_number()</code> to parse large number as they&rsquo;re commonly written:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">parse_number</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">1,234.56&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] 1234.56</span><span style="color:#60a0b0;font-style:italic"># dplyr is smart enough to guess that if you&#39;re using , for</span><span style="color:#60a0b0;font-style:italic"># decimals then you&#39;re probably using . for grouping:</span><span style="color:#06287e">parse_number</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">1.234,56&#34;</span>, locale <span style="color:#666">=</span> <span style="color:#06287e">locale</span>(decimal_mark <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">,&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; [1] 1234.56</span></code></pre></div><h2 id="column-parsing-improvements">Column parsing improvements</h2><p>One of the most useful parts of readr are the column parsers: the tools that turns character input into usefully typed data frame columns. This process is now described more fully in a new vignette: <code>vignette(&quot;column-types&quot;)</code>.By default, column types are guessed by looking at the data. I&rsquo;ve made a number of tweaks to make it more likely that your code will load correctly the first time:</p><ul><li><p>readr now looks at the first 1000 rows (instead of just the first 100) when guessing column types: this only takes a fraction more time, but should hopefully yield better guesses for more inputs.</p></li><li><p><code>col_date()</code> and <code>col_datetime()</code> no longer recognise partial dates like 19, 1900, 1900-01. These triggered many false positives and after re-reading the ISO8601 spec, I believe they actually refer to periods of time, so should not be parsed into a specific instant.</p></li><li><p><code>col_integer()</code> no longer recognises values started with zeros (e.g. 0001) as these are often used as identifiers.</p></li><li><p><code>col_number()</code> will automatically recognise numbers containing the grouping mark (see below for more details).</p></li></ul><p>But you can override these defaults with the <code>col_types()</code> argument. In this version, <code>col_types</code> gains some much needed flexibility:</p><ul><li>New <code>cols()</code> function takes of assembling the list of column types, and with its <code>.default</code> argument, allows you to control the default column type:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">read_csv</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">x,y\n1,2&#34;</span>, col_types <span style="color:#666">=</span> <span style="color:#06287e">cols</span>(.default <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">c&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [1 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x y</span><span style="color:#60a0b0;font-style:italic">#&gt; (chr) (chr)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 2</span></code></pre></div><p>You can refer to parsers with their full name (e.g. <code>col_character()</code>) or their one letter abbreviation (e.g. <code>c</code>). The default value of <code>.default</code> is &ldquo;?&quot;: guess the type of column from the data.</p><ul><li><code>cols_only()</code> allows you to load only the specified columns:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">read_csv</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a,b,c\n1,2,3&#34;</span>, col_types <span style="color:#666">=</span> <span style="color:#06287e">cols_only</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">b&#34;</span> <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">?&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [1 x 1]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; b</span><span style="color:#60a0b0;font-style:italic">#&gt; (int)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 2</span></code></pre></div><p>Many of the individual parsers have also been improved:</p><ul><li><p><code>col_integer()</code> and <code>col_double()</code> no longer silently ignore trailing characters after the number.</p></li><li><p>New <code>col_number()</code>/<code>parse_number()</code> replace the old <code>col_numeric()</code>/ <code>parse_numeric()</code>. This parser is less flexible, so it&rsquo;s less likely to silently ignored bad input. It&rsquo;s designed specifically to read currencies and percentages. It only reads the first number from a string, ignoring the grouping mark defined by the locale:</p></li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">parse_number</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">1,234,566&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] 1234566</span><span style="color:#06287e">parse_number</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">$1,234&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] 1234</span><span style="color:#06287e">parse_number</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">27%&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] 27</span></code></pre></div><ul><li>New <code>parse_time()</code> and <code>col_time()</code> allow you to parse times. They have an optional <code>format</code> argument, that uses the same components as <code>parse_datetime()</code>. If <code>format</code> is omitted, they use a flexible parser that looks for hours, then an optional colon, then minutes, then an optional colon, then optional seconds, then optional am/pm.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">parse_time</span>(<span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">1:45 PM&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">1345&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">13:45:00&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; [1] 13:45:00 13:45:00 13:45:00</span></code></pre></div><p><code>parse_time()</code> returns the number of seconds since midnight as an integer with class &ldquo;time&rdquo;. readr includes a basic print method.</p><ul><li><code>parse_date()</code>/<code>col_date()</code> and <code>parse_datetime()</code>/<code>col_datetime()</code> gain two new format strings: &ldquo;%+&rdquo; skips one or more non-digits, and <code>%p</code> reads in AM/PM (and am/pm).</li></ul><h2 id="file-parsing-improvements">File parsing improvements</h2><p><code>read_csv()</code>, <code>read_tsv()</code>, and <code>read_delim()</code> gain extra arguments that allow you to parse more files:</p><ul><li>Multiple NA values can be specified by passing a character vector to <code>na</code>. The default has been changed to <code>na = c(&quot;&quot;, &quot;NA&quot;)</code>.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">read_csv</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a,b\n.,NA\n1,3&#34;</span>, na <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">.&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">NA&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [2 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; a b</span><span style="color:#60a0b0;font-style:italic">#&gt; (int) (int)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 NA NA</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 1 3</span></code></pre></div><ul><li>New <code>comment</code> argument allows you to ignore all text after a string:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">read_csv</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">#This is a comment</span><span style="color:#4070a0">#This is another comment</span><span style="color:#4070a0">a,b</span><span style="color:#4070a0">1,10</span><span style="color:#4070a0">2,20&#34;</span>, comment <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">#&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [2 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; a b</span><span style="color:#60a0b0;font-style:italic">#&gt; (int) (int)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 10</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 20</span></code></pre></div><ul><li><code>trim_ws</code> argument controls whether leading and trailing whitespace is removed. It defaults to <code>TRUE</code>.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">read_csv</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a,b\n 1, 2&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [1 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; a b</span><span style="color:#60a0b0;font-style:italic">#&gt; (int) (int)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 2</span><span style="color:#06287e">read_csv</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a,b\n 1, 2&#34;</span>, trim_ws <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">FALSE</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [1 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; a b</span><span style="color:#60a0b0;font-style:italic">#&gt; (chr) (chr)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 2</span></code></pre></div><p>Specifying the wrong number of column names, or having rows with an unexpected number of columns, now gives a warning, rather than an error:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">read_csv</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a,b,c\n1,2\n1,2,3,4&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Warning: 2 parsing failures.</span><span style="color:#60a0b0;font-style:italic">#&gt; row col expected actual</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 -- 3 columns 2 columns</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 -- 3 columns 4 columns</span><span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [2 x 3]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; a b c</span><span style="color:#60a0b0;font-style:italic">#&gt; (int) (int) (int)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 2 NA</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 1 2 3</span></code></pre></div><p>Note that the warning message now also shows you the first five problems. I hope this will often allow you to iterate immediately, rather than having to look at the full <code>problems()</code>.</p><h2 id="writers">Writers</h2><p>Despite the name, readr also provides some tools for writing data frames to disk. In this version there are three output functions:</p><ul><li><p><code>write_csv()</code> and <code>write_tsv()</code> write tab and comma delimted files, and <code>write_delim()</code> writes with user specified delimiter.</p></li><li><p><code>write_rds()</code> and <code>read_rds()</code> wrap around <code>readRDS()</code> and <code>saveRDS()</code>, defaulting to no compression, because you&rsquo;re usually more interested in saving time (expensive) than disk space (cheap).</p></li></ul><p>All these functions invisibly return their output so you can use them as part of a pipeline:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">my_df <span style="color:#666">%&gt;%</span><span style="color:#06287e">some_manipulation</span>() <span style="color:#666">%&gt;%</span><span style="color:#06287e">write_csv</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">interim-a.csv&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">some_more_manipulation</span>() <span style="color:#666">%&gt;%</span><span style="color:#06287e">write_csv</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">interim-b.csv&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">even_more_manipulation</span>() <span style="color:#666">%&gt;%</span><span style="color:#06287e">write_csv</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">final.csv&#34;</span>)</code></pre></div><p>You can now control how missing values are written with the <code>na</code> argument, and the quoting algorithm has been further refined to only add quotes when needed: when the string contains a quote, the delimiter, a new line or the same text as missing value.Output for doubles now uses the same precision as R, and POSIXt vectors are saved in a ISO8601 compatible format.For testing, you can use <code>format_csv()</code>, <code>format_tsv()</code>, and <code>format_delim()</code> to write csv to a string:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">mtcars <span style="color:#666">%&gt;%</span><span style="color:#06287e">head</span>(<span style="color:#40a070">4</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">format_csv</span>() <span style="color:#666">%&gt;%</span><span style="color:#06287e">cat</span>()<span style="color:#60a0b0;font-style:italic">#&gt; mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb</span><span style="color:#60a0b0;font-style:italic">#&gt; 21,6,160,110,3.9,2.62,16.46,0,1,4,4</span><span style="color:#60a0b0;font-style:italic">#&gt; 21,6,160,110,3.9,2.875,17.02,0,1,4,4</span><span style="color:#60a0b0;font-style:italic">#&gt; 22.8,4,108,93,3.85,2.32,18.61,1,1,4,1</span><span style="color:#60a0b0;font-style:italic">#&gt; 21.4,6,258,110,3.08,3.215,19.44,1,0,3,1</span></code></pre></div><p>This is particularly useful for generating <a href="https://github.com/jennybc/reprex">reprexes</a>.</p></description></item><item><title>testthat 0.11.0</title><link>https://www.rstudio.com/blog/testthat-0-11-0/</link><pubDate>Thu, 15 Oct 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/testthat-0-11-0/</guid><description><p>testthat 0.11.0 is now available on CRAN. Testthat makes it easy to turn your existing informal tests into formal automated tests that you can rerun quickly and easily. Learn more at <a href="http://r-pkgs.had.co.nz/tests.html">http://r-pkgs.had.co.nz/tests.html</a>. Install the latest version with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">testthat&#34;</span>)</code></pre></div><p>In this version:</p><ul><li>New <code>expect_silent()</code> ensures that code produces no output, messages, or warnings. <code>expect_output()</code>, <code>expect_message()</code>, <code>expect_warning()</code>, and <code>expect_error()</code> now accept <code>NA</code> as the second argument to indicate that there shouldn&rsquo;t be any output, messages, warnings, or errors (i.e. they should be missing)</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">f <span style="color:#666">&lt;-</span> <span style="color:#06287e">function</span>() {<span style="color:#06287e">print</span>(<span style="color:#40a070">1</span>)<span style="color:#06287e">message</span>(<span style="color:#40a070">2</span>)<span style="color:#06287e">warning</span>(<span style="color:#40a070">3</span>)}<span style="color:#06287e">expect_silent</span>(<span style="color:#06287e">f</span>())<span style="color:#60a0b0;font-style:italic">#&gt; Error: f() produced output, warnings, messages</span><span style="color:#06287e">expect_warning</span>(<span style="color:#06287e">log</span>(<span style="color:#40a070">-1</span>), <span style="color:#007020;font-weight:bold">NA</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Error: log(-1) expected no warnings:</span><span style="color:#60a0b0;font-style:italic">#&gt; * NaNs produced</span></code></pre></div><ul><li><p>Praise gets more diverse thanks to Gabor Csardi&rsquo;s <a href="https://github.com/gaborcsardi/praise">praise</a> package, and you now also get random encouragment if your tests don&rsquo;t pass.</p></li><li><p>testthat no longer muffles warning messages. This was a bug in the previous version, as warning messages are usually important and should be dealt with explicitly, either by resolving the problem or explicitly capturing them with <code>expect_warning()</code>.</p></li><li><p>Two new skip functions make it easier to skip tests that don&rsquo;t work in certain environments: <code>skip_on_os()</code> skips tests on the specified operating system, and <code>skip_on_appveyor()</code> skips tests on <a href="http://www.appveyor.com">Appveyor</a>.</p></li></ul><p>There were a number of other minor improvements and bug fixes. See the <a href="https://github.com/hadley/testthat/releases/tag/v0.11.0">release notes</a> for a complete list.</p><p>A big thanks goes out to all the contributors who made this release happen. There&rsquo;s no way I could be as productive without the fantastic commmunity of R developers who come up with thoughtful new features, and who discover and fix my bugs!</p></description></item><item><title>purrr 0.1.0</title><link>https://www.rstudio.com/blog/purrr-0-1-0/</link><pubDate>Tue, 29 Sep 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/purrr-0-1-0/</guid><description><p>Purrr is a new package that fills in the missing pieces in R&rsquo;s functional programming tools: it&rsquo;s designed to make your pure functions purrr. Like many of my recent packages, it works with <a href="https://github.com/smbache/magrittr">magrittr</a> to allow you to express complex operations by combining simple pieces in a standard way.</p><p>Install it with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">purrr&#34;</span>)</code></pre></div><p>Purrr wouldn&rsquo;t be possible without <a href="https://github.com/lionel-">Lionel Henry</a>. He wrote a lot of the package and his insightful comments helped me rapidly iterate towards a stable, useful, and understandable package.</p><h2 id="map-functions">Map functions</h2><p>The core of purrr is a set of functions for manipulating vectors (atomic vectors, lists, and data frames). The goal is similar to dplyr: help you tackle the most common 90% of data manipulation challenges. But where dplyr focusses on data frames, purrr focusses on vectors. For example, the following code splits the built-in mtcars dataset up by number of cylinders (using the base <code>split()</code> function), fits a linear model to each piece, summarises each model, then extracts the the (R^2):</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">mtcars <span style="color:#666">%&gt;%</span><span style="color:#06287e">split</span>(.$cyl) <span style="color:#666">%&gt;%</span><span style="color:#06287e">map</span>(<span style="color:#666">~</span><span style="color:#06287e">lm</span>(mpg <span style="color:#666">~</span> wt, data <span style="color:#666">=</span> .)) <span style="color:#666">%&gt;%</span><span style="color:#06287e">map</span>(summary) <span style="color:#666">%&gt;%</span><span style="color:#06287e">map_dbl</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">r.squared&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; 4 6 8</span><span style="color:#60a0b0;font-style:italic">#&gt; 0.509 0.465 0.423</span></code></pre></div><p>The first argument to all map functions is the vector to operate on. The second argument, <code>.f</code> specifies what to do with each piece. It can be:</p><ul><li><p>A function, like <code>summary()</code>.</p></li><li><p>A formula, which is converted to an anonymous function, so that <code>~ lm(mpg ~ wt, data = .)</code> is shorthand for <code>function(x) lm(mpg ~ wt, data = x)</code>.</p></li><li><p>A string or number, which is used to extract components, i.e. <code>&quot;r.squared&quot;</code> is shorthand for <code>function(x) x[[r.squared]]</code> and <code>1</code> is shorthand for <code>function(x) x[[1]]</code>.</p></li></ul><p>Map functions come in a few different variations based on their inputs and output:</p><ul><li><p><code>map()</code> takes a vector (list or atomic vector) and returns a list. <code>map_lgl()</code>, <code>map_int()</code>, <code>map_dbl()</code>, and <code>map_chr()</code> take a vector and return an atomic vector. <code>flatmap()</code> works similarly, but allows the function to return arbitrary length vectors.</p></li><li><p><code>map_if()</code> only applies <code>.f</code> to those elements of the list where <code>.p</code> is true. For example, the following snippet converts factors into characters:</p></li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">iris <span style="color:#666">%&gt;%</span> <span style="color:#06287e">map_if</span>(is.factor, as.character) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">str</span>()<span style="color:#60a0b0;font-style:italic">#&gt; &#39;data.frame&#39;: 150 obs. of 5 variables:</span><span style="color:#60a0b0;font-style:italic">#&gt; $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...</span><span style="color:#60a0b0;font-style:italic">#&gt; $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...</span><span style="color:#60a0b0;font-style:italic">#&gt; $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...</span><span style="color:#60a0b0;font-style:italic">#&gt; $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...</span><span style="color:#60a0b0;font-style:italic">#&gt; $ Species : chr &#34;setosa&#34; &#34;setosa&#34; &#34;setosa&#34; &#34;setosa&#34; ...</span></code></pre></div><p><code>map_at()</code> works similarly but instead of working with a logical vector or predicate function, it works with a integer vector of element positions.</p><ul><li><code>map2()</code> takes a pair of lists and iterates through them in parallel:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">map2</span>(<span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">3</span>, <span style="color:#40a070">2</span><span style="color:#666">:</span><span style="color:#40a070">4</span>, c)<span style="color:#60a0b0;font-style:italic">#&gt; [[1]]</span><span style="color:#60a0b0;font-style:italic">#&gt; [1] 1 2</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; [[2]]</span><span style="color:#60a0b0;font-style:italic">#&gt; [1] 2 3</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; [[3]]</span><span style="color:#60a0b0;font-style:italic">#&gt; [1] 3 4</span><span style="color:#06287e">map2</span>(<span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">3</span>, <span style="color:#40a070">2</span><span style="color:#666">:</span><span style="color:#40a070">4</span>, <span style="color:#666">~</span> .x <span style="color:#666">*</span> (.y <span style="color:#666">-</span> <span style="color:#40a070">1</span>))<span style="color:#60a0b0;font-style:italic">#&gt; [[1]]</span><span style="color:#60a0b0;font-style:italic">#&gt; [1] 1</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; [[2]]</span><span style="color:#60a0b0;font-style:italic">#&gt; [1] 4</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; [[3]]</span><span style="color:#60a0b0;font-style:italic">#&gt; [1] 9</span></code></pre></div><p><code>map3()</code> does the same thing for three lists, and <code>map_n()</code> does it in general.</p><ul><li><code>invoke()</code>, <code>invoke_lgl()</code>, <code>invoke_int()</code>, <code>invoke_dbl()</code>, and <code>invoke_chr()</code> take a list of functions, and call each one with the supplied arguments:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">list</span>(m1 <span style="color:#666">=</span> mean, m2 <span style="color:#666">=</span> median) <span style="color:#666">%&gt;%</span><span style="color:#06287e">invoke_dbl</span>(<span style="color:#06287e">rcauchy</span>(<span style="color:#40a070">100</span>))<span style="color:#60a0b0;font-style:italic">#&gt; m1 m2</span><span style="color:#60a0b0;font-style:italic">#&gt; 9.765 0.117</span></code></pre></div><ul><li><code>walk()</code> takes a vector, calls a function on piece, and returns its original input. It&rsquo;s useful for functions called for their side-effects; it returns the input so you can use it in a pipe.</li></ul><h3 id="purrr-and-dplyr">Purrr and dplyr</h3><p>I&rsquo;m becoming increasingly enamoured with the list-columns in data frames. The following example combines purrr and dplyr to generate 100 random test-training splits in order to compute an unbiased estimate of prediction quality. These tools are still experimental (and currently need quite a bit of extra scaffolding), but I think the basic approach is really appealing.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(dplyr)random_group <span style="color:#666">&lt;-</span> <span style="color:#06287e">function</span>(n, probs) {probs <span style="color:#666">&lt;-</span> probs <span style="color:#666">/</span> <span style="color:#06287e">sum</span>(probs)g <span style="color:#666">&lt;-</span> <span style="color:#06287e">findInterval</span>(<span style="color:#06287e">seq</span>(<span style="color:#40a070">0</span>, <span style="color:#40a070">1</span>, length <span style="color:#666">=</span> n), <span style="color:#06287e">c</span>(<span style="color:#40a070">0</span>, <span style="color:#06287e">cumsum</span>(probs)),rightmost.closed <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>)<span style="color:#06287e">names</span>(probs)<span style="color:#06287e">[sample</span>(g)]}partition <span style="color:#666">&lt;-</span> <span style="color:#06287e">function</span>(df, n, probs) {n <span style="color:#666">%&gt;%</span><span style="color:#06287e">replicate</span>(<span style="color:#06287e">split</span>(df, <span style="color:#06287e">random_group</span>(<span style="color:#06287e">nrow</span>(df), probs)), <span style="color:#007020;font-weight:bold">FALSE</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">zip_n</span>() <span style="color:#666">%&gt;%</span><span style="color:#06287e">as_data_frame</span>()}msd <span style="color:#666">&lt;-</span> <span style="color:#06287e">function</span>(x, y) <span style="color:#06287e">sqrt</span>(<span style="color:#06287e">mean</span>((x <span style="color:#666">-</span> y) ^ <span style="color:#40a070">2</span>))<span style="color:#60a0b0;font-style:italic"># Genearte 100 random test-training splits,</span>cv <span style="color:#666">&lt;-</span> mtcars <span style="color:#666">%&gt;%</span><span style="color:#06287e">partition</span>(<span style="color:#40a070">100</span>, <span style="color:#06287e">c</span>(training <span style="color:#666">=</span> <span style="color:#40a070">0.8</span>, test <span style="color:#666">=</span> <span style="color:#40a070">0.2</span>)) <span style="color:#666">%&gt;%</span><span style="color:#06287e">mutate</span>(<span style="color:#60a0b0;font-style:italic"># Fit the model</span>model <span style="color:#666">=</span> <span style="color:#06287e">map</span>(training, <span style="color:#666">~</span> <span style="color:#06287e">lm</span>(mpg <span style="color:#666">~</span> wt, data <span style="color:#666">=</span> .)),<span style="color:#60a0b0;font-style:italic"># Make predictions on test data</span>pred <span style="color:#666">=</span> <span style="color:#06287e">map2</span>(model, test, predict),<span style="color:#60a0b0;font-style:italic"># Calculate mean squared difference</span>diff <span style="color:#666">=</span> <span style="color:#06287e">map2</span>(pred, test <span style="color:#666">%&gt;%</span> <span style="color:#06287e">map</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">mpg&#34;</span>), msd) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">flatten</span>())cv<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [100 x 5]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; test training model pred diff</span><span style="color:#60a0b0;font-style:italic">#&gt; (list) (list) (list) (list) (dbl)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 &lt;data.frame [7,11]&gt; &lt;data.frame [25,11]&gt; &lt;S3:lm&gt; &lt;dbl[7]&gt; 3.70</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 &lt;data.frame [7,11]&gt; &lt;data.frame [25,11]&gt; &lt;S3:lm&gt; &lt;dbl[7]&gt; 2.03</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 &lt;data.frame [7,11]&gt; &lt;data.frame [25,11]&gt; &lt;S3:lm&gt; &lt;dbl[7]&gt; 2.29</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 &lt;data.frame [7,11]&gt; &lt;data.frame [25,11]&gt; &lt;S3:lm&gt; &lt;dbl[7]&gt; 4.88</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 &lt;data.frame [7,11]&gt; &lt;data.frame [25,11]&gt; &lt;S3:lm&gt; &lt;dbl[7]&gt; 3.20</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 &lt;data.frame [7,11]&gt; &lt;data.frame [25,11]&gt; &lt;S3:lm&gt; &lt;dbl[7]&gt; 4.68</span><span style="color:#60a0b0;font-style:italic">#&gt; 7 &lt;data.frame [7,11]&gt; &lt;data.frame [25,11]&gt; &lt;S3:lm&gt; &lt;dbl[7]&gt; 3.39</span><span style="color:#60a0b0;font-style:italic">#&gt; 8 &lt;data.frame [7,11]&gt; &lt;data.frame [25,11]&gt; &lt;S3:lm&gt; &lt;dbl[7]&gt; 3.82</span><span style="color:#60a0b0;font-style:italic">#&gt; 9 &lt;data.frame [7,11]&gt; &lt;data.frame [25,11]&gt; &lt;S3:lm&gt; &lt;dbl[7]&gt; 2.56</span><span style="color:#60a0b0;font-style:italic">#&gt; 10 &lt;data.frame [7,11]&gt; &lt;data.frame [25,11]&gt; &lt;S3:lm&gt; &lt;dbl[7]&gt; 3.40</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ... ... ... ... ...</span><span style="color:#06287e">mean</span>(cv<span style="color:#666">$</span>diff)<span style="color:#60a0b0;font-style:italic">#&gt; [1] 3.22</span></code></pre></div><h2 id="other-functions">Other functions</h2><p>There are too many other pieces of purrr to describe in detail here. A few of the most useful functions are noted below:</p><ul><li><code>zip_n()</code> allows you to turn a list of lists &ldquo;inside-out&rdquo;:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">x <span style="color:#666">&lt;-</span> <span style="color:#06287e">list</span>(<span style="color:#06287e">list</span>(a <span style="color:#666">=</span> <span style="color:#40a070">1</span>, b <span style="color:#666">=</span> <span style="color:#40a070">2</span>), <span style="color:#06287e">list</span>(a <span style="color:#666">=</span> <span style="color:#40a070">2</span>, b <span style="color:#666">=</span> <span style="color:#40a070">1</span>))x <span style="color:#666">%&gt;%</span> <span style="color:#06287e">str</span>()<span style="color:#60a0b0;font-style:italic">#&gt; List of 2</span><span style="color:#60a0b0;font-style:italic">#&gt; $ :List of 2</span><span style="color:#60a0b0;font-style:italic">#&gt; ..$ a: num 1</span><span style="color:#60a0b0;font-style:italic">#&gt; ..$ b: num 2</span><span style="color:#60a0b0;font-style:italic">#&gt; $ :List of 2</span><span style="color:#60a0b0;font-style:italic">#&gt; ..$ a: num 2</span><span style="color:#60a0b0;font-style:italic">#&gt; ..$ b: num 1</span>x <span style="color:#666">%&gt;%</span><span style="color:#06287e">zip_n</span>() <span style="color:#666">%&gt;%</span><span style="color:#06287e">str</span>()<span style="color:#60a0b0;font-style:italic">#&gt; List of 2</span><span style="color:#60a0b0;font-style:italic">#&gt; $ a:List of 2</span><span style="color:#60a0b0;font-style:italic">#&gt; ..$ : num 1</span><span style="color:#60a0b0;font-style:italic">#&gt; ..$ : num 2</span><span style="color:#60a0b0;font-style:italic">#&gt; $ b:List of 2</span><span style="color:#60a0b0;font-style:italic">#&gt; ..$ : num 2</span><span style="color:#60a0b0;font-style:italic">#&gt; ..$ : num 1</span>x <span style="color:#666">%&gt;%</span><span style="color:#06287e">zip_n</span>(.simplify <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">str</span>()<span style="color:#60a0b0;font-style:italic">#&gt; List of 2</span><span style="color:#60a0b0;font-style:italic">#&gt; $ a: num [1:2] 1 2</span><span style="color:#60a0b0;font-style:italic">#&gt; $ b: num [1:2] 2 1</span></code></pre></div><ul><li><code>keep()</code> and <code>discard()</code> allow you to filter a vector based on a predicate function. <code>compact()</code> is a helpful wrapper that throws away empty elements of a list.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">10</span> <span style="color:#666">%&gt;%</span> <span style="color:#06287e">keep</span>(<span style="color:#666">~</span>. <span style="color:#666">%%</span> <span style="color:#40a070">2</span> <span style="color:#666">==</span> <span style="color:#40a070">0</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] 2 4 6 8 10</span><span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">10</span> <span style="color:#666">%&gt;%</span> <span style="color:#06287e">discard</span>(<span style="color:#666">~</span>. <span style="color:#666">%%</span> <span style="color:#40a070">2</span> <span style="color:#666">==</span> <span style="color:#40a070">0</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] 1 3 5 7 9</span><span style="color:#06287e">list</span>(<span style="color:#06287e">list</span>(x <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>, y <span style="color:#666">=</span> <span style="color:#40a070">10</span>), <span style="color:#06287e">list</span>(x <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">FALSE</span>, y <span style="color:#666">=</span> <span style="color:#40a070">20</span>)) <span style="color:#666">%&gt;%</span><span style="color:#06287e">keep</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">x&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">str</span>()<span style="color:#60a0b0;font-style:italic">#&gt; List of 1</span><span style="color:#60a0b0;font-style:italic">#&gt; $ :List of 2</span><span style="color:#60a0b0;font-style:italic">#&gt; ..$ x: logi TRUE</span><span style="color:#60a0b0;font-style:italic">#&gt; ..$ y: num 10</span><span style="color:#06287e">list</span>(<span style="color:#007020;font-weight:bold">NULL</span>, <span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">3</span>, <span style="color:#007020;font-weight:bold">NULL</span>, <span style="color:#40a070">7</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">compact</span>() <span style="color:#666">%&gt;%</span><span style="color:#06287e">str</span>()<span style="color:#60a0b0;font-style:italic">#&gt; List of 2</span><span style="color:#60a0b0;font-style:italic">#&gt; $ : int [1:3] 1 2 3</span><span style="color:#60a0b0;font-style:italic">#&gt; $ : num 7</span></code></pre></div><ul><li><p><code>lift()</code> (and friends) allow you to convert a function that takes multiple arguments into a function that takes a list. It helps you compose functions by lifting their domain from a kind of input to another kind. The domain can be changed to and from a list (l), a vector (v) and dots (d).</p></li><li><p><code>cross2()</code>, <code>cross3()</code> and <code>cross_n()</code> allow you to create the Cartesian product of the inputs (with optional filtering).</p></li><li><p>A number of functions let you manipulate functions: <code>negate()</code>, <code>compose()</code>, <code>partial()</code>.</p></li><li><p>A complete set of predicate functions provides predictable versions of the <code>is.*</code> functions: <code>is_logical()</code>, <code>is_list()</code>, <code>is_bare_double()</code>, <code>is_scalar_character()</code>, etc.</p></li><li><p>Other equivalents functions wrap existing base R functions into to the consistent design of purrr: <code>replicate()</code> -&gt; <code>rerun()</code>, <code>Reduce()</code> -&gt; <code>reduce()</code>, <code>Find()</code> -&gt; <code>detect()</code>, <code>Position()</code> -&gt; <code>detect_index()</code>.</p></li></ul><h2 id="design-philosophy">Design philosophy</h2><p>The goal of purrr is not try and turn R into Haskell in R: it does not implement currying, or destructuring binds, or pattern matching. The goal is to give you similar expressiveness to a classical FP language, while allowing you to write code that looks and feels like R.</p><ul><li><p>Anonymous functions are verbose in R, so we provide two convenient shorthands. For predicate functions, <code>~ .x + 1</code> is equivalent to <code>function(.x) .x + 1</code>. For chains of transformations functions, <code>. %&gt;% f() %&gt;% g()</code> is equivalent to <code>function(.) . %&gt;% f() %&gt;% g()</code>.</p></li><li><p>R is weakly typed, so we can implement general <code>zip_n()</code>, rather than having to specialise on the number of arguments. That said, we still provide <code>map2()</code> and <code>map3()</code> since it&rsquo;s useful to clearly separate which arguments are vectorised over. Functions are designed to be output type-stable (respecting <a href="https://en.wikipedia.org/wiki/Robustness_principle">Postel&rsquo;s law</a>) so you can rely on the output being as you expect.</p></li><li><p>R has named arguments, so instead of providing different functions for minor variations (e.g. <code>detect()</code> and <code>detectLast()</code>) we use a named arguments.</p></li><li><p>Instead of currying, we use <code>...</code> to pass in extra arguments. Arguments of purrr functions always start with <code>.</code> to avoid matching to the arguments of <code>.f</code> passed in via <code>...</code>.</p></li><li><p>Instead of point free style, use the pipe, <code>%&gt;%</code>, to write code that can be read from left to right.</p></li></ul></description></item><item><title>rvest 0.3.0</title><link>https://www.rstudio.com/blog/rvest-0-3-0/</link><pubDate>Thu, 24 Sep 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rvest-0-3-0/</guid><description><p>I&rsquo;m pleased to announce rvest 0.3.0 is now available on CRAN. <a href="https://blog.rstudio.com/2014/11/24/rvest-easy-web-scraping-with-r/">Rvest</a> makes it easy to scrape (or harvest) data from html web pages, inspired by libraries like <a href="http://www.crummy.com/software/BeautifulSoup/">beautiful soup</a>. It is designed to work with <a href="https://github.com/smbache/magrittr">pipes</a> so that you can express complex operations by composed simple pieces. Install it with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">rvest&#34;</span>)</code></pre></div><h2 id="whats-new">What&rsquo;s new</h2><p>The biggest change in this version is that rvest now uses the <a href="https://blog.rstudio.com/2015/04/21/xml2/">xml2</a> package instead of <a href="https://cran.r-project.org/web/packages/XML/index.html">XML</a>. This makes rvest much simpler, eliminates memory leaks, and should improve performance a little.</p><p>A number of functions have changed names to improve consistency with other packages: most importantly <code>html()</code> is now <code>read_html()</code>, and <code>html_tag()</code> is now <code>html_name()</code>. The old versions still work, but are deprecated and will be removed in rvest 0.4.0.</p><p><code>html_node()</code> now throws an error if there are no matches, and a warning if there&rsquo;s more than one match. I think this should make it more likely to fail clearly when the structure of the page changes. If you don&rsquo;t want this behaviour, use <code>html_nodes()</code>.</p><p>There were a number of other bug fixes and minor improvements as described in the <a href="https://github.com/hadley/rvest/releases/tag/v0.3.0">release notes</a>.</p></description></item><item><title>Are you headed to Strata? It's next week!</title><link>https://www.rstudio.com/blog/2328/</link><pubDate>Wed, 23 Sep 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/2328/</guid><description><p>RStudio will again teach the new essentials for doing (big) data science in R at this year&rsquo;s Strata NYC conference, September 29 2015 (<a href="http://strataconf.com/big-data-conference-ny-2015/public/schedule/detail/44154)">http://strataconf.com/big-data-conference-ny-2015/public/schedule/detail/44154)</a>. You will learn from Garrett Grolemund, Yihui Xie, and Nathan Stephens who are all working on fascinating new ways to keep the R ecosystem apace of the challenges facing those who work with data.</p><p>Topics include:</p><ul><li><p>R Quickstart: Wrangle, transform, and visualize dataInstructor: Garrett Grolemund (90 minutes)</p></li><li><p>Work with Big Data in RInstructor: Nathan Stephens (90 minutes)</p></li><li><p>Reproducible Reports with Big DataInstructor: Yihui Xie (90 minutes)</p></li><li><p>Interactive Shiny Applications built on Big DataInstructor: Garrett Grolemund (90 minutes)</p></li></ul><p>If you plan to stay for the full Strata Conference+Hadoop World be sure to look us up at booth 633 during the Expo Hall hours. We&rsquo;ll have the latest books from RStudio authors and &ldquo;shiny&rdquo; t-shirts to win. Share with us what you&rsquo;re doing with RStudio and get your product and company questions answered by RStudio employees.</p><p>See you in New York City! (<a href="http://strataconf.com/big-data-conference-ny-2015">http://strataconf.com/big-data-conference-ny-2015</a>)</p></description></item><item><title>devtools 1.9.1</title><link>https://www.rstudio.com/blog/devtools-1-9-1/</link><pubDate>Sun, 13 Sep 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/devtools-1-9-1/</guid><description><p>Devtools 1.9.1 is now available on CRAN. Devtools makes package building so easy a package can become your default way to organise code, data, and documentation. You can learn more about developing packages in <a href="http://r-pkgs.had.co.nz/">R packages</a>, my book about package development that&rsquo;s freely available online..</p><p>Get the latest version of devtools with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">devtools&#34;</span>)</code></pre></div><p>There are three major improvements that I contributed:</p><ul><li><p><code>check()</code> is now much closer to what CRAN does - it passes on <code>--as-cran</code> to <code>R CMD check</code>, using an env var to turn off the incoming CRAN checks. These are turned off because they&rsquo;re slow (they have to retrieve data from CRAN), and are not necessary except just prior to release (so <code>release()</code> turns them back on again).</p></li><li><p><code>install_deps()</code> now automatically upgrades out of date dependencies. This is typically what you want when you&rsquo;re working on a development version of a package: otherwise you can get an unpleasant surprise when you go to submit your package to CRAN and discover it doesn&rsquo;t work with the latest version of its dependencies. To suppress this behaviour, set <code>upgrade_dependencies = FALSE</code>.</p></li><li><p><code>revdep_check()</code> received a number of tweaks that I&rsquo;ve found helpful when preparing my packages for CRAN:</p><ul><li><p>Suggested dependencies of the revdeps are installed by default.</p></li><li><p>The <code>NOT_CRAN</code> env var is set to <code>false</code> so tests that are skipped on CRAN are also skipped for you.</p></li><li><p>The <code>RGL_USE_NULL</code> env var is set to <code>true</code> to stop rgl windows from popping up during testing.</p></li><li><p>All revdep sources are downloaded at the start of the checks. This makes life a bit easier if you&rsquo;re on a flaky internet connection.</p></li></ul></li></ul><p>But like many recent devtools releases, most of the coolest new features have been contributed by the community:</p><ul><li><p><a href="http://www.jimhester.com">Jim Hester</a> implemented experimental remote depedencies for <code>install()</code>. You can now tell devtools where to find dependencies with a remotes field:</p><p>Imports:MASS,testthatRemotes:hadley/testthat</p></li></ul><p>The default allows you to refer to github repos, but you can easily add deps from any of the other sources that devtools supports: see <code>vignette(&quot;dependencies&quot;)</code> for more details.</p><p>Support for installing development dependencies is still experimental so we appreciate any feedback.</p><ul><li><p><a href="http://www.stat.ubc.ca/~jenny/">Jenny Bryan</a> considerably improved the existing GitHub integration. <code>use_github()</code> now pushes to the newly created GitHub repo, and sets a remote tracking branch. It also populates the URL and BugReports fields of your <code>DESCRIPTION</code>.</p></li><li><p><a href="https://github.com/krlmlr">Kirill Müller</a> contributed many bug fixes, minor improvements and test cases.</p></li></ul><p>See the <a href="https://github.com/hadley/devtools/releases/tag/v1.9.1">release notes</a> for complete bug fixes and other minor changes.</p></description></item><item><title>tidyr 0.3.0</title><link>https://www.rstudio.com/blog/tidyr-0-3-0/</link><pubDate>Sun, 13 Sep 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/tidyr-0-3-0/</guid><description><p>tidyr 0.3.0 is now available on CRAN. tidyr makes it easy to &ldquo;tidy&rdquo; your data, storing it in a consistent form so that it&rsquo;s easy to manipulate, visualise and model. Tidy data has variables in columns and observations in rows, and is described in more detail in the <a href="http://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.html">tidy data</a> vignette. Install tidyr with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">tidyr&#34;</span>)</code></pre></div><p>tidyr contains four new verbs: <code>fill()</code>, <code>replace()</code> and <code>complete()</code>, and <code>unnest()</code>, and lots of smaller bug fixes and improvements.</p><h2 id="fill"><code>fill()</code></h2><p>The new fill function fills in missing observations from the last non-missing value. This is useful if you&rsquo;re getting data from Excel users who haven&rsquo;t read Karl Broman&rsquo;s excellent <a href="http://kbroman.org/dataorg/">data organisation guide</a> and <a href="http://kbroman.org/dataorg/pages/no_empty_cells.html">leave cells blank</a> to indicate that the previous value should be carried forward:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df <span style="color:#666">&lt;-</span> dplyr<span style="color:#666">::</span><span style="color:#06287e">data_frame</span>(year <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#40a070">2015</span>, <span style="color:#007020;font-weight:bold">NA</span>, <span style="color:#007020;font-weight:bold">NA</span>, <span style="color:#007020;font-weight:bold">NA</span>),trt <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">A&#34;</span>, <span style="color:#007020;font-weight:bold">NA</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">B&#34;</span>, <span style="color:#007020;font-weight:bold">NA</span>))df<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [4 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; year trt</span><span style="color:#60a0b0;font-style:italic">#&gt; (dbl) (chr)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 2015 A</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 NA NA</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 NA B</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 NA NA</span>df <span style="color:#666">%&gt;%</span> <span style="color:#06287e">fill</span>(year, trt)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [4 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; year trt</span><span style="color:#60a0b0;font-style:italic">#&gt; (dbl) (chr)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 2015 A</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2015 A</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 2015 B</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 2015 B</span></code></pre></div><h2 id="replace_na-and-complete"><code>replace_na()</code> and <code>complete()</code></h2><p><code>replace_na()</code> makes it easy to replace missing values on a column-by-column basis:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df <span style="color:#666">&lt;-</span> dplyr<span style="color:#666">::</span><span style="color:#06287e">data_frame</span>(x <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#40a070">1</span>, <span style="color:#40a070">2</span>, <span style="color:#007020;font-weight:bold">NA</span>),y <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a&#34;</span>, <span style="color:#007020;font-weight:bold">NA</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">b&#34;</span>))df <span style="color:#666">%&gt;%</span> <span style="color:#06287e">replace_na</span>(<span style="color:#06287e">list</span>(x <span style="color:#666">=</span> <span style="color:#40a070">0</span>, y <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">unknown&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [3 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x y</span><span style="color:#60a0b0;font-style:italic">#&gt; (dbl) (chr)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 a</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 unknown</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 0 b</span></code></pre></div><p>It is particularly useful when called from <code>complete()</code>, which makes it easy to fill in missing combinations of your data:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df <span style="color:#666">&lt;-</span> dplyr<span style="color:#666">::</span><span style="color:#06287e">data_frame</span>(group <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">2</span>, <span style="color:#40a070">1</span>),item_id <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">2</span>, <span style="color:#40a070">2</span>),item_name <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">b&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">b&#34;</span>),value1 <span style="color:#666">=</span> <span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">3</span>,value2 <span style="color:#666">=</span> <span style="color:#40a070">4</span><span style="color:#666">:</span><span style="color:#40a070">6</span>)df<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [3 x 5]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; group item_id item_name value1 value2</span><span style="color:#60a0b0;font-style:italic">#&gt; (dbl) (dbl) (chr) (int) (int)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 1 a 1 4</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 2 b 2 5</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 1 2 b 3 6</span>df <span style="color:#666">%&gt;%</span> <span style="color:#06287e">complete</span>(group, <span style="color:#06287e">c</span>(item_id, item_name))<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [4 x 5]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; group item_id item_name value1 value2</span><span style="color:#60a0b0;font-style:italic">#&gt; (dbl) (dbl) (chr) (int) (int)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 1 a 1 4</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 1 2 b 3 6</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 2 1 a NA NA</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 2 2 b 2 5</span>df <span style="color:#666">%&gt;%</span> <span style="color:#06287e">complete</span>(group, <span style="color:#06287e">c</span>(item_id, item_name),fill <span style="color:#666">=</span> <span style="color:#06287e">list</span>(value1 <span style="color:#666">=</span> <span style="color:#40a070">0</span>))<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [4 x 5]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; group item_id item_name value1 value2</span><span style="color:#60a0b0;font-style:italic">#&gt; (dbl) (dbl) (chr) (dbl) (int)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 1 a 1 4</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 1 2 b 3 6</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 2 1 a 0 NA</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 2 2 b 2 5</span></code></pre></div><p>Note how I&rsquo;ve grouped <code>item_id</code> and <code>item_name</code> together with <code>c(item_id, item_name)</code>. This treats them as nested, not crossed, so we don&rsquo;t get every combination of <code>group</code>, <code>item_id</code> and <code>item_name</code>, as we would otherwise:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df <span style="color:#666">%&gt;%</span> <span style="color:#06287e">complete</span>(group, item_id, item_name)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [8 x 5]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; group item_id item_name value1 value2</span><span style="color:#60a0b0;font-style:italic">#&gt; (dbl) (dbl) (chr) (int) (int)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 1 a 1 4</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 1 1 b NA NA</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 1 2 a NA NA</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 1 2 b 3 6</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 2 1 a NA NA</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ... ... ... ... ...</span></code></pre></div><p>Read more about this behaviour in <code>?expand</code>.</p><h2 id="unnest"><code>unnest()</code></h2><p><code>unnest()</code> is out of beta, and is now ready to help you unnest columns that are lists of vectors. This can occur when you have hierarchical data that&rsquo;s been collapsed into a string:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df <span style="color:#666">&lt;-</span> dplyr<span style="color:#666">::</span><span style="color:#06287e">data_frame</span>(x <span style="color:#666">=</span> <span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">2</span>, y <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">1,2&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">3,4,5,6,7&#34;</span>))df<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [2 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x y</span><span style="color:#60a0b0;font-style:italic">#&gt; (int) (chr)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 1,2</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 3,4,5,6,7</span>df <span style="color:#666">%&gt;%</span>dplyr<span style="color:#666">::</span><span style="color:#06287e">mutate</span>(y <span style="color:#666">=</span> <span style="color:#06287e">strsplit</span>(y, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">,&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [2 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x y</span><span style="color:#60a0b0;font-style:italic">#&gt; (int) (list)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 &lt;chr[2]&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 &lt;chr[5]&gt;</span>df <span style="color:#666">%&gt;%</span>dplyr<span style="color:#666">::</span><span style="color:#06287e">mutate</span>(y <span style="color:#666">=</span> <span style="color:#06287e">strsplit</span>(y, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">,&#34;</span>)) <span style="color:#666">%&gt;%</span><span style="color:#06287e">unnest</span>()<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [7 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x y</span><span style="color:#60a0b0;font-style:italic">#&gt; (int) (chr)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 1</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 1 2</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 2 3</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 2 4</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 2 5</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ... ...</span></code></pre></div><p><code>unnest()</code> also works on columns that are lists of data frames. This is admittedly esoteric, but I think it might be useful when you&rsquo;re generating pairs of test-training splits. I&rsquo;m still thinking about this idea, so look for more examples and better support across my packages in the future.</p><h2 id="minor-improvements">Minor improvements</h2><p>There were 13 minor improvements and bug fixes. The most important are listed below. To read about the rest, please consult the <a href="https://github.com/hadley/tidyr/releases/tag/v0.3.0">release notes</a>.</p><ul><li><p><code>%&gt;%</code> is re-exported from magrittr: this means that you no longer need to load dplyr or magrittr if you want to use the pipe.</p></li><li><p><code>extract()</code> and <code>separate()</code> now return multiple NA columns for NA inputs:</p></li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df <span style="color:#666">&lt;-</span> dplyr<span style="color:#666">::</span><span style="color:#06287e">data_frame</span>(x <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a-b&#34;</span>, <span style="color:#007020;font-weight:bold">NA</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">c-d&#34;</span>))df <span style="color:#666">%&gt;%</span> <span style="color:#06287e">separate</span>(x, <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">x&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">y&#34;</span>), <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">-&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [3 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x y</span><span style="color:#60a0b0;font-style:italic">#&gt; (chr) (chr)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 a b</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 NA NA</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 c d</span></code></pre></div><ul><li><code>separate()</code> gains finer control if there are too few matches:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df <span style="color:#666">&lt;-</span> dplyr<span style="color:#666">::</span><span style="color:#06287e">data_frame</span>(x <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a-b-c&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a-c&#34;</span>))df <span style="color:#666">%&gt;%</span> <span style="color:#06287e">separate</span>(x, <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">x&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">y&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">z&#34;</span>), <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">-&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Warning: Too few values at 1 locations: 2</span><span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [2 x 3]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x y z</span><span style="color:#60a0b0;font-style:italic">#&gt; (chr) (chr) (chr)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 a b c</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 a c NA</span>df <span style="color:#666">%&gt;%</span> <span style="color:#06287e">separate</span>(x, <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">x&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">y&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">z&#34;</span>), <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">-&#34;</span>, fill <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">right&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [2 x 3]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x y z</span><span style="color:#60a0b0;font-style:italic">#&gt; (chr) (chr) (chr)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 a b c</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 a c NA</span>df <span style="color:#666">%&gt;%</span> <span style="color:#06287e">separate</span>(x, <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">x&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">y&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">z&#34;</span>), <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">-&#34;</span>, fill <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">left&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [2 x 3]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x y z</span><span style="color:#60a0b0;font-style:italic">#&gt; (chr) (chr) (chr)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 a b c</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 NA c a</span></code></pre></div><p>This complements the support for too many matches:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df <span style="color:#666">&lt;-</span> dplyr<span style="color:#666">::</span><span style="color:#06287e">data_frame</span>(x <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a-b-c&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a-c&#34;</span>))df <span style="color:#666">%&gt;%</span> <span style="color:#06287e">separate</span>(x, <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">x&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">y&#34;</span>), <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">-&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Warning: Too many values at 1 locations: 1</span><span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [2 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x y</span><span style="color:#60a0b0;font-style:italic">#&gt; (chr) (chr)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 a b</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 a c</span>df <span style="color:#666">%&gt;%</span> <span style="color:#06287e">separate</span>(x, <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">x&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">y&#34;</span>), <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">-&#34;</span>, extra <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">merge&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [2 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x y</span><span style="color:#60a0b0;font-style:italic">#&gt; (chr) (chr)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 a b-c</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 a c</span>df <span style="color:#666">%&gt;%</span> <span style="color:#06287e">separate</span>(x, <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">x&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">y&#34;</span>), <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">-&#34;</span>, extra <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">drop&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [2 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x y</span><span style="color:#60a0b0;font-style:italic">#&gt; (chr) (chr)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 a b</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 a c</span></code></pre></div><ul><li>tidyr no longer depends on reshape2. This should fix issues when you load reshape and tidyr at the same time. It also frees tidyr to evolve in a different direction to the more general reshape2.</li></ul></description></item><item><title>dplyr 0.4.3</title><link>https://www.rstudio.com/blog/dplyr-0-4-3/</link><pubDate>Fri, 04 Sep 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/dplyr-0-4-3/</guid><description><p>dplyr 0.4.3 includes over 30 minor improvements and bug fixes, which are described in detail in the <a href="https://github.com/hadley/dplyr/releases/tag/v0.4.3">release notes</a>. Here I wanted to draw your attention five small, but important, changes:</p><ul><li><p><code>mutate()</code> no longer randomly crashes! (Sorry it took us so long to fix this - I know it&rsquo;s been causing a lot of pain.)</p></li><li><p>dplyr now has much better support for non-ASCII column names. It&rsquo;s probably not perfect, but should be a lot better than previous versions.</p></li><li><p>When printing a <code>tbl_df</code>, you now see the types of all columns, not just those that don&rsquo;t fit on the screen:</p></li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">data_frame</span>(x <span style="color:#666">=</span> <span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">3</span>, y <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">letters</span>[x], z <span style="color:#666">=</span> <span style="color:#06287e">factor</span>(y))<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [3 x 3]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x y z</span><span style="color:#60a0b0;font-style:italic">#&gt; (int) (chr) (fctr)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 a a</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 b b</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 3 c c</span></code></pre></div><ul><li><code>bind_rows()</code> gains a <code>.id</code> argument. When supplied, it creates a new column that gives the name of each data frame:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">a <span style="color:#666">&lt;-</span> <span style="color:#06287e">data_frame</span>(x <span style="color:#666">=</span> <span style="color:#40a070">1</span>, y <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a&#34;</span>)b <span style="color:#666">&lt;-</span> <span style="color:#06287e">data_frame</span>(x <span style="color:#666">=</span> <span style="color:#40a070">2</span>, y <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">c&#34;</span>)<span style="color:#06287e">bind_rows</span>(a <span style="color:#666">=</span> a, b <span style="color:#666">=</span> b)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [2 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x y</span><span style="color:#60a0b0;font-style:italic">#&gt; (dbl) (chr)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 a</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 c</span><span style="color:#06287e">bind_rows</span>(a <span style="color:#666">=</span> a, b <span style="color:#666">=</span> b, .id <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">source&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [2 x 3]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; source x y</span><span style="color:#60a0b0;font-style:italic">#&gt; (chr) (dbl) (chr)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 a 1 a</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 b 2 c</span><span style="color:#60a0b0;font-style:italic"># Or equivalently</span><span style="color:#06287e">bind_rows</span>(<span style="color:#06287e">list</span>(a <span style="color:#666">=</span> a, b <span style="color:#666">=</span> b), .id <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">source&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [2 x 3]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; source x y</span><span style="color:#60a0b0;font-style:italic">#&gt; (chr) (dbl) (chr)</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 a 1 a</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 b 2 c</span></code></pre></div><ul><li>dplyr is now more forgiving of unknown attributes. All functions should now copy column attributes from the input to the output, instead of complaining. Additionally <code>arrange()</code>, <code>filter()</code>, <code>slice()</code>, and <code>summarise()</code> preserve attributes of the data frame itself.</li></ul></description></item><item><title>RStudio adds a new Starter Plan, More Active Hours, and a Performance Boost to shinyapps.io</title><link>https://www.rstudio.com/blog/rstudio-adds-a-new-starter-plan-more-active-hours-and-a-performance-boost-to-shinyapps-io/</link><pubDate>Mon, 24 Aug 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-adds-a-new-starter-plan-more-active-hours-and-a-performance-boost-to-shinyapps-io/</guid><description><p>Five months ago we launched <a href="https://www.rstudio.com/products/shinyapps/">shinyapps.io</a>. Since then, more than 25,000 accounts have been created and countless Shiny applications have been deployed. It&rsquo;s incredibly exciting to see!</p><p>It&rsquo;s also given us lots of data and feedback on how we can make <a href="https://www.rstudio.com/products/shinyapps/">shinyapps.io</a> better. Today, we&rsquo;re happy to tell you about some changes to our subscription Plans that we hope will make <a href="https://www.rstudio.com/products/shinyapps/">shinyapps.io</a> an even better experience for Shiny developers and their application users.</p><p><strong>New Starter Plan - More active hours and apps, less money</strong></p><p>For many people the price difference between the Free and the Basic plan was too much. We heard you. Effective today there is a new Starter Plan for only $9 per month or $100 per year. The Starter Plan has the same features as the Free plan but allows 100 active hours per month and up to 25 applications. It&rsquo;s perfect for the active Shiny developer on a budget!</p><p><strong>More Active Hours for Basic, Standard, and Professional Plans</strong></p><p>Once you&rsquo;re up and running with Shiny we want to make sure even the most prolific developers and popular applications have the active hours they need. Today we&rsquo;re doubling the number of active hours per month for the Basic (now 500), Standard (now 2,000), and Professional (now 10,000) plans. In practice, very few accounts exceeded the old limits for these plans but now you can be sure your needs are covered.</p><p><strong>New Performance Boost features for the Basic Plan</strong></p><p>In addition to supporting multiple R worker processes per application, which keeps your application responsive as more people use it, we&rsquo;ve added more memory (up to 8GB) on Basic plans and above. While the data shows that most applications work fine without these enhancements, if you expect many users at the same time or your application is memory or CPU intensive, the Basic Plan has the performance boost you need. The Basic plan also allows unlimited applications and 500 active hours per month.</p></description></item><item><title>Secure HTTPS Connections for R</title><link>https://www.rstudio.com/blog/secure-https-connections-for-r/</link><pubDate>Mon, 17 Aug 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/secure-https-connections-for-r/</guid><description><p>Traditionally, the mechanisms for obtaining R and related software have used standard HTTP connections. This isn&rsquo;t ideal though, as without a secure (HTTPS) connection there is less assurance that you are downloading code from a legitimate source rather than from another server posing as one.</p><p>Recently there have been a number of changes that make it easier to use HTTPS for installing R, RStudio, and packages from CRAN:</p><ol><li><p>Downloads of R from the main CRAN website now use HTTPS;</p></li><li><p>Downloads of RStudio from our website now use HTTPS; and</p></li><li><p>It is now possible to install packages from CRAN over HTTPS.</p></li></ol><p>There are a number of ways to ensure that installation of packages from CRAN are performed using HTTPS. The <a href="https://cran.rstudio.com/">most recent version of R</a> (v3.2.2) makes this the default behavior. The <a href="https://www.rstudio.com/products/rstudio/download/">most recent version of RStudio</a> (v0.99.473) also attempts to configure secure downloads from CRAN by default (even for older versions of R). Finally, any version of R or RStudio can use secure HTTPS downloads by making some configuration changes as described in the <a href="https://support.rstudio.com/hc/en-us/articles/206827897">Secure Package Downloads for R</a> article in our Knowledge Base.</p><h2 id="configuring-secure-connections-to-cran">Configuring Secure Connections to CRAN</h2><p>While the simplest way to ensure secure connections to CRAN is to run the updated versions mentioned above, it&rsquo;s important to note that it is <strong>not necessary to upgrade R or RStudio</strong> to achieve this end. Rather, two configuration changes can be made:</p><ol><li><p>The R <code>download.file.method</code> option needs to specify a method that is capable of HTTPS; and</p></li><li><p>The CRAN mirror you are using must be capable of HTTPS connections (not all of them are).</p></li></ol><p>The specifics of the required changes for various products, platforms, and versions of R are described in-depth in the <a href="https://support.rstudio.com/hc/en-us/articles/206827897">Secure Package Downloads for R</a> article in our Knowledge Base.</p><h2 id="recommendations-for-rstudio-users">Recommendations for RStudio Users</h2><p>We&rsquo;ve made several changes to RStudio IDE to ensure that HTTPS connections are used throughout the product:</p><ol><li><p>The default <code>download.file.method</code> option is set to an HTTPS compatible method (with a warning displayed if a secure method can&rsquo;t be set);</p></li><li><p>The configured CRAN mirror is tested for HTTPS compatibility and a warning is displayed if the mirror doesn&rsquo;t support HTTPS;</p></li><li><p>HTTPS is used for user selection of a non-default CRAN mirror;</p></li><li><p>HTTPS is used for in-product documentation links;</p></li><li><p>HTTPS is used when checking for updated versions of RStudio (applies to desktop version only); and</p></li><li><p>HTTPS is used when downloading <a href="https://cran.r-project.org/bin/windows/Rtools/">Rtools</a> (applies to desktop version only).</p></li></ol><p>If you are running RStudio on the desktop we strongly recommend that you <a href="https://www.rstudio.com/products/rstudio/download/">update to the latest version</a> (v0.99.473).</p><h2 id="recommendations-for-server-administrators">Recommendations for Server Administrators</h2><p>If you are running RStudio Server it&rsquo;s possible to make the most important security enhancements by changing your configuration rather than updating to a new version. The <a href="https://support.rstudio.com/hc/en-us/articles/206827897">Secure Package Downloads for R</a> article in our Knowledge Base provides documentation on how do this.</p><p>In this case in-product documentation links and user selection of a non-default CRAN mirror will continue to use HTTP rather than HTTPS however these are less pressing concerns than CRAN package installation. If you&rsquo;d like these functions to also be performed over HTTPS then you should upgrade your server to the <a href="https://www.rstudio.com/products/rstudio/download/">latest version</a> of RStudio.</p><p>If you are running Shiny Server we recommend that you modify your configuration to support HTTPS package downloads as described in the <a href="https://support.rstudio.com/hc/en-us/articles/206827897">Secure Package Downloads for R</a> article.</p></description></item><item><title>JSM 2015 | Check out these talks and visit RStudio!</title><link>https://www.rstudio.com/blog/jsm-2015/</link><pubDate>Mon, 03 Aug 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/jsm-2015/</guid><description><p>The <a href="https://www.amstat.org/meetings/jsm/2015/">Joint Statistics Meetings</a> starting August 8 is the biggest meetup for statisticians in the world. Navigating the sheer quantity of interesting talks is challenging - there can be up to 50 sessions going on at a time!</p><p>To prepare for Seattle, we asked RStudio&rsquo;s Chief Data Scientist Hadley Wickham for his top session picks. Here are 9 talks, admittedly biased towards R, graphics, and education, that really interested him and might interest you, too.</p><h2 id="check-out-these-talks">Check out these talks</h2><p><a href="https://www.amstat.org/meetings/jsm/2015/onlineprogram/ActivityDetails.cfm?SessionID=211267">Undergraduate Curriculum: The Pathway to Sustainable Growth in Our Discipline</a>Sunday, 1400-1550, CC-607<a href="https://www.amstat.org/meetings/jsm/2015/onlineprogram/ActivityDetails.cfm?SessionID=211254">Statistics with Computing in the Evolving Undergraduate Curriculum</a>Sunday 1600-1750In these back to back sessions, learn how statisticians are rising to the challenge of teaching computing and big(ger) data.</p><p><a href="https://www.amstat.org/meetings/jsm/2015/onlineprogram/ActivityDetails.cfm?SessionID=211258">Recent Advances in Interactive Graphics for Data Analysis</a>Monday,1030-1220, CC-608This is an exciting session discussing innovations in interactive visualisation, and it&rsquo;s telling that all of them connect with R. Hadley will be speaking about ggvis in this session.</p><p><a href="https://www.amstat.org/meetings/jsm/2015/onlineprogram/ActivityDetails.cfm?SessionID=211039">Preparing Students to Work in Industry</a>Monday, 1400-1550, CC-4C4If you&rsquo;re a student about to graduate, we bet there will be some interesting discussion for you here.</p><p><a href="https://www.amstat.org/meetings/jsm/2015/onlineprogram/ActivityDetails.cfm?SessionID=212166">Stat computing and graphics mixer</a>Monday, 1800-2000, S-RavennaThis is advertised as a business meeting, but don&rsquo;t be confused. It&rsquo;s the premier social event for anyone interested in computing or visualisation!</p><p><a href="https://www.amstat.org/meetings/jsm/2015/onlineprogram/ActivityDetails.cfm?SessionID=211492">Statistical Computing and Graphics Student Paper Competition</a>Tuesday, 0830-1020, CC-308Hear this year&rsquo;s winners of the student paper award talk about visualising phylogenetic data, multiple change point detection, capture-recapture data and teaching intro stat with R. All four talks come with accompanying R packages!</p><p><a href="https://www.amstat.org/meetings/jsm/2015/onlineprogram/ActivityDetails.cfm?SessionID=211266">The Statistics Identity Crisis: Are We Really Data Scientists?</a>Tuesday, 0830-1020, CC-609This session, organised by <a href="http://simplystatistics.org/author/jtleek/">Jeffrey Leek</a>, features an all-star cast of <a href="http://alyssafrazee.com">Alyssa Frazee</a>, <a href="http://www2.research.att.com/~volinsky/">Chris Volinsky</a>, <a href="http://web1.sph.emory.edu/users/lwaller/">Lance Waller</a>, and <a href="http://www.stat.ubc.ca/~jenny/">Jenny Bryan</a>.</p><p><a href="https://www.amstat.org/meetings/jsm/2015/onlineprogram/ActivityDetails.cfm?SessionID=211178">Doing Good with Data Viz</a>Wednesday, 0830-1020, CC-2BHear <a href="http://channel.nationalgeographic.com/the-numbers-game/">Jake Porway</a>, <a href="https://hrdag.org/patrickball/">Patrick Ball</a>, and <a href="https://about.me/dinocitraro">Dino Citraro</a> talk about using data to do good. Of all the sessions, you shouldn&rsquo;t miss, this is the one you <em>really</em> shouldn&rsquo;t miss. (But unfortunately it conflicts with another great session - you have difficult choices ahead.)</p><p><a href="https://www.amstat.org/meetings/jsm/2015/onlineprogram/ActivityDetails.cfm?SessionID=211368">Statistics at Scale: Applications from Tech Companies</a>Wednesday, 0830-1020, CC-204<a href="http://hilaryparker.com">Hilary Parker</a> has organised a fantastic session where you&rsquo;ll learn how companies like Etsy, Microsoft, and Facebook do statistics at scale. Get there early because this session is going to be PACKED!</p><p>There are hundreds of sessions at the JSM, so no doubt we&rsquo;ve missed other great ones. If you think we&rsquo;ve missed a &ldquo;don&rsquo;t miss&rdquo; session, please add it to the comments so others can find it.</p><h2 id="visit-rstudio-at-booth-435">Visit RStudio at booth #435</h2><p>In between session times you&rsquo;ll find the RStudio team hanging out at booth #435 in the expo center (at the far back on the right). Please stop by and say hi! We&rsquo;ll have stickers to show your R pride and printed copies of many of <a href="https://www.rstudio.com/resources/cheatsheets/">our cheatsheets</a>.</p><p>See you in Seattle!</p></description></item><item><title>New R Markdown articles section, plus .Rmd to .docx super powers!</title><link>https://www.rstudio.com/blog/new-r-markdown-articles-section-plus-rmd-to-docx-super-powers/</link><pubDate>Wed, 22 Jul 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/new-r-markdown-articles-section-plus-rmd-to-docx-super-powers/</guid><description><p>We&rsquo;ve added a new articles section to the R Markdown development center at <a href="https://rmarkdown.rstudio.com/articles.html">rmarkdown.rstudio.com/articles.html</a>. Here you can find expert advice and tips on how to use R Markdown efficiently.</p><p>In one of the first articles, Richard Layton of <a href="http://www.graphdoctor.com/">graphdoctor.com</a> explains the best tips for using R Markdown to generate Microsoft Word documents. You&rsquo;ll learn how to</p><ul><li><p>set Word styles</p></li><li><p>tweak margins</p></li><li><p>handle relative paths</p></li><li><p>make better tables</p></li><li><p>add bibliographies, and more</p></li></ul><p><img src="https://dl.dropboxusercontent.com/u/36189294/wp/test-report-04.png" alt=""></p><p><a href="https://rmarkdown.rstudio.com/articles_docx.html">Check it out</a>; and then <a href="https://rmarkdown.rstudio.com/articles.html">check back often</a>.</p></description></item><item><title>Article Spotlight: Persistent data storage in Shiny apps</title><link>https://www.rstudio.com/blog/article-spotlight-persistent-data-storage-in-shiny-apps/</link><pubDate>Wed, 15 Jul 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/article-spotlight-persistent-data-storage-in-shiny-apps/</guid><description><p>The articles section on <a href="https://shiny.rstudio.com">shiny.rstudio.com</a> has lots of great advice for Shiny developers.</p><p>A <a href="https://shiny.rstudio.com/articles/persistent-data-storage.html">recent article</a> by Dean Attali demonstrates how to save data from a Shiny app to persistent storage structures, like local files, servers, databases, and more. When you do this, your data remains after the app has closed, which opens new doors for data collection and analysis.</p><p><a href="https://shiny.rstudio.com/articles/persistent-data-storage.html"><img src="https://rstudioblog.files.wordpress.com/2015/07/screen-shot-2015-07-15-at-4-27-10-pm.png" alt="persistent-storage"></a></p><p>Read Dean&rsquo;s article and more at <a href="https://shiny.rstudio.com/articles">shiny.rstudio.com/articles</a></p></description></item><item><title>Spark 1.4 for RStudio</title><link>https://www.rstudio.com/blog/spark-1-4-for-rstudio/</link><pubDate>Tue, 14 Jul 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/spark-1-4-for-rstudio/</guid><description><p><em>Today&rsquo;s guest post is written by Vincent Warmerdam of <a href="http://www.godatadriven.com/">GoDataDriven</a> and is reposted with Vincent&rsquo;s permission from <a href="http://blog.godatadriven.com/spark-rstudio.html">blog.godatadriven.com</a>. You can learn more about how to use SparkR with RStudio at the <a href="http://www.earl-conference.com/">2015 EARL Conference</a> in Boston November 2-4, where Vincent will be speaking live.</em></p><p>This document contains a tutorial on how to provision a spark cluster with RStudio. You will need a machine that can run bash scripts and a functioning account on AWS. Note that this tutorial is meant for Spark 1.4.0. Future versions will most likely be provisioned in another way but this should be good enough to help you get started. At the end of this tutorial you will have a fully provisioned spark cluster that allows you to handle simple dataframe operations on gigabytes of data within RStudio.</p><h3 id="aws-prep">AWS prep</h3><p>Make sure you have an AWS account with billing. Next make sure that you have downloaded your <code>.pem</code> files and that you have your keys ready.</p><h3 id="spark-startup">Spark Startup</h3><p>Next go and get spark locally on your machine from <a href="https://spark.apache.org/downloads.html">the spark homepage</a>. It&rsquo;s a pretty big blob. Unzip it once it is downloaded go to the <code>ec2</code> folder in the spark folder. Run the following command from the command line.</p><pre><code>./spark-ec2 \--key-pair=spark-df \--identity-file=/Users/code/Downloads/spark-df.pem \--region=eu-west-1 \-s 1 \--instance-type c3.2xlarge \launch mysparkr</code></pre><p>This script will use your keys to connect to amazon and setup a spark standalone cluster for you. You can specify what type of machines you want to use as well as how many and where on amazon. You will only need to wait until everything is installed, which can take up to 10 minutes. More info can be found <a href="https://spark.apache.org/docs/latest/ec2-scripts.html">here</a>.When the command signals that it is done, you can ssh into your machine via the command line.<code>./spark-ec2 -k spark-df -i /Users/code/Downloads/spark-df.pem --region=eu-west-1 login mysparkr</code>Once you are in your amazon machine you can immediately run SparkR from the terminal.</p><pre><code>chmod u+w /root/spark/./spark/bin/sparkR</code></pre><p>As just a toy example, you should be able to confirm that the following code already works.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">ddf <span style="color:#666">&lt;-</span> <span style="color:#06287e">createDataFrame</span>(sqlContext, faithful)<span style="color:#06287e">head</span>(ddf)<span style="color:#06287e">printSchema</span>(ddf)</code></pre></div><p>This <code>ddf</code> dataframe is no ordinary dataframe object. It is a distributed dataframe, one that can be distributed across a network of workers such that we could query it for parallelized commands through spark.</p><h3 id="spark-ui">Spark UI</h3><p>This R command you have just run launches a spark job. Spark has a webui so you can keep track of the cluster. To visit the web-ui, first confirm on what IP-address the master node is via this command:</p><pre><code>curl icanhazip.com</code></pre><p>You can now visit the webui via your browser.</p><pre><code>&lt;master-node-ip&gt;:4040</code></pre><p>From here you can view anything you may want to know about your spark clusters (like executor status, job process and even a DAG visualisation).<img src="https://i.imgur.com/CsNys83.png" alt="">This is a good moment to stand still and realize that this on it&rsquo;s own right is already very cool. We can start up a spark cluster in 15 minutes and use R to control it. We can specify how many servers we need by only changing a number on the command line and without any real developer effort we gain access to all this parallelizing power.Still, working from a terminal might not be too productive. We&rsquo;d prefer to work with a GUI and we would like some basic plotting functionality when working with data. So let&rsquo;s install RStudio and get some tools connected.</p><h3 id="rstudio-setup">RStudio setup</h3><p>Get out of the <code>SparkR</code> shell by entering <code>q()</code>. Next, download and install Rstudio.<code>wget http://download2.rstudio.org/rstudio-server-rhel-0.99.446-x86_64.rpm</code><code>sudo yum install --nogpgcheck -y rstudio-server-rhel-0.99.446-x86_64.rpm</code><code>rstudio-server restart</code>While this is installing. Make sure the TCP connection on the 8787 port is open in the AWS security group setting for the master node. A recommended setting is to only allow access from your ip.<img src="https://i.imgur.com/cBfbL9v.png" alt="">Then, add a user that can access RStudio. We make sure that this user can also access all the RStudio files.</p><pre><code>adduser analystpasswd analyst</code></pre><p>You also need to do this (the details of why are a bit involved). These edits need to be made because the analyst user doesn&rsquo;t have root permissions.<code>chmod a+w /mnt/spark</code><code>chmod a+w /mnt2/spark</code><code>sed -e 's/^ulimit/#ulimit/g' /root/spark/conf/spark-env.sh &gt; /root/spark/conf/spark-env2.sh</code><code>mv /root/spark/conf/spark-env2.sh /root/spark/conf/spark-env.sh</code><code>ulimit -n 1000000</code>When this is known, point the browser to <code>&lt;master-ip-adr&gt;:8787</code>. Then login in as analyst.</p><h3 id="rstudio---spark-link">RStudio - Spark link</h3><p>Awesome. RStudio is set up. First start up the master submit.</p><pre><code>/root/spark/sbin/stop-all.sh/root/spark/sbin/start-all.sh</code></pre><p>This will reboot Spark (both the master and slave nodes). You can confirm that spark works after this command by pointing the browser to <code>&lt;ip-adr&gt;:8080</code>.Next, let&rsquo;s go and start Spark from RStudio. Start a new R script, and run the following code:<code>print('Now connecting to Spark for you.')</code></p><p><code>spark_link &lt;- system('cat /root/spark-ec2/cluster-url', intern=TRUE)</code></p><p><code>.libPaths(c(.libPaths(), '/root/spark/R/lib')) </code><code>Sys.setenv(SPARK_HOME = '/root/spark') </code><code>Sys.setenv(PATH = paste(Sys.getenv(c('PATH')), '/root/spark/bin', sep=':')) </code><code>library(SparkR) </code></p><p><code>sc &lt;- sparkR.init(spark_link) </code><code>sqlContext &lt;- sparkRSQL.init(sc) </code></p><p><code>print('Spark Context available as \&quot;sc\&quot;. \\n')</code><code>print('Spark SQL Context available as \&quot;sqlContext\&quot;. \\n')</code></p><h3 id="loading-data-from-s3">Loading data from S3</h3><p>Let&rsquo;s confirm that we can now play with the RStudio stack by downloading some libraries and having it run against a data that lives on S3.<code>small_file = &quot;s3n://&lt;AWS-ID&gt;:&lt;AWS-SECRET-KEY&gt;@&lt;bucket_name&gt;/data.json&quot;</code><code>dist_df &lt;- read.df(sqlContext, small_file, &quot;json&quot;) %&gt;% cache </code>This <code>dist_df</code> is now a distributed dataframe, which has a different api than the normal R dataframe but is similar to <code>dplyr</code>.<code>head(summarize(groupBy(dist_df, df$type), count = n(df$auc)))</code>Also, we can install <code>magrittr</code> to make our code look a lot nicer.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">local_df <span style="color:#666">&lt;-</span> dist_df <span style="color:#666">%&gt;%</span><span style="color:#06287e">groupBy</span>(df<span style="color:#666">$</span>type) <span style="color:#666">%&gt;%</span><span style="color:#06287e">summarize</span>(count <span style="color:#666">=</span> <span style="color:#06287e">n</span>(df<span style="color:#666">$</span>id)) <span style="color:#666">%&gt;%</span>collect</code></pre></div><p>The <code>collect</code> method pulls the distributed dataframe back into a normal dataframe on a single machine so you can use plotting methods on it again and use R as you would normally. A common use case would be to use spark to sample or aggregate a large dataset which can then be further explored in R.Again, if you want to view the spark ui for these jobs you can just go to:</p><pre><code>&lt;master-node-ip&gt;:4040</code></pre><h3 id="a-more-complete-stack">A more complete stack</h3><p>Unfortunately this stack has an old version of R (we need version 3.2 to get the newest version of ggplot2/dplyr). Also, as of right now there isn&rsquo;t support for the machine learning libraries yet. These are known issues at the moment and version 1.5 should show some fixes. Version 1.5 will also feature RStudio installation as part of the ec2 stack.Another issue is that the namespace of <code>dplyr</code> currently conflicts with <code>sparkr</code>, time will tell how this gets resolved. Same would go for other data features like windowing function and more elaborate data types.</p><h3 id="killing-the-cluster">Killing the cluster</h3><p>When you are done with the cluster, you only need to exit the ssh connection and run the following command:<code>./spark-ec2 -k spark-df -i /Users/code/Downloads/spark-df.pem --region=eu-west-1 destroy mysparkr</code></p><h3 id="conclusion">Conclusion</h3><p>The economics of spark are very interesting. We only pay amazon for the time that we are using Spark as a compute engine. All other times we&rsquo;d only pay for S3. This means that if we analyse for 8 hours, we&rsquo;d only pay for 8 hours. Spark is also very flexible in that it allows us to continue coding in R (or python or scala) without having to learn multiple domain specific languages or frameworks like in hadoop. Spark makes big data really simple again.This document is meant to help you get started with Spark and RStudio but in a production environment there are a few things you still need to account for:</p><ul><li><p><strong>security</strong>, our web connection is not done through https, even though we are telling amazon to only use our ip, we may be at security risk if there is a man in the middle listening .</p></li><li><p><strong>multiple users</strong>, this setup will work fine for a single user but if multiple users are working on such a cluster you may need to rethink some steps with regards to user groups, file access and resource management.</p></li><li><p><strong>privacy</strong>, this setup works well for ec2 but if you have sensitive, private user data then you may need to do this on premise because the data cannot leave your own datacenter. Most install steps would be the same but the initial installation of Spark would require the most work. See the <a href="https://spark.apache.org/docs/latest/spark-standalone.html">docs</a> for more information.</p></li></ul><p>Spark is an amazing tool, expect more features in the future.</p><h4 id="possible-gotya">Possible Gotya</h4><h5 id="hanging">Hanging</h5><p>It can happen that the <code>ec2</code> script hangs in the <code>Waiting for cluster to enter 'ssh-ready' state</code> part. This can happen if you use amazon a lot. To prevent this you may want to remove some lines in <code>~/.ssh/known_hosts</code>. More info <a href="http://stackoverflow.com/questions/28002443/cluster-hangs-in-ssh-ready-state-using-spark-1-2-ec2-launch-script">here</a>. Another option is to add the following lines to your <code>~/.ssh/config</code> file.</p><pre><code># AWS EC2 public hostnames (changing IPs)Host *.compute.amazonaws.comStrictHostKeyChecking noUserKnownHostsFile /dev/null</code></pre></description></item><item><title>Accelerating R: RStudio and the new R Consortium</title><link>https://www.rstudio.com/blog/accelerating-r-rstudio-and-the-new-r-consortium/</link><pubDate>Tue, 30 Jun 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/accelerating-r-rstudio-and-the-new-r-consortium/</guid><description><p>To paraphrase Yogi Berra, &ldquo;Predicting is hard, especially about the future&rdquo;. In 1993, when Ross Ihaka and Robert Gentleman first started working on R, who would have predicted that it would be used by millions in a world that increasingly rewards data literacy? It&rsquo;s impossible to know where R will go in the next 20 years, but at RStudio we&rsquo;re working hard to make sure the future is bright.</p><p>Today, we&rsquo;re excited to announce our participation in the <a href="https://www.r-consortium.org/">R Consortium</a>, a new 501(c)6 nonprofit organization. The R Consortium is a collaboration between the R Foundation, RStudio, Microsoft, TIBCO, Google, Oracle, HP and others. It&rsquo;s chartered to fund and inspire ideas that will enable R to become an even better platform for science, research, and industry. The R Consortium complements the R Foundation by providing a convenient funding vehicle for the many commercial beneficiaries of R to give back to the community, and will provide the resources to embark on ambitious new projects to make R even better.</p><p>We believe the R Consortium is critically important to the future of R and despite our small size, we chose to join it at the highest contributor level (alongside Microsoft). Open source is a key component of our mission and giving back to the community is extremely important to us.</p><p>The community of R users and developers have a big stake in the language and its long-term success. We all want free and open source R to continue thriving and growing for the next 20 years and beyond. The fact that so many of the technology industry&rsquo;s largest companies are willing to stand behind R as part of the consortium is remarkable and we think bodes incredibly well for the future of R.</p></description></item><item><title>d3heatmap: Interactive heat maps</title><link>https://www.rstudio.com/blog/d3heatmap/</link><pubDate>Wed, 24 Jun 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/d3heatmap/</guid><description><p>We&rsquo;re pleased to announce <strong><a href="https://github.com/rstudio/d3heatmap">d3heatmap</a></strong>, our new package for generating interactive heat maps using <a href="http://d3js.org/">d3.js</a> and <a href="http://www.htmlwidgets.org/">htmlwidgets</a>. <a href="http://www.r-statistics.com/">Tal Galili</a>, author of <a href="http://cran.r-project.org/package=dendextend">dendextend</a>, collaborated with us on this package.</p><p>d3heatmap is designed to have a familiar feature set and API for anyone who has used <a href="http://www.rdocumentation.org/packages/stats/functions/heatmap">heatmap</a> or <a href="http://www.rdocumentation.org/packages/gplots/functions/heatmap.2">heatmap.2</a> to create static heatmaps. You can specify dendrogram, clustering, and scaling options in the same way.</p><p>d3heatmap includes the following features:</p><ul><li><p>Shows the row/column/value under the mouse cursor</p></li><li><p>Click row/column labels to highlight</p></li><li><p>Drag a rectangle over the image to zoom in</p></li><li><p>Works from the R console, in RStudio, with <a href="https://rmarkdown.rstudio.com/">R Markdown</a>, and with <a href="https://shiny.rstudio.com/">Shiny</a></p></li></ul><h2 id="installation">Installation</h2><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">d3heatmap&#34;</span>)</code></pre></div><h2 id="examples">Examples</h2><p>Here&rsquo;s a very simple example (source: <a href="http://flowingdata.com/2010/01/21/how-to-make-a-heatmap-a-quick-and-easy-solution/">flowingdata</a>):</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(d3heatmap)url <span style="color:#666">&lt;-</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">http://datasets.flowingdata.com/ppg2008.csv&#34;</span>nba_players <span style="color:#666">&lt;-</span> <span style="color:#06287e">read.csv</span>(url, row.names <span style="color:#666">=</span> <span style="color:#40a070">1</span>)<span style="color:#06287e">d3heatmap</span>(nba_players, scale <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">column&#34;</span>)</code></pre></div><p><a href="http://rpubs.com/jcheng/nba1"><img src="https://rstudioblog.files.wordpress.com/2015/06/screen-shot-2015-06-24-at-1-50-07-pm.png" alt="d3heatmap"></a></p><p>You can easily customize the colors using the <code>colors</code> parameter. This can take an <a href="http://cran.r-project.org/package=RColorBrewer">RColorBrewer</a> palette name, a vector of colors, or a function that takes (potentially scaled) data points as input and returns colors.</p><!-- more --><p>Let&rsquo;s modify the previous example by using the <code>&quot;Blues&quot;</code> colorbrewer palette, and dropping the clustering and dendrograms:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">d3heatmap</span>(nba_players, scale <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">column&#34;</span>, dendrogram <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">none&#34;</span>,color <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Blues&#34;</span>)</code></pre></div><p><a href="http://rpubs.com/jcheng/nba2"><img src="https://rstudioblog.files.wordpress.com/2015/06/screen-shot-2015-06-24-at-1-39-15-pm.png" alt="d3heatmap"></a></p><p>If you want to use discrete colors instead of continuous, you can use the <code>col_*</code> functions from the <a href="http://cran.r-project.org/package=scales">scales</a> package.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">d3heatmap</span>(nba_players, scale <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">column&#34;</span>, dendrogram <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">none&#34;</span>,color <span style="color:#666">=</span> scales<span style="color:#666">::</span><span style="color:#06287e">col_quantile</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Blues&#34;</span>, <span style="color:#007020;font-weight:bold">NULL</span>, <span style="color:#40a070">5</span>))</code></pre></div><p><a href="http://rpubs.com/jcheng/nba3"><img src="https://rstudioblog.files.wordpress.com/2015/06/screen-shot-2015-06-24-at-1-40-21-pm.png" alt="d3heatmap"></a>Thanks to integration with the dendextend package, you can customize dendrograms with cluster colors:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">d3heatmap</span>(nba_players, colors <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Blues&#34;</span>, scale <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">col&#34;</span>,dendrogram <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">row&#34;</span>, k_row <span style="color:#666">=</span> <span style="color:#40a070">3</span>)</code></pre></div><p><a href="http://rpubs.com/jcheng/nba4"><img src="https://rstudioblog.files.wordpress.com/2015/06/screen-shot-2015-06-24-at-1-57-20-pm.png" alt="d3heatmap"></a>For issue reports or feature requests, please see our <a href="https://github.com/rstudio/d3heatmap">GitHub repo</a>.</p></description></item><item><title>DT: An R interface to the DataTables library</title><link>https://www.rstudio.com/blog/dt-an-r-interface-to-the-datatables-library/</link><pubDate>Wed, 24 Jun 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/dt-an-r-interface-to-the-datatables-library/</guid><description><p>We are happy to announce a new package <strong>DT</strong> is available on CRAN now. <strong>DT</strong> is an interface to the JavaScript library <a href="http://datatables.net/">DataTables</a> based on the <strong><a href="http://htmlwidgets.org">htmlwidgets</a></strong> framework, to present rectangular R data objects (such as data frames and matrices) as HTML tables. You can filter, search, and sort the data in the table. See <a href="http://rstudio.github.io/DT/">http://rstudio.github.io/DT/</a> for the full documentation and examples of this package. To install the package, run</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#39;</span><span style="color:#4070a0">DT&#39;</span>)<span style="color:#60a0b0;font-style:italic"># run DT::datatable(iris) to see a &#34;hello world&#34; example</span></code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2015/06/screenshot-from-2015-06-17-232010.png" alt="DataTables"></p><p>The main function in this package is <code>datatable()</code>, which returns a table widget that can be rendered in R Markdown documents, Shiny apps, and the R console. It is easy to customize the style (cell borders, row striping, and row highlighting, etc), theme (default or Bootstrap), row/column names, table caption, and so on.</p><!-- more --><h2 id="datatables-options">DataTables Options</h2><p>The DataTables library supports a large number of initialization options. Through <strong>DT</strong>, you can specify these options using a list in R. For example, we can disable searching, change the default page length from 10 to 5, and customize the length menu to use page lengths 5, 10, 15, and 20:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(DT)<span style="color:#06287e">datatable</span>(iris, options <span style="color:#666">=</span> <span style="color:#06287e">list</span>(searching <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">FALSE</span>,pageLength <span style="color:#666">=</span> <span style="color:#40a070">5</span>,lengthMenu <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#40a070">5</span>, <span style="color:#40a070">10</span>, <span style="color:#40a070">15</span>, <span style="color:#40a070">20</span>)))</code></pre></div><p>When you need to write literal JavaScript code in these options (e.g. the callback functions), you can use the <code>JS()</code> function. An example of the <code>initComplete</code> callback:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">datatable</span>(iris, options <span style="color:#666">=</span> <span style="color:#06287e">list</span>(initComplete <span style="color:#666">=</span> <span style="color:#06287e">JS</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0"></span><span style="color:#4070a0"> function(settings, json) {</span><span style="color:#4070a0"> $(this.api().table().header()).css({</span><span style="color:#4070a0"> &#39;background-color&#39;: &#39;#000&#39;,</span><span style="color:#4070a0"> &#39;color&#39;: &#39;#fff&#39;</span><span style="color:#4070a0"> });</span><span style="color:#4070a0"> }&#34;</span>)))</code></pre></div><p>Being able to write JavaScript gives you full flexibility to customize the table. However, one of the goals of <strong>DT</strong> is to avoid writing JavaScript in your R scripts, and we hope users can express everything in pure R syntax, so we have provided a few R helper functions in <strong>DT</strong> that essentially generates JavaScript code for users to fulfill some common tasks, such as formatting table columns and cells.</p><h2 id="formatting-functions">Formatting Functions</h2><p>The functions <code>formatCurrency()</code>, <code>formatPercentage()</code>, <code>formatRound()</code>, and <code>formatDate()</code> can be used to format table columns. For example, for a data frame with five columns A, B, C, D, and E, we format the columns A and C as euros, B as percentages (rounded to 2 decimal places), round D to 3 decimal places, and format E as date strings (the pipe operator <code>%&gt;%</code> comes from the <strong>magrittr</strong> package):</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(DT)df <span style="color:#666">&lt;-</span> <span style="color:#06287e">data.frame</span>(A <span style="color:#666">=</span> <span style="color:#06287e">rpois</span>(<span style="color:#40a070">100</span>, <span style="color:#40a070">1e4</span>),B <span style="color:#666">=</span> <span style="color:#06287e">runif</span>(<span style="color:#40a070">100</span>),C <span style="color:#666">=</span> <span style="color:#06287e">rpois</span>(<span style="color:#40a070">100</span>, <span style="color:#40a070">1e3</span>),D <span style="color:#666">=</span> <span style="color:#06287e">rnorm</span>(<span style="color:#40a070">100</span>),E <span style="color:#666">=</span> <span style="color:#06287e">Sys.Date</span>() <span style="color:#666">+</span> <span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">100</span>)<span style="color:#06287e">datatable</span>(df) <span style="color:#666">%&gt;%</span><span style="color:#06287e">formatCurrency</span>(<span style="color:#06287e">c</span>(<span style="color:#4070a0">&#39;</span><span style="color:#4070a0">A&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">C&#39;</span>), <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">€&#39;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">formatPercentage</span>(<span style="color:#4070a0">&#39;</span><span style="color:#4070a0">B&#39;</span>, <span style="color:#40a070">2</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">formatRound</span>(<span style="color:#4070a0">&#39;</span><span style="color:#4070a0">D&#39;</span>, <span style="color:#40a070">3</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">formatDate</span>(<span style="color:#4070a0">&#39;</span><span style="color:#4070a0">E&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">toDateString&#39;</span>)</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2015/06/screenshot-from-2015-06-18-150030.png" alt="Format table columns"></p><p>It is also easy to style the table cells according to their values using the <code>formatStyle()</code> function. You can apply different CSS styles to cells, e.g. use bold font for those cells with <code>Sepal.Length &gt; 5</code>, gray background for <code>Sepal.Width &lt;= 3.4</code> and yellow for <code>Sepal.Width &gt; 3.4</code>, and so on. See the <a href="http://rstudio.github.io/DT/functions.html">documentation page</a> for these formatting functions for more information.</p><p><img src="https://rstudioblog.files.wordpress.com/2015/06/screenshot-from-2015-06-18-151117.png" alt="Format cells"></p><h2 id="server-side-processing">Server-side Processing</h2><p>Interactions with the table can be processed either on the client side (using JavaScript in the web browser), or on the server side. Server-side processing is suitable for large data objects, since filtering, sorting, and pagination can be much faster in R than JavaScript in the browser. In theory, you can use any server-side processing language to process the data, and we have implemented it in R, which you can trivially enable by using <strong>DT</strong> in Shiny apps (the default mode is just server-side processing).</p><h2 id="column-filters">Column Filters</h2><p>DataTables does not come with column filters by default. It only provides a global search box. We have added filters for individual columns in <strong>DT</strong>, and you can enable column filters using the argument <code>filter = 'top'</code> or <code>'bottom'</code> in <code>datatable()</code>. Currently, three types of filters are provided:</p><ul><li><p>For numeric/date/time columns, <a href="http://refreshless.com/nouislider/">range sliders</a> are used to filter rows within ranges;</p></li><li><p>For factor columns, <a href="http://brianreavis.github.io/selectize.js/">selectize inputs</a> are used to display all possible categories, and you can select multiple categories there (note you can also type in the box to search all categories);</p></li><li><p>For character columns, ordinary search boxes are used to match the values you typed in the boxes;</p></li></ul><p>These filters are similar to the ones introduced in the <a href="https://blog.rstudio.com/2015/05/26/new-version-of-rstudio-v0-99/">RStudio 0.99</a> Data Viewer. Column filters work in both server-side and client-side processing modes. You can enable search result highlighting by the option <a href="http://rstudio.github.io/DT/006-highlight.html">searchHighlight = TRUE</a>.</p><p><img src="https://rstudioblog.files.wordpress.com/2015/06/column-filters.png" alt="Column Filters"></p><h2 id="shiny">Shiny</h2><p>If you have used DataTables before in Shiny (i.e. the functions <code>dataTableOutput()</code> and <code>renderDataTable()</code>), it should be trivial to switch from Shiny to <strong>DT</strong>. <strong>DT</strong> has provided two functions of the same names, and the usage is very similar. Basically, all you have to do is to load <strong>DT</strong> after <strong>shiny</strong>, so that <code>dataTableOutput()</code> and <code>renderDataTable()</code> in <strong>DT</strong> can override the functions in <strong>shiny</strong>. If you want to be sure to use the functions in <strong>DT</strong>, you can add the prefix <code>DT::</code> to these functions. We will deprecate <code>dataTableOutput()</code> and <code>renderDataTable()</code> in <strong>shiny</strong> eventually as <strong>DT</strong> becomes mature and stable.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(shiny)<span style="color:#06287e">library</span>(DT) <span style="color:#60a0b0;font-style:italic"># make sure you load DT *after* shiny</span></code></pre></div><p>As mentioned before, <strong>DT</strong> uses the server-side processing mode in <strong>shiny</strong>. To go back to client-side processing, you can use <code>renderDataTable(data, server = FALSE)</code>.</p><p>The first argument of the function <code>renderDataTable()</code> can be either a data object (e.g. a data frame), or a table widget object (returned by <code>datatable()</code>). The latter form is useful when you need to further process the table widget, e.g. format certain columns or cells.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">renderDataTable</span>({<span style="color:#06287e">datatable</span>(iris) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">formatStyle</span>(<span style="color:#4070a0">&#39;</span><span style="color:#4070a0">Sepal.Width&#39;</span>,backgroundColor <span style="color:#666">=</span> <span style="color:#06287e">styleInterval</span>(<span style="color:#40a070">3.4</span>, <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#39;</span><span style="color:#4070a0">gray&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">yellow&#39;</span>)))})</code></pre></div><p>When a table is rendered in a Shiny app, you can obtain some information about the state of the table via the <code>input</code> object in Shiny. For example, for a table output <code>dataTableOutput('foo')</code>, the indices of the selected rows can be obtained from <code>input$foo_rows_selected</code>, and the indices of rows on the current page are available via <code>input$foo_rows_current</code> (<a href="https://yihui.shinyapps.io/DT-info/">live example</a>). <a href="http://rstudio.github.io/DT/shiny.html">This page</a> has more information about using <strong>DT</strong> in Shiny.</p><h2 id="datatables-extensions">DataTables Extensions</h2><p>DataTables has several extensions, and we have integrated all of them into <strong>DT</strong>. You may enable extensions via the extensions argument of <code>datatable()</code>. For example, you can reorder columns using the ColReorder extension, show/hide columns using the ColVis extension, fix certain columns on the left and/or right via FixedColumns when scrolling horizontally in the table, and so on. Please see the <a href="http://rstudio.github.io/DT/extensions.html">documentation page for extensions</a> for details.</p><p>We hope you will enjoy this package, and please <a href="https://github.com/rstudio/DT/issues">let us know</a> if you have any questions, comments, or feature requests.</p></description></item><item><title>Leaflet: Interactive web maps with R</title><link>https://www.rstudio.com/blog/leaflet-interactive-web-maps-with-r/</link><pubDate>Wed, 24 Jun 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/leaflet-interactive-web-maps-with-r/</guid><description><p>We are excited to announce that a new package <strong>leaflet</strong> has been released on CRAN. The R package <strong>leaflet</strong> is an interface to the JavaScript library <a href="http://leafletjs.com/">Leaflet</a> to create interactive web maps. It was developed on top of the <strong><a href="http://htmlwidgets.org">htmlwidgets</a></strong> framework, which means the maps can be rendered in R Markdown (v2) documents, Shiny apps, and RStudio IDE / the R console. Please see <a href="http://rstudio.github.io/leaflet">http://rstudio.github.io/leaflet</a> for the full documentation. To install the package, run</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#39;</span><span style="color:#4070a0">leaflet&#39;</span>)</code></pre></div><p>We quietly introduced this package in December when we <a href="https://blog.rstudio.com/2014/12/18/htmlwidgets-javascript-data-visualization-for-r/">announced htmlwidgets</a>, but in the months since then we&rsquo;ve added a lot of new features and launched a new set of <a href="http://rstudio.github.io/leaflet">documentation</a>. If you haven&rsquo;t looked at leaflet lately, now is a great time to get reacquainted!</p><h2 id="the-map-widget">The Map Widget</h2><p>The basic usage of this package is that you create a map widget using the <code>leaflet()</code> function, and add layers to the map using the layer functions such as <code>addTiles()</code>, <code>addMarkers()</code>, and so on. Adding layers can be done through the pipe operator <code>%&gt;%</code> from <strong>magrittr</strong> (you are not required to use <code>%&gt;%</code>, though):</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(leaflet)m <span style="color:#666">&lt;-</span> <span style="color:#06287e">leaflet</span>() <span style="color:#666">%&gt;%</span><span style="color:#06287e">addTiles</span>() <span style="color:#666">%&gt;%</span> <span style="color:#60a0b0;font-style:italic"># Add default OpenStreetMap map tiles</span><span style="color:#06287e">addMarkers</span>(lng<span style="color:#666">=</span><span style="color:#40a070">174.768</span>, lat<span style="color:#666">=</span><span style="color:#40a070">-36.852</span>,popup<span style="color:#666">=</span><span style="color:#4070a0">&#34;</span><span style="color:#4070a0">The birthplace of R&#34;</span>)m <span style="color:#60a0b0;font-style:italic"># Print the map</span></code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2015/06/leaflet-basic.png" alt="leaflet-basic"></p><p>There are a variety of layers that you can add to a map widget, including:</p><ul><li><p>Map tiles</p></li><li><p>Markers / Circle Markers</p></li><li><p>Polygons / Rectangles</p></li><li><p>Lines</p></li><li><p>Popups</p></li><li><p>GeoJSON / TopoJSON</p></li><li><p>Raster Images</p></li><li><p>Color Legends</p></li><li><p>Layer Groups and Layer Control</p></li></ul><p>There are a sets of methods to manipulate the attributes of a map, such as <code>setView()</code> and <code>fitBounds()</code>, etc. You can find the details from the help page <code>?setView</code>.</p><!-- more --><p>The <code>leaflet()</code> function and all layer functions have a <code>data</code> argument that can take several types of spatial data objects, including matrices and data frames with latitude and longitude columns, spatial objects from the <strong>sp</strong> package (e.g. <code>SpatialPoints</code> and <code>SpatialPointsDataFrame</code>, etc), and the data frame returned from <code>maps::map()</code>. When you have got a data object in <code>leaflet()</code> or layer functions, you may use the formula interface to pass values of variables to function arguments.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">m <span style="color:#666">&lt;-</span> <span style="color:#06287e">leaflet</span>(df) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">addTiles</span>()m <span style="color:#666">%&gt;%</span> <span style="color:#06287e">addCircleMarkers</span>(radius <span style="color:#666">=</span> <span style="color:#666">~</span>size, color <span style="color:#666">=</span> <span style="color:#666">~</span>color)<span style="color:#60a0b0;font-style:italic"># this is more compact than radius = df$size, color = df$color</span></code></pre></div><h2 id="basemaps">Basemaps</h2><p>You can add basemaps to a map widget using map tiles. The default tiles provided by <code>addTiles()</code> are OpenStreetMap tiles, and you can easily add third-party tiles via <code>addProviderTiles()</code>. WMS (Web Map Service) tiles can be added via <code>addWMSTiles()</code>. You may use more than one tile layer on a map, too.</p><h2 id="markers-and-popups">Markers and Popups</h2><p>Icon markers and circle markers can be placed at the locations specified by latitudes/longitudes on a map via <code>addMarkers()</code> and <code>addCircleMarkers()</code>, respectively. You can change the default appearance of icon markers (dropped pins) and use custom icon images, and you can also customize the appearance of circle markers (radius, color, and so on). When there are a large number of markers on a map, you may cluster them into groups (each group containing multiple markers close to each other), and see individual markers as you zoom into the map. When you add a marker to a map, you can also bind a popup to it through the <code>popup</code> argument, so users can see more information after they click on the marker. It is possible to add popups separately without markers as well via <code>addPopups()</code>.</p><h2 id="lines-and-shapes">Lines and Shapes</h2><p>Polygons, polylines, circles, and rectangles can be added to the map through <code>addPolygons()</code>, <code>addPolylines()</code>, <code>addCircles()</code>, and <code>addRectangles()</code>. For example, you can create a choropleth map by adding polygons with different colors.</p><h2 id="geojson--topojson">GeoJSON / TopoJSON</h2><p>When your data is in the GeoJSON or TopoJSON format, you can add it to the map using <code>addGeoJSON()</code> and <code>addTopoJSON()</code>, respectively. The features in the JSON data can be styled via either the styles specified inside the data, or the arguments of the functions <code>addGeoJSON()</code>/<code>addTopoJSON()</code>.</p><h2 id="raster-images">Raster Images</h2><p>Two-dimensional <code>RasterLayer</code> objects (from the <a href="http://cran.r-project.org/package=raster"><strong>raster</strong> package</a>) can be turned into images and added to Leaflet maps using the <code>addRasterImage()</code> function. You can color the image through the colors argument that accepts a variety of color specifications.</p><h2 id="shiny-integration">Shiny Integration</h2><p>Like most Shiny output widgets, you just create a leaflet output element in the UI using <code>leafletOutput()</code>, and render the widget on the server side using <code>renderLeaflet()</code>, with a leaflet widget object passed to <code>renderLeaflet()</code>.</p><p>After a widget is rendered in Shiny, you may modify it using the <code>leafletProxy()</code> object. All you need to do is replace <code>leaflet()</code> with <code>leafletProxy()</code>. For example, suppose the output ID of the map is <code>mymap</code>, and you have two numeric inputs <code>x</code> and <code>y</code> (specifying lng and lat) in the app, then you can add new markers to the map via:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">observe</span>({<span style="color:#06287e">leafletProxy</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">mymap&#34;</span>) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">addMarkers</span>(input<span style="color:#666">$</span>x, input<span style="color:#666">$</span>y)})</code></pre></div><p>If you added layers with layer ID&rsquo;s to a map, you will be able to remove certain layers according to the layer ID&rsquo;s (e.g. <code>removeMarker()</code>). You can also clear entire layers (e.g. <code>clearMarkers()</code>).</p><p>When interacting with a map or its layers, you can obtain some information about the interaction from the Shiny input object. For example, <code>input$mymap_shape_click</code> will be a list of the form <code>list(lat = lat, lng = lng, id = layerId)</code> after you click on a shape object (e.g. a marker or a circle) on the map.</p><h2 id="color-palettes-and-legends">Color Palettes and Legends</h2><p>We have provided four types of color palettes in this package: <code>colorNumeric()</code>, <code>colorBin()</code>, <code>colorQuantile()</code>, and <code>colorFactor()</code>. These palette functions return functions that can be applied to numeric or factor values to generate colors. If you have used one of these color palettes, you may also use <code>addLegend()</code> to add a color legend to the map.</p><h2 id="layer-groups-and-layer-control">Layer Groups and Layer Control</h2><p>Normally a layer function has an argument called group, which can be used to assign multiple layer elements into a group. Later you may use <code>addLayersControl()</code> to add a layer control to the map to show/hide groups.</p><p>We hope you will enjoy using this package. Please let us know if you have any comments or questions when the R package documentation or the website <a href="http://rstudio.github.io/leaflet">http://rstudio.github.io/leaflet</a> is not clear enough. You are welcome to <a href="https://github.com/rstudio/leaflet/issues">file bug reports / feature requests</a> to the Github repository or ask questions in the <a href="https://groups.google.com/forum/#!forum/shiny-discuss">shiny-discuss</a> mailing list.</p></description></item><item><title>RStudio adds custom domains, bigger data and package support to shinyapps.io</title><link>https://www.rstudio.com/blog/rstudio-adds-custom-domains-bigger-data-and-package-support-to-shinyapps-io/</link><pubDate>Wed, 24 Jun 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-adds-custom-domains-bigger-data-and-package-support-to-shinyapps-io/</guid><description><p>It&rsquo;s time to share &ldquo;What&rsquo;s New&rdquo; in shinyapps.io!</p><ul><li><p><strong><strong>Custom Domains</strong> - <strong>host your shiny applications on your own domain (Professional Plan only).</strong> <a href="https://shiny.rstudio.com/articles/custom-domains.html">Learn more</a>.</strong></p></li><li><p><strong>Bigger applications</strong> – include up to 1GB of data with your application bundles!</p></li><li><p><strong>Bigger packages</strong> – until now<a href="https://www.rstudio.com/products/shinyapps/"> shinyapps.io</a> could only support installation of packages under 100MB; now it&rsquo;s 1GB! (attention users of BioConductor packages especially)</p></li><li><p><strong>Better locale detection</strong> – the newest shinyapps package now detects and maps your locale appropriately if it varies from the locale of<a href="https://www.rstudio.com/products/shinyapps"> shinyapps.io</a> (you will need to update your shinyapps package)</p></li><li><p>****Application deletion - ****you can now delete applications permanently. First archive your application, then delete it. Note: Be careful; all deletes are permanent.</p></li><li><p><strong>Transfer / Rename accounts -</strong> select a different name for your account or transfer control to another shinyapps.io account holder.</p></li><li><p>****&ldquo;What&rsquo;s New&rdquo; is New <strong>-</strong> your dashboard displays the latest enhancements to shinyapps.io under&hellip;you guessed it&hellip; &ldquo;What&rsquo;s New&rdquo;!</p></li></ul><hr></description></item><item><title>New Shiny cheat sheet and video tutorial</title><link>https://www.rstudio.com/blog/new-shiny-cheat-sheet-and-video-tutorial/</link><pubDate>Mon, 22 Jun 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/new-shiny-cheat-sheet-and-video-tutorial/</guid><description><p>We&rsquo;ve added two new tools that make it even easier to learn Shiny.</p><h2 id="video-tutorial">Video tutorial</h2><p><img src="https://rstudioblog.files.wordpress.com/2015/05/01-how-to-start-002.png" alt="01-How-to-start"></p><p>The <a href="https://shiny.rstudio.com/tutorial">How to Start with Shiny training video</a> provides a new way to teach yourself Shiny. The video covers everything you need to know to build your own Shiny apps. You&rsquo;ll learn:</p><ul><li><p>The architecture of a Shiny app</p></li><li><p>A template for making apps quickly</p></li><li><p>The basics of building Shiny apps</p></li><li><p>How to add sliders, drop down menus, buttons, and more to your apps</p></li><li><p>How to share Shiny apps</p></li><li><p>How to control reactions in your apps to</p><ul><li><p>update displays</p></li><li><p>trigger code</p></li><li><p>reduce computation</p></li><li><p>delay reactions</p></li></ul></li><li><p>How to add design elements to your apps</p></li><li><p>How to customize the layout of an app</p></li><li><p>How to style your apps with CSS</p></li></ul><p>Altogether, the video contains two hours and 25 minutes of material organized around a navigable table of contents.</p><p>Best of all, the video tutorial is completely free. The video is the result of our recent How to Start Shiny <a href="https://www.rstudio.com/resources/webinars/">webinar series</a>. Thank you to everyone who attended and made the series a success!</p><p>Watch the new video tutorial <a href="https://shiny.rstudio.com/tutorial">here</a>.</p><h2 id="new-cheat-sheet">New cheat sheet</h2><p>The <a href="https://www.rstudio.com/resources/cheatsheets/">new Shiny cheat sheet</a> provides an up-to-date reference to the most important Shiny functions.</p><p><img src="https://rstudioblog.files.wordpress.com/2015/05/shiny-cheatsheet.png" alt="shiny-cheatsheet"></p><p>The cheat sheet replaces the previous cheat sheet, adding new sections on single-file apps, reactivity, CSS and more. The new sheet also gave us a chance to apply some of the things we&rsquo;ve learned about making cheat sheets since the original Shiny cheat sheet came out.</p><p>Get the new Shiny cheat sheet <a href="https://www.rstudio.com/resources/cheatsheets/">here</a>.</p></description></item><item><title>Shiny 0.12: Interactive Plots with ggplot2</title><link>https://www.rstudio.com/blog/shiny-0-12-interactive-plots-with-ggplot2/</link><pubDate>Tue, 16 Jun 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-0-12-interactive-plots-with-ggplot2/</guid><description><p>Shiny 0.12 has been released to CRAN!</p><p>Compared to version 0.11.1, the major changes are:</p><ul><li><p>Interactive plots with base graphics and ggplot2</p></li><li><p>Switch from RJSONIO to jsonlite</p></li></ul><p>For a full list of changes and bugfixes in this version, see the <a href="http://cran.r-project.org/web/packages/shiny/NEWS">NEWS</a> file.</p><p>To install the new version of Shiny, run:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">shiny&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">htmlwidgets&#34;</span>))</code></pre></div><p>htmlwidgets is not required, but shiny 0.12 will not work with older versions of htmlwidgets, so it&rsquo;s a good idea to install a fresh copy along with Shiny.</p><h2 id="interactive-plots-with-base-graphics-and-ggplot2">Interactive plots with base graphics and ggplot2</h2><p><img src="https://rstudioblog.files.wordpress.com/2015/06/exclude-points.gif" alt="Excluding points"></p><p>The major new feature in this version of Shiny is the ability to create interactive plots using R&rsquo;s base graphics or ggplot2. Adding interactivity is easy: it just requires using one option in <code>plotOutput()</code>, and then the information about mouse events will be available via the <code>input</code> object.</p><p>You can use mouse events to read mouse coordinates, select or deselect points, and implement zooming. Here are some example applications:</p><ul><li><p><a href="https://shiny.rstudio.com/gallery/plot-interaction-basic.html">Basic interactions</a></p></li><li><p><a href="https://shiny.rstudio.com/gallery/plot-interaction-zoom.html">Zooming</a></p></li><li><p><a href="https://shiny.rstudio.com/gallery/plot-interaction-advanced.html">Advanced interactions</a>: This demonstrates many advanced features of interactive plots.</p></li><li><p><a href="https://shiny.rstudio.com/gallery/plot-interaction-exclude.html">Excluding points</a> (as depicted in the screen capture above)</p></li></ul><p>For more information, see the Interactive Plots <a href="https://shiny.rstudio.com/articles/#interactive-plots">articles</a> in the Shiny Dev Center, and the demo apps in the <a href="https://shiny.rstudio.com/gallery/#interactive-plots">gallery</a>.</p><h2 id="switch-from-rjsonio-to-jsonlite">Switch from RJSONIO to jsonlite</h2><p>Shiny uses the JSON format to send data between the server (running R) and the client web browser (running JavaScript).</p><p>In previous versions of Shiny, the data was serialized to/from JSON using the <a href="http://cran.r-project.org/web/packages/RJSONIO/">RJSONIO</a> package. However, as of 0.12.0, Shiny switched from RJSONIO to <a href="http://cran.r-project.org/web/packages/jsonlite/">jsonlite</a>. The reasons for this are that jsonlite has better-defined conversion behavior, and it has better performance because much of it is now implemented in C.</p><p>For the vast majority of users, this will have no impact on existing Shiny apps.</p><p>The <a href="http://www.htmlwidgets.org/">htmlwidgets</a> package has also switched to jsonlite, and any Shiny apps that use htmlwidgets also require an upgrade to that package.</p><h2 id="a-note-about-data-tables">A note about Data Tables</h2><p>The version we just released to CRAN is actually 0.12.1; the previous version, 0.12.0, was released three weeks ago and deprecated Shiny&rsquo;s <a href="https://shiny.rstudio.com/articles/datatables.html"><code>dataTableOutput</code> and <code>renderDataTable</code></a> functions and instructed you to migrate to the nascent <a href="http://rstudio.github.io/DT/">DT</a> package instead. (We&rsquo;ll talk more about DT in a future blog post.)</p><p>User feedback has indicated this transition was too sudden and abrupt, so we&rsquo;ve undeprecated these functions in 0.12.1. We&rsquo;ll continue to support these functions until DT has had more time to mature.</p></description></item><item><title>Hadley Wickham's Master R Developer Workshop - Washington DC registration is open</title><link>https://www.rstudio.com/blog/hadley-wickhams-master-r-developer-workshop-washington-dc-registration-is-open/</link><pubDate>Fri, 12 Jun 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/hadley-wickhams-master-r-developer-workshop-washington-dc-registration-is-open/</guid><description><p>&ldquo;Master&rdquo; R in Washington DC this September!</p><p>Join RStudio Chief Data Scientist Hadley Wickham at the AMA – Executive Conference Center in Arlington, VA on September 14 and 15, 2015 for this rare opportunity to learn from one of the R community&rsquo;s most popular and innovative authors and package developers.</p><p>It will be at least another year before Hadley returns to teach his class on the East Coast, so don&rsquo;t miss this opportunity to learn from him in person. The venue is conveniently located next to Ronald Reagan Washington National Airport and a short distance from the Metro. Attendance is limited. Past events have sold out.</p><p><a href="http://www.eventbrite.com/e/master-r-developer-workshop-washington-dc-tickets-15220403637?aff=erelexporg">Register today!</a></p></description></item><item><title>testthat 0.10.0</title><link>https://www.rstudio.com/blog/testthat-0-10-0/</link><pubDate>Fri, 29 May 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/testthat-0-10-0/</guid><description><p>testthat 0.10.0 is now available on CRAN. Testthat makes it easy to turn the informal testing that you&rsquo;re already doing into formal automated tests. Learn more at <a href="http://r-pkgs.had.co.nz/tests.html">http://r-pkgs.had.co.nz/tests.html</a>. Install the latest version with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">testthat&#34;</span>)</code></pre></div><p>There are four big changes in this release:</p><ul><li><p><code>test_check()</code> uses a new reporter specifically designed for R CMD check. It displays a summary at the end of the tests, designed to be &lt;13 lines long so test failures in R CMD check display are as useful as possible.</p></li><li><p>New <code>skip_if_not_installed()</code> skips tests if a package isn&rsquo;t installed: this is useful if you want tests to skip if a suggested package isn&rsquo;t installed.</p></li><li><p>The <code>expect_that(a, equals(b))</code> style of testing has been soft-deprecated in favour of <code>expect_equals(a, b)</code>. It will keep working, but it&rsquo;s no longer demonstrated in the documentation, and new expectations will only be available in <code>expect_equal(a, b)</code> style.</p></li><li><p><code>compare()</code> is now documented and exported: compare is used to display test failures for <code>expect_equal()</code>, and is designed to help you spot exactly where the failure occured. It currently has methods for character and numeric vectors.</p></li></ul><p>There were a number of other minor improvements and bug fixes. See the <a href="https://github.com/hadley/testthat/releases/tag/v0.10.0">release notes</a> for a complete list.</p></description></item><item><title>SparkR preview by Vincent Warmerdam</title><link>https://www.rstudio.com/blog/sparkr-preview-by-vincent-warmerdam/</link><pubDate>Thu, 28 May 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/sparkr-preview-by-vincent-warmerdam/</guid><description><p>This is a guest post by Vincent Warmerdam of <a href="http://koaning.io%E2%80%8B">koaning.io</a>.</p><h1 id="sparkr-preview-in-rstudio">SparkR preview in Rstudio</h1><p>Apache Spark is the hip new technology on the block. It allows you to write scripts in a functional style and the technology behind it will allow you to run iterative tasks very quickly on a cluster of machines. It&rsquo;s benchmarked to be quicker than hadoop for most machine learning use cases (by a factor between 10-100) and soon Spark will also have support for the R language. As of April 2015, SparkR has been merged into Apache Spark and is shipping with a new version in an upcoming release (1.4) due early summer 2015. In the meanwhile, you can use this tutorial to go ahead and get familiar with the current version of SparkR.</p><p><strong>Disclaimer</strong> : although you will be able to use this tutorial to write Spark jobs right now with R, the new api due this summer will most likely have breaking changes.</p><h2 id="running-spark-locally">Running Spark Locally</h2><p>The main feature of Spark is the resilient distributed dataset, which is a dataset that can be queried in memory, in parallel on a cluster of machines. You don&rsquo;t need a cluster of machines to get started with Spark though. Even on a single machine, Spark is able to efficiently use any configured resources. To keep it simple we will ignore this configuration for now and do a quick one-click install. You can use devtools to download and install Spark with SparkR.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(devtools)<span style="color:#06287e">install_github</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">amplab-extras/SparkR-pkg&#34;</span>, subdir<span style="color:#666">=</span><span style="color:#4070a0">&#34;</span><span style="color:#4070a0">pkg&#34;</span>)</code></pre></div><p>This might take a while. But after the installation, the following R code will run Spark jobs for you:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(magrittr)<span style="color:#06287e">library</span>(SparkR)sc <span style="color:#666">&lt;-</span> <span style="color:#06287e">sparkR.init</span>(master<span style="color:#666">=</span><span style="color:#4070a0">&#34;</span><span style="color:#4070a0">local&#34;</span>)sc <span style="color:#666">%&gt;%</span><span style="color:#06287e">parallelize</span>(<span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">100000</span>) <span style="color:#666">%&gt;%</span>count</code></pre></div><p>This small program generates a list, gives it to Spark (which turns it into an RDD, Spark&rsquo;s Resilient Distributed Dataset structure) and then counts the number of items in it. SparkR exposes the RDD API of Spark as distributed lists in R, which plays very nicely with <strong>magrittr</strong>. As long as you follow the API, you don&rsquo;t need to worry much about parallelizing for performance for your programs.</p><h3 id="a-more-elaborate-example">A more elaborate example</h3><p>Spark also allows for grouped operations, which might remind you a bit of dplyr.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">nums <span style="color:#666">=</span> <span style="color:#06287e">runif</span>(<span style="color:#40a070">100000</span>) <span style="color:#666">*</span> <span style="color:#40a070">10</span>sc <span style="color:#666">%&gt;%</span><span style="color:#06287e">parallelize</span>(nums) <span style="color:#666">%&gt;%</span><span style="color:#06287e">map</span>(<span style="color:#06287e">function</span>(x) <span style="color:#06287e">round</span>(x)) <span style="color:#666">%&gt;%</span><span style="color:#06287e">filterRDD</span>(<span style="color:#06287e">function</span>(x) x <span style="color:#666">%%</span> <span style="color:#40a070">2</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">map</span>(<span style="color:#06287e">function</span>(x) <span style="color:#06287e">list</span>(x, <span style="color:#40a070">1</span>)) <span style="color:#666">%&gt;%</span><span style="color:#06287e">reduceByKey</span>(<span style="color:#06287e">function</span>(x,y) x <span style="color:#666">+</span> y, <span style="color:#40a070">1L</span>) <span style="color:#666">%&gt;%</span>collect</code></pre></div><p>The Spark API will look very &lsquo;functional&rsquo; to programmers used to functional programming languages (which should come to no suprise if you know that Spark is written in Scala). This script will do the following;</p><ol><li><p>it will create a RRD Spark object from the original data</p></li><li><p>it will map each number to a rounded number</p></li><li><p>it will filter all even numbers out or the RDD</p></li><li><p>next it will create key/value pairs that can be counted</p></li><li><p>it then reduces the key value pairs (the 1L is the number of partitions for the resulting RDD)</p></li><li><p>and it collects the results</p></li></ol><p>Spark will have started running services locally on your computer, which can be viewed at <code>http://localhost:4040/stages/</code>. You should be able to see all the jobs you&rsquo;ve run here. You will also see which jobs have failed with the error log.</p><h3 id="bootstrapping-with-spark">Bootstrapping with Spark</h3><p>These examples are nice, but you can also use the power of Spark for more common data science tasks. Let&rsquo;s sample a dataset to generate a large RDD, which we will then summarise via bootstrapping. Instead of parallelizing numbers, I will now parallelize dataframe samples.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">sc <span style="color:#666">&lt;-</span> <span style="color:#06287e">sparkR.init</span>(master<span style="color:#666">=</span><span style="color:#4070a0">&#34;</span><span style="color:#4070a0">local&#34;</span>)sample_cw <span style="color:#666">&lt;-</span> <span style="color:#06287e">function</span>(n, s){<span style="color:#06287e">set.seed</span>(s)ChickWeight<span style="color:#06287e">[sample</span>(<span style="color:#06287e">nrow</span>(ChickWeight), n), ]}data_rdd <span style="color:#666">&lt;-</span> sc <span style="color:#666">%&gt;%</span><span style="color:#06287e">parallelize</span>(<span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">200</span>, <span style="color:#40a070">20</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">map</span>(<span style="color:#06287e">function</span>(s) <span style="color:#06287e">sample_cw</span>(<span style="color:#40a070">250</span>, s))</code></pre></div><p>For the <code>parallelize</code> function we can assign the number of partitions Spark can use for the resulting RDD. My <code>s</code> argument ensures that each partition will use a different random seed when sampling. This <code>data_rdd</code> is useful, because it can be reused for multiple purposes.</p><p>You can use it to estimate the mean of the weight.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">data_rdd <span style="color:#666">%&gt;%</span><span style="color:#06287e">map</span>(<span style="color:#06287e">function</span>(x) <span style="color:#06287e">mean</span>(x<span style="color:#666">$</span>weight)) <span style="color:#666">%&gt;%</span>collect <span style="color:#666">%&gt;%</span>as.numeric <span style="color:#666">%&gt;%</span><span style="color:#06287e">hist</span>(<span style="color:#40a070">20</span>, main<span style="color:#666">=</span><span style="color:#4070a0">&#34;</span><span style="color:#4070a0">mean weight, bootstrap samples&#34;</span>)</code></pre></div><p>Or you can use it to perform bootstrapped regressions.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">train_lm <span style="color:#666">&lt;-</span> <span style="color:#06287e">function</span>(data_in){<span style="color:#06287e">lm</span>(data<span style="color:#666">=</span>data_in, weight <span style="color:#666">~</span> Time)}coef_rdd <span style="color:#666">&lt;-</span> data_rdd <span style="color:#666">%&gt;%</span><span style="color:#06287e">map</span>(train_lm) <span style="color:#666">%&gt;%</span><span style="color:#06287e">map</span>(<span style="color:#06287e">function</span>(x) x<span style="color:#666">$</span>coefficients)get_coef <span style="color:#666">&lt;-</span> <span style="color:#06287e">function</span>(k) {code_rdd <span style="color:#666">%&gt;%</span><span style="color:#06287e">map</span>(<span style="color:#06287e">function</span>(x) x[k]) <span style="color:#666">%&gt;%</span>collect <span style="color:#666">%&gt;%</span>as.numeric}df <span style="color:#666">&lt;-</span> <span style="color:#06287e">data.frame</span>(intercept <span style="color:#666">=</span> <span style="color:#06287e">get_coef</span>(<span style="color:#40a070">1</span>), time_coef <span style="color:#666">=</span> <span style="color:#06287e">get_coef</span>(<span style="color:#40a070">2</span>))df<span style="color:#666">$</span>intercept <span style="color:#666">%&gt;%</span> <span style="color:#06287e">hist</span>(breaks <span style="color:#666">=</span> <span style="color:#40a070">30</span>, main<span style="color:#666">=</span><span style="color:#4070a0">&#34;</span><span style="color:#4070a0">beta coef for intercept&#34;</span>)df<span style="color:#666">$</span>time_coef <span style="color:#666">%&gt;%</span> <span style="color:#06287e">hist</span>(breaks <span style="color:#666">=</span> <span style="color:#40a070">30</span>, main<span style="color:#666">=</span><span style="color:#4070a0">&#34;</span><span style="color:#4070a0">beta coef for time&#34;</span>)</code></pre></div><p>The slow part of this tree of operations is the creation of the data, because this has to occur locally through R. A more common use case for Spark would be to load a large dataset from S3 which connects to a large EC2 cluster of Spark machines.</p><h3 id="more-power">More power?</h3><p>Running Spark locally is nice and will already allow for parallelism, but the real profit can be gained by running Spark on a cluster of computers. The nice thing is that Spark automatically comes with a script that will automate the provisioning of a Spark cluster on Amazon AWS.</p><p>To get a cluster started; start up an EC2 cluster with the supplied ec2 folder from <a href="https://github.com/apache/spark/">Apache&rsquo;s Spark github repo</a>. A more elaborate tutorial can be found <a href="https://spark.apache.org/docs/latest/ec2-scripts.html">here</a>, but if you already are an Amazon user, provisioning a cluster on Amazon is as simple as calling a one-liner:</p><pre><code>./spark-ec2 \--key-pair=spark-df \--identity-file=/path/spark-df.pem \--region=eu-west-1 \-s 3 \--instance-type c3.2xlarge \launch my-spark-cluster</code></pre><p>If you ssh in the master node that has just been setup you can run the following code:</p><pre><code>cd /rootgit clone https://github.com/amplab-extras/SparkR-pkg.gitcd SparkR-pkgSPARK_VERSION=1.2.1 ./install-dev.shcp -a /root/SparkR-pkg/lib/SparkR /usr/share/R/library//root/spark-ec2/copy-dir /root/SparkR-pkg/root/spark/sbin/slaves.sh cp -a /root/SparkR-pkg/lib/SparkR /usr/share/R/library/</code></pre><h3 id="launch-sparkr-on-a-cluster">Launch SparkR on a Cluster</h3><p>Finally to launch SparkR and connect to the Spark EC2 cluster, we run the following code on the master machine:</p><pre><code>MASTER=spark://:7077 ./sparkR</code></pre><p>The hostname can be retrieved using:</p><pre><code>cat /root/spark-ec2/cluster-url</code></pre><p>You can check on the status of your cluster via Spark&rsquo;s Web UI at <code>http://:8080</code>.</p><h2 id="the-future">The future</h2><p>Everything described in this document is subject to changes with the next Spark release, but should help you feel familiar on how Spark works. There will be R support for Spark, less so for low level RDD operations but more so for its distributed machine learning algorithms as well as DataFrame objects.</p><p>The support for R in the Spark universe might be a game changer. R has always been great on doing exploratory and interactive analysis on small to medium datasets. With the addition of Spark, R can become a more viable tool for big datasets.</p><p>June is the current planned release date for Spark 1.4 which will allow R users to run data frame operations in parallel on the distributed memory of a cluster of computers. All of which is completely open source.</p><p>It will be interesting to see what possibilities this brings for the R community.</p></description></item><item><title>New Version of RStudio (v0.99) Available Now</title><link>https://www.rstudio.com/blog/new-version-of-rstudio-v0-99/</link><pubDate>Tue, 26 May 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/new-version-of-rstudio-v0-99/</guid><description><p>We&rsquo;re pleased to announce that the final version of RStudio v0.99 is <a href="https://www.rstudio.com/ide/download">available for download</a> now. Highlights of the release include:</p><ul><li><p>A new <a href="https://support.rstudio.com/hc/en-us/articles/205175388-Using-the-Data-Viewer">data viewer</a> with support for large datasets, filtering, searching, and sorting.</p></li><li><p>Complete overhaul of R <a href="https://support.rstudio.com/hc/en-us/articles/205273297-Code-Completion">code completion</a> with many new features and capabilities.</p></li><li><p>The source editor now provides <a href="https://support.rstudio.com/hc/en-us/articles/205753617-Code-Diagnostics">code diagnostics</a> (errors, warnings, etc.) as you work.</p></li><li><p>User customizable <a href="https://support.rstudio.com/hc/en-us/articles/204463668-Code-Snippets">code snippets</a> for automating common editing tasks.</p></li><li><p><a href="https://blog.rstudio.com/2015/04/14/rstudio-v0-99-preview-tools-for-rcpp/">Tools for Rcpp</a>: completion, diagnostics, code navigation, find usages, and automatic indentation.</p></li><li><p>Many additional <a href="https://blog.rstudio.com/2015/05/06/rstudio-v0-99-preview-more-editor-enhancements/">source editor improvements</a> including multiple cursors, tab re-ordering, and several new themes.</p></li><li><p>An <a href="https://blog.rstudio.com/2015/02/23/rstudio-0-99-preview-vim-mode-improvements/">enhanced Vim mode</a> with visual block selection, macros, marks, and subset of : commands.</p></li></ul><p>There are also lots of smaller improvements and bug fixes across the product. Check out the <a href="https://www.rstudio.com/products/rstudio/release-notes/">v0.99 release notes</a> for details on all of the changes.</p><h3 id="data-viewer">Data Viewer</h3><p>We&rsquo;ve completely overhauled the data viewer with many new capabilities including live update, sorting and filtering, full text searching, and no row limit on viewed datasets.</p><p><img src="https://rstudioblog.files.wordpress.com/2015/05/screen-shot-2015-05-06-at-12-01-14-pm.png" alt="data-viewer"></p><p>See the <a href="https://support.rstudio.com/hc/en-us/articles/205175388-Using-the-Data-Viewer">data viewer documentation</a> for more details.</p><h3 id="code-completion">Code Completion</h3><p>Previously RStudio only completed variables that already existed in the global environment. Now completion is done based on source code analysis so is provided even for objects that haven&rsquo;t been fully evaluated:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/05/screen-shot-2015-05-06-at-11-50-41-am.png" alt="completion-scopes"></p><p>Completions are also provided for a wide variety of specialized contexts including dimension names in [ and [[:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/05/screen-shot-2015-05-06-at-11-54-22-am.png" alt="completion-bracket"></p><h3 id="code-diagnostics">Code Diagnostics</h3><p>We&rsquo;ve added a new inline code diagnostics feature that highlights various issues in your R code as you edit.</p><p>For example, here we&rsquo;re getting a diagnostic that notes that there is an extra parentheses:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/04/screen-shot-2015-04-08-at-12-04-14-pm.png" alt="Screen Shot 2015-04-08 at 12.04.14 PM"></p><p>Here the diagnostic indicates that we&rsquo;ve forgotten a comma within a shiny UI definition:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/04/screen-shot-2015-04-28-at-11-29-46-am.png" alt="diagnostics-comma"></p><p>A wide variety of diagnostics are supported, including optional diagnostics for code style issues (e.g. the inclusion of unnecessary whitespace). Diagnostics are also available for several other languages including C/C++, JavaScript, HTML, and CSS. See the <a href="https://support.rstudio.com/hc/en-us/articles/205753617-Code-Diagnostics">code diagnostics documentation</a> for additional details.</p><h3 id="code-snippets">Code Snippets</h3><p>Code snippets are text macros that are used for quickly inserting common snippets of code. For example, the <code>fun</code> snippet inserts an R function definition:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/04/screen-shot-2015-04-07-at-10-39-50-am.png" alt="Insert Snippet"></p><p>If you select the snippet from the completion list it will be inserted along with several text placeholders which you can fill in by typing and then pressing <strong>Tab</strong> to advance to the next placeholder:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/04/screen-shot-2015-04-07-at-10-44-39-am.png" alt="Screen Shot 2015-04-07 at 10.44.39 AM"></p><p>Other useful snippets include:</p><ul><li><p><code>lib</code>, <code>req</code>, and <code>source</code> for the library, require, and source functions</p></li><li><p><code>df</code> and <code>mat</code> for defining data frames and matrices</p></li><li><p><code>if</code>, <code>el</code>, and <code>ei</code> for conditional expressions</p></li><li><p><code>apply</code>, <code>lapply</code>, <code>sapply</code>, etc. for the apply family of functions</p></li><li><p><code>sc</code>, <code>sm</code>, and <code>sg</code> for defining S4 classes/methods.</p></li></ul><p>See the <a href="https://support.rstudio.com/hc/en-us/articles/204463668-Code-Snippets">code snippets documentation</a> for additional details.</p><h3 id="try-it-out">Try it Out</h3><p>RStudio v0.99 is <a href="https://www.rstudio.com/products/rstudio/download/">available for download</a> now. We hope you enjoy the new release and as always please <a href="https://support.rstudio.com">let us know</a> how it&rsquo;s working and what else we can do to make the product better.</p></description></item><item><title>Hadley Wickham Master R Developer Workshop in Chicago – Space Limited</title><link>https://www.rstudio.com/blog/hadley-wickham-master-r-developer-workshop-in-chicago-space-limited/</link><pubDate>Tue, 12 May 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/hadley-wickham-master-r-developer-workshop-in-chicago-space-limited/</guid><description><p><img src="https://rstudioblog.files.wordpress.com/2014/08/hadleywickhamhs.png" alt="HadleyWickhamHS">Join RStudio Chief Data Scientist Hadley Wickham at the University of Illinois at Chicago, on Wednesday May 27th &amp; 28th for this rare opportunity to learn from one of the R community&rsquo;s most popular and innovative authors and package developers.</p><p>As of this post, the workshop is two-thirds sold out. If you&rsquo;re in or near Chicago and want to boost your R programming skills, this is Hadley&rsquo;s only Central US public workshop planned for 2015.</p><p>Register here: <a href="https://rstudio-chicago.eventbrite.com">https://rstudio-chicago.eventbrite.com</a></p></description></item><item><title>devtools 1.8.0</title><link>https://www.rstudio.com/blog/devtools-1-9-0/</link><pubDate>Mon, 11 May 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/devtools-1-9-0/</guid><description><p>Devtools 1.8 is now available on CRAN. Devtools makes it so easy to build a package that it becomes your default way to organise code, data and documentation. You can learn more about developing packages at <a href="http://r-pkgs.had.co.nz/">http://r-pkgs.had.co.nz/</a>.</p><p>Get the latest version of devtools with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">devtools&#34;</span>)</code></pre></div><p>There are three main improvements:</p><ul><li><p>More helpers to get you up and running with package development as quickly as possible.</p></li><li><p>Better tools for package installation (including checking that all dependencies are up to date).</p></li><li><p>Improved reverse dependency checking for CRAN packages.</p></li></ul><p>There were many other minor improvements and bug fixes. See the <a href="https://github.com/hadley/devtools/releases/tag/v1.8.0">release notes</a> for complete list of changes. The last release announcement was for devtools 1.6 since there weren&rsquo;t many big changes in devtools 1.7. I&rsquo;ve included the most important points in this announcement labelled with [1.7]. ## Helpers</p><p>The number of functions designed to get you up and going with package development continues to grow. This version sees the addition of:</p><ul><li><p><code>dr_devtools()</code>, which runs some common diagnostics: are you using the latest version of R and devtools? Similarly, <code>dr_github()</code> checks for common git/github configuration problems.</p></li><li><p><code>lint()</code> runs <code>lintr::lint_package()</code> to check the style of package code [1.7].</p></li><li><p><code>use_code_of_conduct()</code> adds a contributor code of conduct from <a href="http://contributor-covenant.org">http://contributor-covenant.org</a>.</p></li><li><p><code>use_cran_badge()</code> adds a CRAN status badge that you can copy into a README file. Green indicates package is on CRAN. Packages not yet submitted or accepted to CRAN get a red badge.</p></li><li><p><code>use_cran_comments()</code> creates a <code>cran-comments.md</code> template and adds it to <code>.Rbuildignore</code> to help with CRAN submissions. [1.7]</p></li><li><p><code>use_coveralls()</code> allows you to easily add test coverage with <a href="https://coveralls.io">coveralls</a>.</p></li><li><p><code>use_git()</code> sets up a package to use git, initialising the repo and checking the existing files.</p></li><li><p><code>use_test()</code> adds a new test file in <code>tests/testthat</code>.</p></li><li><p><code>use_readme_rmd()</code> sets up a template to generate a <code>README.md</code> from a <code>README.Rmd</code> with knitr. [1.7]</p></li></ul><h2 id="package-installation-and-info">Package installation and info</h2><p>When developing packages it&rsquo;s common to run into problems because you&rsquo;ve updated a package, but you&rsquo;ve forgotten to update it&rsquo;s dependencies (<code>install.packages()</code> doesn&rsquo;t this automatically). The new <code>package_deps()</code> solves this problem by finding all recursive dependencies of a package and determining if they&rsquo;re out of date:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic"># Find out which dependencies are out of date</span>devtools<span style="color:#666">::</span><span style="color:#06287e">package_deps</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">devtools&#34;</span>)<span style="color:#60a0b0;font-style:italic"># Update them</span><span style="color:#06287e">update</span>(devtools<span style="color:#666">::</span><span style="color:#06287e">package_deps</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">devtools&#34;</span>))</code></pre></div><p>This code is used in <code>install_deps()</code> and <code>revdep_check()</code> - devtools is now aggressive about updating packages, which should avoid potential problems in CRAN submissions.New <code>update_packages()</code> uses these tools to install a package (and its dependencies) only if they&rsquo;re not already installed and current.</p><h2 id="reverse-dependency-checking">Reverse dependency checking</h2><p>Devtools 1.7 included considerable improvement to reverse dependency checking. This sort of checking is important if your package gets popular, and is used by other CRAN packages. Before submitting updates to CRAN, you need to make sure that you have not broken the CRAN packages that use your package. Read more about it in the <a href="http://r-pkgs.had.co.nz/release.html#release-deps">R packages book</a>. To get started, run <code>use_revdep()</code>, then run the code in <code>revdep/check.R</code>.</p></description></item><item><title>RStudio v0.99 Preview: More Editor Enhancements</title><link>https://www.rstudio.com/blog/rstudio-v0-99-preview-more-editor-enhancements/</link><pubDate>Wed, 06 May 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-v0-99-preview-more-editor-enhancements/</guid><description><p>We&rsquo;ve blogged previously about various improvements we&rsquo;ve made to the source editor in RStudio v0.99 including enhanced <a href="https://blog.rstudio.com/2015/02/23/rstudio-v0-99-preview-code-completion/">code completion</a>, <a href="https://blog.rstudio.com/2015/04/13/rstudio-v0-99-preview-code-snippets/">snippets</a>, <a href="https://blog.rstudio.com/2015/04/28/rstudio-v0-99-preview-code-diagnostics/">diagnostics</a>, and an <a href="https://blog.rstudio.com/2015/02/23/rstudio-0-99-preview-vim-mode-improvements/">improved Vim mode</a>. Besides these larger scale features we&rsquo;ve made lots of smaller improvements that we also wanted to highlight. You can try out all of these features now in the RStudio v0.99 <a href="https://www.rstudio.com/products/rstudio/download/preview/">preview release</a>.</p><h3 id="multiple-cursors">Multiple Cursors</h3><p>You can now create and use multiple cursors within RStudio. Multiple cursors can be created in a variety of ways:</p><ul><li><p>Press <strong>Ctrl + Alt + {Up/Down}</strong> to create a new cursor in the pressed direction,</p></li><li><p>Press <strong>Ctrl + Alt + Shift + {Direction}</strong> to move a second cursor in the specified direction,</p></li><li><p>Use <strong>Alt</strong> and drag with the mouse to create a rectangular selection,</p></li><li><p>Use <strong>Alt + Shift</strong> and click to create a rectangular selection from the current cursor position to the clicked position.</p></li></ul><p>RStudio also makes use of multiple cursors in its Find / Replace toolbar now. After entering a search term, if you press the All button, all items matching that search term are selected.</p><p><img src="https://rstudioblog.files.wordpress.com/2015/05/screen-shot-2015-05-05-at-2-58-17-pm.png" alt="Screen Shot 2015-05-05 at 2.58.17 PM"></p><p>You can then begin typing to replace each match with a new term—each matched entry will be updated as you type.</p><h3 id="rearrangeable-tabs">Rearrangeable Tabs</h3><p>You can (finally!) move tabs around in the Source pane by clicking and dragging. In the below example, the file &lsquo;file_4.R&rsquo; is currently selected and being dragged into place.</p><p><img src="https://rstudioblog.files.wordpress.com/2015/05/rearrange-tabs2.png" alt="rearrange-tabs"></p><h3 id="new-improved-editor-themes">New, Improved Editor Themes</h3><p>A number of new editor themes have been added to RStudio, and older editor themes have been tweaked to ensure that brackets are given a distinct color from text for further legibility.</p><p><img src="https://rstudioblog.files.wordpress.com/2015/05/themes.png" alt="themes"></p><h3 id="select--expand-selection">Select / Expand Selection</h3><p>You can use <strong>Ctrl + Shift + E</strong> to select everything within the nearest pair of opening and closing brackets, or use <strong>Ctrl + Alt + Shift + E</strong> to expand the selection up to the next closing bracket.</p><p><img src="https://rstudioblog.files.wordpress.com/2015/05/screen-shot-2015-05-05-at-2-56-45-pm.png" alt="Screen Shot 2015-05-05 at 2.56.45 PM"></p><h3 id="fuzzy-navigation">Fuzzy Navigation</h3><p>You can use <strong>CTRL + .</strong> to quickly navigate between files and symbols within a project. Previously, this search utility performed prefix matching, and so it was difficult to use with long file / symbol names. Now, the <strong>CTRL + .</strong> navigator uses fuzzy matching to narrow the candidate set down based on subsequence matching, which makes it easier to navigate when many files share a common prefix—for example, to <strong>test-</strong> files for a project managing its tests with <a href="http://r-pkgs.had.co.nz/tests.html">testthat</a>.</p><p><img src="https://rstudioblog.files.wordpress.com/2015/05/screen-shot-2015-05-05-at-2-41-11-pm.png" alt="Screen Shot 2015-05-05 at 2.41.11 PM"></p><h3 id="insert-roxygen-skeleton">Insert Roxygen Skeleton</h3><p>RStudio now provides a means for inserting a <a href="http://cran.r-project.org/web/packages/roxygen2/index.html">Roxygen</a> documentation skeleton above functions. The skeleton generator is smart enough to understand plain R functions, as well as S4 generics, methods and classes—it will automatically fill in documentation for available parameters and slots.</p><p><img src="https://rstudioblog.files.wordpress.com/2015/05/roxygen-skeleton1.png" alt="roxygen-skeleton"></p><h3 id="more-languages">More Languages</h3><p>We&rsquo;ve also added syntax highlighting modes for many new languages including Clojure, CoffeeScript, C#, Graphviz, Go, Groovy, Haskell, Java, Julia, Lisp, Lua, Matlab, Perl, Ruby, Rust, Scala, and Stan. There&rsquo;s also some basic keyword and text based code completion for several languages including JavaScript, HTML, CSS, Python, and SQL.</p><h3 id="try-it-out">Try it Out</h3><p>You can try out all of the new editor features by downloading the latest <a href="https://www.rstudio.com/products/rstudio/download/preview/">preview release</a> of RStudio. As always, <a href="https://support.rstudio.com">let us know</a> how the new features are working as well as what else you&rsquo;d like to see us do.</p><h2 id="heading"></h2></description></item><item><title>stringr 1.0.0</title><link>https://www.rstudio.com/blog/stringr-1-0-0/</link><pubDate>Tue, 05 May 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/stringr-1-0-0/</guid><description><p>I&rsquo;m very excited to announce the 1.0.0 release of the stringr package. If you haven&rsquo;t heard of stringr before, it makes string manipulation easier by:</p><ul><li><p>Using consistent function and argument names: all functions start with <code>str_</code>, and the first argument is always the input string This makes stringr easier to learn and easy to use with <a href="http://github.com/smbache/magrittr/">the pipe</a>.</p></li><li><p>Eliminating options that you don&rsquo;t need 95% of the time.</p></li></ul><p>To get started with stringr, check out the <a href="http://cran.r-project.org/web/packages/stringr/vignettes/stringr.html">new vignette</a>.</p><h2 id="whats-new">What&rsquo;s new?</h2><p>The biggest change in this release is that stringr is now powered by the <a href="https://github.com/Rexamine/stringi">stringi</a> package instead of base R. This has two big benefits: stringr is now much faster, and has much better unicode support.</p><p>If you&rsquo;ve used stringi before, you might wonder why stringr is still necessary: stringi does everything that stringr does, and much much more. There are two reasons that I think stringr is still important:</p><ol><li><p>Lots of people use it already, so this update will give many people a performance boost for free.</p></li><li><p>The smaller API of stringr makes it a little easier to learn.</p></li></ol><p>That said, once you&rsquo;ve learned stringr, using stringi should be easy, so it&rsquo;s a great place to start if you need a tool that doesn&rsquo;t exist in stringr.</p><h2 id="new-features-and-functions">New features and functions</h2><ul><li><code>str_replace_all()</code> gains a convenient syntax for applying multiple pairs of pattern and replacement to the same vector:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">x <span style="color:#666">&lt;-</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">abc&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">def&#34;</span>)<span style="color:#06287e">str_replace_all</span>(x, <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">[ad]&#34;</span> <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">!&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">[cf]&#34;</span> <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">?&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;!b?&#34; &#34;!e?&#34;</span></code></pre></div><ul><li><code>str_subset()</code> keeps values that match a pattern:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">x <span style="color:#666">&lt;-</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">abc&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">def&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">jhi&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">klm&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">nop&#34;</span>)<span style="color:#06287e">str_subset</span>(x, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">[aeiou]&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;abc&#34; &#34;def&#34; &#34;jhi&#34; &#34;nop&#34;</span></code></pre></div><ul><li><code>str_order()</code> and <code>str_sort()</code> sort and order strings in a specified locale. <code>str_conv()</code> to converts strings from specified encoding to UTF-8.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic"># The vowels come before the consonants in Hawaiian</span><span style="color:#06287e">str_sort</span>(<span style="color:#007020;font-weight:bold">letters</span>[1<span style="color:#666">:</span><span style="color:#40a070">10</span>], locale <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">haw&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;a&#34; &#34;e&#34; &#34;i&#34; &#34;b&#34; &#34;c&#34; &#34;d&#34; &#34;f&#34; &#34;g&#34; &#34;h&#34; &#34;j&#34;</span></code></pre></div><ul><li>New modifier <code>boundary()</code> allows you to count, locate and split by character, word, line and sentence boundaries.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">words <span style="color:#666">&lt;-</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">These are some words. Some more words.&#34;</span>)<span style="color:#06287e">str_count</span>(words, <span style="color:#06287e">boundary</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">word&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; [1] 7</span><span style="color:#06287e">str_split</span>(words, <span style="color:#06287e">boundary</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">word&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; [[1]]</span><span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;These&#34; &#34;are&#34; &#34;some&#34; &#34;words&#34; &#34;Some&#34; &#34;more&#34; &#34;words&#34;</span></code></pre></div><p>There were two minor changes to make stringr a little more consistent:</p><ul><li><p><code>str_c()</code> now returns a zero length vector if any of its inputs are zero length vectors. This is consistent with all other functions, and standard R recycling rules. Similarly, using <code>str_c(&quot;x&quot;, NA)</code> now yields <code>NA</code>. If you want <code>&quot;xNA&quot;</code>, use <code>str_replace_na()</code> on the inputs.</p></li><li><p><code>str_match()</code> now returns NA if an optional group doesn&rsquo;t match (previously it returned &ldquo;&quot;). This is more consistent with <code>str_extract()</code> and other match failures.</p></li></ul><h2 id="development">Development</h2><p>Stringr is over five years old and is quite stable (the last release was over two years ago). Although I&rsquo;ve endeavoured to make the change to stringi as seemless as possible, it&rsquo;s likely that it has created some new bugs. If you have problems, please try the <a href="https://github.com/hadley/stringr">development version</a>, and if that doesn&rsquo;t help, <a href="https://github.com/hadley/stringr/issues">file an issue on github</a>.</p></description></item><item><title>RStudio v0.99 Preview: Graphviz and DiagrammeR</title><link>https://www.rstudio.com/blog/rstudio-v0-99-preview-graphviz-and-diagrammer/</link><pubDate>Fri, 01 May 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-v0-99-preview-graphviz-and-diagrammer/</guid><description><p>Soon after the announcement of <a href="http://www.htmlwidgets.org/">htmlwidgets</a>, Rich Iannone released the <a href="http://rich-iannone.github.io/DiagrammeR/">DiagrammeR</a> package, which makes it easy to generate graph and flowchart diagrams using text in a Markdown-like syntax. The package is very flexible and powerful, and includes:</p><ol><li><p>Rendering of <a href="http://en.wikipedia.org/wiki/Graphviz">Graphviz</a> graph visualizations (via <a href="https://github.com/mdaines/viz.js/">viz.js</a>)</p></li><li><p>Creating diagrams and flowcharts using <a href="http://knsv.github.io/mermaid/">mermaid.js</a></p></li><li><p>Facilities for mapping R objects into graphs, diagrams, and flowcharts.</p></li></ol><p>We&rsquo;re very excited about the prospect of creating sophisticated diagrams using an easy to author plain-text syntax, and built some special authoring support for DiagrammeR into RStudio v0.99 (which you can download a <a href="https://www.rstudio.com/products/rstudio/download/preview/">preview release</a> of now).</p><h3 id="graphviz-meets-r">Graphviz Meets R</h3><p>If you aren&rsquo;t familiar with Graphviz, it&rsquo;s a tool for rendering <a href="http://en.wikipedia.org/wiki/DOT_(graph_description_language)">DOT</a> (a plain text graph description language). DOT draws directed graphs as hierarchies. Its features include well-tuned layout algorithms for placing nodes and edge splines, edge labels, &ldquo;record&rdquo; shapes with &ldquo;ports&rdquo; for drawing data structures, and cluster layouts (see <a href="http://www.graphviz.org/pdf/dotguide.pdf">http://www.graphviz.org/pdf/dotguide.pdf</a> for an introductory guide).</p><p>DiagrammeR can render any DOT script. For example, with the following source file (&ldquo;boxes.dot&rdquo;):</p><p><img src="https://rstudioblog.files.wordpress.com/2015/04/screen-shot-2015-04-30-at-12-35-17-pm.png" alt="Screen Shot 2015-04-30 at 12.35.17 PM"></p><p>You can render the diagram with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(DiagrammeR)<span style="color:#06287e">grViz</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">boxes.dot&#34;</span>)</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2015/04/screen-shot-2015-04-30-at-12-31-11-pm.png" alt="grviz-viewer"></p><p>Since the diagram is an <a href="http://www.htmlwidgets.org">htmlwidget</a> it can be used at the R console, within R Markdown documents, and within Shiny applications. Within RStudio you can preview a Graphviz or mermaid source file the same way you source an R script via the <strong>Preview</strong> button or the <strong>Ctrl+Shift+Enter</strong> keyboard shortcut.</p><p>This simple example only scratches the surface of what&rsquo;s possible, see the <a href="http://rich-iannone.github.io/DiagrammeR/graphviz.html">DiagrammeR Graphviz documentation</a> for more details and examples.</p><h3 id="diagrams-with-mermaidjs">Diagrams with mermaid.js</h3><p>Support for <a href="http://rich-iannone.github.io/DiagrammeR/mermaid.html">mermaid.js</a> in DiagrammeR enables you to create several other diagram types not supported by Graphviz. For example, here&rsquo;s the code required to create a sequence diagram:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/04/screen-shot-2015-04-30-at-1-31-11-pm.png" alt="sequence"></p><p>You can render the diagram with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(DiagrammeR)<span style="color:#06287e">mermaid</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">sequence.mmd&#34;</span>)</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2015/04/screen-shot-2015-04-30-at-2-08-27-pm1.png" alt="sequence-viewer"></p><p>See the <a href="http://rich-iannone.github.io/DiagrammeR/mermaid.html">DigrammeR mermaid.js documentation</a> for additional details.</p><h3 id="generating-diagrams-from-r-code">Generating Diagrams from R Code</h3><p>Both of the examples above illustrating creating diagrams by direct editing of DOT and mermaid scripts. The latest version of DiagrammeR (v0.6, just released to CRAN) also includes facilities for generating diagrams from R code. This can be done in a couple of ways:</p><ol><li><p>Using text substitution, whereby you create placeholders within the diagram script and substitute their values from R objects. See the documentation on <a href="https://github.com/rich-iannone/DiagrammeR#graphviz-substitution">Graphviz Substitution</a> for more details.</p></li><li><p>Using the <a href="https://github.com/rich-iannone/DiagrammeR#using-data-frames-to-define-graphviz-graphs">graphviz_graph</a> function you can specify nodes and edges directly using a data frame.</p></li></ol><p>Future versions of DiagrammeR are expected to include additional features to support direct generation of diagrams from R.</p><h3 id="publishing-with-diagrammer">Publishing with DiagrammeR</h3><p>Diagrams created with DiagrammeR act a lot like R plots however there&rsquo;s an important difference: they are rendered as HTML content rather than using an R graphics device. This has the following implications for how they can be published and re-used:</p><ol><li><p>Within RStudio you can save diagrams as an image (PNG, BMP, etc.) or copy them to clipboard for re-use in other applications.</p></li><li><p>For a more reproducible workflow, diagrams can be embedded within R Markdown documents just like plots (all of the required HTML and JS is automatically included). Note that because the diagrams depend on HTML and JavaScript for rendering they can only be used in HTML based output formats (they don&rsquo;t work in PDFs or MS Word documents).</p></li><li><p>From within RStudio you can also publish diagrams to <a href="http://www.rpubs.com">RPubs</a> or save them as standalone web pages.</p></li></ol><p><img src="https://rstudioblog.files.wordpress.com/2015/04/screen-shot-2015-04-30-at-2-18-12-pm.png" alt="diagrammer-publish"></p><p>See the <a href="http://rich-iannone.github.io/DiagrammeR/io.html">DiagrammeR documentation on I/O</a> for additional details.</p><h3 id="try-it-out">Try it Out</h3><p>To get started with DiagrammeR check out the excellent collection of demos and documentation on the <a href="http://rich-iannone.github.io/DiagrammeR/">project website</a>. To take advantage of the new RStudio features that support DiagrammeR you should download the latest <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio v0.99 Preview Release</a>.</p></description></item><item><title>RStudio v0.99 Preview: Code Diagnostics</title><link>https://www.rstudio.com/blog/rstudio-v0-99-preview-code-diagnostics/</link><pubDate>Tue, 28 Apr 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-v0-99-preview-code-diagnostics/</guid><description><p>In RStudio v0.99 we&rsquo;ve made a major investment in R source code analysis. This work resulted in significant improvements in <a href="https://blog.rstudio.com/2015/02/23/rstudio-v0-99-preview-code-completion/">code completion</a>, and in the latest <a href="https://www.rstudio.com/products/rstudio/download/preview/">preview release</a> enable a new inline code diagnostics feature that highlights various issues in your R code as you edit.</p><p>For example, here we&rsquo;re getting a diagnostic that notes that there is an extra parentheses:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/04/screen-shot-2015-04-08-at-12-04-14-pm.png" alt="Screen Shot 2015-04-08 at 12.04.14 PM"></p><p>Here the diagnostic indicates that we&rsquo;ve forgotten a comma within a shiny UI definition:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/04/screen-shot-2015-04-28-at-11-29-46-am.png" alt="diagnostics-comma"></p><p>This diagnostic flags an unknown parameter to a function call:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/04/screen-shot-2015-04-08-at-11-50-07-am.png" alt="Screen Shot 2015-04-08 at 11.50.07 AM"></p><p>This diagnostic indicates that we&rsquo;ve referenced a variable that doesn&rsquo;t exist and suggests a fix based on another variable in scope:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/04/screen-shot-2015-04-08-at-4-23-49-pm.png" alt="Screen Shot 2015-04-08 at 4.23.49 PM"></p><p>A wide variety of diagnostics are supported, including optional diagnostics for code style issues (e.g. the inclusion of unnecessary whitespace). Diagnostics are also available for several other languages including C/C++, JavaScript, HTML, and CSS.</p><h3 id="configuring-diagnostics">Configuring Diagnostics</h3><p>By default, code in the current source file is checked whenever it is saved, as well as if the keyboard is idle for a period of time. You can tweak this behavior using the <strong>Code</strong> -&gt; <strong>Diagnostics</strong> options:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/04/screen-shot-2015-04-28-at-11-37-34-am.png" alt="diagnostics-options"></p><p>Note that several of the available diagnostics are disabled by default. This is because we&rsquo;re in the process of refining their behavior to eliminate &ldquo;false negatives&rdquo; where correct code is flagged as having a problem. We&rsquo;ll continue to improve these diagnostics and enable them by default when we feel they are ready.</p><h3 id="trying-it-out">Trying it Out</h3><p>You can try out the new code diagnostics by downloading the latest <a href="https://www.rstudio.com/products/rstudio/download/preview/">preview release</a> of RStudio. This feature is a work in progress and we&rsquo;re particularly interested in feedback on how well it works. Please also let us know if there are common coding problems which you think we should add new diagnostics for. We hope you try out the preview and <a href="https://support.rstudio.com/">let us know</a> how we can make it better.</p></description></item><item><title>Parse and process XML (and HTML) with xml2</title><link>https://www.rstudio.com/blog/xml2/</link><pubDate>Tue, 21 Apr 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/xml2/</guid><description><p>I&rsquo;m pleased to announced that the first version of xml2 is now available on CRAN. Xml2 is a wrapper around the comprehensive <a href="http://xmlsoft.org">libxml2</a> C library that makes it easier to work with XML and HTML in R:</p><ul><li><p>Read XML and HTML with <code>read_xml()</code> and <code>read_html()</code>.</p></li><li><p>Navigate the tree with <code>xml_children()</code>, <code>xml_siblings()</code> and <code>xml_parent()</code>. Alternatively, use xpath to jump directly to the nodes you&rsquo;re interested in with <code>xml_find_one()</code> and <code>xml_find_all()</code>. Get the full path to a node with <code>xml_path()</code>.</p></li><li><p>Extract various components of a node with <code>xml_text()</code>, <code>xml_attrs()</code>, <code>xml_attr()</code>, and <code>xml_name()</code>.</p></li><li><p>Convert to list with <code>as_list()</code>.</p></li><li><p>Where appropriate, functions support namespaces with a global url -&gt; prefix lookup table. See <code>xml_ns()</code> for more details.</p></li><li><p>Convert relative urls to absolute with <code>url_absolute()</code>, and transform in the opposite direction with <code>url_relative()</code>. Escape and unescape special characters with <code>url_escape()</code> and <code>url_unescape()</code>.</p></li><li><p>Support for modifying and creating xml documents in planned in a future version.</p></li></ul><p>This package owes a debt of gratitude to <a href="http://www.stat.ucdavis.edu/~duncan/">Duncan Temple Lang</a> who&rsquo;s XML package has made it possible to use XML with R for almost 15 years!</p><h2 id="usage">Usage</h2><p>You can install it by running:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">xml2&#34;</span>)</code></pre></div><p>(If you&rsquo;re on a mac, you might need to wait a couple of days - CRAN is busy rebuilding all the packages for R 3.2.0 so it&rsquo;s running a bit behind.)</p><p>Here&rsquo;s a small example working with an inline XML document:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(xml2)x <span style="color:#666">&lt;-</span> <span style="color:#06287e">read_xml</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">&lt;foo&gt;</span><span style="color:#4070a0"> &lt;bar&gt;text &lt;baz id = &#39;a&#39; /&gt;&lt;/bar&gt;</span><span style="color:#4070a0"> &lt;bar&gt;2&lt;/bar&gt;</span><span style="color:#4070a0"> &lt;baz id = &#39;b&#39; /&gt;</span><span style="color:#4070a0">&lt;/foo&gt;&#34;</span>)<span style="color:#06287e">xml_name</span>(x)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;foo&#34;</span><span style="color:#06287e">xml_children</span>(x)<span style="color:#60a0b0;font-style:italic">#&gt; {xml_nodeset (3)}</span><span style="color:#60a0b0;font-style:italic">#&gt; [1] &lt;bar&gt;text &lt;baz id=&#34;a&#34;/&gt;&lt;/bar&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; [2] &lt;bar&gt;2&lt;/bar&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; [3] &lt;baz id=&#34;b&#34;/&gt;</span><span style="color:#60a0b0;font-style:italic"># Find all baz nodes anywhere in the document</span>baz <span style="color:#666">&lt;-</span> <span style="color:#06287e">xml_find_all</span>(x, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">.//baz&#34;</span>)baz<span style="color:#60a0b0;font-style:italic">#&gt; {xml_nodeset (2)}</span><span style="color:#60a0b0;font-style:italic">#&gt; [1] &lt;baz id=&#34;a&#34;/&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; [2] &lt;baz id=&#34;b&#34;/&gt;</span><span style="color:#06287e">xml_path</span>(baz)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;/foo/bar[1]/baz&#34; &#34;/foo/baz&#34;</span><span style="color:#06287e">xml_attr</span>(baz, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">id&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;a&#34; &#34;b&#34;</span></code></pre></div><h2 id="development">Development</h2><p>Xml2 is still under active development. If notice any problems (including crashes), please try the <a href="https://github.com/hadley/xml2">development version</a>, and if that doesn&rsquo;t work, <a href="https://github.com/hadley/xml2/issues">file an issue</a>.</p></description></item><item><title>Get data out of excel and into R with readxl</title><link>https://www.rstudio.com/blog/readxl-0-1-0/</link><pubDate>Wed, 15 Apr 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/readxl-0-1-0/</guid><description><p>I&rsquo;m pleased to announced that the first version of readxl is now available on CRAN. Readxl makes it easy to get tabular data out of excel. It:</p><ul><li><p>Supports both the legacy <code>.xls</code> format and the modern xml-based <code>.xlsx</code> format. <code>.xls</code> support is made possible the with <a href="http://sourceforge.net/projects/libxls/">libxls</a> C library, which abstracts away many of the complexities of the underlying binary format. To parse <code>.xlsx</code>, we use the insanely fast <a href="http://rapidxml.sourceforge.net">RapidXML</a> C++ library.</p></li><li><p>Has no external dependencies so it&rsquo;s easy to use on all platforms.</p></li><li><p>Re-encodes non-ASCII characters to UTF-8.</p></li><li><p>Loads datetimes into POSIXct columns. Both Windows (1900) and Mac (1904) date specifications are processed correctly.</p></li><li><p>Blank columns are automatically dropped.</p></li><li><p>Returns output with class <code>c(&quot;tbl_df&quot;, &quot;tbl&quot;, &quot;data.frame&quot;)</code> so if you also use <a href="https://blog.rstudio.com/2015/01/09/dplyr-0-4-0/">dplyr</a> you&rsquo;ll get an enhanced print method (i.e. you&rsquo;ll see just the first ten rows, not the first 10,000!).</p></li></ul><p>You can install it by running:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">readxl&#34;</span>)</code></pre></div><p>There&rsquo;s not really much to say about how to use it:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(readxl)<span style="color:#60a0b0;font-style:italic"># Use a excel file included in the package</span>sample <span style="color:#666">&lt;-</span> <span style="color:#06287e">system.file</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">extdata&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">datasets.xlsx&#34;</span>, package <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">readxl&#34;</span>)<span style="color:#60a0b0;font-style:italic"># Read by position</span><span style="color:#06287e">head</span>(<span style="color:#06287e">read_excel</span>(sample, <span style="color:#40a070">2</span>))<span style="color:#60a0b0;font-style:italic">#&gt; mpg cyl disp hp drat wt qsec vs am gear carb</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1</span><span style="color:#60a0b0;font-style:italic"># Or by name:</span><span style="color:#06287e">excel_sheets</span>(sample)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;iris&#34; &#34;mtcars&#34; &#34;chickwts&#34; &#34;quakes&#34;</span><span style="color:#06287e">head</span>(<span style="color:#06287e">read_excel</span>(sample, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">mtcars&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; mpg cyl disp hp drat wt qsec vs am gear carb</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1</span></code></pre></div><p>You can see the documentation for more info on the <code>col_names</code>, <code>col_types</code> and <code>na</code> arguments.</p><p>Readxl is still under active development. If you have problems loading a dataset, please try the <a href="https://github.com/hadley/readxl">development version</a>, and if that doesn&rsquo;t work, <a href="https://github.com/hadley/readxl/issues">file an issue</a>.</p></description></item><item><title>Interactive time series with dygraphs</title><link>https://www.rstudio.com/blog/interactive-time-series-with-dygraphs/</link><pubDate>Tue, 14 Apr 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/interactive-time-series-with-dygraphs/</guid><description><p>The <a href="http://rstudio.github.io/dygraphs/">dygraphs package</a> is an R interface to the dygraphs JavaScript charting library. It provides rich facilities for charting time-series data in R, including:</p><ul><li><p>Automatically plots <a href="http://cran.rstudio.com/web/packages/xts/index.html">xts</a> time-series objects (or objects convertible to xts).</p></li><li><p>Rich interactive features including <a href="http://rstudio.github.io/dygraphs/gallery-range-selector.html">zoom/pan</a> and series/point <a href="http://rstudio.github.io/dygraphs/gallery-series-highlighting.html">highlighting</a>.</p></li><li><p>Highly configurable axis and series display (including optional 2nd Y-axis).</p></li><li><p>Display <a href="http://rstudio.github.io/dygraphs/gallery-upper-lower-bars.html">upper/lower bars</a> (e.g. prediction intervals) around series.</p></li><li><p>Various graph overlays including <a href="http://rstudio.github.io/dygraphs/gallery-shaded-regions.html">shaded regions</a>, <a href="http://rstudio.github.io/dygraphs/gallery-event-lines.html">event lines</a>, and <a href="http://rstudio.github.io/dygraphs/gallery-annotations.html">annotations</a>.</p></li><li><p>Use at the R console just like conventional R plots (via RStudio Viewer).</p></li><li><p>Embeddable within <a href="http://rstudio.github.io/dygraphs/r-markdown.html">R Markdown</a> documents and <a href="http://rstudio.github.io/dygraphs/shiny.html">Shiny</a> web applications.</p></li></ul><p>The dygraphs package is available on CRAN now and can be installed with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">dygraphs&#34;</span>)</code></pre></div><h3 id="examples">Examples</h3><p>Here are some examples of interactive time series visualizations you can create with only a line or two of R code (the screenshots are static, click them to see the interactive version).</p><h4 id="panning-and-zooming">Panning and Zooming</h4><p>This code adds a range selector that&rsquo;s can be used to pan and zoom around the series data:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">dygraph</span>(nhtemp, main <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">New Haven Temperatures&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">dyRangeSelector</span>()</code></pre></div><p><a href="http://rstudio.github.io/dygraphs/gallery-range-selector.html"><img src="https://rstudioblog.files.wordpress.com/2015/04/screen-shot-2015-04-09-at-1-01-35-pm.png" alt="Screen Shot 2015-04-09 at 1.01.35 PM"></a></p><h4 id="point-highlighting">Point Highlighting</h4><p>When you hover over the time-series the values of all points at the location of the mouse are shown in the legend:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">lungDeaths <span style="color:#666">&lt;-</span> <span style="color:#06287e">cbind</span>(ldeaths, mdeaths, fdeaths)<span style="color:#06287e">dygraph</span>(lungDeaths, main <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Deaths from Lung Disease (UK)&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">dyOptions</span>(colors <span style="color:#666">=</span> RColorBrewer<span style="color:#666">::</span><span style="color:#06287e">brewer.pal</span>(<span style="color:#40a070">3</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Set2&#34;</span>))</code></pre></div><p><a href="http://rstudio.github.io/dygraphs/gallery-series-options.html"><img src="https://rstudioblog.files.wordpress.com/2015/04/screen-shot-2015-04-09-at-12-53-54-pm.png" alt="Screen Shot 2015-04-09 at 12.53.54 PM"></a></p><h4 id="shading-and-annotations">Shading and Annotations</h4><p>There are a wide variety of tools available to annotate time series. Here we demonstrate creating shaded regions:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">dygraph</span>(nhtemp, main<span style="color:#666">=</span><span style="color:#4070a0">&#34;</span><span style="color:#4070a0">New Haven Temperatures&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">dySeries</span>(label<span style="color:#666">=</span><span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Temp (F)&#34;</span>, color<span style="color:#666">=</span><span style="color:#4070a0">&#34;</span><span style="color:#4070a0">black&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">dyShading</span>(from<span style="color:#666">=</span><span style="color:#4070a0">&#34;</span><span style="color:#4070a0">1920-1-1&#34;</span>, to<span style="color:#666">=</span><span style="color:#4070a0">&#34;</span><span style="color:#4070a0">1930-1-1&#34;</span>, color<span style="color:#666">=</span><span style="color:#4070a0">&#34;</span><span style="color:#4070a0">#FFE6E6&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">dyShading</span>(from<span style="color:#666">=</span><span style="color:#4070a0">&#34;</span><span style="color:#4070a0">1940-1-1&#34;</span>, to<span style="color:#666">=</span><span style="color:#4070a0">&#34;</span><span style="color:#4070a0">1950-1-1&#34;</span>, color<span style="color:#666">=</span><span style="color:#4070a0">&#34;</span><span style="color:#4070a0">#CCEBD6&#34;</span>)</code></pre></div><p><a href="http://rstudio.github.io/dygraphs/gallery-shaded-regions.html"><img src="https://rstudioblog.files.wordpress.com/2015/04/screen-shot-2015-04-09-at-1-11-31-pm1.png" alt="Screen Shot 2015-04-09 at 1.11.31 PM"></a></p><p>You can find additional examples and documentation on the <a href="http://rstudio.github.io/dygraphs/">dygraphs for R</a> website.</p><h3 id="bringing-javascript-to-r">Bringing JavaScript to R</h3><p>One of the reasons we are excited about dygraphs is that it takes a mature and feature rich visualization library formerly only accessible to web developers and makes it available to all R users.</p><p>This is part of a larger trend enabled by the <a href="http://www.htmlwidgets.org">htmlwidgets</a> package, and we expect that more and more libraries like dygraphs will emerge over the coming months to bring the best of JavaScript data visualization to R.</p></description></item><item><title>RStudio v0.99 Preview: Tools for Rcpp</title><link>https://www.rstudio.com/blog/rstudio-v0-99-preview-tools-for-rcpp/</link><pubDate>Tue, 14 Apr 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-v0-99-preview-tools-for-rcpp/</guid><description><p>Over the past several years the <a href="http://www.rcpp.org/">Rcpp</a> package has become an indispensable tool for creating high-performance R code. Its power and ease of use have made C++ a natural second language for many R users. There are over 400 packages on <a href="http://cran.r-project.org/">CRAN</a> and <a href="http://www.bioconductor.org/">Bioconductor</a> that depend on Rcpp and it is now the most downloaded R package.</p><p>In RStudio v0.99 we have added extensive additional tools to make working with Rcpp more pleasant, productive, and robust, these include:</p><ul><li><p>Code completion</p></li><li><p>Source diagnostics as you edit</p></li><li><p>Code snippets</p></li><li><p>Auto-indentation</p></li><li><p>Navigable list of compilation errors</p></li><li><p>Code navigation (go to definition)</p></li></ul><p>We think these features will go a long way to helping even more R users succeed with Rcpp. You can try the new features out now by downloading the <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio Preview Release</a>.</p><h3 id="code-completion">Code Completion</h3><p>RStudio v0.99 includes comprehensive code completion for C++ based on <a href="http://en.wikipedia.org/wiki/Clang">Clang</a> (the same underlying engine used by XCode and many other C/C++ tools):</p><p><img src="https://rstudioblog.files.wordpress.com/2015/04/screen-shot-2015-04-07-at-12-13-31-pm.png" alt="Screen Shot 2015-04-07 at 12.13.31 PM"></p><p>Completions are provided for the C++ language, Rcpp, and any other libraries you have imported.</p><h3 id="diagnostics">Diagnostics</h3><p>As you edit C++ source files RStudio uses Clang to scan your code looking for errors, incomplete code, or other conditions worthy of warnings or informational notes. For example:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/04/screen-shot-2015-04-07-at-12-16-38-pm.png" alt="Screen Shot 2015-04-07 at 12.16.38 PM"></p><p>Diagnostics alert you to the possibility of subtle problems and flag outright incorrect code as early as possible, substantially reducing iteration/debugging time.</p><h3 id="interactive-c">Interactive C++</h3><p>Rcpp includes some nifty tools to help make working with C++ code just as simple and straightforward as working with R code. You can &ldquo;source&rdquo; C++ code into R just like you&rsquo;d source an R script (no need to deal with Makefiles or build systems). Here&rsquo;s a Gibbs Sampler implemented with Rcpp:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/04/screen-shot-2015-04-13-at-4-40-36-pm.png" alt="Screen Shot 2015-04-13 at 4.40.36 PM"></p><p>We can make this function available to R by simply sourcing the C++ file (much like we&rsquo;d source an R script):</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">sourceCpp</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">gibbs.cpp&#34;</span>)<span style="color:#06287e">gibbs</span>(<span style="color:#40a070">100</span>, <span style="color:#40a070">10</span>)</code></pre></div><p>Thanks to the abstractions provided by Rcpp, the code implementing the Gibbs Sampler in C++ is nearly identical to the code you&rsquo;d write in R, but runs <a href="http://gallery.rcpp.org/articles/gibbs-sampler/">20 times faster</a>. RStudio includes full support for Rcpp&rsquo;s <code>sourceCpp</code> via the <strong>Source</strong> button and <strong>Ctrl+Shift+Enter</strong> keyboard shortcut.</p><h3 id="try-it-out">Try it Out</h3><p>If you are new to C++ or Rcpp you might be surprised at how easy it is to get started. There are lots of great resources available, including:</p><ul><li><p>Rcpp website: <a href="http://www.rcpp.org/">http://www.rcpp.org/</a></p></li><li><p>Rcpp book: <a href="http://www.rcpp.org/book/">http://www.rcpp.org/book/</a></p></li><li><p>Tutorial for users new to C++: <a href="http://adv-r.had.co.nz/Rcpp.html">http://adv-r.had.co.nz/Rcpp.html</a></p></li><li><p>Gallery of examples: <a href="http://gallery.rcpp.org/">http://gallery.rcpp.org/</a></p></li></ul><p>You can give the new Rcpp features a try now by downloading the <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio Preview Release</a>. If you run into problems or have feedback on how we could make things better let us know on our <a href="https://support.rstudio.com">Support Forum</a>.</p></description></item><item><title>RStudio v0.99 Preview: Code Snippets</title><link>https://www.rstudio.com/blog/rstudio-v0-99-preview-code-snippets/</link><pubDate>Mon, 13 Apr 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-v0-99-preview-code-snippets/</guid><description><p>We&rsquo;re getting close to shipping the next version of RStudio (v0.99) and this week will continue our series of posts describing the major new features of the release (previous posts have already covered <a href="https://blog.rstudio.com/2015/02/23/rstudio-v0-99-preview-code-completion/">code completion</a>, the revamped<a href="https://blog.rstudio.com/2015/02/24/rstudio-v0-99-preview-data-viewer-improvements/"> data viewer</a>, and improvements to <a href="https://blog.rstudio.com/2015/02/23/rstudio-0-99-preview-vim-mode-improvements/">vim mode</a>). Note that if you want to try out any of the new features now you can do so by downloading the <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio Preview Release</a>.</p><h3 id="code-snippets">Code Snippets</h3><p>Code snippets are text macros that are used for quickly inserting common snippets of code. For example, the <code>fun</code> snippet inserts an R function definition:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/04/screen-shot-2015-04-07-at-10-39-50-am.png" alt="Insert Snippet"></p><p>If you select the snippet from the completion list it will be inserted along with several text placeholders which you can fill in by typing and then pressing <strong>Tab</strong> to advance to the next placeholder:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/04/screen-shot-2015-04-07-at-10-44-39-am.png" alt="Screen Shot 2015-04-07 at 10.44.39 AM"></p><p>Other useful snippets include:</p><ul><li><p><code>lib</code>, <code>req</code>, and <code>source</code> for the library, require, and source functions</p></li><li><p><code>df</code> and <code>mat</code> for defining data frames and matrices</p></li><li><p><code>if</code>, <code>el</code>, and <code>ei</code> for conditional expressions</p></li><li><p><code>apply</code>, <code>lapply</code>, <code>sapply</code>, etc. for the apply family of functions</p></li><li><p><code>sc</code>, <code>sm</code>, and <code>sg</code> for defining S4 classes/methods.</p></li></ul><p>Snippets are a great way to automate inserting common/boilerplate code and are available for R, C/C++, JavaScript, and several other languages.</p><h3 id="inserting-snippets">Inserting Snippets</h3><p>As illustrated above, code snippets show up alongside other code completion results and can be inserted by picking them from the completion list. By default the completion list will show up automatically when you pause typing for 250 milliseconds and can also be manually activated via the <strong>Tab</strong> key. In addition, if you have typed the character sequence for a snippet and want to insert it immediately (without going through the completion list) you can press <strong>Shift+Tab</strong>.</p><h3 id="customizing-snippets">Customizing Snippets</h3><p>You can edit the built-in snippet definitions and even add snippets of your own via the <strong>Edit Snippets</strong> button in <strong>Global Options</strong> -&gt; <strong>Code</strong>:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/04/screen-shot-2015-04-07-at-10-48-40-am.png" alt="Edit Snippets"></p><p>Custom snippets are defined using the <code>snippet</code> keyword. The contents of the snippet should be indented below using the <code>&lt;tab&gt;</code> key (rather than with spaces). Variables can be defined using the form <code>{1:varname}</code>. For example, here&rsquo;s the definition of the <code>setGeneric</code> snippet:</p><pre><code>snippet sgsetGeneric(&quot;${1:generic}&quot;, function(${2:x, ...}) {standardGeneric(&quot;${1:generic}&quot;)})</code></pre><p>Once you&rsquo;ve customized snippets for a given language they are written into the <code>~/.R/snippets</code> directory. For example, the customized versions of R and C/C++ snippets are written to:</p><pre><code>~/.R/snippets/r.snippets~/.R/snippets/c_cpp.snippets</code></pre><p>You can edit these files directly to customize snippet definitions or you can use the <strong>Edit Snippets</strong> dialog as described above. If you need to move custom snippet definitions to another system then simply place them in <code>~/.R/snippets</code> and they&rsquo;ll be used in preference to the built-in snippet definitions.</p><h3 id="try-it-out">Try it Out</h3><p>You can give code snippets a try now by downloading the <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio Preview Release</a>. If you run into problems or have feedback on how we could make things better let us know on our <a href="https://support.rstudio.com">Support Forum</a>.</p></description></item><item><title>readr 0.1.0</title><link>https://www.rstudio.com/blog/readr-0-1-0/</link><pubDate>Thu, 09 Apr 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/readr-0-1-0/</guid><description><p>I&rsquo;m pleased to announced that readr is now available on CRAN. Readr makes it easy to read many types of tabular data:</p><ul><li><p>Delimited files with<code>read_delim()</code>, <code>read_csv()</code>, <code>read_tsv()</code>, and <code>read_csv2()</code>.</p></li><li><p>Fixed width files with <code>read_fwf()</code>, and <code>read_table()</code>.</p></li><li><p>Web log files with <code>read_log()</code>.</p></li></ul><p>You can install it by running:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">readr&#34;</span>)</code></pre></div><p>Compared to the equivalent base functions, readr functions are around 10x faster. They&rsquo;re also easier to use because they&rsquo;re more consistent, they produce data frames that are easier to use (no more <code>stringsAsFactors = FALSE</code>!), they have a more flexible column specification, and any parsing problems are recorded in a data frame. Each of these features is described in more detail below.</p><h2 id="input">Input</h2><p>All readr functions work the same way. There are four important arguments:</p><ul><li><code>file</code> gives the file to read; a url or local path. A local path can point to a a zipped, bzipped, xzipped, or gzipped file - it&rsquo;ll be automatically uncompressed in memory before reading. You can also pass in a connection or a raw vector.</li></ul><p>For small examples, you can also supply literal data: if <code>file</code> contains a new line, then the data will be read directly from the string. Thanks to <a href="https://github.com/Rdatatable/data.table">data.table</a> for this great idea!</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(readr)<span style="color:#06287e">read_csv</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">x,y\n1,2\n3,4&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; x y</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 2</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 3 4</span></code></pre></div><ul><li><p><code>col_names</code>: describes the column names (equivalent to <code>header</code> in base R). It has three possible values:</p><ul><li><p><code>TRUE</code> will use the the first row of data as column names.</p></li><li><p><code>FALSE</code> will number the columns sequentially.</p></li><li><p>A character vector to use as column names.</p></li></ul></li><li><p><code>col_types</code>: overrides the default column types (equivalent to <code>colClasses</code> in base R). More on that below.</p></li><li><p><code>progress</code>: By default, readr will display a progress bar if the estimated loading time is greater than 5 seconds. Use <code>progress = FALSE</code> to suppress the progress indicator.</p></li></ul><h2 id="output">Output</h2><p>The output has been designed to make your life easier:</p><ul><li><p>Characters are never automatically converted to factors (i.e. no more <code>stringsAsFactors = FALSE</code>!).</p></li><li><p>Column names are left as is, not munged into valid R identifiers (i.e. there is no <code>check.names = TRUE</code>). Use backticks to refer to variables with unusual names, e.g. ``df$`Income ($000)```.</p></li><li><p>The output has class <code>c(&quot;tbl_df&quot;, &quot;tbl&quot;, &quot;data.frame&quot;)</code> so if you also use <a href="https://blog.rstudio.com/2015/01/09/dplyr-0-4-0/">dplyr</a> you&rsquo;ll get an enhanced print method (i.e. you&rsquo;ll see just the first ten rows, not the first 10,000!).</p></li><li><p>Row names are never set.</p></li></ul><h2 id="column-types">Column types</h2><p>Readr heuristically inspects the first 100 rows to guess the type of each columns. This is not perfect, but it&rsquo;s fast and it&rsquo;s a reasonable start. Readr can automatically detect these column types:</p><ul><li><p><code>col_logical()</code> [l], contains only <code>T</code>, <code>F</code>, <code>TRUE</code> or <code>FALSE</code>.</p></li><li><p><code>col_integer()</code> [i], integers.</p></li><li><p><code>col_double()</code> [d], doubles.</p></li><li><p><code>col_euro_double()</code> [e], &ldquo;Euro&rdquo; doubles that use <code>,</code> as the decimal separator.</p></li><li><p><code>col_date()</code> [D]: Y-m-d dates.</p></li><li><p><code>col_datetime()</code> [T]: ISO8601 date times</p></li><li><p><code>col_character()</code> [c], everything else.</p></li></ul><p>You can manually specify other column types:</p><ul><li><p><code>col_skip()</code> [_], don&rsquo;t import this column.</p></li><li><p><code>col_date(format)</code> and <code>col_datetime(format, tz)</code>, dates or date times parsed with given format string. Dates and times are rather complex, so they&rsquo;re described in more detail in the next section.</p></li><li><p><code>col_numeric()</code> [n], a sloppy numeric parser that ignores everything apart from 0-9, <code>-</code> and <code>.</code> (this is useful for parsing currency data).</p></li><li><p><code>col_factor(levels, ordered)</code>, parse a fixed set of known values into a (optionally ordered) factor.</p></li></ul><p>There are two ways to override the default choices with the <code>col_types</code> argument:</p><ul><li><p>Use a compact string: <code>&quot;dc__d&quot;</code>. Each letter corresponds to a column so this specification means: read first column as double, second as character, skip the next two and read the last column as a double. (There&rsquo;s no way to use this form with column types that need parameters.)</p></li><li><p>With a (named) list of col objects:</p></li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">read_csv</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">iris.csv&#34;</span>, col_types <span style="color:#666">=</span> <span style="color:#06287e">list</span>(Sepal.Length <span style="color:#666">=</span> <span style="color:#06287e">col_double</span>(),Sepal.Width <span style="color:#666">=</span> <span style="color:#06287e">col_double</span>(),Petal.Length <span style="color:#666">=</span> <span style="color:#06287e">col_double</span>(),Petal.Width <span style="color:#666">=</span> <span style="color:#06287e">col_double</span>(),Species <span style="color:#666">=</span> <span style="color:#06287e">col_factor</span>(<span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">setosa&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">versicolor&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">virginica&#34;</span>))))</code></pre></div><p>Any omitted columns will be parsed automatically, so the previous call is equivalent to:</p><pre><code>read_csv(&quot;iris.csv&quot;, col_types = list(Species = col_factor(c(&quot;setosa&quot;, &quot;versicolor&quot;, &quot;virginica&quot;)))</code></pre><h3 id="dates-and-times">Dates and times</h3><p>One of the most helpful features of readr is its ability to import dates and date times. It can automatically recognise the following formats:</p><ul><li><p>Dates in year-month-day form: <code>2001-10-20</code> or <code>2010/15/10</code> (or any non-numeric separator). It can&rsquo;t automatically recongise dates in m/d/y or d/m/y format because they&rsquo;re ambiguous: is <code>02/01/2015</code> the 2nd of January or the 1st of February?</p></li><li><p>Date times as <a href="http://en.wikipedia.org/wiki/ISO_8601">ISO8601</a> form: e.g. <code>2001-02-03 04:05:06.07 -0800</code>, <code>20010203 040506</code>, <code>20010203</code> etc. I don&rsquo;t support every possible variant yet, so please let me know if it doesn&rsquo;t work for your data (more details in <code>?parse_datetime</code>).</p></li></ul><p>If your dates are in another format, don&rsquo;t despair. You can use <code>col_date()</code> and <code>col_datetime()</code> to explicit specify a format string. Readr implements it&rsquo;s own <code>strptime()</code> equivalent which supports the following format strings:</p><ul><li><p>Year: <code>\%Y</code> (4 digits). <code>\%y</code> (2 digits); 00-69 -&gt; 2000-2069, 70-99 -&gt; 1970-1999.</p></li><li><p>Month: <code>\%m</code> (2 digits), <code>\%b</code> (abbreviated name in current locale), <code>\%B</code> (full name in current locale).</p></li><li><p>Day: <code>\%d</code> (2 digits), <code>\%e</code> (optional leading space)</p></li><li><p>Hour: <code>\%H</code></p></li><li><p>Minutes: <code>\%M</code></p></li><li><p>Seconds: <code>\%S</code> (integer seconds), <code>\%OS</code> (partial seconds)</p></li><li><p>Time zone: <code>\%Z</code> (as name, e.g. <code>America/Chicago</code>), <code>\%z</code> (as offset from UTC, e.g. <code>+0800</code>)</p></li><li><p>Non-digits: <code>\%.</code> skips one non-digit charcater, <code>\%*</code> skips any number of non-digit characters.</p></li><li><p>Shortcuts: <code>\%D</code> = <code>\%m/\%d/\%y</code>, <code>\%F</code> = <code>\%Y-\%m-\%d</code>, <code>\%R</code> = <code>\%H:\%M</code>, <code>\%T</code> = <code>\%H:\%M:\%S</code>, <code>\%x</code> = <code>\%y/\%m/\%d</code>.</p></li></ul><p>To practice parsing date times with out having to load the file each time, you can use <code>parse_datetime()</code> and <code>parse_date()</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">parse_date</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">2015-10-10&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;2015-10-10&#34;</span><span style="color:#06287e">parse_datetime</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">2015-10-10 15:14&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;2015-10-10 15:14:00 UTC&#34;</span><span style="color:#06287e">parse_date</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">02/01/2015&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">%m/%d/%Y&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;2015-02-01&#34;</span><span style="color:#06287e">parse_date</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">02/01/2015&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">%d/%m/%Y&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;2015-01-02&#34;</span></code></pre></div><h2 id="problems">Problems</h2><p>If there are any problems parsing the file, the <code>read_</code> function will throw a warning telling you how many problems there are. You can then use the <code>problems()</code> function to access a data frame that gives information about each problem:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">csv <span style="color:#666">&lt;-</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">x,y</span><span style="color:#4070a0">1,a</span><span style="color:#4070a0">b,2</span><span style="color:#4070a0">&#34;</span>df <span style="color:#666">&lt;-</span> <span style="color:#06287e">read_csv</span>(csv, col_types <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">ii&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Warning: 2 problems parsing literal data. See problems(...) for more</span><span style="color:#60a0b0;font-style:italic">#&gt; details.</span><span style="color:#06287e">problems</span>(df)<span style="color:#60a0b0;font-style:italic">#&gt; row col expected actual</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 2 an integer a</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 1 an integer b</span>df<span style="color:#60a0b0;font-style:italic">#&gt; x y</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 NA</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 NA 2</span></code></pre></div><h2 id="helper-functions">Helper functions</h2><p>Readr also provides a handful of other useful functions:</p><ul><li><p><code>read_lines()</code> works the same way as <code>readLines()</code>, but is a lot faster.</p></li><li><p><code>read_file()</code> reads a complete file into a string.</p></li><li><p><code>type_convert()</code> attempts to coerce all character columns to their appropriate type. This is useful if you need to do some manual munging (e.g. with regular expressions) to turn strings into numbers. It uses the same rules as the <code>read_*</code> functions.</p></li><li><p><code>write_csv()</code> writes a data frame out to a csv file. It&rsquo;s quite a bit faster than <code>write.csv()</code> and it never writes row.names. It also escapes <code>&quot;</code> embedded in strings in a way that <code>read_csv()</code> can read.</p></li></ul><h2 id="development">Development</h2><p>Readr is still under very active development. If you have problems loading a dataset, please try the <a href="https://github.com/hadley/readr">development version</a>, and if that doesn&rsquo;t work, <a href="https://github.com/hadley/readr/issues">file an issue</a>.</p></description></item><item><title>Design patterns for action buttons</title><link>https://www.rstudio.com/blog/design-patterns-for-action-buttons/</link><pubDate>Tue, 07 Apr 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/design-patterns-for-action-buttons/</guid><description><p><img src="https://rstudioblog.files.wordpress.com/2015/04/action-button.png" alt="action-button"></p><p>Action buttons can be tricky to use in Shiny because they work differently than other widgets. Widgets like sliders and select boxes maintain a value that is easy to use in your code. But the value of an action button is arbitrary. What should you do with it? Did you know that you should almost always call the value of an action button from <code>observeEvent()</code> or <code>eventReactive()</code>?</p><p>The newest article at the <a href="https://shiny.rstudio.com/articles/action-buttons.html">Shiny Development Center</a> explains how action buttons work, and it provides five useful patterns for working with action buttons. These patterns also work well with action links.</p><p>Read the article <a href="https://shiny.rstudio.com/articles/action-buttons.html">here</a>.</p></description></item><item><title>Data Visualization cheatsheet, plus Spanish translations</title><link>https://www.rstudio.com/blog/data-visualization-cheatsheet-plus-spanish-translations/</link><pubDate>Mon, 30 Mar 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/data-visualization-cheatsheet-plus-spanish-translations/</guid><description><p><img src="https://www.rstudio.com/wp-content/uploads/2015/03/ggplot2-cheatsheet.png" alt="data visualization cheatsheet"></p><p>We&rsquo;ve added a new cheatsheet to our <a href="https://www.rstudio.com/resources/cheatsheets/">collection</a>. <em>Data Visualization with ggplot2</em> describes how to build a plot with ggplot2 and the grammar of graphics. You will find helpful reminders of how to use:</p><ul><li><p>geoms</p></li><li><p>stats</p></li><li><p>scales</p></li><li><p>coordinate systems</p></li><li><p>facets</p></li><li><p>position adjustments</p></li><li><p>legends, and</p></li><li><p>themes</p></li></ul><p>The cheatsheet also documents tips on zooming.</p><p>Download the cheatsheet <a href="https://www.rstudio.com/resources/cheatsheets/">here</a>.</p><p><strong>Bonus</strong> - Frans van Dunné of <a href="http://innovateonline.nl/">Innovate Online</a> has provided Spanish translations of the Data Wrangling, R Markdown, Shiny, and Package Development cheatsheets. Download them at the bottom of the <a href="https://www.rstudio.com/resources/cheatsheets/">cheatsheet gallery</a>.</p></description></item><item><title>Package Development cheatsheet, plus Chinese translations</title><link>https://www.rstudio.com/blog/package-development-cheatsheet-plus-chinese-translations/</link><pubDate>Thu, 12 Mar 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/package-development-cheatsheet-plus-chinese-translations/</guid><description><p><img src="https://www.rstudio.com/wp-content/uploads/2015/03/devtools-cheatsheet.png" alt="Cheatsheet"></p><p>We&rsquo;ve added a new cheatsheet to our <a href="https://www.rstudio.com/resources/cheatsheets/">collection</a>! <em>Package Development with devtools</em> will help you find the most useful functions for building packages in R. The cheatsheet will walk you through the steps of building a package from:</p><ul><li><p>Setting up the package structure</p></li><li><p>Adding a DESCRIPTION file</p></li><li><p>Writing code</p></li><li><p>Writing tests</p></li><li><p>Writing documentation with roxygen</p></li><li><p>Adding data sets</p></li><li><p>Building a NAMESPACE, and</p></li><li><p>Including vignettes</p></li></ul><p>The sheet focuses on Hadley Wickham&rsquo;s devtools package, and it is a useful supplement to Hadley&rsquo;s book <em>R Packages</em>, which you can read online at <a href="http://r-pkgs.had.co.nz">r-pkgs.had.co.nz</a>.</p><p>Download the sheet <a href="https://www.rstudio.com/resources/cheatsheets/">here</a>.</p><p><strong>Bonus</strong> - Vivian Zhang of <a href="http://supstat.com">SupStat Analytics</a> has kindly translated the existing Data Wrangling, R Markdown, and Shiny cheatsheets into Chinese. You can download the translations at the bottom of the <a href="https://www.rstudio.com/resources/cheatsheets/">cheatsheet gallery</a>.</p></description></item><item><title>haven 0.1.0</title><link>https://www.rstudio.com/blog/haven-0-1-0/</link><pubDate>Wed, 04 Mar 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/haven-0-1-0/</guid><description><p>I&rsquo;m pleased to announced that the new haven package is now available on CRAN. Haven makes it easy to read data from SAS, SPSS and Stata. Haven has the same goal as the <a href="http://cran.r-project.org/package=foreign">foreign</a> package, but it:</p><ul><li><p>Can read binary SAS7BDAT files.</p></li><li><p>Can read Stata13 files.</p></li><li><p>Always returns a data frame.</p></li></ul><p>(Haven also has experimental support for writing SPSS and Stata data. This still has some rough edges but please try it out and <a href="https://github.com/hadley/haven/issues">report any problems</a> that you find.)</p><p>Haven is a binding to the excellent <a href="http://github.com/WizardMac/ReadStat/issues">ReadStat</a> C library by <a href="http://www.evanmiller.org">Evan Miller</a>. Haven wouldn&rsquo;t be possible without his hard work - thanks Evan! I&rsquo;d also like to thank Matt Shotwell who spend a lot of time reverse engineering the SAS binary data format, and Dennis Fisher who tested the SAS code with thousands of SAS files.</p><h2 id="usage">Usage</h2><p>Using haven is easy:</p><ul><li><p>Install it, <code>install.packages(&quot;haven&quot;)</code>,</p></li><li><p>Load it, <code>library(haven)</code>,</p></li><li><p>Then pick the appropriate read function:</p><ul><li><p>SAS: <code>read_sas()</code></p></li><li><p>SPSS: <code>read_sav()</code> or <code>read_por()</code></p></li><li><p>Stata: <code>read_dta()</code>.</p></li></ul></li></ul><p>These only need the name of the path. (<code>read_sas()</code> optionally also takes the path to a catolog file.)</p><h2 id="output">Output</h2><p>All functions return a data frame:</p><ul><li><p>The output also has class <code>tbl_df</code> which will improve the default print method (to only show the first ten rows and the variables that fit on one screen) if you have dplyr loaded. If you don&rsquo;t use dplyr, it has no effect.</p></li><li><p>Variable labels are attached as an attribute to each variable. These are not printed (because they tend to be long), but if you have a <a href="https://www.rstudio.com/products/rstudio/download/preview/">preview version of RStudio</a>, you&rsquo;ll see them in the <a href="https://blog.rstudio.com/2015/02/24/rstudio-v0-99-preview-data-viewer-improvements/">revamped viewer pane</a>.</p></li><li><p>Missing values in numeric variables should be seemlessly converted. Missing values in character variables are converted to the empty string, <code>&quot;&quot;</code>: if you want to convert them to missing values, use <code>zap_empty()</code>.</p></li><li><p>Dates are converted in to <code>Date</code>s, and datetimes to <code>POSIXct</code>s. Time variables are read into a new class called <code>hms</code> which represents an offset in seconds from midnight. It has <code>print()</code> and <code>format()</code> methods to nicely display times, but otherwise behaves like an integer vector.</p></li><li><p>Variables with labelled values are turned into a new <code>labelled</code> class, as described next.</p></li></ul><h3 id="labelled-variables">Labelled variables</h3><p>SAS, Stata and SPSS all have the notion of a &ldquo;labelled&rdquo; variable. These are similar to factors, but:</p><ul><li><p>Integer, numeric and character vectors can be labelled.</p></li><li><p>Not every value must be associated with a label.</p></li></ul><p>Factors, by contrast, are always integers and every integer value must be associated with a label.</p><p>Haven provides a <code>labelled</code> class to model these objects. It doesn&rsquo;t implement any common methods, but instead focusses of ways to turn a labelled variable into standard R variable:</p><ul><li><p><code>as_factor()</code>: turns labelled integers into factors. Any values that don&rsquo;t have a label associated with them will become a missing value. (NB: there&rsquo;s no way to make <code>as.factor()</code> work with labelled variables, so you&rsquo;ll need to use this new function.)</p></li><li><p><code>zap_labels()</code>: turns any labelled values into missing values. This deals with the common pattern where you have a continuous variable that has missing values indiciated by sentinel values.</p></li></ul><p>If you have a use case that&rsquo;s not covered by these function, please let me know.</p><h2 id="development">Development</h2><p>Haven is still under very active development. If you have problems loading a dataset, please try the <a href="https://github.com/hadley/haven">development version</a>, and if that doesn&rsquo;t work, <a href="https://github.com/hadley/haven/issues">file an issue</a>.</p></description></item><item><title>Announcing shinyapps.io General Availability</title><link>https://www.rstudio.com/blog/announcing-shinyapps-io-general-availability/</link><pubDate>Thu, 26 Feb 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-shinyapps-io-general-availability/</guid><description><p>RStudio is excited to announce the general availability (GA) of <a href="https://www.rstudio.com/products/shinyapps/">shinyapps.io</a>.</p><p>Shinyapps.io is an easy to use, secure, and scalable hosted service already being used by thousands of professionals and students to deploy Shiny applications on the web. Effective today, shinyapps.io has completed <em>beta</em> testing and is generally available as a commercial service for anyone.</p><p>As regular readers of our blog know, <a href="https://shiny.rstudio.com/">Shiny</a> is a popular free and open source R package from RStudio that simplifies the creation of interactive web applications, dashboards, and reports. Until today, <a href="https://www.rstudio.com/products/shiny/shiny-server/">Shiny Server and Shiny Server Pro</a> were the most popular ways to share shiny apps. Now, there is a commercially supported alternative for individuals and groups who don&rsquo;t have the time or resources to install and manage their own servers.</p><p>We want to thank the nearly 8,000 people who created at least one shiny app and deployed it on shinyapps.io during its extensive alpha and beta testing phases! The service was improved for everyone because of your willingness to give us feedback and bear with us as we continuously added to its capabilities.</p><p>For R users developing shiny applications that haven&rsquo;t yet created a shinyapps.io account, we hope you&rsquo;ll give it a try soon! We did our best to keep the pricing simple and predictable with Free, Basic, Standard, and Professional plans. Each paid plan has features and functionality that we think will appeal to different users and can be purchased with a credit card by month or year. You can learn more about <a href="https://www.rstudio.com/pricing/#ShinyApp">shinyapps.io pricing plans</a> and product features on our website.</p><p>We hope to see your shiny app on shinyapps.io soon!</p></description></item><item><title>RStudio v0.99 Preview: Data Viewer Improvements</title><link>https://www.rstudio.com/blog/rstudio-v0-99-preview-data-viewer-improvements/</link><pubDate>Tue, 24 Feb 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-v0-99-preview-data-viewer-improvements/</guid><description><p>RStudio&rsquo;s data viewer provides a quick way to look at the contents of data frames and other column-based data in your R environment. You invoke it by clicking on the grid icon in the Environment pane, or at the console by typing <code>View(mydata)</code>.</p><p><img src="https://rstudioblog.files.wordpress.com/2015/02/screen-shot-2015-02-23-at-1-28-17-pm.png" alt="grid icon"></p><p>As part of the <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio Preview Release</a>, we&rsquo;ve completely overhauled RStudio&rsquo;s data viewer with modern features provided in part by a new interface built on <a href="http://www.datatables.net/">DataTables</a>.</p><h3 id="no-row-limit">No Row Limit</h3><p>While the data viewer in 0.98 was limited to the first 1,000 rows, you can now view all the rows of your data set. RStudio loads just the portion of the data you&rsquo;re looking at into the user interface, so things won&rsquo;t get sluggish even when you&rsquo;re working with large data sets.</p><p><img src="https://rstudioblog.files.wordpress.com/2015/02/screen-shot-2015-02-23-at-1-03-13-pm.png" alt="no row limit"></p><p>We&rsquo;ve also added fixed column headers, and support for column labels imported from SPSS and other systems.</p><h3 id="sorting-and-filtering">Sorting and Filtering</h3><p>RStudio isn&rsquo;t designed to act like a spreadsheet, but sometimes it&rsquo;s helpful to do a quick sort or filter to get some idea of the data&rsquo;s characteristics before moving into reproducible data analysis. Towards that end, we&rsquo;ve built some basic sorting and filtering into the new data viewer.</p><h4 id="sorting">Sorting</h4><p>Click a column once to sort data in ascending order, and again to sort in descending order. For instance, how big is the biggest diamond?</p><p><img src="https://rstudioblog.files.wordpress.com/2015/02/screen-shot-2015-02-23-at-11-53-35-am.png" alt="sorting"></p><p>To clear all sorts and filters on the data, click the upper-left column header.</p><h4 id="filtering">Filtering</h4><p>Click the new <em>Filter</em> button to enter Filter mode, then click the white filter value box to filter a column. You might, for instance, want to look at only at smaller diamonds:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/02/screen-shot-2015-02-23-at-12-02-04-pm.png" alt="filter"></p><p>Not all data types can be filtered; at the moment, you can filter only numeric types, characters, and factors.</p><p>You can also stack filters; for instance, let&rsquo;s further restrict this view to small diamonds with a Very Good cut:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/02/screen-shot-2015-02-23-at-12-03-29-pm.png" alt="filter factor"></p><h4 id="full-text-search">Full-Text Search</h4><p>You can search the full text of your data frame using the new Search box in the upper right. This is useful for finding specific records; for instance, how many people named John were born in 2013?</p><p><img src="https://rstudioblog.files.wordpress.com/2015/02/screen-shot-2015-02-23-at-12-13-04-pm.png" alt="full-text search"></p><h3 id="live-update">Live Update</h3><p>If you invoke the data viewer on a variable as in <code>View(mydata)</code>, the data viewer will (in most cases) automatically refresh whenever data in the variable changes.</p><p>You can use this feature to watch data change as you manipulate it. It continues to work even when the data viewer is popped out, a configuration that combines well with multi-monitor setups.</p><p>We hope these improvements help make you understand your data more quickly and easily. Try out the <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio Preview Release</a> and let us know what you think!</p></description></item><item><title>RStudio v0.99 Preview: Code Completion</title><link>https://www.rstudio.com/blog/rstudio-v0-99-preview-code-completion/</link><pubDate>Mon, 23 Feb 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-v0-99-preview-code-completion/</guid><description><p>We&rsquo;re busy at work on the next version of RStudio (v0.99) and this week will be blogging about some of the noteworthy new features. If you want to try out any of the new features now you can do so by downloading the <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio Preview Release</a>.</p><p>The first feature to highlight is a fully revamped implementation of code completion for R. We&rsquo;ve always supported a limited form of completion however (a) it only worked on objects in the global environment; and (b) it only worked when expressly requested via the tab key. As a result not nearly enough users discovered or benefitted from code completion. In this release code completion is much more comprehensive.</p><h3 id="smarter-completion-engine">Smarter Completion Engine</h3><p>Previously RStudio only completed variables that already existed in the global environment, now completion is done based on source code analysis so is provided even for objects that haven&rsquo;t been fully evaluated:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/02/document-inferred.png" alt="document-inferred"></p><p>Completions are also provided for a wide variety of specialized contexts including dimension names in [ and [[:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/02/bracket.png" alt="bracket"></p><p>RStudio now provides completions for function arguments within function chains using <a href="http://cran.r-project.org/web/packages/magrittr/index.html">magrittr&rsquo;s</a> %&gt;% operator, for e.g. <a href="http://cran.r-project.org/web/packages/dplyr/index.html">dplyr</a> data transformation pipelines. Extending this behavior, we also provide the appropriate completions for the various &lsquo;verbs&rsquo; used by dplyr:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/02/dplyr.png" alt="dplyr"> <img src="https://rstudioblog.files.wordpress.com/2015/02/dplyr_verb.png" alt="dplyr_verb"></p><p>In addition, certain functions, such as library() and require(), expect package names for completions. RStudio automatically infers whether a particular function expects a package name and provides those names as completion results:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/02/library.png" alt="library"></p><p>Completion is now also S3 and S4 aware. If RStudio is able to determine which method a particular function call will be dispatched to it will attempt to retrieve completions from that method. For example, the sort.default() method provides an extra argument, na.last, not available in the sort() generic. RStudio will provide completions for that argument if S3 dispatch would choose sort.default()</p><p><img src="https://rstudioblog.files.wordpress.com/2015/02/s3.png" alt="s3"></p><p>Beyond what&rsquo;s described above there are lots more new places where completions are provided:</p><ul><li><p>For Shiny applications, completions for ui.R + server.R pairs</p></li><li><p>Completions for knitr options, e.g. in <code>opts_chunk$get()</code>, are now supplied</p></li><li><p>Completions for dynamic symbols within .C, .Call, .Fortran, .External</p></li></ul><h3 id="additional-enhancements">Additional Enhancements</h3><h4 id="always-on-completion">Always On Completion</h4><p>Previously RStudio only displayed completions &ldquo;on-demand&rdquo; in response to the tab key. Now, RStudio will proactively display completions after a <code>$</code> or <code>::</code> as well as after a period of typing inactivity. All of this behavior is configurable via the new completion options panel:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/02/options.png" alt="options"></p><h4 id="file-completions">File Completions</h4><p>When within an RStudio project, completions will be applied recursively to all file names matching the current token. The enclosing parent directory is printed on the right:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/02/file.png" alt="file"></p><h4 id="fuzzy-narrowing">Fuzzy Narrowing</h4><p>Got a completion with an excessively long name, perhaps a particularly long named Bioconductor package, or another variable or function name of long length? RStudio now uses &lsquo;fuzzy narrowing&rsquo; on the completion list, by checking to see if the completion matches a &lsquo;subsequence&rsquo; within each completion. By subsequence, we mean a sequence of characters not necessarily connected within the completion, so that for example, &lsquo;fpse&rsquo; could match &lsquo;file_path_sans_extension&rsquo;. We hope that users will quickly become accustomed to this behavior and find it very useful.</p><p><img src="https://rstudioblog.files.wordpress.com/2015/02/fuzzy.png" alt="fuzzy"></p><h3 id="trying-it-out">Trying it Out</h3><p>We think that the new completion features make for a qualitatively better experience of writing R code for beginning and expert users alike. You can give the new features a try now by downloading the <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio Preview Release</a>. If you run into problems or have feedback on how we could make things better let us know on our <a href="https://support.rstudio.com">Support Forum</a>.</p></description></item><item><title>RStudio v0.99 Preview: Vim Mode Improvements</title><link>https://www.rstudio.com/blog/rstudio-0-99-preview-vim-mode-improvements/</link><pubDate>Mon, 23 Feb 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-0-99-preview-vim-mode-improvements/</guid><description><p>RStudio&rsquo;s code editor includes a set of lightweight <a href="http://en.wikipedia.org/wiki/Vim_%28text_editor%29">Vim</a> key bindings. You can turn these on in Tools | Global Options | Code | Editing:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/02/screen-shot-2015-02-23-at-9-33-49-am.png" alt="global options"></p><p>For those not familiar, Vim is a popular text editor built to enable efficient text editing. It can take some practice and dedication to master Vim style editing but those who have done so typically swear by it. RStudio&rsquo;s &ldquo;vim mode&rdquo; enables the use of many of the most common keyboard operations from Vim right inside RStudio.</p><p>As part of the <a href="https://www.rstudio.com/products/rstudio/download/preview/">0.99 preview release</a>, we&rsquo;ve included an upgraded version of the <a href="http://ace.c9.io/">ACE editor</a>, which has a completely revamped Vim mode. This mode extends the range of Vim key bindings that are supported, and implements a number of Vim &ldquo;power features&rdquo; that go beyond basic text motions and editing. These include:</p><ul><li><p><strong>Vertical block selection</strong> via <code>Ctrl + V</code>. This integrates with the new multiple cursor support in ACE and allows you to type in multiple lines at once.</p></li><li><p><strong>Macro playback and recording</strong>, using <code>q{register}</code> / <code>@{register}</code>.</p></li><li><p><strong>Marks</strong>, which allow you drop markers in your source and jump back to them quickly later.</p></li><li><p><strong>A selection of Ex commands</strong>, such as <code>:wq</code> and <code>:%s</code> that allow you to perform editor operations as you would in native Vim.</p></li><li><p><strong>Fast in-file search</strong> with e.g. <code>/</code> and <code>*</code>, and support for JavaScript regular expressions.</p></li></ul><p>We&rsquo;ve also added a Vim quick reference card to the IDE that you can bring up at any time to show the supported key bindings. To see it, switch your editor to Vim mode (as described above) and type <code>:help</code> in Command mode.</p><p><img src="https://rstudioblog.files.wordpress.com/2015/02/screen-shot-2015-02-23-at-11-03-00-am.png" alt="vim quick reference card"></p><p>Whether you&rsquo;re a Vim novice or power user, we hope these improvements make the RStudio IDE&rsquo;s editor a more productive and enjoyable environment for you. You can try the new Vim features out now by downloading the <a href="https://www.rstudio.com/products/rstudio/download/preview/">RStudio Preview Release</a>.</p></description></item><item><title>Epoch.com sponsors RMySQL development</title><link>https://www.rstudio.com/blog/epoch-rmysql/</link><pubDate>Wed, 11 Feb 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/epoch-rmysql/</guid><description><p>I&rsquo;m very pleased to announce that Epoch.com has stepped up as a sponsor for the <a href="https://github.com/rstats-db/RMySQL">RMySQL</a> package.</p><p>For the last 20 years, <a href="http://epoch.com">Epoch.com</a> has built its Internet Payment Service Provider infrastructure on open source software. Their data team, led by <a href="https://www.linkedin.com/in/szilard">Szilard Pafka</a>, PhD, has been using R for nearly a decade, developing cutting-edge data visualization, machine learning and other analytical applications. According to Epoch, &ldquo;We have always believed in the value of R and in the importance of contributing to the open source community.&rdquo;</p><p>This sort of sponsorship is very important to me. While I already spend most of my time working on R packages, I don&rsquo;t have the skills to fix every problem. Sponsorship allows me to hire outside experts. In this case, Epoch.com&rsquo;s sponsorship allowed me to work with Jeroen Ooms to improve the build system for RMySQL so that a CRAN binary is available for every platform.</p><p>Is your company interested in sponsoring other infrastructure work that benefits the whole R community? If so, please <a href="mailto:hadley@rstudio.com">get in touch</a>.</p></description></item><item><title>Register now for RStudio Shiny Workshops in D.C., New York, Boston, L.A., San Francisco and Seattle</title><link>https://www.rstudio.com/blog/register-now-for-rstudio-shiny-workshops-in-d-c-new-york-boston-l-a-san-francisco-and-seattle/</link><pubDate>Wed, 28 Jan 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/register-now-for-rstudio-shiny-workshops-in-d-c-new-york-boston-l-a-san-francisco-and-seattle/</guid><description><p>Great news for Shiny and R Markdown enthusiasts!</p><p>An Interactive Reporting Workshop with Shiny and R Markdown is coming to a city near you. Act fast as only 20 seats are available for each workshop.</p><p><strong>You can find out more / register by clicking on the link for your city!</strong></p><table><thead><tr><th align="left">East Coast</th><th align="left">West Coast</th></tr></thead><tbody><tr><td align="left"><a href="http://info.rstudio.net/O0v0000N0200CNSH0d0Y00X">March 2 - Washington, DC</a></td><td align="left"><a href="http://info.rstudio.net/hS00K0X00N0000YvC020Ng0">April 15 - Los Angeles, CA</a></td></tr><tr><td align="left"><a href="http://info.rstudio.net/s0Iv0S000XCe0N000020Y0N">March 4 - New York, NY</a></td><td align="left"><a href="http://info.rstudio.net/R00S000X0Y002000NLNvhC0">April 17 - San Francisco, CA</a></td></tr><tr><td align="left"><a href="http://info.rstudio.net/MC0N0000Nv00Y00fS00X0J2">March 6 - Boston, MA</a></td><td align="left"><a href="http://info.rstudio.net/wX0NC0YNS000000i0002M0v">April 20 - Seattle, WA</a></td></tr></tbody></table><p><strong>You&rsquo;ll want to take this workshop if&hellip;</strong></p><p>You have some experience working with R already. You should have written a number of functions, and be comfortable with R&rsquo;s basic data structures (vectors, matrices, arrays, lists, and data frames).</p><p><strong>You will learn from&hellip;</strong></p><p>The workshop is taught by Garrett Grolemund. Garrett is the Editor-in-Chief of <a href="https://shiny.rstudio.com/">shiny.rstudio.com</a>, the development center for the Shiny R package. He is also the author of Hands-On Programming with R as well as Data Science with R, a forthcoming book by O&rsquo;Reilly Media. Garrett works as a Data Scientist and Chief Instructor for RStudio, Inc. <a href="http://info.rstudio.net/TgCNx00000000XI20S00YN0">GitHub</a></p></td></description></item><item><title>RStudio - Infoworld 2015 Technology of the Year Award Recipient!</title><link>https://www.rstudio.com/blog/rstudio-infoworld-2015-technology-of-the-year-award-recipient/</link><pubDate>Wed, 28 Jan 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-infoworld-2015-technology-of-the-year-award-recipient/</guid><description><p>Sometimes the universe surprises us. In this case, it was in a good way and we genuinely appreciated it.</p><p>Earlier this week we learned that the Infoworld Testing Center staff selected RStudio as one of 32 recipients of the <a href="http://www.idgenterprise.com/press/infoworld-announces-the-2015-technology-of-the-year-award-recipients">2015 Technology of the Year Award</a>.</p><p>We thought it was cool because it was completely unsolicited, we&rsquo;re in very good company (some of our favorite technologies like Docker, Github, node.js&hellip;even my Dell XPS 15 Touch!&hellip;were also award winners) and the description of our products was surprisingly elegant - simple and accurate.</p><p>We know Infoworld wouldn&rsquo;t have known about us if our customers hadn&rsquo;t brought us to their attention.</p><p>Thank you.</p><p><img src="https://rstudioblog.files.wordpress.com/2015/01/toy15-rstudio-100563580-orig.jpg" alt="toy15-rstudio-100563580-orig"></p></description></item><item><title>Shiny 0.11, themes, and dashboard</title><link>https://www.rstudio.com/blog/shiny-0-11-themes-and-dashboard/</link><pubDate>Fri, 23 Jan 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-0-11-themes-and-dashboard/</guid><description><p>Shiny version 0.11 is available now! Notable changes include:</p><ul><li><p>Shiny has migrated from Bootstrap 2 to Bootstrap 3 for its web front end. More on this below.</p></li><li><p>The old <a href="https://github.com/egorkhmelev/jslider">jsliders</a> have been replaced with <a href="https://github.com/IonDen/ion.rangeSlider">ion.rangeSlider</a>. These sliders look better, are easier for users to interact with, and support updating more fields from the server side.</p></li><li><p>There is a new <code>passwordInput()</code> which can be used to create password fields.</p></li><li><p>New <code>observeEvent()</code> and <code>eventReactive()</code> functions greatly streamline the use of <code>actionButton</code> and other inputs that act more like events than reactive inputs.</p></li></ul><p>For a full set of changes, see the <a href="http://cran.rstudio.com/web/packages/shiny/NEWS">NEWS</a> file. To install, run:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">shiny&#34;</span>)</code></pre></div><p>We&rsquo;ve also posted an <a href="http://shiny.rstudio-staging.com/articles/upgrade-0.11.html">article</a> with notes on upgrading to 0.11.</p><h4 id="bootstrap-3-migration">Bootstrap 3 migration</h4><p>In all versions of Shiny prior to 0.11, Shiny has used the Bootstrap 2 framework for its web front-end. Shiny generates HTML that is structured to work with Bootstrap, and this makes it easy to create pages with sidebars, tabs, dropdown menus, mobile device support, and so on.</p><p>The Bootstrap development team stopped development on the Bootstrap 2 series after version 2.3.2, which was released over a year ago, and has since focused their efforts on Bootstrap 3. The new version of Bootstrap builds on many of the same underlying ideas, but it also has many small changes – for example, many of the CSS class names have changed.</p><p>In Shiny 0.11, we&rsquo;ve moved to Bootstrap 3. For most Shiny users, the transition will be seamless; the only differences you&rsquo;ll see are slight changes to fonts and spacing.</p><p>If, however, you customized any of your code to use features specific to Bootstrap 2, then you may need to update your code to work with Bootstrap 3 (see the <a href="http://getbootstrap.com/migration/">Bootstrap migration guide</a> for details). If you don&rsquo;t want to update your code right away, you can use the <a href="https://github.com/rstudio/shinybootstrap2">shinybootstrap2</a> package for backward compatibility with Bootstrap 2 – using it requires adding just two lines of code. If you do use shinybootstrap2, we suggest using it just as an interim solution until you update your code for Bootstrap 3, because Shiny development going forward will use Bootstrap 3.</p><p>Why is Shiny moving to Bootstrap 3? One reason is support: as mentioned earlier, Bootstrap 2 is no longer developed and is no longer supported. Another reason is that there is dynamic community of actively-developed Bootstrap 3 themes. (Themes for Bootstrap 2 also exist, but there is less development activity.) Using these themes will allow you to customize the appearance of a Shiny app so that it doesn&rsquo;t just look like&hellip; a Shiny app.</p><p>We&rsquo;ve also created a package that make it easy to use Bootstrap themes: <a href="http://rstudio.github.io/shinythemes/">shinythemes</a>. Here&rsquo;s an example using the included Flatly theme:</p><p><img src="https://rstudioblog.files.wordpress.com/2015/01/flatly.png" alt="flatly"></p><p>See the <a href="http://rstudio.github.io/shinythemes/">shinythemes site</a> for more screenshots and instructions on how to use it.</p><p>We&rsquo;re also working on <a href="http://rstudio.github.io/shinydashboard/">shinydashboard</a>, a package that makes it easy to create dashboards. Here&rsquo;s an example dashboard that also uses the <a href="http://rstudio.github.io/leaflet/">leaflet</a> package.</p><p><img src="https://rstudioblog.files.wordpress.com/2015/01/buses.png" alt="buses"></p><p>The shinydashboard package still under development, but feel free to try it out and give us feedback.</p></description></item><item><title>Balancing the Load | What's New in RStudio Server Pro?</title><link>https://www.rstudio.com/blog/balancing-the-load-whats-new-in-rstudio-server-pro/</link><pubDate>Tue, 13 Jan 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/balancing-the-load-whats-new-in-rstudio-server-pro/</guid><description><p>As R users know, we&rsquo;re continuously improving the RStudio IDE. This includes RStudio Server Pro, where organizations who want to deploy the IDE at scale will find a growing set of features recently enhanced for them.</p><p>If you&rsquo;re not already familiar with RStudio Server Pro here&rsquo;s an updated <a href="https://www.rstudio.com/products/rstudio-server-pro/">summary page</a> and a <a href="https://www.rstudio.com/products/rstudio/#RStudioServerVersionComparison">comparison</a> to RStudio Server worth checking out. Or you can skip all of that and <a href="https://www.rstudio.com/products/rstudio-server-pro/evaluation/">download</a> a free 45 day evaluation right now!</p><p><strong>WHAT&rsquo;S NEW IN RSTUDIO SERVER PRO (v0.98.1091)</strong></p><p>Naturally, the latest RStudio Server Pro has all of the new features found in the open source server version of the RStudio IDE. They include improvements to R Markdown document and Shiny app creation, making R package development easier, better debugging and source editing, and support for Internet Explorer 10 and 11 and RHEL 7.</p><p>Recently, we added even more powerful features exclusively for RStudio Server Pro:</p><ul><li><p><strong>Load balancing</strong> based on factors you control. Load balancing ensures R users are automatically assigned to the best available server in a cluster.</p></li><li><p><strong>Flexible resource allocation</strong> by user or group. Now you can allocate cores, set scheduler priority, control the version(s) of R and enforce memory and CPU limits.</p></li><li><p><strong>New security enhancements</strong>. Leverage PAM to issue Kerberos tickets, move Google Accounts support to OAuth 2.0, and allow administrators to disable access to various features.</p></li></ul><p>For a full list of what&rsquo;s changed in more depth, make sure to read the RStudio Server Pro <a href="https://s3.amazonaws.com/rstudio-server/rstudio-server-pro-0.98.1091-admin-guide.pdf">admin guide</a>.</p><p><strong>THE RSTUDIO SERVER PRO BASICS</strong></p><p>In addition to the newest features above there are many more that make RStudio Server Pro an upgrade to the open source IDE. Here&rsquo;s a quick list:</p><ul><li><p>An administrative dashboard that provides insight into active sessions, server health, and monitoring of system-wide and per-user performance and resources</p></li><li><p>Authentication using system accounts, ActiveDirectory, LDAP, or Google Accounts</p></li><li><p>Full support for the Pluggable Authentication Module (PAM)</p></li><li><p>HTTP enhancements add support for SSL and keep-alive for improved performance</p></li><li><p>Ability to restrict access to the server by IP</p></li><li><p>Customizable server health checks</p></li><li><p>Suspend, terminate, or assume control of user sessions for assistance and troubleshooting</p></li></ul><p>That&rsquo;s a lot to discover! Please <a href="https://www.rstudio.com/products/rstudio-server-pro/evaluation/">download the newest version of RStudio Server Pro</a> and as always let us know how it&rsquo;s working and what else you&rsquo;d like to see.</p></description></item><item><title>dplyr 0.4.0</title><link>https://www.rstudio.com/blog/dplyr-0-4-0/</link><pubDate>Fri, 09 Jan 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/dplyr-0-4-0/</guid><description><p>I&rsquo;m very pleased to announce that dplyr 0.4.0 is now available from CRAN. Get the latest version by running:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">dplyr&#34;</span>)</code></pre></div><p>dplyr 0.4.0 includes over 80 minor improvements and bug fixes, which are described in detail in the <a href="https://github.com/hadley/dplyr/releases/tag/v0.4.0">release notes</a>. Here I wanted to draw your attention to two areas that have particularly improved since dplyr 0.3, two-table verbs and data frame support.</p><h2 id="two-table-verbs">Two table verbs</h2><p>dplyr now has full support for all two-table verbs provided by SQL:</p><ul><li><p>Mutating joins, which add new variables to one table from matching rows in another: <code>inner_join()</code>, <code>left_join()</code>, <code>right_join()</code>, <code>full_join()</code>. (Support for non-equi joins is planned for dplyr 0.5.0.)</p></li><li><p>Filtering joins, which filter observations from one table based on whether or not they match an observation in the other table: <code>semi_join()</code>, <code>anti_join()</code>.</p></li><li><p>Set operations, which combine the observations in two data sets as if they were set elements: <code>intersect()</code>, <code>union()</code>, <code>setdiff()</code>.</p></li></ul><p>Together, these verbs should allow you to solve 95% of data manipulation problems that involve multiple tables. If any of the concepts are unfamiliar to you, I highly recommend reading the <a href="http://cran.r-project.org/web/packages/dplyr/vignettes/two-table.html">two-table vignette</a> (and if you still don&rsquo;t understand, please let me know so I can make it better.)</p><h2 id="data-frames">Data frames</h2><p>dplyr wraps data frames in a <code>tbl_df</code> class. These objects are structured in exactly the same way as regular data frames, but their behaviour has been tweaked a little to make them easier to work with. The new <a href="http://cran.r-project.org/web/packages/dplyr/vignettes/data_frames.html">data_frames vignette</a> describes how dplyr works with data frames in general, and below I highlight some of the features new in 0.4.0.</p><h3 id="printing">Printing</h3><p>The biggest difference is printing: <code>print.tbl_df()</code> doesn&rsquo;t try and print 10,000 rows! Printing got a lot of love in dplyr 0.4 and now:</p><ul><li><p>All <code>print()</code> method methods invisibly return their input so you can interleave <code>print()</code> statements into a pipeline to see interim results.</p></li><li><p>If you&rsquo;ve managed to produce a 0-row data frame, dplyr won&rsquo;t try to print the data, but will tell you the column names and types:</p></li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">data_frame</span>(x <span style="color:#666">=</span> <span style="color:#06287e">numeric</span>(), y <span style="color:#666">=</span> <span style="color:#06287e">character</span>())<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [0 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; Variables not shown: x (dbl), y (chr)</span></code></pre></div><ul><li>dplyr never prints row names since no dplyr method is guaranteed to preserve them:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df <span style="color:#666">&lt;-</span> <span style="color:#06287e">data.frame</span>(x <span style="color:#666">=</span> <span style="color:#06287e">c</span>(a <span style="color:#666">=</span> <span style="color:#40a070">1</span>, b <span style="color:#666">=</span> <span style="color:#40a070">2</span>, c <span style="color:#666">=</span> <span style="color:#40a070">3</span>))df<span style="color:#60a0b0;font-style:italic">#&gt; x</span><span style="color:#60a0b0;font-style:italic">#&gt; a 1</span><span style="color:#60a0b0;font-style:italic">#&gt; b 2</span><span style="color:#60a0b0;font-style:italic">#&gt; c 3</span>df <span style="color:#666">%&gt;%</span> <span style="color:#06287e">tbl_df</span>()<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [3 x 1]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 3</span></code></pre></div><p>I don&rsquo;t think using row names is a good idea because it violates one of the principles of <a href="http://vita.had.co.nz/papers/tidy-data.html">tidy data</a>: every variable should be stored in the same way.</p><p>To make life a bit easier if you do have row names, you can use the new <code>add_rownames()</code> to turn your row names into a proper variable:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">df <span style="color:#666">%&gt;%</span><span style="color:#06287e">add_rownames</span>()<span style="color:#60a0b0;font-style:italic">#&gt; rowname x</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 a 1</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 b 2</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 c 3</span></code></pre></div><p>(But you&rsquo;re better off never creating them in the first place.)</p><ul><li><code>options(dplyr.print_max)</code> is now 20, so dplyr will never print more than 20 rows of data (previously it was 100). The best way to see more rows of data is to use <code>View()</code>.</li></ul><h3 id="coercing-lists-to-data-frames">Coercing lists to data frames</h3><p>When you have a list of vectors of equal length that you want to turn into a data frame, dplyr provides <code>as_data_frame()</code> as a simple alternative to <code>as.data.frame()</code>. <code>as_data_frame()</code> is considerably faster than <code>as.data.frame()</code> because it does much less:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">l <span style="color:#666">&lt;-</span> <span style="color:#06287e">replicate</span>(<span style="color:#40a070">26</span>, <span style="color:#06287e">sample</span>(<span style="color:#40a070">100</span>), simplify <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">FALSE</span>)<span style="color:#06287e">names</span>(l) <span style="color:#666">&lt;-</span> <span style="color:#007020;font-weight:bold">letters</span>microbenchmark<span style="color:#666">::</span><span style="color:#06287e">microbenchmark</span>(<span style="color:#06287e">as_data_frame</span>(l),<span style="color:#06287e">as.data.frame</span>(l))<span style="color:#60a0b0;font-style:italic">#&gt; Unit: microseconds</span><span style="color:#60a0b0;font-style:italic">#&gt; expr min lq median uq max neval</span><span style="color:#60a0b0;font-style:italic">#&gt; as_data_frame(l) 101.856 112.0615 124.855 143.0965 254.193 100</span><span style="color:#60a0b0;font-style:italic">#&gt; as.data.frame(l) 1402.075 1466.6365 1511.644 1635.1205 3007.299 100</span></code></pre></div><p>It&rsquo;s difficult to precisely describe what <code>as.data.frame(x)</code> does, but it&rsquo;s similar to <code>do.call(cbind, lapply(x, data.frame))</code> - it coerces each component to a data frame and then <code>cbind()</code>s them all together.</p><p>The speed of <code>as.data.frame()</code> is not usually a bottleneck in interactive use, but can be a problem when combining thousands of lists into one tidy data frame (this is common when working with data stored in json or xml).</p><h3 id="binding-rows-and-columns">Binding rows and columns</h3><p>dplyr now provides <code>bind_rows()</code> and <code>bind_cols()</code> for binding data frames together. Compared to <code>rbind()</code> and <code>cbind()</code>, the functions:</p><ul><li>Accept either individual data frames, or a list of data frames:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">a <span style="color:#666">&lt;-</span> <span style="color:#06287e">data_frame</span>(x <span style="color:#666">=</span> <span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">5</span>)b <span style="color:#666">&lt;-</span> <span style="color:#06287e">data_frame</span>(x <span style="color:#666">=</span> <span style="color:#40a070">6</span><span style="color:#666">:</span><span style="color:#40a070">10</span>)<span style="color:#06287e">bind_rows</span>(a, b)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [10 x 1]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 3</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 4</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 5</span><span style="color:#60a0b0;font-style:italic">#&gt; .. .</span><span style="color:#06287e">bind_rows</span>(<span style="color:#06287e">list</span>(a, b))<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [10 x 1]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 3</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 4</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 5</span><span style="color:#60a0b0;font-style:italic">#&gt; .. .</span></code></pre></div><p>If <code>x</code> is a list of data frames, <code>bind_rows(x)</code> is equivalent to <code>do.call(rbind, x)</code>.</p><ul><li>Are much faster:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">dfs <span style="color:#666">&lt;-</span> <span style="color:#06287e">replicate</span>(<span style="color:#40a070">100</span>, <span style="color:#06287e">data_frame</span>(x <span style="color:#666">=</span> <span style="color:#06287e">runif</span>(<span style="color:#40a070">100</span>)), simplify <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">FALSE</span>)microbenchmark<span style="color:#666">::</span><span style="color:#06287e">microbenchmark</span>(<span style="color:#06287e">do.call</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">rbind&#34;</span>, dfs),<span style="color:#06287e">bind_rows</span>(dfs))<span style="color:#60a0b0;font-style:italic">#&gt; Unit: microseconds</span><span style="color:#60a0b0;font-style:italic">#&gt; expr min lq median uq max</span><span style="color:#60a0b0;font-style:italic">#&gt; do.call(&#34;rbind&#34;, dfs) 5344.660 6605.3805 6964.236 7693.8465 43457.061</span><span style="color:#60a0b0;font-style:italic">#&gt; bind_rows(dfs) 240.342 262.0845 317.582 346.6465 2345.832</span><span style="color:#60a0b0;font-style:italic">#&gt; neval</span><span style="color:#60a0b0;font-style:italic">#&gt; 100</span><span style="color:#60a0b0;font-style:italic">#&gt; 100</span></code></pre></div><p>(Generally you should avoid <code>bind_cols()</code> in favour of a join; otherwise check carefully that the rows are in a compatible order).</p><h3 id="list-variables">List-variables</h3><p>Data frames are usually made up of a list of atomic vectors that all have the same length. However, it&rsquo;s also possible to have a variable that&rsquo;s a list, which I call a list-variable. Because of <code>data.frame()</code>s complex coercion rules, the easiest way to create a data frame containing a list-column is with <code>data_frame()</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">data_frame</span>(x <span style="color:#666">=</span> <span style="color:#40a070">1</span>, y <span style="color:#666">=</span> <span style="color:#06287e">list</span>(<span style="color:#40a070">1</span>), z <span style="color:#666">=</span> <span style="color:#06287e">list</span>(<span style="color:#06287e">list</span>(<span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">5</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">b&#34;</span>)))<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [1 x 3]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x y z</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 &lt;dbl[1]&gt; &lt;list[3]&gt;</span></code></pre></div><p>Note how list-variables are printed: a list-variable could contain a lot of data, so dplyr only shows a brief summary of the contents. List-variables are useful for:</p><ul><li>Working with summary functions that return more than one value:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">qs <span style="color:#666">&lt;-</span> mtcars <span style="color:#666">%&gt;%</span><span style="color:#06287e">group_by</span>(cyl) <span style="color:#666">%&gt;%</span><span style="color:#06287e">summarise</span>(y <span style="color:#666">=</span> <span style="color:#06287e">list</span>(<span style="color:#06287e">quantile</span>(mpg)))<span style="color:#60a0b0;font-style:italic"># Unnest input to collpase into rows</span>qs <span style="color:#666">%&gt;%</span> tidyr<span style="color:#666">::</span><span style="color:#06287e">unnest</span>(y)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [15 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; cyl y</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 4 21.4</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 4 22.8</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 4 26.0</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 4 30.4</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 4 33.9</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ... ...</span><span style="color:#60a0b0;font-style:italic"># To extract individual elements into columns, wrap the result in rowwise()</span><span style="color:#60a0b0;font-style:italic"># then use summarise()</span>qs <span style="color:#666">%&gt;%</span><span style="color:#06287e">rowwise</span>() <span style="color:#666">%&gt;%</span><span style="color:#06287e">summarise</span>(q25 <span style="color:#666">=</span> y[2], q75 <span style="color:#666">=</span> y[4])<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [3 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; q25 q75</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 22.80 30.40</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 18.65 21.00</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 14.40 16.25</span></code></pre></div><ul><li>Keeping associated data frames and models together:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">by_cyl <span style="color:#666">&lt;-</span> <span style="color:#06287e">split</span>(mtcars, mtcars<span style="color:#666">$</span>cyl)models <span style="color:#666">&lt;-</span> <span style="color:#06287e">lapply</span>(by_cyl, lm, formula <span style="color:#666">=</span> mpg <span style="color:#666">~</span> wt)<span style="color:#06287e">data_frame</span>(cyl <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#40a070">4</span>, <span style="color:#40a070">6</span>, <span style="color:#40a070">8</span>), data <span style="color:#666">=</span> by_cyl, model <span style="color:#666">=</span> models)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [3 x 3]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; cyl data model</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 4 &lt;S3:data.frame&gt; &lt;S3:lm&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 6 &lt;S3:data.frame&gt; &lt;S3:lm&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 8 &lt;S3:data.frame&gt; &lt;S3:lm&gt;</span></code></pre></div><p>dplyr&rsquo;s support for list-variables continues to mature. In 0.4.0, you can join and row bind list-variables and you can create them in summarise and mutate.</p><p>My vision of list-variables is still partial and incomplete, but I&rsquo;m convinced that they will make pipeable APIs for modelling much eaiser. See the draft <a href="https://github.com/hadley/lowliner">lowliner</a> package for more explorations in this direction.</p><h2 id="bonus">Bonus</h2><p>My colleague, Garrett, helped me make a cheat sheet that summarizes the data wrangling features of dplyr 0.4.0. You can download it from RStudio&rsquo;s new <a href="https://www.rstudio.com/resources/cheatsheets/">gallery of R cheat sheets</a>.</p><p><a href="https://www.rstudio.com/resources/cheatsheets/"><img src="https://rstudioblog.files.wordpress.com/2015/01/dplyr-0-4-cheatsheet.png" alt="Data wrangling cheatsheet"></a></p></description></item><item><title>ggplot2 updates</title><link>https://www.rstudio.com/blog/ggplot2-updates/</link><pubDate>Fri, 09 Jan 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/ggplot2-updates/</guid><description><h2 id="ggplot2-100">ggplot2 1.0.0</h2><p>As you might have noticed, ggplot2 recently <a href="http://cran.r-project.org/web/packages/ggplot2/index.html">turned 1.0.0</a>. This release incorporated a handful of <a href="https://github.com/hadley/ggplot2/releases/tag/v1.0.0">new features and bug fixes</a>, but most importantly reflects that ggplot2 is now a mature plotting system and it will not change significantly in the future.</p><p>This does not mean ggplot2 is dead! The ggplot2 community is <a href="https://groups.google.com/forum/#!forum/ggplot2">rich</a> and <a href="http://stackoverflow.com/tags/ggplot2">vibrant</a> and the number of packages that build on top of ggplot2 continues to grow. We are committed to maintaining ggplot2 so that you can continue to rely on it for years to come.</p><h2 id="the-ggplot2-book">The ggplot2 book</h2><p>Since ggplot2 is now stable, and the <a href="http://ggplot2.org/book/">ggplot2 book</a> is over five years old and rather out of date, I&rsquo;m also happy to announce that I&rsquo;m working on a second edition. I&rsquo;ll be ably assisted in this endeavour by <a href="http://cpsievert.github.io">Carson Sievert</a>, who&rsquo;s so far done a great job of converting the source to Rmd and updating many of the examples to work with ggplot2 1.0.0. In the coming months we&rsquo;ll be rewriting the data chapter to reflect modern best practices (e.g. <a href="https://github.com/hadley/tidyr">tidyr</a> and <a href="https://github.com/hadley/dplyr">dplyr</a>), and adding sections about new features.</p><p>We&rsquo;d love your help! The source code for the book is available on <a href="https://github.com/hadley/ggplot2-book">github</a>. If you&rsquo;ve spotted any mistakes in the first edition that you&rsquo;d like to correct, we&rsquo;d really appreciate a <a href="https://github.com/hadley/ggplot2-book/pulls">pull request</a>. If there&rsquo;s a particular section of the book that you think needs an update (or is just plain missing), please let us know by filing an <a href="https://github.com/hadley/ggplot2-book/issues">issue</a>. Unfortunately we can&rsquo;t turn the book into a free website because of my agreement with the publisher, but at least you can now get easily get to the source.</p></description></item><item><title>RMySQL 0.10.0</title><link>https://www.rstudio.com/blog/rmysql-0-1-0/</link><pubDate>Fri, 09 Jan 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rmysql-0-1-0/</guid><description><p><a href="http://jeroenooms.github.io">Jeroen Ooms</a> and I are very pleased to announce a new version of RMySQL, the R package that allows you to talk to MySQL (and MariaDB) databases. We have taken over maintenance from <a href="http://biostat.mc.vanderbilt.edu/wiki/Main/JeffreyHorner">Jeffrey Horner</a>, who has done a great job of maintaining the package of the last few years, but no longer has time to look after it. Thanks for all your hard work Jeff!</p><h2 id="using-rmysql">Using RMySQL</h2><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(DBI)<span style="color:#60a0b0;font-style:italic"># Connect to a public database that I&#39;m running on Google&#39;s</span><span style="color:#60a0b0;font-style:italic"># cloud SQL service. It contains a copy of the data in the</span><span style="color:#60a0b0;font-style:italic"># datasets package.</span>con <span style="color:#666">&lt;-</span> <span style="color:#06287e">dbConnect</span>(RMySQL<span style="color:#666">::</span><span style="color:#06287e">MySQL</span>(),username <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">public&#34;</span>,password <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">F60RUsyiG579PeKdCH&#34;</span>,host <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">173.194.227.144&#34;</span>,port <span style="color:#666">=</span> <span style="color:#40a070">3306</span>,dbname <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">datasets&#34;</span>)<span style="color:#60a0b0;font-style:italic"># Run a query</span><span style="color:#06287e">dbGetQuery</span>(con, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">SELECT * FROM mtcars WHERE cyl = 4 AND mpg &lt; 23&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; row_names mpg cyl disp hp drat wt qsec vs am gear carb</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2</span><span style="color:#60a0b0;font-style:italic"># It&#39;s polite to let the database know when you&#39;re done</span><span style="color:#06287e">dbDisconnect</span>(con)<span style="color:#60a0b0;font-style:italic">#&gt; [1] TRUE</span></code></pre></div><p>It&rsquo;s generally a bad idea to put passwords in your code, so instead of typing them directly, you can create a file called <code>~/.my.cnf</code> that contains</p><pre><code>[cloudSQL]username=publicpassword=F60RUsyiG579PeKdCHhost=173.194.227.144port=3306database=datasets</code></pre><p>Then you can connect with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">con <span style="color:#666">&lt;-</span> <span style="color:#06287e">dbConnect</span>(RMySQL<span style="color:#666">::</span><span style="color:#06287e">MySQL</span>(), group <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">cloudSQL&#34;</span>)</code></pre></div><h2 id="changes-in-this-release">Changes in this release</h2><p>RMySQL 0.10.0 is mostly a cleanup release. RMySQL is one of the oldest packages on CRAN, and according to the timestamps, it is older than many recommended packages, and only slightly younger than MASS! That explains why a facelift was well overdue.</p><p>The most important change is an improvement to the build process so that CRAN binaries are now available for Windows and OS X Mavericks. This should make your life much easier if you&rsquo;re on one of these platforms. We&rsquo;d love your feedback on the new build scripts. There have been many problems in the past, so we&rsquo;d like to know that this client works well across platforms and versions of MySQL server.</p><p>Otherwise, the changes update RMySQL for DBI 0.3 compatibility:</p><ul><li><p>Internal <code>mysql*()</code> functions are no longer exported. Please use the corresponding DBI generics instead.</p></li><li><p>RMySQL gains transaction support with <code>dbBegin()</code>, <code>dbCommit()</code>, and <code>dbRollback()</code>. (But note that MySQL does not allow data definition language statements to be rolled back.)</p></li><li><p>Added method for <code>dbFetch()</code>. Please use this instead of <code>fetch()</code>. <code>dbFetch()</code> now returns a 0-row data frame (instead of an 0-col data frame) if there are no results.</p></li><li><p>Added methods for <code>dbIsValid()</code>. Please use these instead of <code>isIdCurrent()</code>.</p></li><li><p><code>dbWriteTable()</code> has been rewritten. It uses a better quoting strategy, throws errors on failure, and only automatically adds row names only if they&rsquo;re strings. (NB: <code>dbWriteTable()</code> also has a method that allows you load files directly from disk - this is likely to be faster if your file is one of the formats supported.)</p></li></ul><p>For a complete list of changes, please see the full <a href="https://github.com/rstats-db/RMySQL/releases/tag/v0.10">release notes</a>.</p></description></item><item><title>Announcing shinyapps.io beta</title><link>https://www.rstudio.com/blog/announcing-shinyapps-io-beta/</link><pubDate>Tue, 06 Jan 2015 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-shinyapps-io-beta/</guid><description><p>RStudio is happy to announce the availability of the <a href="https://www.rstudio.com/products/shinyapps/">shinyapps.io <em>beta</em></a>.</p><p>Shinyapps.io is an easy to use, secure, and scalable hosted service already being used by thousands of professionals and students to deploy Shiny applications on the web. Today we are releasing a significant upgrade as we transition from <em>alpha</em> to <em>beta,</em> the final step before general availability (GA) later this quarter.</p><p>New Feature Highlights in shinyapps.io <em>beta</em></p><ul><li><p>Secure and manage authorized users with support for new authentication systems, including Google, GitHub, or a shinyapps.io account.</p></li><li><p>Tune application performance by controlling the resources available. Run multiple R processes per application instance and add application instances.</p></li><li><p>Track performance metrics and simplify application management in a new shinyapps.io dashboard. See an application&rsquo;s active connections, CPU, memory, and network usage. Review application logs, start, stop, restart, rebuild and archive applications all from one convenient place.</p></li></ul><p>During the beta period, these and all other features in shinyapps.io are available at no charge. At the end of the beta, users may subscribe to a plan of their choice or transition their applications to the free plan.</p><p>If you do not already have an account, we encourage anyone developing Shiny applications to consider shinyapps.io <em>beta</em> and appreciate any and all feedback on our features or <a href="https://www.rstudio.com/products/shinyapps/#ShinyApp">proposed packaging and pricing</a>.</p><p>Happy New Year!</p></description></item><item><title>Hadley Wickham Master R Developer Workshop - Space Limited</title><link>https://www.rstudio.com/blog/hadley-wickham-master-r-developer-workshop-space-limited/</link><pubDate>Tue, 23 Dec 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/hadley-wickham-master-r-developer-workshop-space-limited/</guid><description><p>Give yourself the gift of &ldquo;mastering&rdquo; R to start 2015!</p><p>Join RStudio Chief Data Scientist Hadley Wickham at the Westin San Francisco on January 19 and 20 for this rare opportunity to learn from one of the R community&rsquo;s most popular and innovative authors and package developers.</p><p>As of this post, the workshop is two-thirds sold out. If you&rsquo;re in or near California and want to boost your R programming skills, this is Hadley&rsquo;s only West Coast public workshop planned for 2015.</p><p>Register here: <a href="http://rstudio-sfbay.eventbrite.com/">http://rstudio-sfbay.eventbrite.com/</a></p></description></item><item><title>htmlwidgets: JavaScript data visualization for R</title><link>https://www.rstudio.com/blog/htmlwidgets-javascript-data-visualization-for-r/</link><pubDate>Thu, 18 Dec 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/htmlwidgets-javascript-data-visualization-for-r/</guid><description><p>Today we&rsquo;re excited to announce <a href="http://www.htmlwidgets.org">htmlwidgets</a>, a new framework that brings the best of JavaScript data visualization libraries to R. There are already several packages that take advantage of the framework (<a href="http://www.htmlwidgets.org/showcase_leaflet.html">leaflet</a>, <a href="http://www.htmlwidgets.org/showcase_dygraphs.html">dygraphs</a>, <a href="http://www.htmlwidgets.org/showcase_networkD3.html">networkD3</a>, <a href="http://www.htmlwidgets.org/showcase_datatables.html">DataTables</a>, and <a href="http://www.htmlwidgets.org/showcase_threejs.html">rthreejs</a>) with hopefully many more to come.</p><p>An <strong>htmlwidget</strong> works just like an R plot except it produces an interactive web visualization. A line or two of R code is all it takes to produce a D3 graphic or Leaflet map. Widgets can be used at the R console as well as embedded in <a href="http://rmarkdown.rstudio.com">R Markdown</a> reports and <a href="https://shiny.rstudio.com">Shiny</a> web applications. Here&rsquo;s an example of using leaflet directly from the R console:</p><p><img src="https://rstudioblog.files.wordpress.com/2014/12/rconsole-2x.png" alt="rconsole.2x"></p><p>When printed at the console the leaflet widget displays in the RStudio Viewer pane. All of the tools typically available for plots are also available for widgets, including history, zooming, and export to file/clipboard (note that when not running within RStudio widgets will display in an external web browser).</p><p>Here&rsquo;s the same widget in an R Markdown report. Widgets automatically print as HTML within R Markdown documents and even respect the default knitr figure width and height.</p><p><img src="https://rstudioblog.files.wordpress.com/2014/12/rmarkdown-2x.png" alt="rmarkdown.2x"></p><p>Widgets also provide Shiny output bindings so can be easily used within web applications. Here&rsquo;s the same widget in a Shiny application:</p><p><img src="https://rstudioblog.files.wordpress.com/2014/12/shiny-2x.jpg" alt="shiny.2x"></p><h3 id="bringing-javascript-to-r"><strong>Bringing JavaScript to R</strong></h3><p>The <strong>htmlwidgets</strong> framework is a collaboration between Ramnath Vaidyanathan (rCharts), Kenton Russell (Timely Portfolio), and RStudio. We&rsquo;ve all spent countless hours creating bindings between R and the web and were motivated to create a framework that made this as easy as possible for all R developers.</p><p>There are a plethora of libraries available that create attractive and fully interactive data visualizations for the web. However, the programming interface to these libraries is JavaScript, which places them outside the reach of nearly all statisticians and analysts. <strong>htmlwidgets</strong> makes it extremely straightforward to create an R interface for any JavaScript library.</p><p>Here are a few widget libraries that have been built so far:</p><ul><li><p><a href="http://www.htmlwidgets.org/showcase_leaflet.html">leaflet</a>, a library for creating dynamic maps that support panning and zooming, with various annotations like markers, polygons, and popups.</p></li><li><p><a href="http://www.htmlwidgets.org/showcase_dygraphs.html">dygraphs</a>, which provides rich facilities for charting time-series data and includes support for many interactive features including series/point highlighting, zooming, and panning.</p></li><li><p><a href="http://www.htmlwidgets.org/showcase_networkD3.html">networkD3</a>, a library for creating D3 network graphs including force directed networks, Sankey diagrams, and Reingold-Tilford tree networks.</p></li><li><p><a href="http://www.htmlwidgets.org/showcase_datatables.html">DataTables</a>, which displays R matrices or data frames as interactive HTML tables that support filtering, pagination, and sorting.</p></li><li><p><a href="http://www.htmlwidgets.org/showcase_threejs.html">rthreejs</a>, which features 3D scatterplots and globes based on WebGL.</p></li></ul><p>All of these libraries combine visualization with direct interactivity, enabling users to explore data dynamically. For example, time-series visualizations created with dygraphs allow dynamic panning and zooming:</p><p><a href="http://rstudio.github.io/dygraphs/gallery-range-selector.html"><img src="https://rstudioblog.files.wordpress.com/2014/12/newhaventemps.png" alt="NewHavenTemps"></a></p><h3 id="learning-more"><strong>Learning More</strong></h3><p>To learn more about the framework and see a showcase of the available widgets in action check out the <a href="http://www.htmlwidgets.org">htmlwidgets web site</a>. To learn more about building your own widgets, install the <strong>htmlwidgets</strong> package from CRAN and check out the <a href="http://www.htmlwidgets.org/develop_intro.html">developer documentation</a>.</p></description></item><item><title>httr 0.6.0</title><link>https://www.rstudio.com/blog/httr-0-6-0/</link><pubDate>Sun, 14 Dec 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/httr-0-6-0/</guid><description><p>httr 0.6.0 is now available on CRAN. The httr packages makes it easy to talk to web APIs from R. Learn more in the <a href="http://cran.r-project.org/web/packages/httr/vignettes/quickstart.html">quick start</a> vignette.</p><p>This release is mostly bug fixes and minor improvements. The most important are:</p><ul><li><p><code>handle_reset()</code>, which allows you to reset the default handle if you get the error &ldquo;easy handle already used in multi handle&rdquo;.</p></li><li><p><code>write_stream()</code> which lets you process the response from a server as a stream of raw vectors (#143).</p></li><li><p><code>VERB()</code> allows to you send a request with a custom http verb.</p></li><li><p><code>brew_dr()</code> checks for common problems. It currently checks if your <code>libcurl</code> uses NSS. This is unlikely to work so it gives you some advice on how to fix the problem (thanks to Dirk Eddelbuettel for debugging this problem and suggesting a remedy).</p></li><li><p>Added support for Google OAuth2 <a href="https://developers.google.com/accounts/docs/OAuth2ServiceAccount">service accounts</a>. (#119, thanks to help from @siddharthab). See <code>?oauth_service_token</code> for details.</p></li></ul><p>I&rsquo;ve also switched from RC to R6 (which should make it easier to extend OAuth classes for non-standard OAuth implementations), and tweaked the use of the backend SSL certificate details bundled with httr. See the <a href="https://github.com/hadley/httr/releases/tag/v0.6">release notes</a> for complete details.</p></description></item><item><title>tidyr 0.2.0 (and reshape2 1.4.1)</title><link>https://www.rstudio.com/blog/tidyr-0-2-0/</link><pubDate>Mon, 08 Dec 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/tidyr-0-2-0/</guid><description><p>tidyr 0.2.0 is now available on CRAN. tidyr makes it easy to &ldquo;tidy&rdquo; your data, storing it in a consistent form so that it&rsquo;s easy to manipulate, visualise and model. Tidy data has variables in columns and observations in rows, and is described in more detail in the <a href="http://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.html">tidy data</a> vignette. Install tidyr with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">tidyr&#34;</span>)</code></pre></div><p>There are three important additions to tidyr 0.2.0:</p><ul><li><code>expand()</code> is a wrapper around <code>expand.grid()</code> that allows you to generate all possible combinations of two or more variables. In conjunction with <code>dplyr::left_join()</code>, this makes it easy to fill in missing rows of data.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">sales <span style="color:#666">&lt;-</span> dplyr<span style="color:#666">::</span><span style="color:#06287e">data_frame</span>(year <span style="color:#666">=</span> <span style="color:#06287e">rep</span>(<span style="color:#06287e">c</span>(<span style="color:#40a070">2012</span>, <span style="color:#40a070">2013</span>), <span style="color:#06287e">c</span>(<span style="color:#40a070">4</span>, <span style="color:#40a070">2</span>)),quarter <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#40a070">1</span>, <span style="color:#40a070">2</span>, <span style="color:#40a070">3</span>, <span style="color:#40a070">4</span>, <span style="color:#40a070">2</span>, <span style="color:#40a070">3</span>),sales <span style="color:#666">=</span> <span style="color:#06287e">sample</span>(<span style="color:#40a070">6</span>) <span style="color:#666">*</span> <span style="color:#40a070">100</span>)<span style="color:#60a0b0;font-style:italic"># Missing sales data for 2013 Q1 &amp; Q4</span>sales<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [6 x 3]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; year quarter sales</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 2012 1 400</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2012 2 200</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 2012 3 500</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 2012 4 600</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 2013 2 300</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 2013 3 100</span><span style="color:#60a0b0;font-style:italic"># Missing values are now explicit</span>sales <span style="color:#666">%&gt;%</span><span style="color:#06287e">expand</span>(year, quarter) <span style="color:#666">%&gt;%</span>dplyr<span style="color:#666">::</span><span style="color:#06287e">left_join</span>(sales)<span style="color:#60a0b0;font-style:italic">#&gt; Joining by: c(&#34;year&#34;, &#34;quarter&#34;)</span><span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [8 x 3]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; year quarter sales</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 2012 1 400</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2012 2 200</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 2012 3 500</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 2012 4 600</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 2013 1 NA</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 2013 2 300</span><span style="color:#60a0b0;font-style:italic">#&gt; 7 2013 3 100</span><span style="color:#60a0b0;font-style:italic">#&gt; 8 2013 4 NA</span></code></pre></div><ul><li>In the process of data tidying, it&rsquo;s sometimes useful to have a column of a data frame that is a list of vectors. <code>unnest()</code> lets you simplify that column back down to an atomic vector, duplicating the original rows as needed. (NB: If you&rsquo;re working with data frames containing lists, I highly recommend using dplyr&rsquo;s <code>tbl_df</code>, which will display list-columns in a way that makes their structure more clear. Use <code>dplyr::data_frame()</code> to create a data frame wrapped with the <code>tbl_df</code> class.)</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">raw <span style="color:#666">&lt;-</span> dplyr<span style="color:#666">::</span><span style="color:#06287e">data_frame</span>(x <span style="color:#666">=</span> <span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">3</span>,y <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">d,e,f&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">g,h&#34;</span>))<span style="color:#60a0b0;font-style:italic"># y is character vector containing comma separated strings</span>raw<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [3 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x y</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 a</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 d,e,f</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 3 g,h</span><span style="color:#60a0b0;font-style:italic"># y is a list of character vectors</span>as_list <span style="color:#666">&lt;-</span> raw <span style="color:#666">%&gt;%</span> <span style="color:#06287e">mutate</span>(y <span style="color:#666">=</span> <span style="color:#06287e">strsplit</span>(y, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">,&#34;</span>))as_list<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [3 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x y</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 &lt;chr[1]&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 &lt;chr[3]&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 3 &lt;chr[2]&gt;</span><span style="color:#60a0b0;font-style:italic"># y is a character vector; rows are duplicated as needed</span>as_list <span style="color:#666">%&gt;%</span> <span style="color:#06287e">unnest</span>(y)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [6 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x y</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 a</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 d</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 2 e</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 2 f</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 3 g</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 3 h</span></code></pre></div><ul><li><code>separate()</code> has a new <code>extra</code> argument that allows you to control what happens if a column doesn&rsquo;t always split into the same number of pieces.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">raw <span style="color:#666">%&gt;%</span> <span style="color:#06287e">separate</span>(y, <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">trt&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">B&#34;</span>), <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">,&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Error: Values not split into 2 pieces at 1, 2</span>raw <span style="color:#666">%&gt;%</span> <span style="color:#06287e">separate</span>(y, <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">trt&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">B&#34;</span>), <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">,&#34;</span>, extra <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">drop&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [3 x 3]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x trt B</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 a NA</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 d e</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 3 g h</span>raw <span style="color:#666">%&gt;%</span> <span style="color:#06287e">separate</span>(y, <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">trt&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">B&#34;</span>), <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">,&#34;</span>, extra <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">merge&#34;</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [3 x 3]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x trt B</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 a NA</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 d e,f</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 3 g h</span></code></pre></div><p>To read about the other minor changes and bug fixes, please consult the <a href="https://github.com/hadley/tidyr/releases/tag/v0.2.0">release notes</a>.</p><h2 id="reshape2-141">reshape2 1.4.1</h2><p>There&rsquo;s also a new version of reshape2, 1.4.1. It includes three bug fixes for <code>melt.data.frame()</code> contributed by <a href="https://github.com/kevinushey">Kevin Ushey</a>. Read all about them on the <a href="https://github.com/hadley/reshape/releases/tag/v1.4.1">release notes</a> and install it with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">reshape2&#34;</span>)</code></pre></div></description></item><item><title>magrittr 1.5</title><link>https://www.rstudio.com/blog/magrittr-1-5/</link><pubDate>Mon, 01 Dec 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/magrittr-1-5/</guid><description><p>(Posted on behalf of Stefan Milton Bache)</p><p>Sometimes it&rsquo;s the small things that make a big difference. For me, the introduction of our awkward looking friend, <code>%&gt;%</code>, was one such little thing. I&rsquo;d never suspected that it would have such an impact on the way quite a few people think and write <code>R</code> (including my own), or that pies would be baked (<a href="https://twitter.com/zevross/status/534703645703405570">see here</a>) and t-shirts printed (<a href="https://twitter.com/yokkuns/status/505679441381433344">e.g. here</a>) in honor of the successful three-char-long and slightly overweight operator. Of course a big part of the success is the very fruitful relationship with dplyr and its powerful verbs.</p><p>Quite some time went by without any changes to the CRAN version of magrittr. But many ideas have been evaluated and tested, and now we are happy to finally bring an update which brings both some optimization and a few nifty features — we hope that we have managed to strike a balance between simplicity and usefulness and that you will benefit from this update. You can install it now with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">magrittr&#34;</span>)</code></pre></div><p>The underlying evaluation model is more coherent in this release; this makes the new features more natural extensions and improves performance somewhat. Below I&rsquo;ll recap some of the important new features, which include functional sequences, a few specialized supplementary operators and better lambda syntax.</p><h2 id="functional-sequences">Functional sequences</h2><p>The basic (pseudo) usage of the pipe operator goes something like this:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">awesome_data <span style="color:#666">&lt;-</span>raw_interesting_data <span style="color:#666">%&gt;%</span><span style="color:#06287e">transform</span>(somehow) <span style="color:#666">%&gt;%</span><span style="color:#06287e">filter</span>(the_good_parts) <span style="color:#666">%&gt;%</span>finalize</code></pre></div><p>This statement has three parts: an input, an output, and a sequence transformations. That&rsquo;s suprisingly close to the definition of a function, so in magrittr is really just a convenient way of of defining and applying a function.A new really useful feature of magrittr 1.5 makes that explicit: you can use <code>%&gt;%</code> to not only produce <em>values</em> but also to produce <em>functions</em> (or <em>functional sequences</em>)! It&rsquo;s really all the same, except sometimes the function is applied instantly and produces a result, and sometimes it is not, in which case the function itself is returned. In this case, there is no initial value, so we replace that with the dot placeholder. Here is how:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">mae <span style="color:#666">&lt;-</span> . <span style="color:#666">%&gt;%</span> abs <span style="color:#666">%&gt;%</span> <span style="color:#06287e">mean</span>(na.rm <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>)<span style="color:#06287e">mae</span>(<span style="color:#06287e">rnorm</span>(<span style="color:#40a070">10</span>))<span style="color:#60a0b0;font-style:italic">#&gt; [1] 0.5605</span></code></pre></div><p>That&rsquo;s equivalent to:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">mae <span style="color:#666">&lt;-</span> <span style="color:#06287e">function</span>(x) {<span style="color:#06287e">mean</span>(<span style="color:#06287e">abs</span>(x), na.rm <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>)}</code></pre></div><p>Even for a short function, this is more compact, and is easier to read as it is defined linearly from left to right.There are some really cool use cases for this: <a href="http://adv-r.had.co.nz/Functionals.html">functionals</a>! Consider how clean it is to pass a function to <code>lapply</code> or <code>aggregate</code>!</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">info <span style="color:#666">&lt;-</span>files <span style="color:#666">%&gt;%</span><span style="color:#06287e">lapply</span>(. <span style="color:#666">%&gt;%</span> read_file <span style="color:#666">%&gt;%</span> <span style="color:#06287e">extract</span>(the_goodies))</code></pre></div><p>Functions made this way can be indexed with <code>[</code> to get a new function containing only a subset of the steps.</p><h2 id="lambda-expressions">Lambda expressions</h2><p>The new version makes it clearer that each step is really just a single-statement body of a unary function. What if we need a little more than one command to make a satisfactory &ldquo;step&rdquo; in a chain? Before, one might either define a function outside the chain, or even anonymously inside the chain, enclosing the entire definition in parentheses. Now extending that one command is like extending a standard one-command function: enclose whatever you&rsquo;d like in braces, and that&rsquo;s it:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">value <span style="color:#666">%&gt;%</span>foo <span style="color:#666">%&gt;%</span> {x <span style="color:#666">&lt;-</span> <span style="color:#06287e">bar</span>(.)y <span style="color:#666">&lt;-</span> <span style="color:#06287e">baz</span>(.)x <span style="color:#666">*</span> y} <span style="color:#666">%&gt;%</span>and_whatever</code></pre></div><p>As usual, the name of the argument to that unary function is <code>.</code>.</p><h2 id="nested-function-calls">Nested function calls</h2><p>In this release the dot (<code>.</code>) will work also in nested function calls on the right-hand side, e.g.:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">5</span> <span style="color:#666">%&gt;%</span><span style="color:#06287e">paste</span>(<span style="color:#007020;font-weight:bold">letters</span>[.])<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;1 a&#34; &#34;2 b&#34; &#34;3 c&#34; &#34;4 d&#34; &#34;5 e&#34;</span></code></pre></div><p>When you use <code>.</code> inside a function call, it&rsquo;s used in addition to, not instead of, <code>.</code> at the top-level. For example, the previous command is equivalent to:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">5</span> <span style="color:#666">%&gt;%</span><span style="color:#06287e">paste</span>(., <span style="color:#007020;font-weight:bold">letters</span>[.])<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;1 a&#34; &#34;2 b&#34; &#34;3 c&#34; &#34;4 d&#34; &#34;5 e&#34;</span></code></pre></div><p>If you don&rsquo;t want this behaviour, wrap the function call in <code>{</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">5</span> <span style="color:#666">%&gt;%</span> {<span style="color:#06287e">paste</span>(<span style="color:#007020;font-weight:bold">letters</span>[.])}<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;a&#34; &#34;b&#34; &#34;c&#34; &#34;d&#34; &#34;e&#34;</span></code></pre></div><h2 id="a-few-of-s-friends">A few of <code>%&gt;%</code>'s friends</h2><p>We also introduce a few operators. These are supplementary operators that just make some situations more comfortable.The <strong>tee</strong> operator, <code>%T&gt;%</code>, enables temporary branching in a pipeline to apply a few side-effect commands to the current value, like plotting or logging, and is inspired by the Unix tee command. The only difference to <code>%&gt;%</code> is that <code>%T&gt;%</code> returns the left-hand side rather than the result of applying the right-hand side:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">value <span style="color:#666">%&gt;%</span>transform <span style="color:#666">%T&gt;%</span>plot <span style="color:#666">%&gt;%</span><span style="color:#06287e">transform</span>(even_more)</code></pre></div><p>This is a shortcut for:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">value <span style="color:#666">%&gt;%</span>transform <span style="color:#666">%&gt;%</span>{ <span style="color:#06287e">plot</span>(.); . } <span style="color:#666">%&gt;%</span><span style="color:#06287e">transform</span>(even_more)</code></pre></div><p>because <code>plot()</code> doesn&rsquo;t normally return anything that can be piped along!The <strong>exposition</strong> operator, <code>%$%</code>, is a wrapper around <code>with()</code>,which makes it easy to refer to the variables inside a data frame:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">mtcars <span style="color:#666">%$%</span><span style="color:#06287e">plot</span>(mpg, wt)</code></pre></div><p>Finally, we also have <code>%&lt;&gt;%</code>, the <strong>compound assignment</strong> pipe operator. This must be the first operator in the chain, and it will assign the result of the pipeline to the left-hand side name or expression. It&rsquo;s purpose is to shorten expressions like this:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">data<span style="color:#666">$</span>some_variable <span style="color:#666">&lt;-</span>data<span style="color:#666">$</span>some_variable <span style="color:#666">%&gt;%</span>transform</code></pre></div><p>and turn them into something like this:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">data<span style="color:#666">$</span>some_variable <span style="color:#666">%&lt;&gt;%</span>transform</code></pre></div><p>Even a small example like <code>x %&lt;&gt;% sort</code> has its appeal!In summary there is a few new things to get to know; but magrittr is like it always was. Just a little coolr!</p></description></item><item><title>rvest: easy web scraping with R</title><link>https://www.rstudio.com/blog/rvest-easy-web-scraping-with-r/</link><pubDate>Mon, 24 Nov 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rvest-easy-web-scraping-with-r/</guid><description><p>rvest is new package that makes it easy to scrape (or harvest) data from html web pages, inspired by libraries like <a href="http://www.crummy.com/software/BeautifulSoup/">beautiful soup</a>. It is designed to work with <a href="https://github.com/smbache/magrittr">magrittr</a> so that you can express complex operations as elegant pipelines composed of simple, easily understood pieces. Install it with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">rvest&#34;</span>)</code></pre></div><h2 id="rvest-in-action">rvest in action</h2><p>To see rvest in action, imagine we&rsquo;d like to scrape some information about <a href="http://www.imdb.com/title/tt1490017/">The Lego Movie</a> from IMDB. We start by downloading and parsing the file with <code>html()</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(rvest)lego_movie <span style="color:#666">&lt;-</span> <span style="color:#06287e">html</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">http://www.imdb.com/title/tt1490017/&#34;</span>)</code></pre></div><p>To extract the rating, we start with <a href="http://selectorgadget.com">selectorgadget</a> to figure out which css selector matches the data we want: <code>strong span</code>. (If you haven&rsquo;t heard of <a href="http://selectorgadget.com/">selectorgadget</a>, make sure to read <code>vignette(&quot;selectorgadget&quot;)</code> - it&rsquo;s the easiest way to determine which selector extracts the data that you&rsquo;re interested in.) We use <code>html_node()</code> to find the first node that matches that selector, extract its contents with <code>html_text()</code>, and convert it to numeric with <code>as.numeric()</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">lego_movie <span style="color:#666">%&gt;%</span><span style="color:#06287e">html_node</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">strong span&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">html_text</span>() <span style="color:#666">%&gt;%</span><span style="color:#06287e">as.numeric</span>()<span style="color:#60a0b0;font-style:italic">#&gt; [1] 7.9</span></code></pre></div><p>We use a similar process to extract the cast, using <code>html_nodes()</code> to find all nodes that match the selector:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">lego_movie <span style="color:#666">%&gt;%</span><span style="color:#06287e">html_nodes</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">#titleCast .itemprop span&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">html_text</span>()<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;Will Arnett&#34; &#34;Elizabeth Banks&#34; &#34;Craig Berry&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt; [4] &#34;Alison Brie&#34; &#34;David Burrows&#34; &#34;Anthony Daniels&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt; [7] &#34;Charlie Day&#34; &#34;Amanda Farinos&#34; &#34;Keith Ferguson&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt; [10] &#34;Will Ferrell&#34; &#34;Will Forte&#34; &#34;Dave Franco&#34;</span><span style="color:#60a0b0;font-style:italic">#&gt; [13] &#34;Morgan Freeman&#34; &#34;Todd Hansen&#34; &#34;Jonah Hill&#34;</span></code></pre></div><p>The titles and authors of recent message board postings are stored in a the third table on the page. We can use <code>html_node()</code> and <code>[[</code> to find it, then coerce it to a data frame with <code>html_table()</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">lego_movie <span style="color:#666">%&gt;%</span><span style="color:#06287e">html_nodes</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">table&#34;</span>) <span style="color:#666">%&gt;%</span>.[[3]] <span style="color:#666">%&gt;%</span><span style="color:#06287e">html_table</span>()<span style="color:#60a0b0;font-style:italic">#&gt; X 1 NA</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 this movie is very very deep and philosophical mrdoctor524</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 This got an 8.0 and Wizard of Oz got an 8.1... marr-justinm</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 Discouraging Building? Laestig</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 LEGO - the plural neil-476</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 Academy Awards browncoatjw</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 what was the funniest part? actionjacksin</span></code></pre></div><h2 id="other-important-functions">Other important functions</h2><ul><li><p>If you prefer, you can use xpath selectors instead of css: <code>html_nodes(doc, xpath = &quot;//table//td&quot;)</code>).</p></li><li><p>Extract the tag names with <code>html_tag()</code>, text with <code>html_text()</code>, a single attribute with <code>html_attr()</code> or all attributes with <code>html_attrs()</code>.</p></li><li><p>Detect and repair text encoding problems with <code>guess_encoding()</code> and <code>repair_encoding()</code>.</p></li><li><p>Navigate around a website as if you&rsquo;re in a browser with <code>html_session()</code>, <code>jump_to()</code>, <code>follow_link()</code>, <code>back()</code>, and <code>forward()</code>. Extract, modify and submit forms with <code>html_form()</code>, <code>set_values()</code> and <code>submit_form()</code>. (This is still a work in progress, so I&rsquo;d love your feedback.)</p></li></ul><p>To see these functions in action, check out package demos with <code>demo(package = &quot;rvest&quot;)</code>.</p></description></item><item><title>Introduction to Data Science with R video workshop</title><link>https://www.rstudio.com/blog/introduction-to-data-science-with-r-video-workshop/</link><pubDate>Thu, 06 Nov 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/introduction-to-data-science-with-r-video-workshop/</guid><description><p><a href="http://www.youtube.com/watch?v=Fa9gghVBlk4">http://www.youtube.com/watch?v=Fa9gghVBlk4</a></p><p>RStudio has teamed up with O&rsquo;Reilly media to create a new way to learn R!</p><p>The <a href="http://shop.oreilly.com/product/0636920034834.do">Introduction to Data Science with R video course</a> is a comprehensive introduction to the R language. It&rsquo;s ideal for non-programmers with no data science experience or for data scientists switching to R from Excel, SAS or other software.</p><p>Join RStudio Master Instructor Garrett Grolemund as he covers the three skill sets of data science: computer programming (with R), manipulating data sets (including loading, cleaning, and visualizing data), and modeling data with statistical methods. You&rsquo;ll learn R&rsquo;s syntax and grammar as well as how to load, save, and transform data, generate beautiful graphs, and fit statistical models to the data.</p><p>All of the techniques introduced in this video are motivated by real problems that involve real datasets. You&rsquo;ll get plenty of hands-on experience with R (and not just hear about it!), and lots of help if you get stuck.</p><p>You&rsquo;ll also learn how to use the ggplot2, reshape2, and dplyr packages.</p><p>The course contains over eight hours of instruction. You can access the first hour free from <a href="http://shop.oreilly.com/product/0636920034834.do">O&rsquo;Reilly&rsquo;s website</a>. The course covers the same content as our two day Introduction to Data Science with R workshop, right down to the same exercises. But unlike our workshops, the videos are self-paced, which can help you learn R in a more relaxed way.</p><p>To learn more, visit <a href="http://shop.oreilly.com/product/0636920034834.do">Introduction to Data Science with R</a>.</p></description></item><item><title>RSQLite 1.0.0</title><link>https://www.rstudio.com/blog/rsqlite-1-0-0/</link><pubDate>Sat, 25 Oct 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rsqlite-1-0-0/</guid><description><p>I&rsquo;m very pleased to announce a new version of RSQLite 1.0.0. RSQLite is <em>the</em> easiest way to use SQL database from R:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(DBI)<span style="color:#60a0b0;font-style:italic"># Create an ephemeral in-memory RSQLite database</span>con <span style="color:#666">&lt;-</span> <span style="color:#06287e">dbConnect</span>(RSQLite<span style="color:#666">::</span><span style="color:#06287e">SQLite</span>(), <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">:memory:&#34;</span>)<span style="color:#60a0b0;font-style:italic"># Copy in the buit-in mtcars data frame</span><span style="color:#06287e">dbWriteTable</span>(con, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">mtcars&#34;</span>, mtcars, row.names <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">FALSE</span>)<span style="color:#60a0b0;font-style:italic">#&gt; [1] TRUE</span><span style="color:#60a0b0;font-style:italic"># Fetch all results from a query:</span>res <span style="color:#666">&lt;-</span> <span style="color:#06287e">dbSendQuery</span>(con, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">SELECT * FROM mtcars WHERE cyl = 4 AND mpg &lt; 23&#34;</span>)<span style="color:#06287e">dbFetch</span>(res)<span style="color:#60a0b0;font-style:italic">#&gt; mpg cyl disp hp drat wt qsec vs am gear carb</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2</span><span style="color:#06287e">dbClearResult</span>(res)<span style="color:#60a0b0;font-style:italic">#&gt; [1] TRUE</span><span style="color:#60a0b0;font-style:italic"># Or fetch them a chunk at a time</span>res <span style="color:#666">&lt;-</span> <span style="color:#06287e">dbSendQuery</span>(con, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">SELECT * FROM mtcars WHERE cyl = 4&#34;</span>)<span style="color:#06287e">while</span>(<span style="color:#666">!</span><span style="color:#06287e">dbHasCompleted</span>(res)){chunk <span style="color:#666">&lt;-</span> <span style="color:#06287e">dbFetch</span>(res, n <span style="color:#666">=</span> <span style="color:#40a070">10</span>)<span style="color:#06287e">print</span>(<span style="color:#06287e">nrow</span>(chunk))}<span style="color:#60a0b0;font-style:italic">#&gt; [1] 10</span><span style="color:#60a0b0;font-style:italic">#&gt; [1] 1</span><span style="color:#06287e">dbClearResult</span>(res)<span style="color:#60a0b0;font-style:italic">#&gt; [1] TRUE</span><span style="color:#60a0b0;font-style:italic"># Good practice to disconnect from the database when you&#39;re done</span><span style="color:#06287e">dbDisconnect</span>(con)<span style="color:#60a0b0;font-style:italic">#&gt; [1] TRUE</span></code></pre></div><p>RSQLite 1.0.0 is mostly a cleanup release. This means a lot of old functions have been deprecated and removed:</p><ul><li><p><code>idIsValid()</code> is deprecated; use <code>dbIsValid()</code> instead. <code>dbBeginTransaction()</code> is deprecated; use <code>dbBegin()</code> instead. Use <code>dbFetch()</code> instead of <code>fetch()</code>.</p></li><li><p><code>dbBuildTableDefinition()</code> is now <code>sqliteBuildTableDefinition()</code> (to avoid implying that it&rsquo;s a DBI generic).</p></li><li><p>Internal <code>sqlite*()</code> functions are no longer exported (#20). <code>safe.write()</code> is no longer exported.</p></li></ul><p>It also includes a few minor improvements and bug fixes. The most important are:</p><ul><li><p>Inlined <code>RSQLite.extfuns</code> - use <code>initExtension()</code> to load the many useful extension functions.</p></li><li><p>Methods no longer automatically clone the connection is there is an open result set. This was implemented inconsistently in a handful of places. RSQLite is now more forgiving if you forget to close a result set - it will close it for you, with a warning. It&rsquo;s still good practice to clean up after yourself with <code>dbClearResults()</code>, but you don&rsquo;t have to.</p></li><li><p><code>dbBegin()</code>, <code>dbCommit()</code> and <code>dbRollback()</code> throw errors on failure, rather than returning <code>FALSE</code>. They all gain a <code>name</code> argument to specify named savepoints.</p></li><li><p><code>dbWriteTable()</code> has been rewritten. It uses a better quoting strategy, throws errors on failure, and only automatically adds row names only if they&rsquo;re strings. (NB: <code>dbWriteTable()</code> also has a method that allows you load files directly from disk.)</p></li></ul><p>For a complete list of changes, please see the full <a href="https://github.com/rstats-db/RSQLite/releases/tag/v1.0.0">release notes</a>.</p></description></item><item><title>dplyr 0.3</title><link>https://www.rstudio.com/blog/dplyr-0-3-2/</link><pubDate>Mon, 13 Oct 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/dplyr-0-3-2/</guid><description><p>I&rsquo;m very pleased to announce that dplyr 0.3 is now available from CRAN. Get the latest version by running:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">dplyr&#34;</span>)</code></pre></div><p>There are four major new features:</p><ul><li><p>Four new high-level verbs: <code>distinct()</code>, <code>slice()</code>, <code>rename()</code>, and <code>transmute()</code>.</p></li><li><p>Three new helper functions <code>between</code>, <code>count()</code>, and <code>data_frame()</code>.</p></li><li><p>More flexible join specifications.</p></li><li><p>Support for row-based set operations.</p></li></ul><p>There are two new features of interest to developers. They make it easier to write packages that use dplyr:</p><ul><li><p>It&rsquo;s now much easier to program with dplyr (using standard evaluation).</p></li><li><p>Improved database backends.</p></li></ul><p>I describe each of these in turn below.</p><h2 id="new-verbs">New verbs</h2><p><code>distinct()</code> returns distinct (unique) rows of a table:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(nycflights13)<span style="color:#60a0b0;font-style:italic"># Find all origin-destination pairs</span>flights <span style="color:#666">%&gt;%</span><span style="color:#06287e">select</span>(origin, dest) <span style="color:#666">%&gt;%</span><span style="color:#06287e">distinct</span>()<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [224 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; origin dest</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 EWR IAH</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 LGA IAH</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 JFK MIA</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 JFK BQN</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 LGA ATL</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ... ...</span></code></pre></div><p><code>slice()</code> allows you to select rows by position. It includes positive integers and drops negative integers:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic"># Get the first flight to each destination</span>flights <span style="color:#666">%&gt;%</span><span style="color:#06287e">group_by</span>(dest) <span style="color:#666">%&gt;%</span><span style="color:#06287e">slice</span>(<span style="color:#40a070">1</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [105 x 16]</span><span style="color:#60a0b0;font-style:italic">#&gt; Groups: dest</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; year month day dep_time dep_delay arr_time arr_delay carrier tailnum</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 2013 10 1 1955 -6 2213 -35 B6 N554JB</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2013 10 1 1149 -10 1245 -14 B6 N346JB</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 2013 1 1 1315 -2 1413 -10 EV N13538</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 2013 7 6 1629 14 1954 1 UA N587UA</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 2013 1 1 554 -6 812 -25 DL N668DN</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ... ... ... ... ... ... ... ... ...</span><span style="color:#60a0b0;font-style:italic">#&gt; Variables not shown: flight (int), origin (chr), dest (chr), air_time</span><span style="color:#60a0b0;font-style:italic">#&gt; (dbl), distance (dbl), hour (dbl), minute (dbl)</span></code></pre></div><p><code>transmute()</code> and <code>rename()</code> are variants of <code>mutate()</code> and <code>select()</code>. Transmute drops all columns that you didn&rsquo;t specifically mention, <code>rename()</code> keeps all columns that you didn&rsquo;t specifically mention. They complete this table:</p><table><thead><tr><th align="left"></th><th align="left">Drop others</th><th align="left">Keep others</th></tr></thead><tbody><tr><td align="left">Rename &amp; reorder variables</td><td align="left"><code>select()</code></td><td align="left"><code>rename()</code></td></tr><tr><td align="left">Compute new variables</td><td align="left"><code>transmute()</code></td><td align="left"><code>mutate()</code></td></tr></tbody></table><h2 id="new-helpers">New helpers</h2><p><code>data_frame()</code>, contributed by <a href="https://github.com/kevinushey">Kevin Ushey</a>, is a nice way to create data frames:</p><ul><li>It never changes the type of its inputs (i.e. no more <code>stringsAsFactors = FALSE</code>!)</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">data.frame</span>(x <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">letters</span>) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">sapply</span>(class)<span style="color:#60a0b0;font-style:italic">#&gt; x</span><span style="color:#60a0b0;font-style:italic">#&gt; &#34;factor&#34;</span><span style="color:#06287e">data_frame</span>(x <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">letters</span>) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">sapply</span>(class)<span style="color:#60a0b0;font-style:italic">#&gt; x</span><span style="color:#60a0b0;font-style:italic">#&gt; &#34;character&#34;</span></code></pre></div><ul><li>Or the names of variables:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">data.frame</span>(`crazy name` <span style="color:#666">=</span> <span style="color:#40a070">1</span>) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">names</span>()<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;crazy.name&#34;</span><span style="color:#06287e">data_frame</span>(`crazy name` <span style="color:#666">=</span> <span style="color:#40a070">1</span>) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">names</span>()<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;crazy name&#34;</span></code></pre></div><ul><li>It evaluates its arguments lazyily and in order:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">data_frame</span>(x <span style="color:#666">=</span> <span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">5</span>, y <span style="color:#666">=</span> x ^ <span style="color:#40a070">2</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [5 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; x y</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 1</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 4</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 3 9</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 4 16</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 5 25</span></code></pre></div><ul><li>It adds <code>tbl_df()</code> class to output, never adds <code>row.names()</code>, and only recycles vectors of length 1 (recycling is a frequent source of bugs in my experience).</li></ul><p>The <code>count()</code> function wraps up the common combination of <code>group_by()</code> and <code>summarise()</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic"># How many flights to each destination?</span>flights <span style="color:#666">%&gt;%</span> <span style="color:#06287e">count</span>(dest)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [105 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; dest n</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 ABQ 254</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 ACK 265</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 ALB 439</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 ANC 8</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 ATL 17215</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ... ...</span><span style="color:#60a0b0;font-style:italic"># Which planes flew the most?</span>flights <span style="color:#666">%&gt;%</span> <span style="color:#06287e">count</span>(tailnum, sort <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [4,044 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; tailnum n</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 2512</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 N725MQ 575</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 N722MQ 513</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 N723MQ 507</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 N711MQ 486</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ... ...</span><span style="color:#60a0b0;font-style:italic"># What&#39;s the total carrying capacity of planes by year of purchase</span>planes <span style="color:#666">%&gt;%</span> <span style="color:#06287e">count</span>(year, wt <span style="color:#666">=</span> seats)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [47 x 2]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; year n</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1956 102</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 1959 18</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 1963 10</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 1965 149</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 1967 9</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ... ...</span></code></pre></div><h2 id="better-joins">Better joins</h2><p>You can now join by different variables in each table:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">narrow <span style="color:#666">&lt;-</span> flights <span style="color:#666">%&gt;%</span> <span style="color:#06287e">select</span>(origin, dest, year<span style="color:#666">:</span>day)<span style="color:#60a0b0;font-style:italic"># Add destination airport metadata</span>narrow <span style="color:#666">%&gt;%</span> <span style="color:#06287e">left_join</span>(airports, <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">dest&#34;</span> <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">faa&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [336,776 x 11]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; dest origin year month day name lat lon</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 IAH EWR 2013 1 1 George Bush Intercontinental 29.98 -95.34</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 IAH LGA 2013 1 1 George Bush Intercontinental 29.98 -95.34</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 MIA JFK 2013 1 1 Miami Intl 25.79 -80.29</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 BQN JFK 2013 1 1 NA NA NA</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 ATL LGA 2013 1 1 Hartsfield Jackson Atlanta Intl 33.64 -84.43</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ... ... ... ... ... ... ... ...</span><span style="color:#60a0b0;font-style:italic">#&gt; Variables not shown: alt (int), tz (dbl), dst (chr)</span><span style="color:#60a0b0;font-style:italic"># Add origin airport metadata</span>narrow <span style="color:#666">%&gt;%</span> <span style="color:#06287e">left_join</span>(airports, <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">origin&#34;</span> <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">faa&#34;</span>))<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [336,776 x 11]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; origin dest year month day name lat lon alt tz dst</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 EWR IAH 2013 1 1 Newark Liberty Intl 40.69 -74.17 18 -5 A</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 LGA IAH 2013 1 1 La Guardia 40.78 -73.87 22 -5 A</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 JFK MIA 2013 1 1 John F Kennedy Intl 40.64 -73.78 13 -5 A</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 JFK BQN 2013 1 1 John F Kennedy Intl 40.64 -73.78 13 -5 A</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 LGA ATL 2013 1 1 La Guardia 40.78 -73.87 22 -5 A</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ... ... ... ... ... ... ... ... ... .. ...</span></code></pre></div><p>(<code>right_join()</code> and <code>outer_join()</code> implementations are planned for dplyr 0.4.)</p><h2 id="set-operations">Set operations</h2><p>You can use <code>intersect()</code>, <code>union()</code> and <code>setdiff()</code> with data frames, data tables and databases:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">jfk_planes <span style="color:#666">&lt;-</span> flights <span style="color:#666">%&gt;%</span><span style="color:#06287e">filter</span>(origin <span style="color:#666">==</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">JFK&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">select</span>(tailnum) <span style="color:#666">%&gt;%</span><span style="color:#06287e">distinct</span>()lga_planes <span style="color:#666">&lt;-</span> flights <span style="color:#666">%&gt;%</span><span style="color:#06287e">filter</span>(origin <span style="color:#666">==</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">LGA&#34;</span>) <span style="color:#666">%&gt;%</span><span style="color:#06287e">select</span>(tailnum) <span style="color:#666">%&gt;%</span><span style="color:#06287e">distinct</span>()<span style="color:#60a0b0;font-style:italic"># Planes that fly out of either JGK or LGA</span><span style="color:#06287e">nrow</span>(<span style="color:#06287e">union</span>(jfk_planes, lga_planes))<span style="color:#60a0b0;font-style:italic">#&gt; [1] 3592</span><span style="color:#60a0b0;font-style:italic"># Planes that fly out of both JFK and LGA</span><span style="color:#06287e">nrow</span>(<span style="color:#06287e">intersect</span>(jfk_planes, lga_planes))<span style="color:#60a0b0;font-style:italic">#&gt; [1] 1311</span><span style="color:#60a0b0;font-style:italic"># Planes that fly out JGK but not LGA</span><span style="color:#06287e">nrow</span>(<span style="color:#06287e">setdiff</span>(jfk_planes, lga_planes))<span style="color:#60a0b0;font-style:italic">#&gt; [1] 647</span></code></pre></div><h2 id="programming-with-dplyr">Programming with dplyr</h2><p>You can now program with dplyr - every function that uses non-standard evaluation (NSE) also has a standard evaluation (SE) twin that ends in <code>_</code>. For example, the SE version of <code>filter()</code> is called <code>filter_()</code>. The SE version of each function has similar arguments, but they must be explicitly &ldquo;quoted&rdquo;. Usually the best way to do this is to use <code>~</code>:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">airport <span style="color:#666">&lt;-</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">ANC&#34;</span><span style="color:#60a0b0;font-style:italic"># NSE version</span><span style="color:#06287e">filter</span>(flights, dest <span style="color:#666">==</span> airport)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [8 x 16]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; year month day dep_time dep_delay arr_time arr_delay carrier tailnum</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 2013 7 6 1629 14 1954 1 UA N587UA</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2013 7 13 1618 3 1955 2 UA N572UA</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 2013 7 20 1618 3 2003 10 UA N567UA</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 2013 7 27 1617 2 1906 -47 UA N559UA</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 2013 8 3 1615 0 2003 10 UA N572UA</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ... ... ... ... ... ... ... ... ...</span><span style="color:#60a0b0;font-style:italic">#&gt; Variables not shown: flight (int), origin (chr), dest (chr), air_time</span><span style="color:#60a0b0;font-style:italic">#&gt; (dbl), distance (dbl), hour (dbl), minute (dbl)</span><span style="color:#60a0b0;font-style:italic"># Equivalent SE code:</span>criteria <span style="color:#666">&lt;-</span> <span style="color:#666">~</span>dest <span style="color:#666">==</span> airport<span style="color:#06287e">filter_</span>(flights, criteria)<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [8 x 16]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; year month day dep_time dep_delay arr_time arr_delay carrier tailnum</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 2013 7 6 1629 14 1954 1 UA N587UA</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2013 7 13 1618 3 1955 2 UA N572UA</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 2013 7 20 1618 3 2003 10 UA N567UA</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 2013 7 27 1617 2 1906 -47 UA N559UA</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 2013 8 3 1615 0 2003 10 UA N572UA</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ... ... ... ... ... ... ... ... ...</span><span style="color:#60a0b0;font-style:italic">#&gt; Variables not shown: flight (int), origin (chr), dest (chr), air_time</span><span style="color:#60a0b0;font-style:italic">#&gt; (dbl), distance (dbl), hour (dbl), minute (dbl)</span></code></pre></div><p>To learn more, read the <a href="http://cran.r-project.org/web/packages/dplyr/vignettes/nse.html">Non-standard evaluation</a> vignette. This new approach is powered by the <a href="https://github.com/hadley/lazyeval">lazyeval package</a> which provides all the tools needed to implement NSE consistently and correctly. I now understand how to implement NSE consistently and correctly, and I&rsquo;ll be using the same approach everywhere.</p><h2 id="database-backends">Database backends</h2><p>The database backend system has been completely overhauled in order to make it possible to add backends in other packages, and to support a much wider range of databases. If you&rsquo;re interested in implementing a new dplyr backend, please check out <code>vignette(&quot;new-sql-backend&quot;)</code> - it&rsquo;s really not that much work.</p><p>The first package to take advantage of this system is <a href="http://cran.r-project.org/web/packages/MonetDB.R">MonetDB.R</a>, which now provides the MonetDB backend for dplyr.</p><h2 id="other-changes">Other changes</h2><p>As well as the big new features described here, dplyr 0.3 also fixes many bugs and makes numerous minor improvements. See the <a href="https://github.com/hadley/dplyr/releases/tag/v0.3">release notes</a> for a complete list of the changes.</p></description></item><item><title>ggvis 0.4</title><link>https://www.rstudio.com/blog/ggvis-0-4/</link><pubDate>Mon, 13 Oct 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/ggvis-0-4/</guid><description><p>ggvis 0.4 is now available on CRAN. You can install it with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">ggvis&#34;</span>)</code></pre></div><p>The major features of this release are:</p><ul><li>Boxplots, with <code>layer_boxplots()</code></li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">chickwts <span style="color:#666">%&gt;%</span> <span style="color:#06287e">ggvis</span>(<span style="color:#666">~</span>feed, <span style="color:#666">~</span>weight) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">layer_boxplots</span>()</code></pre></div><p><img src="https://rstudioblog.files.wordpress.com/2014/10/ggvis-0-4-boxplot.png" alt="ggvis box plot"></p><ul><li><p>Better stability when errors occur.</p></li><li><p>Better handling of empty data and malformed data.</p></li><li><p>More consistent handling of data in compute pipeline functions.</p></li></ul><p>Because of these changes, interactive graphics with dynamic data sources will work more reliably.</p><p>Additionally, there are many small improvements and bug fixes under the hood. You can see the full change log <a href="https://github.com/rstudio/ggvis/releases/tag/v0.4">here</a>.</p></description></item><item><title>devtools 1.6</title><link>https://www.rstudio.com/blog/devtools-1-6/</link><pubDate>Thu, 02 Oct 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/devtools-1-6/</guid><description><p>Devtools 1.6 is now available on CRAN. Devtools makes it so easy to build a package that it becomes your default way to organise code, data and documentation. Learn more at <a href="http://r-pkgs.had.co.nz/">http://r-pkgs.had.co.nz/</a>. You can get the latest version with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">devtools&#34;</span>)</code></pre></div><p>We&rsquo;ve made a lot of improvements to the install and release process:</p><ul><li><p>Installation functions now default to <code>build_vignettes = FALSE</code>, and only install required dependencies (not suggested). They also store a lot of useful metadata.</p></li><li><p><code>install_github()</code> got a lot of love. <code>install_github(&quot;user/repo&quot;)</code> is now the preferred way to install a package from github (older forms with explicit username parameter are now deprecated). You can supply the <code>host</code> argument to install packages from a local github enterprise installation. You can get the latest release with <code>user/repo@*release</code>.</p></li><li><p><code>session_info()</code> uses package installation metdata to show you exactly how every package was installed (locally, from CRAN, from github, &hellip;)</p></li><li><p><code>release()</code> uses new webform-based submission process for CRAN, as implemented in <code>submit_cran()</code>.</p></li><li><p>You can add arbitrary extra questions to <code>release()</code> by defining a function <code>release_questions()</code> in your package. It should return a character vector of questions to ask.</p></li></ul><p>We&rsquo;ve also added a number of functions to make it easy to get started with various aspects of the package development:</p><ul><li><p><code>use_data()</code> adds data to a package, either in <code>data/</code> (external data) or in <code>R/sysdata.rda</code> (internal data). <code>use_data_raw()</code> sets up <code>data-raw/</code> for your reproducible data generation scripts.</p></li><li><p><code>use_package()</code> sets dependencies and reminds you how to use them.</p></li><li><p><code>use_rcpp()</code> gets you ready to use <a href="http://www.rcpp.org">Rcpp</a>.</p></li><li><p><code>use_testthat()</code> sets up testing infrastructure with <a href="http://r-pkgs.had.co.nz/tests.html">testthat</a>.</p></li><li><p><code>use_travis()</code> adds a <code>.travis.yml</code> file and tells you how to get started with <a href="https://travis-ci.org">travis ci</a>.</p></li><li><p><code>use_vignette()</code> creates a draft vignette using <a href="http://rmarkdown.rstudio.com">Rmarkdown</a>.</p></li></ul><p>There were many other minor improvements and bug fixes. See the <a href="https://github.com/hadley/devtools/releases/tag/v1.6">release notes</a> for complete list of changes.</p></description></item><item><title>Shiny 0.10.2</title><link>https://www.rstudio.com/blog/shiny-0-10-2/</link><pubDate>Thu, 02 Oct 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-0-10-2/</guid><description><p><a href="http://cran.rstudio.com/package=shiny">Shiny v0.10.2</a> has been released to CRAN. To install it:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#39;</span><span style="color:#4070a0">shiny&#39;</span>)</code></pre></div><p>This version of Shiny requires R 3.0.0 or higher (note the <a href="http://cran.rstudio.com/">current version of R</a> is 3.1.1). R 2.15.x is no longer supported.</p><p>Here are the most prominent changes:</p><ul><li><p>File uploading via <code>fileInput()</code> now works for Internet Explorer 8 and 9. Note, however, that IE 8/9 do not support multiple files from a single file input. If you need to upload multiple files, you must use one file input for each file. Unlike in modern web browsers, no progress bar will display when uploading files in IE 8/9.</p></li><li><p>Shiny now supports single-file applications: instead of needing two separate files, <code>server.R</code> and <code>ui.R</code>, you can now create an application with single file named <code>app.R</code>. This also makes it easier to distribute example Shiny code, because you can run an entire app by simply copying and pasting the code for a single-file app into the R console. Here&rsquo;s a simple example of a single-file app:</p></li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic">## app.R</span>server <span style="color:#666">&lt;-</span> <span style="color:#06287e">function</span>(input, output) {output<span style="color:#666">$</span>distPlot <span style="color:#666">&lt;-</span> <span style="color:#06287e">renderPlot</span>({<span style="color:#06287e">hist</span>(<span style="color:#06287e">rnorm</span>(input<span style="color:#666">$</span>obs), col <span style="color:#666">=</span> <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">darkgray&#39;</span>, border <span style="color:#666">=</span> <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">white&#39;</span>)})}ui <span style="color:#666">&lt;-</span> <span style="color:#06287e">shinyUI</span>(<span style="color:#06287e">fluidPage</span>(<span style="color:#06287e">sidebarLayout</span>(<span style="color:#06287e">sidebarPanel</span>(<span style="color:#06287e">sliderInput</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">obs&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Number of observations:&#34;</span>,min <span style="color:#666">=</span> <span style="color:#40a070">10</span>, max <span style="color:#666">=</span> <span style="color:#40a070">500</span>, value <span style="color:#666">=</span> <span style="color:#40a070">100</span>)),<span style="color:#06287e">mainPanel</span>(<span style="color:#06287e">plotOutput</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">distPlot&#34;</span>)))))<span style="color:#06287e">shinyApp</span>(ui <span style="color:#666">=</span> ui, server <span style="color:#666">=</span> server)</code></pre></div><p>See the <a href="https://shiny.rstudio.com/articles/single-file.html">single-file app article</a> for more.</p><ul><li>We&rsquo;ve added progress bars, which allow you to indicate to users that something is happening when there&rsquo;s a long-running computation. The progress bar will show at the top of the browser window, as shown here:</li></ul><p><img src="https://rstudioblog.files.wordpress.com/2014/10/progress.png" alt="progress"></p><p>Read the <a href="https://shiny.rstudio.com/articles/progress.html">progress bar article</a> for more.</p><ul><li>We&rsquo;ve upgraded the DataTables Javascript library from 1.9.4 to 1.10.2. We&rsquo;ve tried to support backward compatibility as much as possible, but this might be a breaking change if you&rsquo;ve customized the DataTables options in your apps. This is because some option names have changed; for example, <code>aLengthMenu</code> has been renamed to <code>lengthMenu</code>. Please read <a href="https://shiny.rstudio.com/articles/datatables.html">the article on DataTables</a> on the Shiny website for more information about updating Shiny apps that use DataTables 1.9.4.</li></ul><p>In addition to the changes listed above, there are some smaller updates:</p><ul><li><p>Searching in DataTables is case-insensitive and the search strings are not treated as regular expressions by default now. If you want case-sensitive searching or regular expressions, you can use the configuration options <code>search$caseInsensitive</code> and <code>search$regex</code>, e.g. <code>renderDataTable(..., options = list(search = list(caseInsensitve = FALSE, regex = TRUE)))</code>.</p></li><li><p>Shiny has switched from reference classes to R6.</p></li><li><p>Reactive log performance has been greatly improved.</p></li><li><p>Exported <code>createWebDependency</code>. It takes an <code>htmltools::htmlDependency</code> object and makes it available over Shiny&rsquo;s built-in web server.</p></li><li><p>Custom output bindings can now render <code>htmltools::htmlDependency</code> objects at runtime using <code>Shiny.renderDependencies()</code>.</p></li></ul><p>Please read the <a href="https://github.com/rstudio/shiny/blob/master/NEWS">NEWS</a> file for a complete list of changes, and let us know if you have any <a href="https://groups.google.com/forum/#!forum/shiny-discuss">comments</a> or <a href="http://stackoverflow.com/questions/tagged/shiny">questions</a>.</p></description></item><item><title>Meet us at R Day and at the Strata+Hadoop World NYC Oct 15-17, 2014</title><link>https://www.rstudio.com/blog/meet-us-at-r-day-and-at-the-stratahadoop-world-nyc-oct-15-17-2014/</link><pubDate>Tue, 30 Sep 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/meet-us-at-r-day-and-at-the-stratahadoop-world-nyc-oct-15-17-2014/</guid><description><p>Are you headed to Strata? It&rsquo;s just around the corner!</p><p>We particularly hope to see you at <a href="http://strataconf.com/stratany2014/public/schedule/detail/37037">R Day</a> on October 15, where we will cover a raft of current topics that analysts and R users need to pay attention to. The R Day tutorials come from Hadley Wickham, Winston Chang, Garrett Grolemund, J.J. Allaire, and Yihui Xie who are all working on fascinating new ways to keep the R ecosystem apace of the challenges facing those who work with data.</p><p>If you plan to stay for the full <a href="http://strataconf.com/stratany2014?intcmp=il-strata-stny14-franchise-page">Strata Conference+Hadoop World</a> be sure to look us up in the Innovator Pavilion booth P14 during the Expo Hall hours. We&rsquo;ll have the latest books from RStudio authors and &ldquo;shiny&rdquo; t-shirts to win. Share with us what you&rsquo;re doing with RStudio and get your product and company questions answered by RStudio employees.</p><p>See you in New York City!</p></description></item><item><title>Data management with ShinyApps.io</title><link>https://www.rstudio.com/blog/data-management-with-shinyapps-io/</link><pubDate>Mon, 29 Sep 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/data-management-with-shinyapps-io/</guid><description><p><img src="https://www.shinyapps.io/assets/images/dashboard-screen.png" alt="ShinyApps.io dashboard"></p><p>Some of the most innovative Shiny apps share data across user sessions. Some apps share the results of one session to use in future sessions, others track user characteristics over time and make them available as part of the app.</p><p>This level of sophistication creates tricky design choices when you host your app on a server. A nimble server will open new instances of your app to speed up performance, or relaunch your app on a bigger server when it becomes popular. How should you ensure that your app can find and use its data trail along the way?</p><p>Shiny Server developer Jeff Allen explains the best ways to share data between sessions in <a href="https://shiny.rstudio.com/articles/share-data.html">Share data across sessions with ShinyApps.io</a>, a new article at the Shiny Dev Center.</p></description></item><item><title>Registration now open for Master R Developer workshop in San Francisco</title><link>https://www.rstudio.com/blog/registration-now-open-for-master-r-developer-workshop-in-san-francisco/</link><pubDate>Mon, 29 Sep 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/registration-now-open-for-master-r-developer-workshop-in-san-francisco/</guid><description><p>Registration is now open for the next Master R Development workshop led by Hadley Wickham, author of over 30 R packages and the <a href="http://http://adv-r.had.co.nz/">Advanced R</a> book. The workshop will be held on January 19 and 20th in the San Francisco bay area.</p><p>The workshop is a two day course on advanced R practices and package development. You&rsquo;ll learn the three main paradigms of R programming: functional programming, object oriented programming and metaprogramming, as well as how to make R packages, the key to well-documented, well-tested and easily-distributed R code.</p><p>To learn more, or register visit <a href="http://rstudio-sfbay.eventbrite.com">rstudio-sfbay.eventbrite.com</a>.</p></description></item><item><title>testthat 0.9</title><link>https://www.rstudio.com/blog/testthat-0-9/</link><pubDate>Tue, 23 Sep 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/testthat-0-9/</guid><description><p>testthat 0.9 is now available on CRAN. Testthat makes it easy to turn the informal testing that you&rsquo;re already doing into formal automated tests. Learn more at <a href="http://r-pkgs.had.co.nz/tests.html">http://r-pkgs.had.co.nz/tests.html</a></p><p>This version of testthat has four important new features that bring testthat up to speed with unit testing frameworks in other languages:</p><ul><li>You can <code>skip()</code> tests with an informative message, if their prerequisites are not available. This is particularly use for CRAN packages, since tests only have a limited amount of time to run. Use <code>skip_on_cran()</code> skip selected tests when run on CRAN.</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">test_that</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">a complicated simulation takes a long time&#34;</span>, {<span style="color:#06287e">skip_on_cran</span>()<span style="color:#007020;font-weight:bold">...</span>})</code></pre></div><ul><li>Experiment with behaviour driven development with the new <code>describe()</code> function contributed by <a href="https://github.com/dirkschumacher">Dirk Schumacher</a>:</li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">describe</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">matrix()&#34;</span>, {<span style="color:#06287e">it</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">can be multiplied by a scalar&#34;</span>, {m1 <span style="color:#666">&lt;-</span> <span style="color:#06287e">matrix</span>(<span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">4</span>, <span style="color:#40a070">2</span>, <span style="color:#40a070">2</span>)m2 <span style="color:#666">&lt;-</span> m1 <span style="color:#666">*</span> <span style="color:#40a070">2</span><span style="color:#06287e">expect_equivalent</span>(<span style="color:#06287e">matrix</span>(<span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">4</span> <span style="color:#666">*</span> <span style="color:#40a070">2</span>, <span style="color:#40a070">2</span>, <span style="color:#40a070">2</span>), m2)})})</code></pre></div><ul><li><p>Use <code>with_mock()</code> to &ldquo;mock&rdquo; functions, replacing slow, resource intensive or inconsistent functions with your own quick approximations. This is particularly useful when you want to test functions that call web APIs without being connected to the internet. Contributed by <a href="https://github.com/krlmlr">Kirill Müller</a>.</p></li><li><p>Sometimes it&rsquo;s difficult to figure out exactly what a function should return and instead you just want to make sure that it returned the same thing as the last time you ran it. A new expectation, <code>expect_equal_to_reference()</code>, makes this easy to do. Contributed by <a href="https://github.com/jonclayden">Jon Clayden</a>.</p></li></ul><p>Other changes of note: <code>auto_test_package()</code> is working again (and uses <code>devtools::load_all()</code> to load the code), random praise has been re-enabled (after being accidentally disabled), and <code>expect_identical()</code> works better with R-devel. See the <a href="https://github.com/hadley/testthat/releases/tag/v0.9">release notes</a> for complete list of changes.</p></description></item><item><title>Track how visitors use your Shiny app with Google Analytics</title><link>https://www.rstudio.com/blog/track-how-visitors-use-your-shiny-app-with-google-analytics/</link><pubDate>Mon, 08 Sep 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/track-how-visitors-use-your-shiny-app-with-google-analytics/</guid><description><p>Want to see who is using your Shiny apps and what they are doing while they are there?</p><p>Google Analytics is a popular way to track traffic to your website. With Google Analytics, you can see what sort of person comes to your website, where they arrive from, and what they do while they are there.</p><p>Since Shiny apps are web pages, you can also use Google Analytics to keep an eye on who visits your app and how they use it.</p><p><a href="https://shiny.rstudio.com/articles/google-analytics.html">Add Google Analytics to a Shiny app</a> is a new article at the <a href="https://shiny.rstudio.com">Shiny dev center</a> that will show you how to set up analytics for a Shiny app. Some knowledge of jQuery is required.</p></description></item><item><title>Packrat on CRAN</title><link>https://www.rstudio.com/blog/packrat-on-cran/</link><pubDate>Fri, 05 Sep 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/packrat-on-cran/</guid><description><p>Packrat is now available <a href="http://cran.r-project.org/web/packages/packrat/">on CRAN</a>, with version 0.4.1-1! Packrat is an R package that helps you manage your project&rsquo;s R package dependencies in an isolated, reproducible and portable way.</p><p>Install packrat from CRAN with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">packrat&#34;</span>)</code></pre></div><p>In particular, this release provides better support for local repositories. Local repositories are just folders containing package sources (currently as folders themselves).</p><p>One can now specify local repositories on a per-project basis by using:</p><pre><code>packrat::set_opts(local.repos = &lt;pathToRepo&gt;)</code></pre><p>and then using</p><pre><code>packrat::install_local(&lt;pkgName&gt;)</code></pre><p>to install that package from the local repository.</p><p>There is also experimental support for a global &lsquo;cache&rsquo; of packages, which can be used across projects. If you wish to enable this feature, you can use (note that it is disabled by default):</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">packrat<span style="color:#666">::</span><span style="color:#06287e">set_opts</span>(use.cache <span style="color:#666">=</span> <span style="color:#007020;font-weight:bold">TRUE</span>)</code></pre></div><p>in each project where you would utilize the cache.</p><p>By doing so, if one project installs or uses e.g. Shiny 0.10.1 for CRAN, and another version uses that same package, packrat will look up the installed version of that package in the cache — this should greatly speed up project initialization for new projects that use projects overlapping with other packrat projects with large, overlapping dependency chains.</p><p>In addition, this release provides a number of usability improvements and bug fixes for Windows.</p><p>Please visit <a href="http://rstudio.github.io/packrat/">rstudio.github.io/packrat</a> for more information and a guide to getting started with Packrat.</p></description></item><item><title>httr 0.5</title><link>https://www.rstudio.com/blog/httr-0-5/</link><pubDate>Wed, 03 Sep 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/httr-0-5/</guid><description><p>httr 0.5 is now available on CRAN. The httr packages makes it easy to talk to web APIs from R. Learn more in the <a href="http://cran.r-project.org/web/packages/httr/vignettes/quickstart.html">quick start</a> vignette.</p><p>This release is mostly bug fixes and minor improvements, but there is one major new feature: you can now save response bodies directly to disk.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(httr)<span style="color:#60a0b0;font-style:italic"># Download the latest version of rstudio for windows</span>url <span style="color:#666">&lt;-</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">http://download1.rstudio.org/RStudio-0.98.1049.exe&#34;</span><span style="color:#06287e">GET</span>(url, <span style="color:#06287e">write_disk</span>(<span style="color:#06287e">basename</span>(url)), <span style="color:#06287e">progress</span>())</code></pre></div><p>There is also some preliminary support for HTTP caching (see <code>cache_info()</code> and <code>rerequest()</code>). See the <a href="https://github.com/hadley/httr/releases/tag/v0.5">release notes</a> for complete details.</p></description></item><item><title>Master R Developer Workshop - San Francisco, January 19-20</title><link>https://www.rstudio.com/blog/master-r-developer-workshop-san-francisco-january-19-20/</link><pubDate>Mon, 11 Aug 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/master-r-developer-workshop-san-francisco-january-19-20/</guid><description><p>RStudio is planning a new Master R Developer Workshop to be taught by Hadley Wickham in the San Francisco Bay Area on January 19-20. This will be the same workshop that Hadley is teaching in <a href="http://rstudionyc.eventbrite.com">September in New York City</a> to a sold out audience.</p><p>If you did not get a chance to register for the NYC workshop but wished to, consider attending the January Bay Area workshop. We will open registration once we have planned out all of the event details. If you would like to be notified when registration opens, leave a contact address <a href="http://pages.rstudio.net/Master-R-Workshop.html">here</a>.</p></description></item><item><title>Come see RStudio at JSM in Boston</title><link>https://www.rstudio.com/blog/come-see-rstudio-at-jsm-in-boston/</link><pubDate>Fri, 01 Aug 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/come-see-rstudio-at-jsm-in-boston/</guid><description><p>The Joint Statistical Meetings (JSM) start this weekend! We wanted to let you know we&rsquo;ll be there. Be sure to check out these sessions from RStudio and friends:</p><p>Sunday, August 3</p><ul><li>4:00 PM: A Web Application for Efficient Analysis of Peptide Libraries: Eric Hare*+ and Timo Sieber and Heike Hofmann</li><li>4:00 PM: Gravicom: A Web-Based Tool for Community Detection in Networks: Andrea Kaplan*+ and Heike Hofmann and Daniel Nordman</li></ul><p>Monday, August 4</p><ul><li>8:35 AM: Preparing Students for Big Data Using R and Rstudio: Randall Pruim</li><li>8:55 AM: Thinking with Data in the Second Course: Nicholas J. Horton and Ben S. Baumer and Hadley Wickham</li><li>8:55 AM: Doing Reproducible Research Unconsciously: Higher Standard, but Less Work: Yihui Xie</li><li>10:30 AM: Interactive Web Application with Shiny: Bharat Bahadur</li><li>2:00 PM: Interactive Web Application with Shiny: Bharat Bahadur</li></ul><p>Tuesday, August 5</p><ul><li>2:00PM: Give Me an Old Computer, a Blank DVD, and an Internet Connection and I&rsquo;ll Give You World-Class Analytics: Ty Henkaline</li></ul><p>Wednesday, August 6</p><ul><li>10:35 Shiny: Easy Web Applications in R:Joseph Cheng</li><li>11:00 AM: ggvis: Moving Toward a Grammar of Interactive Graphics: Hadley Wickham</li></ul><p>For even more talks on R we thought Joseph Rickert&rsquo;s &ldquo;Data Scientists and R Users Guide to the JSM&rdquo; was excellent. Click here to see it. <a href="http://blog.revolutionanalytics.com/2014/07/a-data-scientists-and-r-users-guide-to-the-jsm.html">http://blog.revolutionanalytics.com/2014/07/a-data-scientists-and-r-users-guide-to-the-jsm.html</a></p><p>While you&rsquo;re at the conference, please come by our exhibition area (Booth #112) to say hello. J.J., Hadley and other members of the team will be there. We&rsquo;ve got enough space to talk about your plans for R and how RStudio Server Pro and Shiny Server Pro can provide enterprise-ready support and scalability for your RStudio IDE and Shiny Server deployments.</p><p>We hope to see you there!</p></description></item><item><title>Shiny 0.10.1</title><link>https://www.rstudio.com/blog/shiny-0-10-1/</link><pubDate>Fri, 01 Aug 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-0-10-1/</guid><description><p><a href="http://cran.rstudio.com/package=shiny">Shiny v0.10.1</a> has been released to CRAN. You can either install it from a CRAN mirror, or update it if you have installed a previous version.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">install.packages</span>(<span style="color:#4070a0">&#39;</span><span style="color:#4070a0">shiny&#39;</span>, repos <span style="color:#666">=</span> <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">http://cran.rstudio.com&#39;</span>)<span style="color:#60a0b0;font-style:italic"># or update your installed packages</span><span style="color:#60a0b0;font-style:italic"># update.packages(ask = FALSE, repos = &#39;http://cran.rstudio.com&#39;)</span></code></pre></div><p>The most prominent change in this patch release is that we added full Unicode support on Windows. Shiny apps running on Windows must use the UTF-8 encoding for ui.R and server.R (also the optional global.R, README.md, and DESCRIPTION) if they contain non-ASCII characters. See this article for details and <a href="https://shiny.rstudio.com/gallery/unicode-characters.html">examples</a>: <a href="https://shiny.rstudio.com/articles/unicode.html">https://shiny.rstudio.com/articles/unicode.html</a></p><p><img src="https://shiny.rstudio.com/gallery/images/screenshots/unicode-characters.png" alt="Chinese characters in a shiny app"></p><p class="caption">Chinese characters in a shiny app</p><p>Please note although we require UTF-8 for the app components, UTF-8 is not a general requirement for any other files. If you read/write text files in an app, you are free to use any encoding you want, e.g. you can <code>readLines('foo.txt', encoding = 'Windows-1252')</code>. The article above has explained it in detail.</p><p>Other changes include:</p><ul><li><p><code>runGitHub()</code> also allows the <code>'username/repo'</code> syntax now, which is equivalent to <code>runGitHub('repo', 'username')</code>. (<a href="https://github.com/rstudio/shiny/issues/427">#427</a>)</p></li><li><p><code>navbarPage()</code> now accepts a <code>windowTitle</code> parameter to set the web browser page title to something other than the title displayed in the navbar.</p></li><li><p>Added an <code>inline</code> argument to <code>textOutput()</code>, <code>imageOutput()</code>, <code>plotOutput()</code>, and <code>htmlOutput()</code>. When <code>inline = TRUE</code>, these outputs will be put in <code>span()</code> instead of the default <code>div()</code>. This occurs automatically when these outputs are created via the inline expressions (e.g. <code>r renderText(expr)</code>) in R Markdown documents. See an R Markdown example at <a href="https://shiny.rstudio.com/gallery/inline-output.html">https://shiny.rstudio.com/gallery/inline-output.html</a> (<a href="https://github.com/rstudio/shiny/pull/512">#512</a>)</p></li><li><p>Added support for option groups in the select/selectize inputs. When the <code>choices</code> argument for <code>selectInput()</code>/<code>selectizeInput()</code> is a list of sub-lists and any sub-list is of length greater than 1, the HTML tag <code>&lt;optgroup&gt;</code> will be used. See an example at <a href="https://shiny.rstudio.com/gallery/option-groups-for-selectize-input.html">here</a> (<a href="https://github.com/rstudio/shiny/pull/542">#542</a>)</p></li></ul><p>Please let us know if you have any <a href="https://groups.google.com/forum/#!forum/shiny-discuss">comments</a> or <a href="http://stackoverflow.com/questions/tagged/shiny">questions</a>.</p></description></item><item><title>The R Markdown Cheat Sheet</title><link>https://www.rstudio.com/blog/the-r-markdown-cheat-sheet/</link><pubDate>Fri, 01 Aug 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/the-r-markdown-cheat-sheet/</guid><description><p>R Markdown is a framework for writing versatile, reproducible reports from R. With R Markdown, you write a simple plain text report and then render it to create polished output. You can:</p><ol><li><p>Transform your file into a pdf, html, or Microsoft Word document—even a slideshow—at the click of a button.</p></li><li><p>Embed R code into your report. When you render the file, R will run the code and insert its results into your report. Use this feature to add graphs and tables to your report: if your data ever changes, you can update your figures by re-rendering the report.</p></li><li><p>Make interactive documents and slideshows. Your report becomes interactive when you embed Shiny code.</p></li></ol><p>We&rsquo;ve created a cheat sheet to help you master R Markdown. Download your copy <a href="https://shiny.rstudio.com/articles/rm-cheatsheet.html">here</a>. You can also learn more about R Markdown at <a href="http://rmarkdown.rstudio.com">rmarkdown.rstudio.com</a> and <a href="https://shiny.rstudio.com/articles/rmarkdown.html">Introduction to R Markdown</a>.</p><p><a href="https://shiny.rstudio.com/articles/rm-cheatsheet.html"><img src="https://rstudioblog.files.wordpress.com/2014/08/rm-cheatsheet.png" alt="RM-cheatsheet"></a></p></description></item><item><title>httr 0.4</title><link>https://www.rstudio.com/blog/httr-0-4/</link><pubDate>Thu, 31 Jul 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/httr-0-4/</guid><description><p>httr 0.4 is now available on CRAN. The httr packages makes it easy to talk to web APIs from R.</p><p>The most important new features are two new vignettes to <a href="http://cran.r-project.org/web/packages/httr/vignettes/quickstart.html">help you get started</a> and to help you make wrappers for <a href="http://cran.r-project.org/web/packages/httr/vignettes/api-packages.html">web APIs</a>. Other important improvements include:</p><ul><li><p>New <code>headers()</code> and <code>cookies()</code> functions to extract headers and cookies from responses. <code>status_code()</code> returns HTTP status codes.</p></li><li><p><code>POST()</code> (and <code>PUT()</code>, and <code>PATCH()</code>) now have an <code>encode</code> argument that determine how the <code>body</code> is encoded. Valid values are &ldquo;multipart&rdquo;, &ldquo;form&rdquo; or &ldquo;json&rdquo;, and the <code>multipart</code> argument is now deprecated.</p></li><li><p><code>GET(..., progress())</code> will display a progress bar, useful if you&rsquo;re doing large uploads or downloads.</p></li><li><p><code>verbose()</code> gives you considerably more control over degree of verbosity, and defaults have been selected to be more helpful for the most common cases.</p></li><li><p>NULL <code>query</code> parameters are now dropped automatically.</p></li></ul><p>There are number of other minor improvements and bug fixes, as described by the <a href="https://github.com/hadley/httr/releases/tag/v0.4">release notes</a>.</p></description></item><item><title>Announcing Shiny Server Pro 1.2</title><link>https://www.rstudio.com/blog/announcing-shiny-server-pro-1-2/</link><pubDate>Thu, 24 Jul 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-shiny-server-pro-1-2/</guid><description><p>RStudio is very pleased to announce the general availability of Shiny Server Pro 1.2.</p><p><a href="https://www.rstudio.com/products/shiny-server-pro/">Download a free 45 day evaluation of Shiny Server Pro 1.2</a></p><p>Shiny Server Pro 1.2 adds support for R Markdown Interactive Documents in addition to Shiny applications. Learn more about Interactive Documents by registering for the <a href="http://pages.rstudio.net/Webniar-Series-2-Essential-Tools-for-R.html">Reproducible Reporting webinar</a> August 13 and <a href="http://pages.rstudio.net/Webniar-Series-3-Essential-Tools-for-R.html">Interactive Reporting webinar</a> September 3.</p><p>We are excited about the new ways in which you can now share your data analysis in Shiny Server Pro along with the security, management and performance tuning capabilities you and your IT teams need to scale.</p><p>Uncover all the features of Shiny Server Pro 1.2 in the updated <a href="http://rstudio.github.io/shiny-server/latest/">Shiny Server admin guide</a>&hellip;then give it a try!</p></description></item><item><title>New data packages</title><link>https://www.rstudio.com/blog/new-data-packages/</link><pubDate>Wed, 23 Jul 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/new-data-packages/</guid><description><p>I&rsquo;ve released four new data packages to CRAN: <a href="https://github.com/hadley/babynames">babynames</a>, <a href="https://github.com/hadley/fueleconomy">fueleconomy</a>, <a href="https://github.com/hadley/nasaweather">nasaweather</a> and <a href="https://github.com/hadley/nycflights13">nycflights13</a>. The goal of these packages is to provide some interesting, and relatively large, datasets to demonstrate various data analysis challenges in R. The package source code (on github, linked above) is fully reproducible so that you can see some data tidying in action, or make your own modifications to the data.</p><p>Below, I&rsquo;ve listed the primary dataset found in each package. Most packages also include a number of supplementary datasets that provide additional information. Check out the docs for more details.</p><ul><li><p><code>babynames::babynames</code>: US baby name data for each year from 1880 to 2013, the number of children of each sex given each name. All names used 5 or more times are included. 1,792,091 rows, 5 columns (year, sex, name, n, prop). (Source: <a href="http://www.ssa.gov/oact/babynames/limits.html">Social security administration</a>).</p></li><li><p><code>fueleconomy::vehicles</code>: Fuel economy data for all cars sold in the US from 1984 to 2015. 33,442 rows, 12 variables. (Source: <a href="http://www.fueleconomy.gov/feg/download.shtml">Environmental protection agency</a>)</p></li><li><p><code>nasaweather::atmos</code>: Data from the 2006 ASA data expo. Contains monthly atmospheric measurements from Jan 1995 to Dec 2000 on 24 x 24 grid over Central America. 41,472 observations, 11 variables. (Source: <a href="http://stat-computing.org/dataexpo/2006/">ASA data expo</a>)</p></li><li><p><code>nycflights13::flights</code>: This package contains information about all flights that departed from NYC (i.e., EWR, JFK and LGA) in 2013: 336,776 flights with 16 variables. To help understand what causes delays, it also includes a number of other useful datasets: <code>weather</code>, <code>planes</code>, <code>airports</code>, <code>airlines</code>. (Source: <a href="http://www.transtats.bts.gov/DL_SelectFields.asp?Table_ID=236">Bureau of transportation statistics</a>)</p></li></ul><p>NB: since the datasets are large, I&rsquo;ve tagged each data frame with the <code>tbl_df</code> class. If you don&rsquo;t use dplyr, this has no effect. If you do use dplyr, this ensures that you won&rsquo;t accidentally print thousands of rows of data. Instead, you&rsquo;ll just see the first 10 rows and as many columns as will fit on screen. This makes interactive exploration much easier.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(dplyr)<span style="color:#06287e">library</span>(nycflights13)flights<span style="color:#60a0b0;font-style:italic">#&gt; Source: local data frame [336,776 x 16]</span><span style="color:#60a0b0;font-style:italic">#&gt;</span><span style="color:#60a0b0;font-style:italic">#&gt; year month day dep_time dep_delay arr_time arr_delay carrier tailnum</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 2013 1 1 517 2 830 11 UA N14228</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2013 1 1 533 4 850 20 UA N24211</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 2013 1 1 542 2 923 33 AA N619AA</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 2013 1 1 544 -1 1004 -18 B6 N804JB</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 2013 1 1 554 -6 812 -25 DL N668DN</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 2013 1 1 554 -4 740 12 UA N39463</span><span style="color:#60a0b0;font-style:italic">#&gt; 7 2013 1 1 555 -5 913 19 B6 N516JB</span><span style="color:#60a0b0;font-style:italic">#&gt; 8 2013 1 1 557 -3 709 -14 EV N829AS</span><span style="color:#60a0b0;font-style:italic">#&gt; 9 2013 1 1 557 -3 838 -8 B6 N593JB</span><span style="color:#60a0b0;font-style:italic">#&gt; 10 2013 1 1 558 -2 753 8 AA N3ALAA</span><span style="color:#60a0b0;font-style:italic">#&gt; .. ... ... ... ... ... ... ... ... ...</span><span style="color:#60a0b0;font-style:italic">#&gt; Variables not shown: flight (int), origin (chr), dest (chr), air_time</span><span style="color:#60a0b0;font-style:italic">#&gt; (dbl), distance (dbl), hour (dbl), minute (dbl)</span></code></pre></div></description></item><item><title>Announcing Packrat v0.4</title><link>https://www.rstudio.com/blog/announcing-packrat-v0-4/</link><pubDate>Tue, 22 Jul 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-packrat-v0-4/</guid><description><p>We&rsquo;re excited to announce a new release of <a href="http://rstudio.github.io/packrat/">Packrat</a>, a tool for making R projects more isolated and reproducible by managing their package dependencies.</p><p>This release brings a number of exciting features to Packrat that significantly improve the user experience:</p><ul><li><p><strong>Automatic snapshots</strong> ensure that new packages installed in your project library are automatically tracked by Packrat.</p></li><li><p><strong>Bundle and share your projects</strong> with packrat::bundle() and packrat::unbundle() &ndash; whether you want to freeze an analysis, or exchange it for collaboration with colleagues.</p></li><li><p><strong>Packrat mode</strong> can now be turned on and off at will, allowing you to navigate between different Packrat projects in a single R session. Use packrat::on() to activate Packrat in the current directory, and packrat::off() to turn it off.</p></li><li><p><strong>Local repositories</strong> (ie, directories containing R package sources) can now be specified for projects, allowing local source packages to be used in a Packrat project alongside CRAN, BioConductor and GitHub packages (see this and more with ?&ldquo;packrat-options&rdquo;).</p></li></ul><p>In addition, Packrat is now <a href="http://rstudio.github.io/packrat/rstudio.html">tightly integrated with the RStudio IDE</a>, making it easier to manage project dependencies than ever. Download today&rsquo;s <a href="https://www.rstudio.com/products/rstudio/download/">RStudio IDE 0.98.978 release</a> and try it out!</p><p><a href="http://rstudio.github.io/packrat/rstudio.html"><img src="https://rstudioblog.files.wordpress.com/2014/07/screen-shot-2014-07-22-at-10-32-12-am.png" alt="Packrat RStudio package pane integration"></a></p><p>You can install the latest version of Packrat from <a href="http://www.github.com/rstudio/packrat">GitHub</a> with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">devtools<span style="color:#666">::</span><span style="color:#06287e">install_github</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">rstudio/packrat&#34;</span>)</code></pre></div><p>Packrat will be coming to CRAN soon as well.</p><p>If you try it, we&rsquo;d love to get your feedback. Leave a comment here or post in the <a href="https://groups.google.com/forum/#!forum/packrat-discuss">packrat-discuss Google group</a>.</p></description></item><item><title>Introducing tidyr</title><link>https://www.rstudio.com/blog/introducing-tidyr/</link><pubDate>Tue, 22 Jul 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/introducing-tidyr/</guid><description><p>tidyr is new package that makes it easy to &ldquo;tidy&rdquo; your data. Tidy data is data that&rsquo;s easy to work with: it&rsquo;s easy to munge (with dplyr), visualise (with ggplot2 or ggvis) and model (with R&rsquo;s hundreds of modelling packages). The two most important properties of tidy data are:</p><ul><li><p>Each column is a variable.</p></li><li><p>Each row is an observation.</p></li></ul><p>Arranging your data in this way makes it easier to work with because you have a consistent way of referring to variables (as column names) and observations (as row indices). When use tidy data and tidy tools, you spend less time worrying about how to feed the output from one function into the input of another, and more time answering your questions about the data.</p><p>To tidy messy data, you first identify the variables in your dataset, then use the tools provided by tidyr to move them into columns. tidyr provides three main functions for tidying your messy data: <code>gather()</code>, <code>separate()</code> and <code>spread()</code>.</p><p><code>gather()</code> takes multiple columns, and gathers them into key-value pairs: it makes &ldquo;wide&rdquo; data longer. Other names for gather include melt (reshape2), pivot (spreadsheets) and fold (databases). Here&rsquo;s an example how you might use <code>gather()</code> on a made-up dataset. In this experiment we&rsquo;ve given three people two different drugs and recorded their heart rate:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(tidyr)<span style="color:#06287e">library</span>(dplyr)messy <span style="color:#666">&lt;-</span> <span style="color:#06287e">data.frame</span>(name <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Wilbur&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Petunia&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Gregory&#34;</span>),a <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#40a070">67</span>, <span style="color:#40a070">80</span>, <span style="color:#40a070">64</span>),b <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#40a070">56</span>, <span style="color:#40a070">90</span>, <span style="color:#40a070">50</span>))messy<span style="color:#60a0b0;font-style:italic">#&gt; name a b</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 Wilbur 67 56</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 Petunia 80 90</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 Gregory 64 50</span></code></pre></div><p>We have three variables (name, drug and heartrate), but only name is currently in a column. We use <code>gather()</code> to gather the a and b columns into key-value pairs of drug and heartrate:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">messy <span style="color:#666">%&gt;%</span><span style="color:#06287e">gather</span>(drug, heartrate, a<span style="color:#666">:</span>b)<span style="color:#60a0b0;font-style:italic">#&gt; name drug heartrate</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 Wilbur a 67</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 Petunia a 80</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 Gregory a 64</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 Wilbur b 56</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 Petunia b 90</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 Gregory b 50</span></code></pre></div><p>Sometimes two variables are clumped together in one column. <code>separate()</code> allows you to tease them apart (<code>extract()</code> works similarly but uses regexp groups instead of a splitting pattern or position). Take this example from <a href="http://stackoverflow.com/questions/9684671">stackoverflow</a> (modified slightly for brevity). We have some measurements of how much time people spend on their phones, measured at two locations (work and home), at two times. Each person has been randomly assigned to either treatment or control.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">set.seed</span>(<span style="color:#40a070">10</span>)messy <span style="color:#666">&lt;-</span> <span style="color:#06287e">data.frame</span>(id <span style="color:#666">=</span> <span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">4</span>,trt <span style="color:#666">=</span> <span style="color:#06287e">sample</span>(<span style="color:#06287e">rep</span>(<span style="color:#06287e">c</span>(<span style="color:#4070a0">&#39;</span><span style="color:#4070a0">control&#39;</span>, <span style="color:#4070a0">&#39;</span><span style="color:#4070a0">treatment&#39;</span>), each <span style="color:#666">=</span> <span style="color:#40a070">2</span>)),work.T1 <span style="color:#666">=</span> <span style="color:#06287e">runif</span>(<span style="color:#40a070">4</span>),home.T1 <span style="color:#666">=</span> <span style="color:#06287e">runif</span>(<span style="color:#40a070">4</span>),work.T2 <span style="color:#666">=</span> <span style="color:#06287e">runif</span>(<span style="color:#40a070">4</span>),home.T2 <span style="color:#666">=</span> <span style="color:#06287e">runif</span>(<span style="color:#40a070">4</span>))</code></pre></div><p>To tidy this data, we first use <code>gather()</code> to turn columns <code>work.T1</code>, <code>home.T1</code>, <code>work.T2</code> and <code>home.T2</code> into a key-value pair of key and time. (Only the first eight rows are shown to save space.)</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">tidier <span style="color:#666">&lt;-</span> messy <span style="color:#666">%&gt;%</span><span style="color:#06287e">gather</span>(key, time, <span style="color:#666">-</span>id, <span style="color:#666">-</span>trt)tidier <span style="color:#666">%&gt;%</span> <span style="color:#06287e">head</span>(<span style="color:#40a070">8</span>)<span style="color:#60a0b0;font-style:italic">#&gt; id trt key time</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 treatment work.T1 0.08514</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 control work.T1 0.22544</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 3 treatment work.T1 0.27453</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 4 control work.T1 0.27231</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 1 treatment home.T1 0.61583</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 2 control home.T1 0.42967</span><span style="color:#60a0b0;font-style:italic">#&gt; 7 3 treatment home.T1 0.65166</span><span style="color:#60a0b0;font-style:italic">#&gt; 8 4 control home.T1 0.56774</span></code></pre></div><p>Next we use <code>separate()</code> to split the key into location and time, using a regular expression to describe the character that separates them.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">tidy <span style="color:#666">&lt;-</span> tidier <span style="color:#666">%&gt;%</span><span style="color:#06287e">separate</span>(key, into <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">location&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">time&#34;</span>), sep <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">\\.&#34;</span>)tidy <span style="color:#666">%&gt;%</span> <span style="color:#06287e">head</span>(<span style="color:#40a070">8</span>)<span style="color:#60a0b0;font-style:italic">#&gt; id trt location time time</span><span style="color:#60a0b0;font-style:italic">#&gt; 1 1 treatment work T1 0.08514</span><span style="color:#60a0b0;font-style:italic">#&gt; 2 2 control work T1 0.22544</span><span style="color:#60a0b0;font-style:italic">#&gt; 3 3 treatment work T1 0.27453</span><span style="color:#60a0b0;font-style:italic">#&gt; 4 4 control work T1 0.27231</span><span style="color:#60a0b0;font-style:italic">#&gt; 5 1 treatment home T1 0.61583</span><span style="color:#60a0b0;font-style:italic">#&gt; 6 2 control home T1 0.42967</span><span style="color:#60a0b0;font-style:italic">#&gt; 7 3 treatment home T1 0.65166</span><span style="color:#60a0b0;font-style:italic">#&gt; 8 4 control home T1 0.56774</span></code></pre></div><p>The last tool, <code>spread()</code>, takes two columns (a key-value pair) and spreads them in to multiple columns, making &ldquo;long&rdquo; data wider. Spread is known by other names in other places: it&rsquo;s cast in reshape2, unpivot in spreadsheets and unfold in databases. <code>spread()</code> is used when you have variables that form rows instead of columns. You need <code>spread()</code> less frequently than <code>gather()</code> or <code>separate()</code> so to learn more, check out the documentation and the demos.</p><p>Just as reshape2 did less than reshape, tidyr does less than reshape2. It&rsquo;s designed specifically for tidying data, not general reshaping. In particular, existing methods only work for data frames, and tidyr never aggregates. This makes each function in tidyr simpler: each function does one thing well. For more complicated operations you can string together multiple simple tidyr and dplyr functions with <code>%&gt;%</code>.</p><p>You can learn more about the underlying principles in my <a href="http://vita.had.co.nz/papers/tidy-data.html">tidy data paper</a>. To see more examples of data tidying, read the vignette, <code>vignette(&quot;tidy-data&quot;)</code>, or check out the demos, <code>demo(package = &quot;tidyr&quot;)</code>. Alternatively, check out some of the <a href="http://stackoverflow.com/search?tab=votes&amp;q=%5br%5d%20tidyr">great stackoverflow answers</a> that use tidyr. Keep up-to-date with development at <a href="http://github.com/hadley/tidyr">http://github.com/hadley/tidyr</a>, report bugs at <a href="http://github.com/hadley/tidyr/issues">http://github.com/hadley/tidyr/issues</a> and get help with data manipulation challenges at <a href="https://groups.google.com/group/manipulatr">https://groups.google.com/group/manipulatr</a>. If you ask a question specifically about tidyr on <a href="http://stackoverflow.com">stackoverflow</a>, please tag it with tidyr and I&rsquo;ll make sure to read it.</p></description></item><item><title>Master interactive documents at the Shiny Dev Center</title><link>https://www.rstudio.com/blog/master-interactive-documents-at-the-shiny-dev-center/</link><pubDate>Mon, 21 Jul 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/master-interactive-documents-at-the-shiny-dev-center/</guid><description><p>We&rsquo;ve added a new section of articles to the <a href="https://shiny.rstudio.com">Shiny Development Center</a>. These articles explain how to create interactive documents with Shiny and <a href="http://rmarkdown.rstudio.com">R Markdown</a>.</p><p>You&rsquo;ll learn how to</p><ul><li><p><strong>Use R Markdown to create reproducible, dynamic reports.</strong> R Markdown offers one of the most efficient workflows for writing up your R results.</p></li><li><p><strong>Create interactive documents and slideshows by embedding Shiny elements into an R Markdown report.</strong> The Shiny + R Markdown combo does more than just enhance your reports; R Markdown provides one of the quickest ways to make light weight Shiny apps.</p></li><li><p><strong>Take advantage of RStudio&rsquo;s built in features that support R Markdown</strong></p></li></ul><p><a href="https://shiny.rstudio.com/articles"><img src="https://rstudioblog.files.wordpress.com/2014/07/interactive-articles-001.png" alt="interactive-articles.001"></a></p><p>Learn more at <a href="https://shiny.rstudio.com/articles/">shiny.rstudio.com/articles</a></p></description></item><item><title>RStudio presents Essential Tools for Data Science with R</title><link>https://www.rstudio.com/blog/rstudio-presents-essential-tools-for-data-science-with-r/</link><pubDate>Wed, 16 Jul 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-presents-essential-tools-for-data-science-with-r/</guid><description><p>The RStudio team recently rolled out new capabilities in RStudio, shiny, ggvis, dplyr, knitr, R Markdown, and packrat. The &ldquo;Essential Tools for Data Science with R&rdquo; <strong>free webinar</strong> series is the perfect place to learn more about the power of these R packages from the authors themselves.</p><p><a href="http://pages.rstudio.net/Webniar-Series-Essential-Tools-for-R.html">Click to learn more and register </a>for one or more webinar sessions. You must register for each separately. If you miss a live webinar or want to review them, recorded versions will be available to registrants within 30 days.</p><p><strong>The Grammar and Graphics of Data Science</strong>Live! Wednesday, July 30 at 11am Eastern Time US <a href="http://pages.rstudio.net/Webniar-Series-Essential-Tools-for-R.html">Click to register</a></p><ul><li><p>dplyr: a grammar of data manipulation – <strong>Hadley Wickham</strong></p></li><li><p>ggvis: Interactive graphics in R - <strong>Winston Chang</strong></p></li></ul><p><strong>Reproducible Reporting</strong>Live! Wednesday, August 13 at 11am Eastern Time US <a href="http://pages.rstudio.net/Webniar-Series-2-Essential-Tools-for-R.html">Click to register</a></p><ul><li><p>The Next Generation of R Markdown – <strong>Jeff Allen</strong></p></li><li><p>Knitr Ninja – <strong>Yihui Xie</strong></p></li><li><p>Packrat – A Dependency Management System for R – <strong>J.J. Allaire &amp; Kevin Ushey</strong></p></li></ul><p><strong>Interactive Reporting</strong>Live! Wednesday, September 3 at 11am Eastern Time US <a href="http://pages.rstudio.net/Webniar-Series-3-Essential-Tools-for-R.html">Click to register</a></p><ul><li><p>Embedding Shiny Apps in R Markdown documents – <strong>Garrett Grolemund</strong></p></li><li><p>Shiny: R made interactive – <strong>Joe Cheng</strong></p></li></ul></description></item><item><title>R Day at Strata NYC</title><link>https://www.rstudio.com/blog/r-day-at-strata-nyc/</link><pubDate>Tue, 08 Jul 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/r-day-at-strata-nyc/</guid><description><p>RStudio will teach the new essentials for doing data science in R at this year&rsquo;s <a href="http://strataconf.com/stratany2014/public/schedule/detail/37037">Strata NYC conference</a>, Oct 15 2014.</p><p>R Day at Strata is a full day of tutorials that will cover some of the most useful topics in R. You&rsquo;ll learn how to manipulate and visualize data with R, as well as how to write reproducible, interactive reports that foster collaboration. Topics include:</p><p>9:00am – 10:30amA Grammar of Data Manipulation with dplyrSpeaker: Hadley Wickham</p><p>11:00am – 12:30pmA Reactive Grammar of Graphics with ggvisSpeaker: Winston Chang</p><p>1:30pm – 3:00pmAnalytic Web Applications with ShinySpeaker: Garrett Grolemund</p><p>3:30pm – 5:00pmReproducible R Reports with Packrat and RmarkdownSpeaker: JJ Allaire &amp; Yihui Xie</p><p>The tutorials are integrated into a cohesive day of instruction. Many of the tools that we&rsquo;ll cover did not exist six months ago, so you are almost certain to learn something new. You will get the most out of the day if you already know how to load data into R and have some basic experience visualizing and manipulating data.</p><p>Visit <a href="http://strataconf.com/stratany2014/public/schedule/detail/37037">strataconf.com/stratany2014</a> to learn more and register! <a href="https://en.oreilly.com/stratany2014/public/register">Early bird pricing</a> ends July 31.</p><p>Not available on October 15? Check out Hadley&rsquo;s <a href="http://rstudionyc.eventbrite.com">Advanced R Workshop</a> in New York City on September 8 and 9, 2014.</p></description></item><item><title>Shiny cheat sheet</title><link>https://www.rstudio.com/blog/shiny-cheat-sheet/</link><pubDate>Mon, 30 Jun 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-cheat-sheet/</guid><description><p>Shiny v0.10 comes with a quick, handy guide. Use the Shiny cheat sheet as a quick reference for building Shiny apps. The cheat sheet will guide you from structuring your app, to writing a reactive foundation with server.R, to laying out and deploying your app.</p><p><a href="https://shiny.rstudio.com/articles/cheatsheet.html"><img src="https://rstudioblog.files.wordpress.com/2014/06/cheatsheet.png" alt="cheatsheet"></a></p><p>You can find the <a href="https://shiny.rstudio.com/articles/cheatsheet.html">Shiny cheat sheet</a> along with many more resources for using Shiny at the Shiny Dev Center, <a href="https://shiny.rstudio.com">shiny.rstudio.com</a>.</p><p>(p.s. Visit the RStudio booth at useR! today for a free hard copy of the cheat sheet.)</p></description></item><item><title>Come see RStudio at UseR! 2014</title><link>https://www.rstudio.com/blog/come-see-rstudio-at-user-2014/</link><pubDate>Tue, 24 Jun 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/come-see-rstudio-at-user-2014/</guid><description><p><a href="http://user2014.stat.ucla.edu/">The R User Conference 2014</a> is coming up fast in Los Angeles!</p><p>RStudio will be there in force to share the latest enhancements to shiny, ggvis, knitr, dplyr. R markdown, packrat and more. Here&rsquo;s a quick snapshot of our scheduled sessions. We hope to see you in as many of them as you can attend!</p><p><strong>Monday, June 30</strong></p><p>Morning Tutorials</p><ul><li><p>_Interactive graphics with ggvis _- Winston Chang</p></li><li><p><em>Dynamic Documents with R and knitr</em> - Yihui Xie</p></li></ul><p>Afternoon Tutorials</p><ul><li><p><em>Data manipulation with dplyr</em> - Hadley Wickham</p></li><li><p><em>Interactive data display with Shiny and R</em> - Garrett Grolemund</p></li></ul><p><strong>Tuesday, July 1</strong></p><p>Session 1 10:30 Room - Palisades<em>ggvis: Interactive graphics in R</em> - Winston Chang</p><p>Session 2 13:00 Room - Palisades<em>Shiny: R made interactive</em> - Joe Cheng</p><p>Session 3 16:00 Room - Palisades<em>dplyr: a grammar of data manipulation</em> - Hadley Wickham</p><p><strong>Wednesday, July 2</strong></p><p>Session 5 16.00 Room - Palisades<em>Packrat - A Dependency Management System for R</em> - J.J. Allaire</p><p><strong>Thursday, July 3</strong></p><p>Session 6 10:00 Room - Palisades<em>The Next Generation of R Markdown</em> - J.J. Allaire<em>Knitr Ninja</em> - Yihui Xie<em>Embedding Shiny Apps in R Markdown documents</em> - Garrett Grolemund</p><p><strong>Every Day</strong></p><p>Don&rsquo;t miss our table in the exhibition area during the conference. Come talk to us about your plans for R and learn how RStudio Server Pro and Shiny Server Pro can provide enterprise-ready support and scalability for your RStudio IDE and Shiny deployments.</p></description></item><item><title>Introducing ggvis</title><link>https://www.rstudio.com/blog/introducing-ggvis/</link><pubDate>Mon, 23 Jun 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/introducing-ggvis/</guid><description><p>Our first public release of <a href="http://ggvis.rstudio.com/">ggvis</a>, version 0.3, is now available on CRAN. What is ggvis? It&rsquo;s a new package for data visualization. Like ggplot2, it is built on concepts from the grammar of graphics, but it also adds interactivity, a new data pipeline, and it renders in a web browser. Our goal is to make an interface that&rsquo;s flexible, so that you can compose new kinds of visualizations, yet simple, so that it&rsquo;s accessible to all R users.</p><p><strong>Update:</strong> there was an issue affecting interactive plots in version 0.3. Version 0.3.0.1 fixes the issue. The updated source package is now on CRAN, and Windows and Mac binary packages will be available shortly.</p><p><img src="https://rstudioblog.files.wordpress.com/2014/06/ggvis_movies.gif" alt="ggvis_movies"></p><p>ggvis integrates with Shiny, so you can use dynamic, interactive ggvis graphics in Shiny applications. We hope that the combination of ggvis and Shiny will make it easy for you to create applications for interactive data exploration and presentation. ggvis plots are inherently reactive and they render in the browser, so they can take advantage of the capabilities provided by modern web browsers. You can use Shiny&rsquo;s interactive components for interactivity as well as more direct forms of interaction with the plot, such as hovering, clicking, and brushing.</p><p>ggvis works seamlessly with <a href="https://www.rstudio.com/blog/r-markdown-v2/">R Markdown v2</a> and <a href="https://www.rstudio.com/blog/interactive-documents-an-incredibly-easy-way-to-use-shiny/">interactive documents</a>, so you can easily add interactive graphics to your R Markdown documents:</p><p><img src="https://rstudioblog.files.wordpress.com/2014/06/shiny-doc-ggvis1.png" alt="shiny-doc-ggvis"> <img src="https://rstudioblog.files.wordpress.com/2014/06/ggvis_density.gif" alt="ggvis_density"></p><p>And don&rsquo;t worry &ndash; ggvis isn&rsquo;t only meant to be used with Shiny and interactive documents. Because the RStudio IDE is also a web browser, ggvis plots can display in the IDE, like any other R graphics:</p><p><img src="https://rstudioblog.files.wordpress.com/2014/06/ggvis-screenshot.png" alt="ggvis in RStudio IDE"></p><p>There&rsquo;s much more to come with ggvis. To learn more, visit the <a href="http://ggvis.rstudio.com/">ggvis website</a>.</p><p>Please note that ggvis is still young, and lacks a number of important features from ggplot2. But we&rsquo;re working hard on ggvis and expect many improvements in the months to come.</p></description></item><item><title>Shiny 0.10</title><link>https://www.rstudio.com/blog/shiny-0-10/</link><pubDate>Fri, 20 Jun 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-0-10/</guid><description><p>Shiny 0.10 is now available on CRAN.</p><h2 id="interactive-documents">Interactive documents</h2><p>In this release, the biggest changes were under the hood to support the creation of <a href="https://rmarkdown.rstudio.com/authoring_shiny.html">interactive documents</a>. If you haven&rsquo;t had a chance to check out interactive documents, we really encourage you to do so&mdash;it may be the <a href="https://blog.rstudio.com/2014/06/19/interactive-documents-an-incredibly-easy-way-to-use-shiny/">easiest way to learn Shiny</a>.</p><h2 id="new-layout-functions">New layout functions</h2><p>Three new functions&mdash;<a href="https://shiny.rstudio.com/reference/shiny/latest/flowLayout.html"><code>flowLayout()</code></a>, <a href="https://shiny.rstudio.com/reference/shiny/latest/splitLayout.html"><code>splitLayout()</code></a>, and <a href="https://shiny.rstudio.com/reference/shiny/latest/inputPanel.html"><code>inputPanel()</code></a>&mdash;were added for putting UI elements side by side.</p><ul><li><p><code>flowPanel()</code> lays out its children in a left-to-right, top-to-bottom arrangement.</p></li><li><p><code>splitLayout()</code> evenly divides its horizontal space among its children (or unevenly divides if cellWidths argument is provided).</p></li><li><p><code>inputPanel()</code> is like <code>flowPanel()</code>, but with a light grey background, and is intended for encapsulating small input controls wherever vertical space is at a premium.</p></li></ul><p>A new logical argument <code>inline</code> was also added to <code>checkboxGroupInput()</code> and <code>radioButtons()</code> to arrange check boxes and radio buttons horizontally.</p><h2 id="custom-validation-error-messages">Custom validation error messages</h2><p>Sometimes you don&rsquo;t want your reactive expressions or output renderers in server.R to proceed unless certain input conditions are satisfied, e.g. a select input value has been chosen, or a sensible combination of inputs has been provided. In these cases, you might want to stop the render function quietly, or you might want to give the user a custom message. In shiny 0.10.0, we introduced the functions <a href="https://shiny.rstudio.com/reference/shiny/latest/validate.html"><code>validate()</code> and <code>need()</code></a> which you can use to enforce validation conditions. This won&rsquo;t be the last word on input validation in Shiny, but it should be a lot safer and more convenient than how most of us have been doing it.</p><p>See the article <a href="https://shiny.rstudio.com/articles/validation.html">Write error messages for your UI with validate</a> for details and examples.</p><h2 id="sever-side-processing-for-selectize-input">Sever-side processing for Selectize input</h2><p>In the previous release of Shiny, we added support for <a href="http://brianreavis.github.io/selectize.js/">Selectize</a>, a powerful select box widget. At that time, our implementation passed all of the data to the web page and used JavaScript to do any paging, filtering, and sorting. It worked great for small numbers of items but didn&rsquo;t scale well beyond a few thousand items.</p><p>For Shiny 0.10, we greatly improved the performance of our existing client-side Selectize binding, but also added a new mode that allows the paging, filtering, and sorting to all happen on the server. Only the results that are actually displayed are downloaded to the client. This approach works well for hundreds of thousands or millions of rows.</p><p>For more details and examples, see the article <a href="https://shiny.rstudio.com/articles/selectize.html">Using selectize input</a> on <a href="https://shiny.rstudio.com/">shiny.rstudio.com</a>.</p><h2 id="htmltools">htmltools</h2><p>We also split off Shiny&rsquo;s HTML generating library (<code>tags</code> and friends) into a separate <a href="http://cran.rstudio.com/web/packages/htmltools/index.html">htmltools</a> package. If you&rsquo;re writing a package that needs to generate HTML programmatically, it&rsquo;s far easier and safer to use htmltools than to paste HTML strings together yourself. We&rsquo;ll have more to share about htmltools in the months to come.</p><h2 id="other-changes">Other changes</h2><ul><li><p>New <a href="https://shiny.rstudio.com/reference/shiny/latest/actionButton.html"><code>actionLink()</code></a> input control: behaves like <code>actionButton()</code> but looks like a link</p></li><li><p><code>renderPlot()</code> now calls <code>print()</code> on its result if it&rsquo;s visible&ndash;no more explicit <code>print()</code> required for ggplot2</p></li><li><p>Sliders and select boxes now use a fixed horizontal size instead of filling up all available horizontal space; pass <code>width=&quot;100%&quot;</code> if you need the old behavior</p></li><li><p>The <code>session</code> object that can be passed into a server function is now documented: see <code>?session</code></p></li><li><p>New <a href="https://shiny.rstudio.com/reference/shiny/latest/domains.html">reactive domains</a> feature makes it easy to get callbacks when the current session ends, without having to pass <code>session</code> everywhere</p></li><li><p>Thanks to reactive domains, by default, observers now automatically stop executing when the Shiny session that created them ends</p></li><li><p><code>shinyUI</code> and <code>shinyServer</code></p></li></ul><p>For the full list, you can take a look at the <a href="https://raw.githubusercontent.com/rstudio/shiny/v/0/10/0/NEWS">NEWS file</a>. Please let us know if you have any comments or questions.</p></description></item><item><title>Interactive documents: An incredibly easy way to use Shiny</title><link>https://www.rstudio.com/blog/interactive-documents-an-incredibly-easy-way-to-use-shiny/</link><pubDate>Thu, 19 Jun 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/interactive-documents-an-incredibly-easy-way-to-use-shiny/</guid><description><p>R Markdown&rsquo;s new <a href="https://rmarkdown.rstudio.com/authoring_shiny.html">interactive documents</a> provide a quick, light-weight way to use <a href="https://shiny.rstudio.com">Shiny</a>. An interactive document embeds Shiny elements in an <a href="http://rmarkdown.rstudio.com">R Markdown</a> report. The report becomes &ldquo;live&rdquo;, a choose your own adventure that readers can control and explore. Interactive documents are easy to create and easy to share.</p><h2 id="create-an-interactive-document">Create an interactive document</h2><p>To create an interactive document use RStudio to create a new R Markdown file, choose the Shiny document template, then click &ldquo;Run Document&rdquo; to show a preview:</p><p><img src="https://rstudioblog.files.wordpress.com/2014/06/storms-002.png" alt="storms.002"></p><p><a href="https://rmarkdown.rstudio.com/authoring_rcodechunks.html">Embed R code chunks</a> in your report where you like. Interactive documents use the same syntax as R Markdown and <a href="http://yihui.name/knitr/">knitr</a>. Set <code>echo = FALSE</code>. Your reader won&rsquo;t see the code, just its results.</p><p><img src="https://rstudioblog.files.wordpress.com/2014/06/storms2-001.png" alt="storms2.001"></p><p>Include <a href="https://shiny.rstudio.com/gallery/widgets-gallery.html">Shiny widgets</a> and <a href="https://rmarkdown.rstudio.com/authoring_shiny.html#inputs-and-outputs">outputs</a> in your code chunks. R Markdown will insert the widgets directly into your final document. When a reader toggles a widget, the parts of the document that depend on it will update instantly.</p><p><img src="https://rstudioblog.files.wordpress.com/2014/06/storms-003.png" alt="storms.003"></p><p>That&rsquo;s it! No extra files are needed.</p><p>Note that in order to use interactive documents you should be running the <a href="https://www.rstudio.com/products/rstudio/">latest version</a> of RStudio (v0.98.932 or higher). Alternatively if you are not using RStudio be sure to follow the directions <a href="https://rmarkdown.rstudio.com/authoring_shiny.html#prerequisites">here</a> to install all of the required components.</p><h2 id="share-your-document">Share your document</h2><p>Interactive documents can be run locally on the desktop or be deployed Shiny Server v1.2 or <a href="http://shinyapps.io/">ShinyApps</a> just like any other Shiny application. See the RMarkdown v2 website for more details on <a href="https://rmarkdown.rstudio.com/authoring_shiny.html#deployment">deploying interactive documents</a>.</p><h2 id="use-pre-packaged-tools">Use pre-packaged tools</h2><p>Interactive documents make it easy to insert powerful tools into a report. For example, you can insert a kmeans clustering tool into your document with one line of code, as below. <code>kmeans_cluster</code> is a widget built from a Shiny app and intended for use in interactive documents.</p><p><img src="https://rstudioblog.files.wordpress.com/2014/06/storms-004.png" alt="storms.004"></p><p>You can <a href="https://rmarkdown.rstudio.com/authoring_shiny_widgets.html">build your own widgets</a> with <code>shinyApp</code>, a new function that repackages Shiny apps as functions. <code>shinyApp</code> is easy to use. Its first argument takes the code that appears in an app&rsquo;s ui.R file. The second argument takes the code that appears in the app&rsquo;s server.R file. The <a href="https://github.com/rstudio/rmdexamples/blob/master/R/kmeans_cluster.R">source</a> of <code>kmeans_cluster</code> reveals how simple this is.</p><h2 id="be-a-hero">Be a hero</h2><p>Ready to be a hero? You can use the <code>shinyApp</code> function to make out of the box widgets that students, teachers, and data scientists will use everyday. Widgets can</p><ul><li><p>fit models</p></li><li><p>compare distributions</p></li><li><p>visualize data</p></li><li><p>demonstrate teaching examples</p></li><li><p>act as quizzes or multiple choice questions</p></li><li><p>and more</p></li></ul><p>These widgets are not made yet, they are low hanging fruit for any Shiny developer. If you know how to program with Shiny (or want to learn), and would like to make your mark on R, consider authoring a package that makes widgets available for interactive documents.</p><h2 id="get-started">Get started!</h2><p>To learn more about interactive documents visit <a href="https://rmarkdown.rstudio.com/authoring_shiny.html">https://rmarkdown.rstudio.com/authoring_shiny.html</a>.</p></description></item><item><title>New Version of RStudio: R Markdown v2 and More</title><link>https://www.rstudio.com/blog/new-version-of-rstudio-r-markdown-v2-and-more/</link><pubDate>Wed, 18 Jun 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/new-version-of-rstudio-r-markdown-v2-and-more/</guid><description><p>Today we&rsquo;re very pleased to announce a new version of RStudio (v0.98.932) which is <a href="https://www.rstudio.com/products/rstudio/">available for download</a> now. New features in this release include:</p><ul><li><p>A next generation implementation of <a href="http://rmarkdown.rstudio.com">R Markdown</a> with a raft of new features including support for HTML, PDF, and Word output, many new options for customizing document appearance, and the ability to create presentations (Beamer or HTML5).</p></li><li><p><a href="https://rmarkdown.rstudio.com/authoring_shiny.html">Interactive Documents</a> (Shiny meets R Markdown). Readers can now change the parameters underlying your analysis and see the results immediately. Interactive Documents make it easier than ever to use Shiny!</p></li><li><p>Shareable notebooks from R scripts. Notebooks include all R code and generated output, and can be rendered in HTML, PDF, and Word formats.</p></li><li><p>Enhanced debugging including support for the new R 3.1 debugging commands to step into function calls and finish the current loop or function.</p></li><li><p>Various source editor enhancements including new syntax highlighting modes for XML, YAML, SQL, Python, and shell scripts. You can also execute Python and shell scripts directly from the editor using Ctrl+Shift+Enter.</p></li><li><p>Integrated tools for <a href="https://shiny.rstudio.com/">Shiny</a> development including the ability to run applications within an IDE pane as well as Run/Reload applications with a keyboard shortcut (Ctrl+Shift+Enter).</p></li><li><p>A new <a href="https://github.com/hadley/devtools">devtools</a> mode for package development (uses devtools for check, document, test, build, etc.)</p></li><li><p>Contextual Git/SVN menu that enables quick access to per-file revision history and selection-aware View/Blame for projects hosted on <a href="https://github.com">GitHub</a>.</p></li><li><p>Fast lookup of shortcuts using the new keyboard shortcut quick-reference card (Alt+Shift+K)</p></li></ul><p>See the <a href="https://www.rstudio.com/products/rstudio/release-notes/">release notes</a> for a full list of what&rsquo;s changed and see Yihui Xie&rsquo;s post on <a href="https://blog.rstudio.com/2014/06/18/r-markdown-v2/">R Markdown v2</a> for more on what&rsquo;s new there.</p><p>We&rsquo;ll be posting additional articles over the next few days that describe the new features in more depth. In the meantime we hope you <a href="https://www.rstudio.com/products/rstudio/">download</a> the new version and as always <a href="http://support.rstudio.com">let us know</a> how it&rsquo;s working and what else you&rsquo;d like to see.</p></description></item><item><title>R Markdown v2</title><link>https://www.rstudio.com/blog/r-markdown-v2/</link><pubDate>Wed, 18 Jun 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/r-markdown-v2/</guid><description><p>People rarely agree on a best authoring tool or language. Some people cannot live without <code>\LaTeX{}</code> because of the beauty and quality of its PDF output. Some \feel{} \uncomfortable{} \with{} \backslashes{}, and would rather live in another <del>World</del> Word. We have also witnessed the popularity of Markdown, an incredibly simple language (seriously? a LANGUAGE?) that has made reproducible research <a href="http://rpubs.com">much easier</a>.</p><p>Thinking of all these tools and languages, every developer will dream about &ldquo;<em>One ring to rule them all</em>&quot;. <code>\section{}</code>, <code>&lt;h1&gt;&lt;/h1&gt;</code>, <code>===</code>, <code>#</code>, &hellip; Why cannot we write the first-level section header in a single way? Yes, we are aware of <a href="http://xkcd.com/927/">the danger</a> of &ldquo;adding yet another so-called universal standard that covers all the previous standards&rdquo;. However, we believe <a href="http://johnmacfarlane.net/pandoc/">Pandoc</a> has done a fairly good job in terms of &ldquo;yet another Markdown standard&rdquo;. Standing on the shoulders of Pandoc, today we are excited to announce the second episode of our journey into the development of the tools for authoring dynamic documents:</p><p><em>The Return of R Markdown</em>!</p><p>The R package <a href="http://cran.rstudio.com/package=markdown"><strong>markdown</strong></a> (plus <a href="http://cran.rstudio.com/package=knitr"><strong>knitr</strong></a>) was our first version of R Markdown. The primary output format was HTML, which certainly could not satisfy all users in the <del>World</del> Word. It did not have features like citations, footnotes, or metadata (title, author, and date, etc), either. When we were asked how one could convert Markdown to PDF/Word, we used to tell users to try Pandoc. The problem is that Pandoc&rsquo;s great power comes with a lot of command line options (more than 70), and <strong>knitr</strong> has the same problem of too many options. That is why we created the second generation of R Markdown, represented by the <a href="https://rmarkdown.rstudio.com/"><strong>rmarkdown</strong></a> package, to provide reasonably good defaults and an R-friendly interface to customize Pandoc options.</p><p>The <a href="https://www.rstudio.com/products/rstudio/download/">new version of RStudio</a> (v0.98.932) includes everything you need to use R Markdown v2 (including pandoc and the <strong>rmarkdown</strong> package). If you are not using RStudio you can install rmarkdown and pandoc separately as described <a href="https://github.com/rstudio/rmarkdown#installation">here</a>. To get started with a &ldquo;Hello Word&rdquo; example, simply click the menu <code>File -&gt; New File -&gt; R Markdown</code> in RStudio IDE. You can choose the output format from the drop-down menu on the toolbar.</p><p><img src="https://rstudioblog.files.wordpress.com/2014/06/r-markdown-formats.png" alt="R Markdown Formats"></p><p>The built-in output formats include HTML, LaTeX/PDF, Word, Beamer slides, HTML5 presentations, and so on. <a href="http://johnmacfarlane.net/pandoc/README.html#pandocs-markdown">Pandoc&rsquo;s Markdown</a> allows us to write richer content such as tables, citations, and footnotes. For power users who understand LaTeX/HTML, you can even embed raw LaTeX/HTML code in Markdown, and Pandoc is smart enough to process these raw fragments. If you cannot remember the possible options for a certain output format in the YAML metadata (data between <code>---</code> and <code>---</code> in the beginning of a document), you can use the <code>Settings</code> button on the toolbar.</p><p>Extensive documentation for R Markdown v2 and all of it&rsquo;s supported output formats are available on the new R Markdown website at <a href="http://rmarkdown.rstudio.com">http://rmarkdown.rstudio.com</a>.</p><p>We understand users will never be satisfied by our default templates, regardless of how hard we try to make them appealing. The <strong>rmarkdown</strong> package is fully customizable and extensible in the sense that you can define your custom templates and output formats. You want to contribute an article to The R Journal, or JSS (Journal of Statistical Software), but prefer writing in Markdown instead of LaTeX? <a href="https://rmarkdown.rstudio.com/developer_document_templates.html">No problem!</a> Pandoc also supports many other output formats, and you want EPUB books, or a different type of HTML5 slides? <a href="https://rmarkdown.rstudio.com/developer_custom_formats.html">No problem!</a> Not satisfied with one single static output document? You can embed interactive widgets into R Markdown documents as well! <a href="https://rmarkdown.rstudio.com/authoring_shiny.html">Let there be Shiny!</a> The more you learn about <strong>rmarkdown</strong> and Pandoc, the more freedom you will get.</p><p>For a brief video introduction, you may <a href="http://vimeo.com/94181521">watch the talk</a> below (jump to 18:30 if you only want to see the demos):</p><p>[vimeo 94181521 w=500 h=281]</p><p>The <strong>rmarkdown</strong> package is open-source (GPL-3) and is both included in the <a href="https://www.rstudio.com/products/rstudio/">RStudio IDE</a> and <a href="https://github.com/rstudio/rmarkdown">available on GitHub</a>. The package is not on CRAN yet, but will be there as soon as we make all the improvements requested by early users.</p><p>To clarify the relationship between <strong>rmarkdown</strong> and RStudio IDE, our IDE is absolutely not the only way to compile R Markdown documents. You are free to call functions in <strong>rmarkdown</strong> in any environment. Please check out the R package documentation, in particular, the <em>render()</em> function in <strong>rmarkdown</strong>.</p><p>Please let us know if you have any questions or comments, and your feedback is greatly appreciated. We hope you will enjoy R Markdown v2.</p><p><img src="https://rstudioblog.files.wordpress.com/2014/06/keep-calm-and-markdown.png" alt="Keep Calm and Markdown"></p></description></item><item><title>Comment sections and help instructions at the Shiny Dev Center</title><link>https://www.rstudio.com/blog/comment-sections-and-help-instructions-at-the-shiny-dev-center/</link><pubDate>Wed, 28 May 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/comment-sections-and-help-instructions-at-the-shiny-dev-center/</guid><description><p>We&rsquo;ve added two new features that make it easier to learn Shiny with the <a href="https://shiny.rstudio.com">Shiny Dev Center</a>.</p><ol><li><p><strong>Disqus comments</strong> - Each lesson and article on the Dev Center now has its own comments section. Use the comments section to start a discussion or to leave feedback about the articles.</p></li><li><p><strong>How to get help with Shiny</strong> - We&rsquo;ve added a new article, <a href="https://shiny.rstudio.com/articles/help.html">How to get help with Shiny</a>, which explains the best ways to get help with Shiny and R. As an open source language, R doesn&rsquo;t have a paid support team, which makes getting help with R (and Shiny) a little different than for paid software.</p></li></ol><p>Are you curious about the Shiny Dev Center? The Dev Center is located at <a href="https://shiny.rstudio.com">shiny.rstudio.com</a>, a central repository of information on Shiny. At the Dev Center, you will find a <a href="https://shiny.rstudio.com/tutorial/">tutorial</a>, <a href="https://shiny.rstudio.com/reference/shiny/latest/">documentation</a>, <a href="https://shiny.rstudio.com/articles/">articles</a> on Shiny, and <a href="https://shiny.rstudio.com/gallery/">example Shiny apps</a>.</p></description></item><item><title>dplyr 0.2</title><link>https://www.rstudio.com/blog/dplyr-0-2/</link><pubDate>Wed, 21 May 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/dplyr-0-2/</guid><description><p>I&rsquo;m very excited to announce dplyr 0.2. It has three big features:</p><ul><li><p>improved piping courtesy of the <a href="https://github.com/smbache/magrittr">magrittr</a> package</p></li><li><p>a vastly more useful implementation of <code>do()</code></p></li><li><p>five new verbs: <code>sample_n()</code>, <code>sample_frac()</code>, <code>summarise_each()</code>, <code>mutate_each</code> and <code>glimpse()</code>.</p></li></ul><p>These features are described in more detail below. To learn more about the 35 new minor improvements and bug fixes, please read the <a href="https://github.com/hadley/dplyr/releases/tag/v0.2.0">full release notes</a>.</p><h2 id="improved-piping">Improved piping</h2><p>dplyr now imports <code>%&gt;%</code> from the <a href="https://github.com/smbache/magrittr">magrittr</a> package by <a href="http://www.stefanbache.dk/">Stefan Milton Bache</a>. I recommend that you use this instead of <code>%.%</code> because it is easier to type (since you can hold down the shift key) and is more flexible. With you <code>%&gt;%</code>, you can control which argument on the RHS receives the LHS with the pronoun <code>.</code>. This makes <code>%&gt;%</code> more useful with base R functions because they don&rsquo;t always take the data frame as the first argument. For example you could pipe <code>mtcars</code> to <code>xtabs()</code> with:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">mtcars <span style="color:#666">%&gt;%</span> <span style="color:#06287e">xtabs</span>( <span style="color:#666">~</span> cyl <span style="color:#666">+</span> vs, data <span style="color:#666">=</span> .)</code></pre></div><p>dplyr only exports <code>%&gt;%</code> from magrittr, but magrittr contains many other useful functions. To use them, load magrittr explicitly with <code>library(magrittr)</code>. For more details, see <code>vignette(&quot;magrittr&quot;)</code>.<code>%.%</code> will be deprecated in a future version of dplyr, but it won&rsquo;t happen for a while. I&rsquo;ve deprecated <code>chain()</code> to encourage a single style of dplyr usage: please use <code>%&gt;%</code> instead.</p><h2 id="do">Do</h2><p><code>do()</code> has been completely overhauled, and <code>group_by()</code> + <code>do()</code> is now equivalent in power to <code>plyr::dlply()</code>. There are two ways to use <code>do()</code>, either with multiple named arguments or a single unnamed arguments. If you use named arguments, each argument becomes a list-variable in the output. A list-variable can contain any arbitrary R object which makes this form of <code>do()</code> useful for storing models:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(dplyr)models <span style="color:#666">%&gt;%</span> <span style="color:#06287e">group_by</span>(cyl) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">do</span>(model <span style="color:#666">=</span> <span style="color:#06287e">lm</span>(mpg <span style="color:#666">~</span> wt, data <span style="color:#666">=</span> .))models <span style="color:#666">%&gt;%</span> <span style="color:#06287e">summarise</span>(rsq <span style="color:#666">=</span> <span style="color:#06287e">summary</span>(model)<span style="color:#666">$</span>r.squared)</code></pre></div><p>If you use an unnamed argument, the result should be a data frame. This allows you to apply arbitrary functions to each group.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">mtcars <span style="color:#666">%&gt;%</span> <span style="color:#06287e">group_by</span>(cyl) <span style="color:#666">%&gt;%</span> <span style="color:#06287e">do</span>(<span style="color:#06287e">head</span>(., <span style="color:#40a070">1</span>))</code></pre></div><p>Note the use of the pronoun <code>.</code> to refer to the data in the current group.<code>do()</code> also has an automatic progress bar. It appears if the computation takes longer than 2 seconds and estimates how long the job will take to complete.</p><h2 id="new-verbs">New verbs</h2><p><code>sample_n()</code> randomly samples a fixed number of rows from a tbl; <code>sample_frac()</code> randomly samples a fixed fraction of rows. They currently only work for local data frames and data tables.<code>summarise_each()</code> and <code>mutate_each()</code> make it easy to apply one or more functions to multiple columns in a tbl. These works for all srcs that <code>summarise()</code> and <code>mutate()</code> work for.<code>glimpse()</code> makes it possible to see all the columns in a tbl, displaying as much data for each variable as can be fit on a single line.</p></description></item><item><title>roxygen2 4.0.1</title><link>https://www.rstudio.com/blog/roxygen2-4-0-1/</link><pubDate>Mon, 19 May 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/roxygen2-4-0-1/</guid><description><p>We&rsquo;re pleased to announce a new version of roxygen2. Roxygen2 allows you to write documentation comments that are automatically converted to R&rsquo;s standard Rd format, saving you time and reducing duplication. This release is a major update that provides enhanced error handling and considerably safer default behaviour. Roxygen2 now adds a comment to all generated files so that you know they shouldn&rsquo;t be edited by hand. This also ensures that roxygen2 will never overwrite a file that it did not create, and can automatically remove files that are no longer needed.</p><p>I&rsquo;ve also written some vignettes to help you understand how to use roxygen2. Six new vignettes provide a comprehensive overview of using roxygen2 in practice. Run <code>browseVignettes(&quot;roxygen2&quot;)</code> to read them. In an effort to make roxygen2 easier to use and more consistent between package authors, I&rsquo;ve made parsing considerably stricter, and made sure that all errors give you the line number of the associated roxygen block. Every input is now checked to make sure that it has (e.g. every <code>{</code> has a matching <code>}</code>). This should prevent frustrating errors that require careful reading of <code>.Rd</code> files. Similarly, <code>@section</code> titles and <code>@export</code> tags can now only span a single line as this prevents a number of common bugs.</p><p>Other features include two new tags <code>@describeIn</code> and <code>@field</code>, and you can document objects (like datasets) by documenting their name as a string. For example, to document a dataset called <code>mydata</code>, you can do:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic">#&#39; Mydata set</span><span style="color:#60a0b0;font-style:italic">#&#39;</span><span style="color:#60a0b0;font-style:italic">#&#39; Some data I collected about myself</span><span style="color:#4070a0">&#34;</span><span style="color:#4070a0">mydata&#34;</span></code></pre></div><p>To see a complete list of all bug fixes and improvements, please see the release notes for <a href="https://github.com/klutometis/roxygen/releases/tag/v4.0.0">roxygen2 4.0.0</a> for details. <a href="https://github.com/klutometis/roxygen/releases/tag/v4.0.1">Roxygen2 4.0.1</a> fixed a couple of minor bugs and majorly improved the upgrade process.</p></description></item><item><title>reshape2 1.4; Kevin Ushey joins Rstudio</title><link>https://www.rstudio.com/blog/reshape2-1-4/</link><pubDate>Fri, 09 May 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/reshape2-1-4/</guid><description><p>reshape2 1.4 is now available on CRAN. This version adds a number of useful arguments and messages, but mostly importantly it gains a C++ implementation of <code>melt.data.frame()</code>. This new method should be much much faster (&gt;10x) and does a better job of preserving existing attributes. For full details, see the <a href="https://github.com/hadley/reshape/releases/tag/v1.4">release notes</a> on github.</p><p>The C++ implementation of melt was contributed by <a href="http://kevinushey.github.io/">Kevin Ushey</a>, who we&rsquo;re very pleased to announce has joined RStudio. You may be familiar with Kevin from his contributions to Rcpp, or his CRAN packages Kmisc and timeit.</p></description></item><item><title>New Shiny article: Style your apps with CSS</title><link>https://www.rstudio.com/blog/new-shiny-article-style-your-apps-with-css/</link><pubDate>Wed, 07 May 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/new-shiny-article-style-your-apps-with-css/</guid><description><p>Shiny apps use an HTML interface, which means that you can change the visual appearance of your apps quickly and simply with CSS files. Would you like to know how? I posted a new article that will step you through the options at the <a href="https://shiny.rstudio.com">Shiny Dev Center</a>. Check it out <a href="https://shiny.rstudio.com/articles/css.html">here</a>.</p><p><img src="https://rstudioblog.files.wordpress.com/2014/05/css-side-by-side.png" alt="Image"></p></description></item><item><title>Announcing RStudio Shiny Server Pro v1.1</title><link>https://www.rstudio.com/blog/announcing-rstudio-shiny-server-pro-v1-1/</link><pubDate>Tue, 22 Apr 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-rstudio-shiny-server-pro-v1-1/</guid><description><p>We are happy to announce the availability of v1.1 of <a href="https://www.rstudio.com/shiny/server/">RStudio Shiny Server Pro</a>, our commercial server for deploying <a href="https://shiny.rstudio.com/">Shiny</a> applications. In this release we took your feedback and made it easier for you to integrate Shiny Server Pro into your production environments. With Shiny Server Pro v1.1 you now can:</p><ul><li><p>Control access to your applications with Google Authentication (OAuth2).</p></li><li><p>Create sessions and authenticate with PAM (<a href="http://rstudio.github.io/shiny-server/latest/#pam-authentication">auth_pam</a> and <a href="http://rstudio.github.io/shiny-server/latest/#pam-sessions">pam_sessions_profile</a>).</p></li><li><p>Set the version of R that is used per application and/or per user</p></li><li><p>Customize page templates for directory listings and error pages.</p></li><li><p>Monitor service health and get additional metrics with a new health check endpoint.</p></li><li><p>Provide custom environment variables to a Shiny process using Bash profiles</p></li><li><p>Configure apps to run using the authenticated user&rsquo;s account with custom environment variables from Bash or PAM</p></li><li><p>Launch Shiny apps with a prefix command such as &lsquo;nice&rsquo; allowing you to prioritize compute resources per application or per user</p></li></ul><p>If you haven&rsquo;t tried Shiny Server Pro yet, download a copy <a href="https://www.rstudio.com/shiny/server/pro">here</a>.</p></description></item><item><title>Announcing our new training web pages</title><link>https://www.rstudio.com/blog/announcing-our-new-training-web-pages/</link><pubDate>Thu, 10 Apr 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-our-new-training-web-pages/</guid><description><p>We&rsquo;ve redesigned our training pages to make it even easier for you to learn R or Shiny. Visit our new training web page, <a href="https://www.rstudio.com/training/">www.rstudio.com/training</a>, to see:</p><ul><li><p>A curated list of free materials for learning R. We think that these are some of the most helpful resources on the web. They would make an effective starting place if you want to improve your R skills.</p></li><li><p>Announcements for upcoming RStudio public workshops, like the <a href="https://rstudio-sfbay.eventbrite.com">Introduction to R</a> course that we&rsquo;re holding on April 28 &amp; 29 in San Francisco.</p></li><li><p>A database of well known R instructors, who can provide on-site &ndash; as well as online &ndash; R training.</p></li><li><p>Links to the new Shiny Dev Center, which includes articles, examples, and a tutorial, all designed to help you master Shiny.</p></li><li><p>Links to the preview sites for R Markdown, an easy option for writing reproducible reports with R, and ggvis, an R package that creates interactive plots with the grammar of graphics.</p></li><li><p>Links to books that we have written (or are writing) about R and its tools.</p></li></ul><p>Why are we so excited about training? We think that learning R and Shiny is the best investment that a data user can make. These two free tools can streamline how you analyze data and deliver results. Browse through the links at <a href="https://www.rstudio.com/training/">www.rstudio.com/training</a> and see for yourself.</p></description></item><item><title>devtools 1.5</title><link>https://www.rstudio.com/blog/devtools-1-5/</link><pubDate>Tue, 08 Apr 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/devtools-1-5/</guid><description><p>devtools 1.5 is now available on CRAN. It includes four new functions to make it easier to add useful infrastructure to packages:</p><ul><li><p><code>add_test_infrastructure()</code> will create testthat infrastructure when needed.</p></li><li><p><code>add_rstudio_project()</code> adds an Rstudio project file to your package.</p></li><li><p><code>add_travis()</code> adds a basic template for <a href="https://travis-ci.org/">travis-ci</a>.</p></li><li><p><code>add_build_ignore()</code> makes it easy to add files to <code>.Rbuildignore</code>,escaping special characters as needed.</p></li></ul><p>We&rsquo;ve also bumped two dependencies: devtools now requires R 3.0.2 and roxygen2 3.0.0. We&rsquo;ve also included many minor improvements and bug fixes, particularly for package installation. For example <code>install_github()</code> now prefers the safer github personal access token, and does a better job of installing the dependencies that you actually need. We also provide versions of <code>help()</code>, <code>?</code> and <code>system.file()</code> that work with all packages, regardless of how they&rsquo;re loaded. See a complete list of changes in the <a href="https://github.com/hadley/devtools/releases/tag/v1.5">full release notes</a>.</p></description></item><item><title>Introduction to Data Science with R, April 28-29 San Francisco</title><link>https://www.rstudio.com/blog/introduction-to-data-science-with-r-april-28-29-san-francisco/</link><pubDate>Thu, 03 Apr 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/introduction-to-data-science-with-r-april-28-29-san-francisco/</guid><description><p>Please join us for our popular Introduction to R course for data scientists and data analysts in San Francisco on April 28 and 29. This is a two-day workshop, designed to provide a comprehensive introduction to R that will have you analyzing and modeling data with R in no time. We will cover practical skills for visualizing, transforming, and modeling data in R. You will learn how to explore and understand data as well as how to build linear and non-linear models in R.</p><p>The course will be led by RStudio Master Instructor and author Dr. Garrett Grolemund.</p><p>We offer introductory R training only a few times a year. The Boston course in January sold out quickly. Space is limited. We encourage you to <a href="http://rstudio-sfbay.eventbrite.com">register </a>(rstudio-sfbay.eventbrite.com) as soon as you can.</p><p>&ldquo;The instructor was amazing. He knew so much and could answer any questions. His expertise was obvious and he was also very clear about how to explain it to a varied audience.&rdquo; - Workshop Student, January 2014</p><p>&ldquo;Very well organized and at a good pace. The example datasets were very helpful. Excellent teachers!&rdquo; - Workshop Student, January 2014</p></description></item><item><title>Introduction to Data Science with R, April 28-29 San Francisco</title><link>https://www.rstudio.com/blog/introduction-to-data-science-with-r-april-28-29-san-francisco/</link><pubDate>Thu, 03 Apr 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/introduction-to-data-science-with-r-april-28-29-san-francisco/</guid><description><p>Please join us for our popular Introduction to R course for data scientists and data analysts in San Francisco on April 28 and 29. This is a two-day workshop, designed to provide a comprehensive introduction to R that will have you analyzing and modeling data with R in no time. We will cover practical skills for visualizing, transforming, and modeling data in R. You will learn how to explore and understand data as well as how to build linear and non-linear models in R.</p><p>The course will be led by RStudio Master Instructor and author Dr. Garrett Grolemund.</p><p>We offer introductory R training only a few times a year. The Boston course in January sold out quickly. Space is limited. We encourage you to <a href="http://rstudio-sfbay.eventbrite.com">register </a>(rstudio-sfbay.eventbrite.com) as soon as you can.</p><p>&ldquo;The instructor was amazing. He knew so much and could answer any questions. His expertise was obvious and he was also very clear about how to explain it to a varied audience.&rdquo; - Workshop Student, January 2014</p><p>&ldquo;Very well organized and at a good pace. The example datasets were very helpful. Excellent teachers!&rdquo; - Workshop Student, January 2014</p></description></item><item><title>New Shiny website launched; Shiny 0.9 released</title><link>https://www.rstudio.com/blog/shiny-website-and-0-9/</link><pubDate>Thu, 27 Mar 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-website-and-0-9/</guid><description><p>We&rsquo;re excited to introduce to you our new website for Shiny: <a href="https://shiny.rstudio.com/">shiny.rstudio.com</a>!</p><p><a href="https://shiny.rstudio.com/"><img src="https://rstudioblog.files.wordpress.com/2014/03/shiny-rstudio-com.gif" alt="shiny-rstudio-com"></a></p><p>We&rsquo;ve included <a href="https://shiny.rstudio.com/articles/">articles</a> on many Shiny-related topics, dozens of <a href="https://shiny.rstudio.com/gallery/">example applications</a>, and an all-new <a href="https://shiny.rstudio.com/tutorial/">tutorial</a> for getting started.</p><p>Whether you&rsquo;re a beginner or expert at Shiny, we hope that having these resources available in one place will help you find the information you need.</p><p><strong>We&rsquo;d also like to announce Shiny 0.9, now available on CRAN.</strong> This release includes many bug fixes and new features, including:</p><h2 id="new-application-layout-options">New application layout options</h2><p>Until now, the vast majority of Shiny apps have used a sidebar-style layout. Shiny 0.9 introduces new layout features to:</p><ol><li><p>Make it easy to create custom page layouts using the Bootstrap grid system. See our new <a href="https://shiny.rstudio.com/articles/layout-guide.html">application layout guide</a> or a <a href="https://shiny.rstudio.com/gallery/plot-plus-three-columns.html">live example</a>.</p></li><li><p>Provide navigation bars and lists for separating your application into different pages. See <a href="https://shiny.rstudio.com/reference/shiny/latest/navbarPage.html">navbarPage</a> and <a href="https://shiny.rstudio.com/reference/shiny/latest/navlistPanel.html">navlistPanel</a>, and <a href="https://shiny.rstudio.com/gallery/navbar-example.html">this example</a>.</p></li><li><p>Enhance <a href="https://shiny.rstudio.com/reference/shiny/latest/navlistPanel.html">tabsetPanel</a> to allow pill-style tabs, and to let tabs be placed above, below, or to either side of tab content.</p></li><li><p>Create floating panels and place them relative to the sides of the page, optionally making them draggable. See <a href="https://shiny.rstudio.com/reference/shiny/latest/absolutePanel.html">absolutePanel</a> or <a href="https://shiny.rstudio.com/gallery/absolutely-positioned-panels.html">this example</a>.</p></li><li><p>Use <a href="https://www.google.com/webhp?ion=1&amp;espv=2&amp;ie=UTF-8#q=bootstrap+themes">Bootstrap themes</a> to easily modify the fonts and colors of your application. <a href="https://shiny.rstudio.com/gallery/retirement-simulation.html">Example</a></p></li></ol><p>You can see many of these features in action together in <a href="https://shiny.rstudio.com/gallery/superzip-example.html">our reimplementation</a> of the Washington Post&rsquo;s <a href="http://www.washingtonpost.com/sf/local/2013/11/09/washington-a-world-apart/">interactive article on Super Zips</a>.</p><h2 id="selectizejs-integration">Selectize.js integration</h2><p>The JavaScript library <a href="https://github.com/brianreavis/selectize.js">selectize.js</a> provides a much more flexible interface compared to the basic select input. It allows you to type and search in the options, use placeholders, control the number of options/items to show/select, and so on.</p><p><img src="https://rstudioblog.files.wordpress.com/2014/03/selectize.png" alt="selectize"></p><p>We have integrated selectize.js in shiny 0.9, and <code>selectInput</code> now creates selectize inputs by default. (You can revert back to plain select inputs by passing <code>selectize=FALSE</code> to <code>selectInput</code>.) For more advanced uses, we have included a new <code>selectizeInput</code> function that lets you pass options to selectize.</p><p>Please check out <a href="https://demo.shinyapps.io/013-selectize/">this example</a> to see a subset of features of the selectize input. There is also <a href="https://demo.shinyapps.io/017-select-vs-selectize/">an example</a> comparing the select and selectize input.</p><h2 id="showcase-mode">Showcase mode</h2><p>Shiny apps can now (optionally) run in a &ldquo;showcase&rdquo; mode in which the app&rsquo;s R code can be automatically displayed within the app. Most of the Shiny example apps in our new <a href="https://shiny.rstudio.com/gallery/">gallery</a> use showcase mode.</p><p><a href="https://shiny.rstudio.com/gallery/kmeans-example.html"><img src="https://rstudioblog.files.wordpress.com/2014/03/kmeans.png" alt="Showcase example"></a></p><p>As you interact with the application, reactive expressions and outputs in server.R will light up as they execute. This can be helpful in visualizing the reactivity in your app.</p><p>See <a href="https://shiny.rstudio.com/articles/display-modes.html">this article</a> to learn more.</p><p>As always, you can install the latest release of Shiny by running this command at the R console:</p><p><code>install.packages(&quot;shiny&quot;)</code></p><p>The complete list of bug fixes and features is available in the <a href="http://cran.r-project.org/web/packages/shiny/NEWS">NEWS file</a>.</p><p>We hope you&rsquo;ll find these new features helpful in exploring and understanding your data!</p></description></item><item><title>httr 0.3</title><link>https://www.rstudio.com/blog/httr-0-3/</link><pubDate>Fri, 21 Mar 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/httr-0-3/</guid><description><p>We&rsquo;re very pleased to announce the release of httr 0.3. httr makes iteasy to work with modern web apis so that you can work with web dataalmost as easily as local data. For example, this code shows how mightfind the most recently asked question about R on stackoverflow:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#60a0b0;font-style:italic"># install.packages(&#34;httr&#34;)</span><span style="color:#06287e">library</span>(httr)<span style="color:#60a0b0;font-style:italic"># Find the most recent R questions on stackoverflow</span>r <span style="color:#666">&lt;-</span> <span style="color:#06287e">GET</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">http://api.stackexchange.com&#34;</span>,path <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">questions&#34;</span>,query <span style="color:#666">=</span> <span style="color:#06287e">list</span>(site <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">stackoverflow.com&#34;</span>,tagged <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">r&#34;</span>))<span style="color:#60a0b0;font-style:italic"># Check the request succeeded</span><span style="color:#06287e">stop_for_status</span>(r)<span style="color:#60a0b0;font-style:italic"># Automatically parse the json output</span>questions <span style="color:#666">&lt;-</span> <span style="color:#06287e">content</span>(r)questions<span style="color:#666">$</span>items[[1]]<span style="color:#666">$</span>title<span style="color:#60a0b0;font-style:italic">#&gt; [1] &#34;Remove NAs from data frame without deleting entire rows/columns&#34;</span></code></pre></div><p>httr 0.3 recieved a major overhaul to OAuth support. OAuth is a modernstandard for authentication used when you want to allow a service (i.e Rpackage) access to your account on a website. This version of httrprovides an improved initial authentication experience and supportscaching so that you only need to authenticate once per project. A bigthanks goes to Craig Citro (Google) who contributed a lot of code andideas to make this possible.</p><p>httr 0.3 also includes many other bug fixes and minor improvements. Youcan read about these in the <a href="https://github.com/hadley/httr/releases/tag/v0.3">github release notes</a>.</p></description></item><item><title>dplyr 0.1.3</title><link>https://www.rstudio.com/blog/dplyr-0-1-3/</link><pubDate>Sun, 16 Mar 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/dplyr-0-1-3/</guid><description><p>dplyr 0.1.3 is now on CRAN. It fixes an incompatibility with the latest version of Rcpp, and a number of other bugs that were causing dplyr to crash R. See the full details in the <a href="https://github.com/hadley/dplyr/releases/tag/v0.1.3">release notes</a>.</p></description></item><item><title>Announcing Shiny Server Pro general availability</title><link>https://www.rstudio.com/blog/announcing-shiny-server-pro-general-availability/</link><pubDate>Wed, 26 Feb 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-shiny-server-pro-general-availability/</guid><description><p>We are excited to announce the general availability of <a href="https://www.rstudio.com/shiny/server/">RStudio Shiny Server Pro</a>.</p><p>Shiny Server Pro is the simplest way for data scientists and R users in the enterprise to share their work with colleagues. With Shiny Server Pro you can:</p><ul><li><p>Secure access to Shiny applications with authentication systems such as LDAP and Active Directory</p></li><li><p>Configure a Shiny application to use more than one process</p></li><li><p>Control the number of concurrent users per application</p></li><li><p>Gain insight into your applications&rsquo; CPU and memory use</p></li><li><p>Get help directly from our team at RStudio</p></li></ul><p>If you&rsquo;re interested in finding out more, download a free 45 day evaluation <a href="https://www.rstudio.com/shiny/server/pro">here</a>.</p></description></item><item><title>dplyr 0.1.2</title><link>https://www.rstudio.com/blog/dplyr-0-1-2/</link><pubDate>Tue, 25 Feb 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/dplyr-0-1-2/</guid><description><p>We&rsquo;re pleased to announce a new minor version of dplyr. This fixes a number of bugs that crashed R, and considerably improves the functionality of <code>select()</code>. You can now use named arguments to rename existing variables, and use new functions <code>starts_with()</code>, <code>ends_with()</code>, <code>contains()</code>, <code>matches()</code> and <code>num_range()</code> to select variables based on their names. Finally, <code>select()</code> now makes a shallow copy, substantially reducing its memory impact. I&rsquo;ve also added the <code>summarize()</code> alias for people from countries who don&rsquo;t spell correctly ;)</p><p>For a complete list of changes, please see the <a href="https://github.com/hadley/dplyr/releases/tag/v0.1.2">github release</a>, and as always, you can install the latest version with <code>install.packages(&quot;dplyr&quot;).</code></p></description></item><item><title>testthat 0.8 (and 0.8.1)</title><link>https://www.rstudio.com/blog/testthat-0-8/</link><pubDate>Tue, 25 Feb 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/testthat-0-8/</guid><description><p>We&rsquo;re pleased to announce a new major version of testthat. Version 0.8 comes with a new recommended structure for storing your tests. To better meet CRAN recommended practices, we now recommend that tests live in <code>tests/testthat</code>, instead of <code>inst/tests</code>. This makes it possible for users to choose whether or not to install tests. With this new structure, you&rsquo;ll need to use <code>test_check()</code> instead of <code>test_packages()</code> in the test file (usually <code>tests/testthat.R</code>) that runs all testthat unit tests.</p><p>Another big improvement comes from <a href="https://github.com/kforner">Karl Forner</a>. He contributed code which provides line numbers in test errors so you can see exactly where the problems are. There are also four new expectations (<code>expect_null()</code>, <code>expected_named()</code>, <code>expect_more_than()</code>, <code>expect_less_than()</code>) and many other minor improvements and bug fixes. For a complete list of changes, please see the <a href="https://github.com/hadley/testthat/releases/tag/v0.8">github release</a>. After release of 0.8 to CRAN, we discovered two small bugs. These were fixed in <a href="https://github.com/hadley/testthat/releases/tag/v0.8.1">0.8.1</a>.</p><p>As always, you can install the latest version with <code>install.packages(&quot;testthat&quot;)</code>.</p></description></item><item><title>dplyr 0.1.1</title><link>https://www.rstudio.com/blog/dplyr-0-1-1/</link><pubDate>Thu, 30 Jan 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/dplyr-0-1-1/</guid><description><p>We&rsquo;re pleased to announce a new minor version of dplyr. This fixes a few bugs that crashed R, adds a few minor new features (like a <code>sort</code> argument to <code>tally()</code>), and uses shallow copying in a few more places. There is one backward incompatible change: <code>explain_tbl()</code> has been renamed to <code>explain</code>. For a complete list of changes, please see the <a href="https://github.com/hadley/dplyr/releases/tag/v0.1.1">github release</a> notice.</p><p>As always, you can install the latest version with <code>install.packages(&quot;dplyr&quot;).</code></p></description></item><item><title>roxygen2 3.1.0</title><link>https://www.rstudio.com/blog/roxygen2-3-1-0/</link><pubDate>Thu, 30 Jan 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/roxygen2-3-1-0/</guid><description><p>We&rsquo;re pleased to announce a new version of roxygen2. The biggest news is that roxygen2 now recognises reference class method docstrings and will automatically add them to the documentation. 3.1.0 also offers a number of minor improvements and bug fixes, as listed on the <a href="https://github.com/klutometis/roxygen/releases/tag/v3.1.0">github release</a> notice.</p><p>As always, you can install the latest version with <code>install.packages(&quot;roxygen2&quot;).</code></p></description></item><item><title>Introducing dplyr</title><link>https://www.rstudio.com/blog/introducing-dplyr/</link><pubDate>Fri, 17 Jan 2014 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/introducing-dplyr/</guid><description><p><code>dplyr</code> is a new package which provides a set of tools for efficiently manipulating datasets in R. <code>dplyr</code> is the next iteration of <code>plyr</code>, focussing on only data frames. <code>dplyr</code> is faster, has a more consistent API and should be easier to use. There are three key ideas that underlie <code>dplyr</code>:</p><ol><li><p>Your time is important, so <a href="http://romainfrancois.blog.free.fr/">Romain Francois</a> has written the key pieces in <a href="http://www.rcpp.org/">Rcpp</a> to provide blazing fast performance. Performance will only get better over time, especially once we figure out the best way to make the most of multiple processors.</p></li><li><p>Tabular data is tabular data regardless of where it lives, so you should use the same functions to work with it. With <code>dplyr</code>, anything you can do to a local data frame you can also do to a remote database table. PostgreSQL, MySQL, SQLite and Google bigquery support is built-in; adding a new backend is a matter of implementing a handful of S3 methods.</p></li><li><p>The bottleneck in most data analyses is the time it takes for you to figure out what to do with your data, and dplyr makes this easier by having individual functions that correspond to the most common operations (<code>group_by</code>, <code>summarise</code>, <code>mutate</code>, <code>filter</code>, <code>select</code> and <code>arrange</code>). Each function does one only thing, but does it well.</p></li></ol><p>Lets compare <code>plyr</code> and <code>dplyr</code> with a little example, using the <code>Batting</code> dataset from the fantastic <a href="http://cran.us.r-project.org/web/packages/Lahman/"><code>Lahman</code></a> package which makes the complete Lahman baseball database easily accessible from R. Pretend we want to find the five players who have batted in the most games in all of baseball history.</p><p>In <code>plyr</code>, we might write code like this:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(Lahman)<span style="color:#06287e">library</span>(plyr)games <span style="color:#666">&lt;-</span> <span style="color:#06287e">ddply</span>(Batting, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">playerID&#34;</span>, summarise, total <span style="color:#666">=</span> <span style="color:#06287e">sum</span>(G))<span style="color:#06287e">head</span>(<span style="color:#06287e">arrange</span>(games, <span style="color:#06287e">desc</span>(total)), <span style="color:#40a070">5</span>)</code></pre></div><p>We use <code>ddply()</code> to break up the <code>Batting</code> dataframe into pieces according to the <code>playerID</code> variable, then apply <code>summarise()</code> to reduce the player data to a single row. Each row in <code>Batting</code> represents one year of data for one player, so we figure out the total number of games with <code>sum(G)</code> and save it in a new variable called <code>total</code>. We sort the result so the most games come at the top and then use <code>head()</code> to pull off the first five.</p><p>In <code>dplyr</code>, the code is similar:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">library</span>(Lahman)<span style="color:#06287e">library</span>(dplyr)players <span style="color:#666">&lt;-</span> <span style="color:#06287e">group_by</span>(Batting, playerID)games <span style="color:#666">&lt;-</span> <span style="color:#06287e">summarise</span>(players, total <span style="color:#666">=</span> <span style="color:#06287e">sum</span>(G))<span style="color:#06287e">head</span>(<span style="color:#06287e">arrange</span>(games, <span style="color:#06287e">desc</span>(total)), <span style="color:#40a070">5</span>)</code></pre></div><p>But now grouping is now a top level operation performed by <code>group_by()</code>, and <code>summarise()</code> works directly on the grouped data, rather than being called from inside another function. The other big difference is speed. <code>plyr</code> took about 7s on my computer, and <code>dplyr</code> took 0.2s, a 35x speed-up. This is common when switching from plyr to dplyr, and for many operations you&rsquo;ll see a 20x-1000x speedup.</p><p><code>dplyr</code> provides another innovation over <code>plyr</code>: the ability to chain operations together from left to right with the <code>%.%</code> operator. This makes <code>dplyr</code> behave a little like a grammar of data manipulation:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">Batting <span style="color:#666">%.%</span><span style="color:#06287e">group_by</span>(playerID) <span style="color:#666">%.%</span><span style="color:#06287e">summarise</span>(total <span style="color:#666">=</span> <span style="color:#06287e">sum</span>(G)) <span style="color:#666">%.%</span><span style="color:#06287e">arrange</span>(<span style="color:#06287e">desc</span>(total)) <span style="color:#666">%.%</span><span style="color:#06287e">head</span>(<span style="color:#40a070">5</span>)</code></pre></div><p>Read more about it in the help, <code>?&quot;%.%&quot;</code>.</p><p>If this small example has whet your interest, you can learn more from the built-in vignettes. First install <code>dplyr</code> with <code>install.packages(&quot;dplyr&quot;)</code>, then run:</p><ul><li><p><a href="http://cran.rstudio.com/web/packages/dplyr/vignettes/introduction.html"><code>vignette(&quot;introduction&quot;, package = &quot;dplyr&quot;)</code></a> to learn how the main verbs of <code>dplyr</code> work with data frames.</p></li><li><p><a href="http://cran.rstudio.com/web/packages/dplyr/vignettes/databases.html"><code>vignette(&quot;databases&quot;, package = &quot;dplyr&quot;)</code></a> to learn how to work with databases from dplyr.</p></li></ul><p>You can track development progress at <a href="http://github.com/hadley/dplyr">http://github.com/hadley/dplyr</a>, report bugs at <a href="http://github.com/hadley/dplyr/issues">http://github.com/hadley/dplyr/issues</a> and get help with data manipulation challenges at <a href="https://groups.google.com/group/manipulatr">https://groups.google.com/group/manipulatr</a>. If you ask a question specifically about <code>dplyr</code> on StackOverflow, please tag it with <code>dplyr</code> and I&rsquo;ll make sure to read it.</p></description></item><item><title>roxygen2 3.0.0</title><link>https://www.rstudio.com/blog/roxygen2-3-0-0/</link><pubDate>Mon, 09 Dec 2013 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/roxygen2-3-0-0/</guid><description><p>We&rsquo;re pleased to announce a new version of roxygen2. The biggest news is that you can painlessly document your S4 classes, S4 methods and RC classes with roxygen2 - you can safely remove workarounds that used <code>@alias</code> and <code>@usage</code>, and simply rely on roxygen2 to do the right thing. Roxygen2 is also much smarter when it comes to S3: you can remove existing uses of <code>@method</code>, and can replace <code>@S3method</code> with <code>@export</code>.</p><p>Version 3.0 also includes many other improvements including better generation of usage, the ability to turn off wrapping in your Rd files and choose default roclets for a package, a safer <code>roxygenise()</code> (or <code>roxyngenize()</code> if you prefer) and many other bug fixes and improvement. See the full list on the <a href="https://github.com/klutometis/roxygen/releases/tag/v3.0.0">github release</a>.</p><p>As always, you can install the latest version with <code>install.packages(&quot;roxygen2&quot;)</code></p></description></item><item><title>New Version of RStudio (v0.98)</title><link>https://www.rstudio.com/blog/new-version-of-rstudio-v0-98/</link><pubDate>Tue, 03 Dec 2013 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/new-version-of-rstudio-v0-98/</guid><description><p>We&rsquo;re pleased to announce that the final version of RStudio v0.98 is <a href="https://www.rstudio.com/ide/download">available for download</a> now. Highlights of the new release include:</p><ul><li><p>An <a href="https://www.rstudio.com/ide/docs/debugging/overview">interactive debugger</a> for R that is tightly integrated with base R debugging tools (browser, recover, etc.)</p></li><li><p>Numerous improvements to the Workspace pane (which is now called the Environment pane).</p></li><li><p><a href="https://www.rstudio.com/ide/docs/presentations/overview">R Presentations</a> for easy authoring of HTML5 presentations that include R code, output, and graphics.</p></li><li><p>A new <a href="https://www.rstudio.com/ide/docs/advanced/viewer_pane">Viewer pane</a> for displaying local web content (e.g. graphical output from packages like googleVis).</p></li><li><p>Additional support for developing and running <a href="https://www.rstudio.com/shiny">Shiny</a> web applications.</p></li><li><p>Substantially improved UI performance on Mac OS X.</p></li><li><p>A <a href="https://www.rstudio.com/ide/server/">Professional Edition</a> of RStudio Server with many new capabilities for enterprise deployment.</p></li></ul><p>There are also lots of smaller improvements and bug fixes across the product, check out the <a href="https://www.rstudio.com/ide/docs/release_notes_v0.98.html">release notes</a> for full details.</p><h3 id="debugging-tools">Debugging Tools</h3><p>The feature we&rsquo;re most excited about is the addition of a full interactive debugger to the IDE. Noteworthy capabilities of the debugger include:</p><ul><li><p>Setting breakpoints within the source editor, both inside and outside functions</p></li><li><p>Stepping through code line by line</p></li><li><p>Inspecting object values and the call stack during debugging</p></li><li><p>An error inspector for quick access to tracebacks and the debugger after runtime errors</p></li><li><p>Tight integration with traditional R debugging tools, such as <code>browser()</code> and <code>debug()</code></p></li></ul><p>Here&rsquo;s a screenshot of the IDE after hitting an editor breakpoint:</p><p><img src="https://rstudioblog.files.wordpress.com/2013/09/rstudiodebugger.png" alt="RStudioDebugger"></p><p>For more details on how to take advantage of the new debugging tools, see <a href="https://www.rstudio.com/ide/docs/debugging/overview">Debugging with RStudio</a>.</p><h3 id="environment-pane">Environment Pane</h3><p>The Workspace pane is now called the Environment pane and has numerous improvements, including:</p><ul><li><p>Browse any environment on the search path</p></li><li><p>Filtering by name/value</p></li><li><p>Expand lists, data frames, and S4 objects inline</p></li><li><p>Use <code>str()</code> to display object values</p></li><li><p>Optional grid view sortable by various attributes</p></li><li><p>Many other small correctness and robustness enhancements</p></li></ul><h3 id="r-presentations">R Presentations</h3><p>R Presentations enable easy authoring of HTML5 presentations. R Presentations are based on <a href="https://www.rstudio.com/ide/docs/authoring/using_markdown.html">R Markdown</a>, and include the following features:</p><ul><li><p>Easy authoring of HTML5 presentations based on <a href="https://www.rstudio.com/ide/docs/authoring/using_markdown.html">R Markdown</a></p></li><li><p>Extensive support for authoring and previewing inside the IDE</p></li><li><p>Many options for customizing layout and appearance</p></li><li><p>Publishing as either a standalone HTML file or to <a href="http://rpubs.com/">RPubs</a></p></li></ul><p>Here&rsquo;s a screenshot showing a simple presentation being authored and previewed within the IDE:</p><p><img src="https://rstudioblog.files.wordpress.com/2013/09/rpresentations1.png" alt="RPresentations"></p><p>For more details see the documentation on <a href="https://www.rstudio.com/ide/docs/presentations/overview">Authoring R Presentations</a>.</p><h3 id="viewer-pane">Viewer Pane</h3><p>RStudio now includes a Viewer pane that can be used to view local web content. This includes both static web content or even a local web application created using <a href="https://www.rstudio.com/shiny">Shiny</a>, <a href="http://cran.rstudio.com/web/packages/Rook/index.html">Rook</a>, or <a href="https://public.opencpu.org/">OpenCPU</a>. This is especially useful for packages that have R bindings to Javascript data visualization libraries.</p><p>The <a href="http://lamages.blogspot.com/2013/11/googlevis-047-with-rstudio-integration.html">googleVis</a> and <a href="http://www.youtube.com/watch?v=wi2fUKqHtpM">rCharts</a> packages have already been updated to take advantage of the Viewer pane. Here&rsquo;s a screenshot of the googleVis integration:</p><p><img src="https://rstudioblog.files.wordpress.com/2013/11/googlevis1.png" alt="googleVis"></p><p>We&rsquo;re hopeful that there will be many more compelling uses of the Viewer. For more details see the article <a href="https://www.rstudio.com/ide/docs/advanced/viewer_pane">Extending RStudio with the Viewer Pane</a>.</p><h3 id="shiny-integration">Shiny Integration</h3><p>We&rsquo;ve added a number of features to support development of <a href="https://www.rstudio.com/shiny/">Shiny</a> web applications, including:</p><ul><li><p>The ability to develop and run Shiny applications on RStudio Server (localhost and websocket proxying is handled automatically)</p></li><li><p>Running Shiny applications within an IDE pane (see the discussion of the Viewer pane below for details)</p></li><li><p>Create a new Shiny application from within the New Project dialog</p></li><li><p>Debugging of Shiny applications using the new RStudio debugging tools.</p></li></ul><h3 id="mac-ui-framework">Mac UI Framework</h3><p>In RStudio v0.98 we also migrated our Mac <a href="http://en.wikipedia.org/wiki/WebKit">WebKit</a> engine from a cross-platform framework (Qt) to <a href="http://en.wikipedia.org/wiki/Cocoa_(API)">Cocoa</a>. The original motivation for this was compatibility problems between Qt and OS X Mavericks, but as it turned out the move to Cocoa WebKit yielded substantially faster editor, scrolling, layout, and graphics performance across the board. If you are a Mac user you&rsquo;ll find everything about the product snappier in v0.98.</p><p>In the next major version of RStudio we&rsquo;re hoping to make comparable improvements in performance on both Linux and Windows by using a more modern WebKit on those platforms as well.</p><h3 id="rstudio-server-professional-edition">RStudio Server Professional Edition</h3><p>Over the years we&rsquo;ve gotten lots of feedback from larger organizations deploying RStudio Server on the features they&rsquo;d like to see for production deployments of the server. With RStudio v0.98 we&rsquo;re introducing a new Professional Edition of RStudio Server that incorporates much of this feedback. Highlights include:</p><ul><li><p>An administrative dashboard that provides insight into active sessions, server health, and monitoring of system-wide and per-user performance and resource metrics.</p></li><li><p>Authentication using system accounts, ActiveDirectory, LDAP, or Google Accounts.</p></li><li><p>Full support for PAM (including PAM sessions for dynamically provisioning user resources).</p></li><li><p>Ability to establish per-user or per-group CPU priorities and memory limits.</p></li><li><p>HTTP enhancements including support for SSL and keep-alive for improved performance.</p></li><li><p>Ability to restrict access to the server by IP.</p></li><li><p>Customizable server health checks.</p></li><li><p>Suspend, terminate, or assume control of user sessions.</p></li><li><p>Impersonate users for assistance and troubleshooting.</p></li></ul><p>The RStudio Server product page has <a href="https://www.rstudio.com/ide/server/">full details</a> on the Professional Edition, and an <a href="https://www.rstudio.com/ide/download/server-pro-evaluation.html">evaluation version</a> of the server is also available for download.</p><h3 id="new-support-site">New Support Site</h3><p>With this release we&rsquo;re also introducing a brand new support and documentation website, please <a href="http://support.rstudio.com">visit us</a> there with questions, feedback, as well as what other improvements you&rsquo;d like to see in the product.</p></description></item><item><title>Shiny Server 0.4</title><link>https://www.rstudio.com/blog/shiny-server-0-4/</link><pubDate>Tue, 03 Dec 2013 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-server-0-4/</guid><description><p>Today, we&rsquo;re excited to announce the release of <a href="http://rstudio.com/shiny/server/">Shiny Server version 0.4</a> as well as the availability of a beta version of Shiny Server Professional Edition.</p><p>Shiny Server is a platform for hosting Shiny Applications over the Web and has undergone substantial work in the past few months. We have fixed many bugs, added stability enhancements, and have created <a href="https://www.rstudio.com/shiny/server/">pre-built installers</a> for Ubuntu 12.04 (and later) and RedHat/CentOS 5 and 6. The new installers will drastically simplify the process of installing and configuring Shiny Server on these distributions. For other platforms you can use the updated <a href="https://github.com/rstudio/shiny-server/wiki/Building-Shiny-Server-from-Source">instructions to build from source</a>.</p><p>Important note for current Shiny Server users: We are no longer relying on npm to distribute the software. If you had previously installed version 0.3.x using npm, you must uninstall the old version before upgrading. Follow <a href="http://rstudio.github.io/shiny-server/latest/#upgrading-from-shiny-server-0.3.5">these instructions</a> to uninstall the old version before upgrading to the new.</p><p>We hope this new version will allow you to deploy your Shiny applications even more efficiently. Please reach out on the <a href="https://groups.google.com/forum/?fromgroups=#!forum/shiny-discuss">mailing list</a> to let us know what you think or if you have any problems.</p><h3 id="shiny-server-pro-beta">Shiny Server Pro beta</h3><p>We&rsquo;ve recently begun beta testing of Shiny Server Professional Edition. This product adds features that make it easier for an enterprise to scale, tune, monitor and receive support for production environments. Shiny Server Pro will include the ability to configure a Shiny application with more than one process, and control the number of concurrent users per application. It adds an administrative dashboard to monitor and gain insight into your applications, and includes integrations with a variety of authentication systems including LDAP and Active Directory.</p><p>If you&rsquo;re interested in finding out more about Shiny Server Pro, or being a participant in our beta please register <a href="https://www.rstudio.com/shiny/server/pro">here</a>.</p><p>Thank you for your help in making Shiny Server a better product. We hope you enjoy Shiny Server 0.4 and look forward to getting your feedback.</p></description></item><item><title>Upcoming courses: Dec 2013</title><link>https://www.rstudio.com/blog/upcoming-courses-dec-2013/</link><pubDate>Mon, 02 Dec 2013 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/upcoming-courses-dec-2013/</guid><description><p>We&rsquo;re pleased to announce two upcoming in-person training opportunities:</p><ul><li><p><a href="http://goo.gl/n5REEj">Advanced R programming</a>. SF, Dec 16-17.Learn the most important topics from <a href="http://adv-r.had.co.nz">advanced R programming</a> in person. One day one, you&rsquo;ll learn about metaprograming, functional programming and object oriented programming in R, as well general best practices for programming. Taught by Hadley Wickham, RStudio&rsquo;s Chief Scientist.</p></li><li><p><a href="http://goo.gl/6PHJpS">Public workshop</a>. Boston, Jan 27-28.In this two-day workshop, you&rsquo;ll get a comprehensive introduction to R, and you&rsquo;ll be visualising, manipulating and modeling data in no time. Taught by Garrett Grolemund, RStudio Master Instructor</p></li></ul><p>We have discounts available for students (66%) and academics (33%). Please contact <a href="mailto:josh@rstudio.com">Josh Paulson</a> for details.</p></description></item><item><title>devtools 1.4 now available</title><link>https://www.rstudio.com/blog/devtools-1-4/</link><pubDate>Wed, 27 Nov 2013 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/devtools-1-4/</guid><description><p>We&rsquo;re very pleased to announce the release of devtools 1.4. This version brings many improvements to package installation, including automated vignette building, and a better way of referring to repos on github, <code>install_github(&quot;hadley/devtools&quot;)</code>. There are also many other bug fixes and minor improvements; to see them all, please read the <a href="https://github.com/hadley/devtools/releases/tag/devtools-1.4">release notes</a> file on github.</p></description></item><item><title>Shiny 0.8.0 released; Yihui Xie joins RStudio</title><link>https://www.rstudio.com/blog/shiny-0-8-0-released/</link><pubDate>Fri, 15 Nov 2013 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-0-8-0-released/</guid><description><p>We&rsquo;re very pleased to announce <a href="http://rstudio.com/shiny/">Shiny 0.8.0</a> (which actually went up on CRAN about two weeks ago). This release features a vastly better way to display tabular data, and new debugging tools that make it much easier to fix errors in your app.</p><h2 id="datatables-support">DataTables support</h2><p><img src="https://rstudioblog.files.wordpress.com/2013/11/datatables1.png" alt="datatables"></p><p>We now support much more attractive and powerful displays of tabular data using the popular <a href="http://datatables.net/">DataTables</a> library. Our DataTables integration features pagination, searching/filtering, sorting, and more. Check out <a href="http://glimmer.rstudio.com/yihui/12_datatables/">this demo</a> to see it in action, and learn more about how to use it in your own apps by visiting the tutorial&rsquo;s <a href="http://rstudio.github.io/shiny/tutorial/#datatables">chapter on DataTables</a>.</p><h2 id="debugging-tools">Debugging tools</h2><p>In <a href="http://cran.rstudio.com/web/packages/shiny/">version 0.8.0</a> of the <a href="http://rstudio.com/shiny/">Shiny</a> package, we&rsquo;ve greatly improved the set of debugging tools you can use with your Shiny apps. It&rsquo;s now much easier to figure out what&rsquo;s happening when things go wrong, thanks to two new features:</p><ul><li><p>Integration with the new visual debugger that&rsquo;s available with <a href="https://www.rstudio.com/ide/download/preview">RStudio v0.98</a>. You can set breakpoints and step through your code much more easily than before.</p></li><li><p>A new option &lsquo;shiny.error&rsquo; which can take a function as an error handler. It is called when an error occurs in a reactive observer (e.g. when running an output rendering function). You can use options(shiny.error=traceback) to simply print a traceback, options(shiny.error=recover) for debugging from a regular R console, or options(shiny.error=browser) to jump into the RStudio visual debugger.</p></li></ul><p>There have also been a few smaller tweaks and bug fixes. For the full list, you can take a look at our <a href="http://cran.rstudio.com/web/packages/shiny/NEWS">NEWS file</a>.</p><h2 id="welcome-yihui-xie">Welcome, Yihui Xie!</h2><p>If you&rsquo;re reading this, there&rsquo;s a good chance you have heard of <a href="http://yihui.name">Yihui Xie</a> or have used his software; during his time as a PhD student at Iowa State University, he created the <a href="http://yihui.name/knitr/">knitr</a>, <a href="http://cranvas.org/">cranvas</a>, and <a href="http://cran.r-project.org/web/packages/animation/index.html">animation</a> packages, among others.</p><p>We&rsquo;re thrilled to announce that Yihui has joined the RStudio team! He will be one of the primary maintainers of the Shiny package and has already contributed some great improvements in the short time he has been with us.</p></description></item><item><title>Announcing Packrat</title><link>https://www.rstudio.com/blog/announcing-packrat/</link><pubDate>Thu, 14 Nov 2013 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-packrat/</guid><description><p>We&rsquo;re excited to announce <a href="http://rstudio.github.io/packrat/">Packrat</a>, a new tool for managing the packages your R project depends on.</p><p>If you&rsquo;ve ever been frustrated by package dependencies, whether juggling the packages needed by your own projects or getting someone else&rsquo;s project to work, Packrat is for you. Similar in spirit to <a href="http://bundler.io/">Bundler</a>, Packrat understands package dependencies and manages them inside a private, project-specific library.</p><p>Packrat makes your project more isolated, portable, and reproducible. Because your project&rsquo;s package dependencies travel with it, you control the environment in which your code runs. Your results are easy to duplicate on other machines, whether your own or your collaborators&rsquo;.</p><p><a href="http://vimeo.com/79537844">http://vimeo.com/79537844</a></p><p>We built Packrat to help us create self-sufficient R projects for deployment, but we think it has many other use cases. Lots more information, including installation instructions, can be found at the Packrat project page:</p><p><a href="http://rstudio.github.io/packrat/">Packrat: Reproducible package management for R</a>.</p><p>If you try it, we&rsquo;d love to get your feedback. Leave a comment here or post in the <a href="https://groups.google.com/forum/#!forum/packrat-discuss">packrat-discuss Google group</a>.</p></description></item><item><title>Announcing Packrat</title><link>https://www.rstudio.com/blog/announcing-packrat/</link><pubDate>Thu, 14 Nov 2013 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-packrat/</guid><description><p>We&rsquo;re excited to announce <a href="http://rstudio.github.io/packrat/">Packrat</a>, a new tool for managing the packages your R project depends on.</p><p>If you&rsquo;ve ever been frustrated by package dependencies, whether juggling the packages needed by your own projects or getting someone else&rsquo;s project to work, Packrat is for you. Similar in spirit to <a href="http://bundler.io/">Bundler</a>, Packrat understands package dependencies and manages them inside a private, project-specific library.</p><p>Packrat makes your project more isolated, portable, and reproducible. Because your project&rsquo;s package dependencies travel with it, you control the environment in which your code runs. Your results are easy to duplicate on other machines, whether your own or your collaborators&rsquo;.</p><p><a href="http://vimeo.com/79537844">http://vimeo.com/79537844</a></p><p>We built Packrat to help us create self-sufficient R projects for deployment, but we think it has many other use cases. Lots more information, including installation instructions, can be found at the Packrat project page:</p><p><a href="http://rstudio.github.io/packrat/">Packrat: Reproducible package management for R</a>.</p><p>If you try it, we&rsquo;d love to get your feedback. Leave a comment here or post in the <a href="https://groups.google.com/forum/#!forum/packrat-discuss">packrat-discuss Google group</a>.</p></description></item><item><title>RStudio OS X Mavericks Issues Resolved</title><link>https://www.rstudio.com/blog/rstudio-os-x-mavericks-issues-resolved/</link><pubDate>Tue, 12 Nov 2013 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-os-x-mavericks-issues-resolved/</guid><description><p>When OS X Mavericks was released last month we were very disappointed to discover a compatibility issue between <a href="http://qt-project.org/">Qt </a>(our cross-platform user interface toolkit) and OS X Mavericks that resulted in extremely poor graphics performance.</p><p>We now have an updated preview version of RStudio for OS X (v0.98.475) that not only overcomes these issues, but also improves editor, scrolling, and layout performance across the board on OS X (more details below if you are curious):</p><p><a href="https://www.rstudio.com/ide/download/preview">https://www.rstudio.com/ide/download/preview</a></p><p>We were initially optimistic that we could patch Qt to overcome the problems but even with some help from Digia (the organization behind Qt) we never got acceptable performance. Running out of viable options based on Qt, we decided to bypass Qt entirely by implementing the RStudio desktop frame as a native <a href="https://developer.apple.com/technologies/mac/cocoa.html">Cocoa</a> application.</p><p>OS X Mavericks issues aside, we are thrilled with the result of using Cocoa rather than a cross-platform toolkit. RStudio desktop uses <a href="http://www.webkit.org/">WebKit</a> to render its user-interface, and the Cocoa <a href="https://developer.apple.com/library/mac/documentation/cocoa/reference/webkit/objc_classic/_index.html">WebKit Framework</a> is substantially faster than the one in Qt.</p><p>Please try out the updated preview and let us know if you encounter any issues or problems on our <a href="http://support.rstudio.org">support forum</a>. For those that prefer to wait for the final release of v0.98 we expect that to happen sometime during the next couple of weeks.</p></description></item><item><title>RStudio and OS X 10.9 Mavericks</title><link>https://www.rstudio.com/blog/rstudio-and-os-x-10-9-mavericks/</link><pubDate>Tue, 22 Oct 2013 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-and-os-x-10-9-mavericks/</guid><description><p><strong>UPDATE</strong>: <a href="https://blog.rstudio.com/2013/11/12/rstudio-os-x-mavericks-issues-resolved/">RStudio OS X Mavericks Issues Resolved</a></p><p>This post is now out of date (see link above for information on getting a version of RStudio that works with OS X Mavericks).</p><hr><p>Today Apple released <a href="http://www.apple.com/osx/">OS X 10.9 &ldquo;Mavericks&rdquo;</a>. If you are a Mac user and considering updating to the new OS there are some RStudio compatibility issues to consider before you update.</p><p>As a result of a problem between Mavericks and the user interface toolkit underlying RStudio (<a href="http://qt-project.org/">Qt</a>) the RStudio IDE is very slow in painting and user interactions when running under Mavericks. We are following up with both Qt and Apple on resolving the compatibility issue. In the meantime there is a workaround available in the v0.98.443 release of RStudio that can be downloaded here:</p><p><a href="https://www.rstudio.com/ide/download/preview">https://www.rstudio.com/ide/download/preview</a></p><p>This version of RStudio detects when it is running on OS X Mavericks and in that case bypasses the use of Qt. Rather, a version of RStudio Server is run locally and connected to by a special RStudioIDE browser window. There are several differences you&rsquo;ll notice when running in this mode:</p><ol><li><p>Only one instance of RStudio can be run at a time.</p></li><li><p>The Mac native menubar is not used. Rather, the main menu appears inside the RStudio frame.</p></li><li><p>Mac native file open and save dialogs are not used. Rather, internal versions of the dialogs are used.</p></li><li><p>Finder file associations activate RStudio however don&rsquo;t open the targeted file(s).</p></li><li><p>The copy plot to clipboard function is not available.</p></li><li><p>During a shutdown of Mac OS X when RStudio is running the current project&rsquo;s Workspace is not saved automatically (however source files are).</p></li></ol><p>We&rsquo;re hoping that the underlying problem in OS X 10.9 is resolved in a future update or alternatively the Qt toolkit is updated to address the issue. If and when that occurs we&rsquo;ll release a new version of RStudio that restores the previous RStudio behavior on OS X 10.9.</p></description></item><item><title>RStudio v0.98 Preview (Debugging Tools and More)</title><link>https://www.rstudio.com/blog/rstudio-v0-98-preview/</link><pubDate>Tue, 24 Sep 2013 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-v0-98-preview/</guid><description><p>We&rsquo;re very pleased to announce that a <a href="https://www.rstudio.com/ide/download/preview">preview release</a> of RStudio IDE v0.98 is available for download now. Major highlights of the new release include debugging tools, many improvements to environment/workspace browsing, and a new way to create HTML5 presentations using R Markdown. As usual there are also many small improvements and bug fixes. We&rsquo;ll talk about some of the more interesting new features below, otherwise check out the <a href="https://www.rstudio.com/ide/docs/release_notes_v0.98.html">release notes</a> for full details.</p><h3 id="debugging-tools">Debugging Tools</h3><p>We&rsquo;ve done lots of work to add R debugging tools to the IDE, including:</p><ul><li><p>Setting breakpoints within the source editor, both inside and outside functions</p></li><li><p>Stepping through code line by line</p></li><li><p>Inspecting object values and the call stack during debugging</p></li><li><p>An error inspector for quick access to tracebacks and the debugger after runtime errors</p></li><li><p>Tight integration with traditional R debugging tools, such as <code>browser()</code> and <code>debug()</code></p></li></ul><p>Here&rsquo;s a screenshot of the IDE after hitting an editor breakpoint:</p><p><img src="https://rstudioblog.files.wordpress.com/2013/09/rstudiodebugger.png" alt="RStudioDebugger"></p><p>Note that execution is stopped at the specified breakpoint, the environment is updated to show the objects within the context where execution was stopped, and commands for line by line stepping, continuing, and aborting the debug session appear in the console.</p><p>For more details on how to take advantage of the new tools, see <a href="https://www.rstudio.com/ide/docs/debugging/overview">Debugging with RStudio</a>.</p><h3 id="r-presentations">R Presentations</h3><p>R Presentations enable easy authoring of HTML5 presentations. R Presentations are based on <a href="https://www.rstudio.com/ide/docs/authoring/using_markdown.html">R Markdown</a>, and include the following features:</p><ul><li><p>Easy authoring of HTML5 presentations based on <a href="https://www.rstudio.com/ide/docs/authoring/using_markdown.html">R Markdown</a></p></li><li><p>Extensive support for authoring and previewing inside the IDE</p></li><li><p>Many options for customizing layout and appearance</p></li><li><p>Publishing as either a standalone HTML file or to <a href="http://rpubs.com/">RPubs</a></p></li></ul><p>Here&rsquo;s a screenshot showing a simple presentation being authored and previewed within the IDE:</p><p><img src="https://rstudioblog.files.wordpress.com/2013/09/rpresentations1.png" alt="RPresentations"></p><p>For more details see the documentation on <a href="https://www.rstudio.com/ide/docs/presentations/overview">Authoring R Presentations</a>.</p><h3 id="rstudio-server-pro">RStudio Server Pro</h3><p>With RStudio v0.98 we&rsquo;ve added a new Professional Edition of RStudio Server. New features in RStudio Server Pro include:</p><ul><li><p>An administrative dashboard that provides insight into active sessions, server health, and monitoring of system-wide and per-user performance and resource metrics.</p></li><li><p>Authentication using system accounts, ActiveDirectory, LDAP, or Google Accounts.</p></li><li><p>Full support for PAM (including PAM sessions for dynamically provisioning user resources).</p></li><li><p>Ability to establish per-user or per-group CPU priorities and memory limits.</p></li><li><p>HTTP enhancements including support for SSL and keep-alive for improved performance.</p></li><li><p>Ability to restrict access to the server by IP.</p></li><li><p>Customizable server health checks.</p></li><li><p>Suspend, terminate, or assume control of user sessions.</p></li><li><p>Impersonate users for assistance and troubleshooting.</p></li></ul><p>The Professional Edition also includes priority support and a commercial license. You can get more details as well as download a free 45-day evaluation version from the <a href="https://www.rstudio.com/ide/download/pro-preview">RStudio Server Professional Preview</a> page.</p><h3 id="whats-next">What&rsquo;s Next</h3><p>The <a href="https://www.rstudio.com/ide/download/preview">preview release</a> is feature complete and we expect to release the final version of v0.98 during the next few weeks. After that we&rsquo;ll be focusing on adding features to make it easier to develop and deploy <a href="https://www.rstudio.com/shiny/">Shiny web applications</a> and expect another release with those features before the end of the year.</p></description></item><item><title>The RStudio CRAN mirror</title><link>https://www.rstudio.com/blog/rstudio-cran-mirror/</link><pubDate>Mon, 10 Jun 2013 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-cran-mirror/</guid><description><p>RStudio maintains its own CRAN mirror, <a href="http://cran.rstudio.com">http://cran.rstudio.com</a>. The server itself is a virtual machine run by Amazon&rsquo;s EC2 service, and it syncs with the main CRAN mirror in Austria once per day. When you contact <a href="http://cran.rstudio.com">http://cran.rstudio.com</a>, however, you&rsquo;re probably not talking to our CRAN mirror directly. That&rsquo;s because we use <a href="http://aws.amazon.com/cloudfront/">Amazon CloudFront</a>, a <a href="http://en.wikipedia.org/wiki/Content_delivery_network">content delivery network</a>, which automatically distributes the content to locations <a href="http://aws.amazon.com/cloudfront/#details">all over the world</a>. When you try to download a package from the Rstudio cloud mirror, it&rsquo;ll be retrieved from a local CloudFront cache instead of the CRAN mirror itself. That means that, no matter where you are in the world, the data doesn&rsquo;t need to travel very far, and so is fast to download.</p><p>To back this up with some data, we asked some friends to time downloads from all the CRAN mirrors. The RStudio mirror was not always the fastest (especially if you have a mirror nearby), but it was consistently fast around the world. (If you think you could improve on our testing methodology, the scripts and raw data are available at <a href="https://gist.github.com/hadley/5420147">https://gist.github.com/hadley/5420147</a> - let us know what you come up with!)</p><p>You can use our mirror, even if you don&rsquo;t use RStudio. (If you haven&rsquo;t deliberately chosen a CRAN mirror in RStudio, we&rsquo;ll use ours by default). It&rsquo;s the first one in the list of mirrors (&ldquo;0-Cloud&rdquo;), or if you don&rsquo;t want to select it every time you install a package, you can it as the default in your .Rprofile:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">options</span>(repos <span style="color:#666">=</span> <span style="color:#06287e">c</span>(CRAN <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">http://cran.rstudio.com&#34;</span>))</code></pre></div><p>Of course, speed isn&rsquo;t the only factor you want to consider when choosing a mirror. Another important factor is reliability: is the mirror always available, and how often is it updated? CRAN provides the useful <a href="http://cran.r-project.org/mirmon_report.html">mirror monitoring report</a>. Running a mirror is easy (it&rsquo;s just a simple script run every few hours), so it&rsquo;s a warning flag if a mirror has any non-green squares. We care about the availability of our mirror, and if it ever does go down, we&rsquo;ll endeavour to fix it as quickly as possible.</p><p>Finally, because every download from a CRAN mirror is logged, CRAN mirrors provide a rich source of data about R and package usage. To date, it&rsquo;s been hard to get access to this data. We wanted to change that, so you can now download our anonymised log data from <a href="http://cran-logs.rstudio.com">cran-logs.rstudio.com</a>. We&rsquo;ve tried to strike a balance between utility and privacy. We&rsquo;ve parsed the raw log data into fields that mean something to R users (like r version, architecture and os). The IP address is potentially revealing, so we&rsquo;ve replaced it with a combination of country and a unique id within each day. This should make it possible to explore download patterns without undermining the privacy of the mirror users.</p><pre><code> date time size r_version r_arch r_os date time size r_version r_arch r_os1 2013-01-01 00:18:22 551371 2.15.2 x86_64 darwin9.8.02 2013-01-01 00:43:47 220277 2.15.2 x86_64 mingw323 2013-01-01 00:43:51 3505851 2.15.2 x86_64 mingw324 2013-01-01 00:43:53 761107 2.15.2 x86_64 mingw325 2013-01-01 00:31:15 187381 2.15.2 i686 linux-gnu6 2013-01-01 00:59:46 2388932 2.15.2 x86_64 mingw32package version country ip_id1 knitr 0.9 RU 12 R.devices 2.1.3 US 23 PSCBS 0.30.0 US 24 R.oo 1.11.4 US 25 akima 0.5-8 US 36 spacetime 1.0-3 VN 4</code></pre><p>Altogether, there&rsquo;s currently around 150 megs of gzipped log files, representing over 7,000,000 package downloads. We&rsquo;re looking forward to seeing what the R community does with this data, and we&rsquo;ll highlight particularly interesting analyses in a future blog post. If you have any problems using the data, or you&rsquo;d like to highlight a particularly interesting result, please feel free to <a href="mailto:hadley@rstudio.com">email me</a>.</p></description></item><item><title>Version 1.2 of devtools released</title><link>https://www.rstudio.com/blog/devtools-1-2/</link><pubDate>Wed, 17 Apr 2013 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/devtools-1-2/</guid><description><p>We&rsquo;re very pleased to announce the release of devtools 1.2. This version continues to make working with packages easier by increasing installation speed (skipping the build step unless <code>local = FALSE</code>), enhancing vignette handling (to support the non-Sweave vignettes available in R 3.0.0), and providing better default compiler flags for C and C++ code.</p><p>Also new in this release is the <code>sha</code> argument to <code>source_url</code> and <code>source_gist</code>. If provided, this checks that the file you download is what your expected, and is an important safety feature when running scripts over the web.</p><p>Devtools 1.2 contains many other bug fixes and minor improvements; to see them all, please read the <a href="https://github.com/hadley/devtools/blob/master/NEWS">NEWS</a> file on github.</p></description></item><item><title>Shiny 0.4.0 now available</title><link>https://www.rstudio.com/blog/shiny-0-4-0-now-available/</link><pubDate>Fri, 22 Feb 2013 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-0-4-0-now-available/</guid><description><p>Shiny version 0.4.0 is now available on <a href="http://cran.r-project.org/web/packages/shiny/index.html">CRAN</a>. The most visible change is that the API has been slightly simplified. Your existing code will continue to work, although Shiny will print messages about how to migrate your code. Migration should be straightforward, as described below. It will take a bit of work to switch to the new API, but we think it&rsquo;s worth it in the long run, because the new interface is somewhat simpler, and because it offers a better mapping between function names and reactive programming concepts.</p><p>We&rsquo;ve also updated the <a href="http://rstudio.github.com/shiny/tutorial/">Shiny tutorial</a> to reflect the changes, and we&rsquo;ve also added a some new content explaining Shiny&rsquo;s reactive programming model in depth. If you want to have a better understanding of how Shiny works, see the sections under <em>Understanding Reactivity</em>, starting with the <a href="http://rstudio.github.com/shiny/tutorial/#reactivity-overview">Reactivity Overview</a>.</p><p>Another new feature is that Shiny now suspends outputs when they aren&rsquo;t visible on the user&rsquo;s web browser. For example, if your Shiny application has multiple tabs or conditional panels, Shiny will only run the calculations and send data for the currently-visible tabs and panels. This new feature will reduce network traffic and computational load on the server, resulting in a faster application.</p><!-- more --><p>Here&rsquo;s what has changed, and how to migrate to the new API:</p><ul><li><p><code>reactive()</code> takes expressions as input, instead of functions.<strong>Old style:</strong> <code>reactive(function() { ... })</code><strong>New style:</strong> <code>reactive({ ... })</code></p></li><li><p><code>reactiveText</code>, <code>reactivePlot</code>, and so on, have been replaced with <code>renderText</code>, <code>renderPlot</code>, etc. They also now take expressions instead of functions.<strong>Old style:</strong> <code>reactiveText(function() { ... })</code><strong>New style:</strong> <code>renderText({ ... })</code></p></li><li><p><code>observe()</code> also takes expressions instead of functions:<strong>Old style</strong> <code>observe(function() { ... })</code><strong>New style:</strong> <code>observe({ ... })</code></p></li></ul><p>If for some reason you want to save an unevaluated expression in a variable and then give it to, <code>reactive()</code>, <code>renderText()</code>, and so on, you can quote the expression and then use the <code>quote=TRUE</code> option:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">my_expr <span style="color:#666">&lt;-</span> <span style="color:#06287e">quote</span>({ input<span style="color:#666">$</span>num <span style="color:#666">+</span> <span style="color:#40a070">1</span> })<span style="color:#06287e">renderText</span>(my_expr, quote<span style="color:#666">=</span><span style="color:#007020;font-weight:bold">TRUE</span>)</code></pre></div><p>If you still have any issues migrating, please feel free to ask questions on the <a href="https://groups.google.com/forum/?fromgroups#!forum/shiny-discuss">Shiny-discuss</a> mailing list.</p></description></item><item><title>Shiny 0.3.0 released</title><link>https://www.rstudio.com/blog/shiny-0-3-0-released-2/</link><pubDate>Fri, 25 Jan 2013 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-0-3-0-released-2/</guid><description><p>Version 0.3.0 of Shiny is now available on CRAN. This version of Shiny has several new features and bug fixes. Some of the changes are under the hood: for example, Shiny now uses a more efficient algorithm for scheduling the execution of reactive functions. There are also some user-facing changes: for example, the new <code>runGitHub()</code> function lets you download and run applications directly from a GitHub repository.</p><p>We&rsquo;ve updated the <a href="http://rstudio.github.com/shiny/tutorial/">tutorial page</a> with documentation about these and other features. For a full list of changes, see the the <a href="http://cran.r-project.org/web/packages/shiny/NEWS">NEWS file</a>.</p><p>You can install the new version of Shiny with <code>install.packages('shiny')</code>.</p></description></item><item><title>Version 1.0 of devtools released!</title><link>https://www.rstudio.com/blog/devtools-1-0/</link><pubDate>Wed, 23 Jan 2013 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/devtools-1-0/</guid><description><p>We&rsquo;re very pleased to announce the release of devtools 1.0. We&rsquo;ve given devtools the 1.0 marker because it now works with the vast majority of packages in the wild, with this version adding support for S4 and Rcpp. Devtools also has completely revamped code for finding Rtools on windows, including much better error messages if something is wrong with your setup. In celebration of reaching 1.0, devtools now has it&rsquo;s <a href="https://www.rstudio.com/projects/devtools/">own webpage</a>, which provides a bit more information about why you might want to use it.</p><p>Devtools 1.0 also contains many other bug fixes and minor improvements, as listed in the <a href="https://github.com/hadley/devtools/blob/master/NEWS">NEWS</a> file on github.</p></description></item><item><title>Shiny Server now available</title><link>https://www.rstudio.com/blog/shiny-server-now-available/</link><pubDate>Tue, 22 Jan 2013 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-server-now-available/</guid><description><p><a href="http://rstudio.com/shiny/">Shiny</a> makes it easy to develop interactive web applications that run on your own machine. But by itself, it isn&rsquo;t designed to make your applications available to all comers over the internet (or intranet). You can&rsquo;t run more than one Shiny application on the same port, and if your R process crashes or exits for any reason, your service becomes unavailable.</p><p>Our solution is Shiny Server, the application server for Shiny. Using Shiny Server, you can host multiple Shiny applications, as well as static web content, on a Linux server and make them available over the internet. You can specify what applications are available at what URL, or configure Shiny Server to let anyone with a user account on the server deploy their own Shiny applications. For more details, see our <a href="https://blog.rstudio.com/2012/12/04/shiny-update/">previous blog post</a>.</p><p><strong>Shiny Server is available as a public beta today.</strong> Follow the instructions on <a href="https://github.com/rstudio/shiny-server#shiny-server">our GitHub project page</a> to get started now!</p></description></item><item><title>ggplot2 0.9.3 and plyr 1.8 have been released!</title><link>https://www.rstudio.com/blog/ggplot2-plyr-release/</link><pubDate>Thu, 06 Dec 2012 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/ggplot2-plyr-release/</guid><description><p>We&rsquo;re pleased to announce new versions of ggplot2 (0.9.3) and plyr (1.8). To get up and running with the new versions, start a clean R session without ggplot2 or plyr loaded, and run <code>install.packages(c(&quot;ggplot2&quot;, &quot;gtable&quot;, &quot;scales&quot;, &quot;plyr&quot;))</code>. Read on to find out what&rsquo;s new.</p><h2 id="ggplot2-093">ggplot2 0.9.3</h2><p>Most of the changes version 0.9.3 are bug fixes. Perhaps the most visible change is that ggplot will now print out warning messages when you use <code>stat=&quot;bin&quot;</code> and also map a variable to y. For example, these are valid:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(mtcars, <span style="color:#06287e">aes</span>(wt, mpg)) <span style="color:#666">+</span> <span style="color:#06287e">geom_bar</span>(stat <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">identity&#34;</span>)<span style="color:#06287e">ggplot</span>(mtcars, <span style="color:#06287e">aes</span>(cyl)) <span style="color:#666">+</span> <span style="color:#06287e">geom_bar</span>(stat <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">bin&#34;</span>)</code></pre></div><p>But this will result in some warnings:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ggplot</span>(mtcars, <span style="color:#06287e">aes</span>(wt, mpg)) <span style="color:#666">+</span> <span style="color:#06287e">geom_bar</span>(stat <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">bin&#34;</span>)<span style="color:#60a0b0;font-style:italic"># The default stat for geom_bar is &#34;bin&#34;, so this is the same as above:</span><span style="color:#06287e">ggplot</span>(mtcars, <span style="color:#06287e">aes</span>(wt, mpg)) <span style="color:#666">+</span> <span style="color:#06287e">geom_bar</span>()</code></pre></div><p>The reason for this change is to make behavior more consistent – <code>stat_bin</code> generates a y value, and so should not work when you also map a value to y.</p><p>For a full list of changes, please see the <a href="http://cran.r-project.org/web/packages/ggplot2/NEWS">NEWS file</a>.</p><h2 id="plyr-18">plyr 1.8</h2><p>Version 1.8 has 28 improvements and bug fixes. Among the most prominent:</p><ul><li><p>All parallel plyr functions gain a <code>.paropts</code> argument, a list of options that is passed onto <code>foreach</code> which allows you to control parallel execution.</p></li><li><p><code>progress_time</code> is a new progress bar contributed by Mike Lawrence estimates the amount of time remaining before a job is complete</p></li><li><p>The summarise() function now calculates columns sequentially, so you can calculate new columns from other new columns, like this:</p></li></ul><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"> <span style="color:#06287e">summarise</span>(mtcars, x <span style="color:#666">=</span> disp<span style="color:#666">/</span><span style="color:#40a070">10</span>, y <span style="color:#666">=</span> x<span style="color:#666">/</span><span style="color:#40a070">10</span>)</code></pre></div><p>This behavior is similar to the mutate() function. Please be aware that this could change the behavior of existing code, if any columns of the output have the same name but different values as columns in the input. For example, this will result in different behavior in plyr 1.7 and 1.8:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"> <span style="color:#06287e">summarise</span>(mtcars, disp <span style="color:#666">=</span> disp<span style="color:#666">/</span><span style="color:#40a070">10</span>, y <span style="color:#666">=</span> disp<span style="color:#666">*</span><span style="color:#40a070">10</span>)</code></pre></div><p>In the old version, the y column would equal <code>mtcars$disp * 10</code>, and in the new version, it would equal <code>mtcars$disp</code>.</p><ul><li>There are a number of performance improvements: <code>a*ply</code> uses more efficient indexing so should be more competitive with <code>apply</code>; <code>d*ply</code>, <code>quickdf_df</code> and <code>idata.frame</code> all have performance tweaks which will help a few people out a lot, and a lot of people a little.</li></ul><p>For a full list of changes, please see the <a href="http://cran.r-project.org/web/packages/plyr/NEWS">NEWS file</a>.</p></description></item><item><title>An update on Shiny</title><link>https://www.rstudio.com/blog/shiny-update/</link><pubDate>Tue, 04 Dec 2012 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/shiny-update/</guid><description><p>Last month we released <a href="https://www.rstudio.com/shiny/">Shiny</a>, our new R package for creating interactive web applications. The response from the community has been extremely encouraging&ndash;we&rsquo;ve received a lot of great feedback that has helped us to make significant improvements to the framework already!</p><h4 id="shiny-023-on-cran">Shiny 0.2.3 on CRAN</h4><p>Starting with Shiny 0.2.3, you can install the latest stable version of Shiny directly from CRAN. Since the initial release, we&rsquo;ve added some interesting features to Shiny, most notably the ability to offer <a href="http://rstudio.github.com/shiny/tutorial/#downloads">on-the-fly file downloads</a>. We&rsquo;ve also fixed some bugs, including an issue with runGist that caused it to fail on many Windows systems.</p><p>Install or upgrade now by running: <code>install.packages('shiny')</code></p><h4 id="coming-soon-shiny-server">Coming soon: Shiny Server</h4><p>While Shiny works great today for running apps on your own machine, we indicated in our original blog post that for web-based deployment we&rsquo;d be offering hosting services and a software package for deploying Shiny applications on a server.</p><p>Today we have more details to share about Shiny Server, the software package which will allow you to deploy Shiny applications on your own server:</p><ul><li><p>Free and open source (<a href="http://www.gnu.org/licenses/agpl-3.0.txt">AGPLv3</a> license)</p></li><li><p>Host multiple applications on the same port, with a different URL path per application</p></li><li><p>Allows Shiny applications to work with Internet Explorer 8 and 9</p></li><li><p>Automatically starts and stops R sessions as needed</p></li><li><p>Detects and recovers from crashed R sessions</p></li><li><p>Designed to serve applications directly to browsers, or be proxied behind another web server like Apache/Nginx</p></li><li><p>Works across network gateways and proxies that don&rsquo;t support websockets</p></li></ul><p>Our goal is to begin beta testing by the end of January. Shiny Server will require Linux at launch, though we will likely add Windows and Mac support later.</p><p>While we previously said that Shiny Server would be commercial software, we&rsquo;ve decided to make it free and open source instead. Later in 2013 we hope to introduce a paid edition of Shiny Server that will include additional features that are targeted at larger organizations.</p><p>That&rsquo;s all we have on the Shiny front for now. If you have questions, leave us a comment, or drop by our active and growing community at <a href="https://groups.google.com/group/shiny-discuss">shiny-discuss</a>!</p></description></item><item><title>RStudio and Rcpp</title><link>https://www.rstudio.com/blog/rstudio-and-rcpp/</link><pubDate>Thu, 29 Nov 2012 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-and-rcpp/</guid><description><p>Earlier this month a new version of the Rcpp package by <a href="http://dirk.eddelbuettel.com/">Dirk Eddelbuettel</a> and <a href="http://romainfrancois.blog.free.fr">Romain François</a> was released to CRAN and today we&rsquo;re excited to announce a <a href="https://www.rstudio.com/ide/download/">new version of RStudio</a> that integrates tightly with Rcpp.</p><p>First though more about some exciting new features in <a href="http://dirk.eddelbuettel.com/blog/2012/11/27/#rcpp_0.10.1">Rcpp 0.10.1</a>. This release includes <a href="http://cran.rstudio.com/web/packages/Rcpp/vignettes/Rcpp-attributes.pdf">Rcpp attributes</a>, which are simple annotations that you add to C++ source files to streamline calling C++ from R. This makes it possible to write C++ functions and simply source them into R just as you&rsquo;d source an R script. Here&rsquo;s an example:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-cpp" data-lang="cpp"><span style="color:#007020">#</span><span style="color:#007020">include</span> <span style="color:#007020">&lt;Rcpp.h&gt;</span><span style="color:#007020"></span><span style="color:#007020"></span><span style="color:#007020;font-weight:bold">using</span> <span style="color:#007020;font-weight:bold">namespace</span> Rcpp;<span style="color:#60a0b0;font-style:italic">// [[Rcpp::export]]</span><span style="color:#60a0b0;font-style:italic"></span>NumericMatrix <span style="color:#06287e">gibbs</span>(<span style="color:#902000">int</span> N, <span style="color:#902000">int</span> thin) {NumericMatrix mat(N, <span style="color:#40a070">2</span>);<span style="color:#902000">double</span> x <span style="color:#666">=</span> <span style="color:#40a070">0</span>, y <span style="color:#666">=</span> <span style="color:#40a070">0</span>;RNGScope scope;<span style="color:#007020;font-weight:bold">for</span>(<span style="color:#902000">int</span> i <span style="color:#666">=</span> <span style="color:#40a070">0</span>; i <span style="color:#666">&lt;</span> N; i<span style="color:#666">+</span><span style="color:#666">+</span>) {<span style="color:#007020;font-weight:bold">for</span>(<span style="color:#902000">int</span> j <span style="color:#666">=</span> <span style="color:#40a070">0</span>; j <span style="color:#666">&lt;</span> thin; j<span style="color:#666">+</span><span style="color:#666">+</span>) {x <span style="color:#666">=</span> R<span style="color:#666">:</span><span style="color:#666">:</span>rgamma(<span style="color:#40a070">3.0</span>, <span style="color:#40a070">1.0</span> <span style="color:#666">/</span> (y <span style="color:#666">*</span> y <span style="color:#666">+</span> <span style="color:#40a070">4</span>));y <span style="color:#666">=</span> R<span style="color:#666">:</span><span style="color:#666">:</span>rnorm(<span style="color:#40a070">1.0</span> <span style="color:#666">/</span> (x <span style="color:#666">+</span> <span style="color:#40a070">1</span>), <span style="color:#40a070">1.0</span> <span style="color:#666">/</span> sqrt(<span style="color:#40a070">2</span> <span style="color:#666">*</span> x <span style="color:#666">+</span> <span style="color:#40a070">2</span>));}mat(i, <span style="color:#40a070">0</span>) <span style="color:#666">=</span> x;mat(i, <span style="color:#40a070">1</span>) <span style="color:#666">=</span> y;}<span style="color:#007020;font-weight:bold">return</span>(mat);}</code></pre></div><p>By annotating the gibbs function with the <code>Rcpp::export</code> attribute, we indicate we&rsquo;d like that function to be callable from R. As a result we can now call the function like this:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">sourceCpp</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">gibbs.cpp&#34;</span>)<span style="color:#06287e">gibbs</span>(<span style="color:#40a070">100</span>, <span style="color:#40a070">10</span>)</code></pre></div><p>Thanks to the abstractions provided by Rcpp, the code implementing gibbs in C++ is nearly identical to the code you&rsquo;d write in R, but runs <a href="http://dirk.eddelbuettel.com/blog/2011/07/14/">20 times faster</a>.</p><p>The <code>sourceCpp</code> function makes it much easier to use C++ within interactive R sessions. In the new version of RStudio we did a few things to support this workflow. Here&rsquo;s a screenshot showing the RStudio C++ editing mode:</p><p><img src="https://www.rstudio.com/images/docs/rcpp_sourcecpp.png" alt=""></p><p>In RStudio you can now source a C++ file in the same way as an R script, using the source button on the toolbar or Cmd+Shift+Enter. If errors occur during compilation then RStudio parses the GCC error log and presents the errors as a navigable list.</p><p>When using <code>sourceCpp</code> it&rsquo;s also possible to embed R code within a C++ source file using a special block comment. RStudio treats this code as an R code chunk (similar to Sweave or R Markdown code chunks):</p><p><img src="https://www.rstudio.com/images/docs/rcpp_sourcecpp_rchunks.png" alt=""></p><p>RStudio also includes extensive support for package development with Rcpp. For more details see the <a href="https://www.rstudio.com/ide/docs/advanced/using_rcpp">Using Rcpp with RStudio</a> document on our website.</p><p>Note that if you want to try out the new features be sure you are running <a href="https://www.rstudio.com/ide/download/">RStudio v0.97.237</a> as well as the very latest version of <a href="http://cran.rstudio.com/web/packages/Rcpp/">Rcpp</a> (0.10.1) .</p></description></item><item><title>Introducing Shiny: Easy web applications in R</title><link>https://www.rstudio.com/blog/introducing-shiny/</link><pubDate>Thu, 08 Nov 2012 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/introducing-shiny/</guid><description><p>Say hello to <a href="https://www.rstudio.com/shiny/">Shiny</a>, a new R package that we&rsquo;re releasing for public beta testing today.</p><p><strong>Shiny makes it super simple for R users to turn analyses into interactive web applications that anyone can use.</strong> These applications let you specify input parameters using friendly controls like sliders, drop-downs, and text fields; and they can easily incorporate any number of outputs like plots, tables, and summaries.</p><p>No HTML or JavaScript knowledge is necessary. If you have some experience with R, you&rsquo;re just minutes away from combining the statistical power of R with the simplicity of a web page:</p><p><img src="https://rstudioblog.files.wordpress.com/2012/11/heightweight.png" alt="Shiny application screenshot"></p><p>More details, including live examples and a link to an extensive tutorial, can be found on the <a href="https://www.rstudio.com/shiny/">Shiny homepage</a>.</p><p>The Shiny package is free and open source, and is designed primarily to run Shiny applications locally. To share Shiny applications with others, you can send them your application source as a GitHub gist, R package, or zip file (see <a href="http://rstudio.github.com/shiny/tutorial/#deployment">details</a>). We&rsquo;re also working on a Shiny server that is designed to provide enterprise-grade application hosting, which we&rsquo;ll offer as a subscription-based hosting service and/or commercial software package.</p><p>We&rsquo;re really excited about Shiny, and look forward to seeing what kind of applications you come up with!</p><p>(Special thanks to <a href="http://illposed.net">Bryan Lewis</a> for authoring the <a href="http://cran.r-project.org/web/packages/websockets/index.html">websockets</a> package, which is used heavily by Shiny.)</p><p><a href="https://www.rstudio.com/shiny/">Shiny homepage</a></p></description></item><item><title>New version of RStudio (v0.97)</title><link>https://www.rstudio.com/blog/new-version-of-rstudio-v0-97/</link><pubDate>Thu, 01 Nov 2012 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/new-version-of-rstudio-v0-97/</guid><description><p>Today a new version of RStudio (v0.97) is <a href="https://www.rstudio.com/ide/download">available for download</a> from our website. The principal focus of this release was creating comprehensive tools for R package development. We also implemented many other frequently requested enhancements including a new <a href="http://en.wikipedia.org/wiki/Vim_(text_editor)">Vim</a> editing mode and a much improved Find and Replace pane. Here&rsquo;s a summary of what&rsquo;s new in the release:</p><h4 id="package-development">Package Development</h4><ul><li><p>A new Build tab with package development commands and a view of build output and errors</p></li><li><p>Build and Reload command that rebuilds the package and reloads it in a fresh R session</p></li><li><p>Create a new package using existing source files via New Project</p></li><li><p>R documentation tools including previewing, spell-checking, and <a href="https://github.com/klutometis/roxygen">Roxygen</a> aware editing</p></li><li><p>Integration with <a href="https://github.com/hadley/devtools">devtools</a> package development functions</p></li><li><p>Support for <a href="http://dirk.eddelbuettel.com/code/rcpp.html">Rcpp</a> including syntax highlighting for C/C++ and gcc error navigation</p></li></ul><h4 id="source-editor">Source Editor</h4><ul><li><p><a href="http://en.wikipedia.org/wiki/Vim_(text_editor)">Vim</a> editing mode</p></li><li><p><a href="https://github.com/chriskempson/tomorrow-theme#readme">Tomorrow</a> suite of editor themes</p></li><li><p>Find and replace: incremental search, find/replace in selection, and backwards find</p></li><li><p>Auto-indenting: improved intelligence and new options to customize indenting behavior</p></li><li><p>New options: show whitespace, show indent guides, non-blinking cursor, focus console after executing code</p></li></ul><h4 id="more">More</h4><ul><li><p>New Restart R and Terminate R commands</p></li><li><p>More intelligent console history navigation with up/down arrow keys</p></li><li><p>View plots within a separate window/monitor.</p></li><li><p>Ability to set a global UI zoom-level</p></li><li><p>RStudio CRAN mirror (via Amazon CloudFront) for fast package downloads</p></li></ul><p>There are also many more small improvements and bug fixes. Check out the <a href="https://www.rstudio.com/ide/docs/release_notes_v0.97.html">v0.97 release notes</a> for details on all of the changes.</p></description></item><item><title>RStudio training</title><link>https://www.rstudio.com/blog/training/</link><pubDate>Tue, 23 Oct 2012 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/training/</guid><description><p>At RStudio, we want you to be effective R users. As well as <a href="http://rstudio.com/ide/">creating</a> <a href="http://ggplot2.org">great</a> <a href="http://github.com/hadley/devtools">software</a>, we want to make it easier for you to master R. To this end, we&rsquo;re very happy to announce our new <a href="http://rstudio.com/training">training offerings</a>.</p><p>We&rsquo;re kicking off with two public courses:</p><ul><li><p><a href="http://rstudio.com/training/curriculum/effective-data-visualization.html">Effective data visualisation</a> and <a href="http://rstudio.com/training/curriculum/reports-and-reproducible-research.html">reports and reproducible research</a> in <a href="http://rstudio-sf.eventbrite.com/">San Francisco</a>, Dec 3-4.</p></li><li><p><a href="http://rstudio.com/training/curriculum/advanced-r-programming.html">Advanced R programming</a> and <a href="http://rstudio.com/training/curriculum/package-development.html">package development</a> in <a href="http://rstudio-dc.eventbrite.com/">Washington DC</a>, Dec 10-12.</p></li></ul><p>We&rsquo;ve also planned a number of <a href="http://rstudio.com/training/">other courses</a>, based on our experience with the R community, seeing what&rsquo;s hard to learn and what people are struggling with. These courses are available now if you&rsquo;d like us to come to <a href="http://rstudio.com/training/on-site.html">your company</a>, and based on <a href="http://rstudio.com/training/public-courses.html">your feedback</a> we&rsquo;ll offer public versions in the near future.</p><p>You can also read about <a href="http://rstudio.com/training/trainers.html">our instructors</a> and <a href="http://rstudio.com/training/philosophy.html">our philosophy</a>.</p></description></item><item><title>New version of httr: 0.2</title><link>https://www.rstudio.com/blog/httr-0-2/</link><pubDate>Sun, 14 Oct 2012 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/httr-0-2/</guid><description><p>We&rsquo;re happy to announce a new version of httr, a package designed to make it easy to work with web APIs. Httr is a wrapper around <a href="http://www.omegahat.org/RCurl/">RCurl</a>, and provides:</p><ul><li><p>functions for the most important http verbs: <code>GET</code>, <code>HEAD</code>, <code>PATCH</code>, <code>PUT</code>, <code>DELETE</code> and <code>POST</code>.</p></li><li><p>automatic cookie handing across requests, connection sharing, and standard SSL config.</p></li><li><p>a request object which captures the body of the request along with request status, cookies, headers, timings and other useful information.</p></li><li><p>easy ways to access the response as a raw vector, a character vector, or parsed into an R object (for html, xml, json, png and jpeg).</p></li><li><p>wrapper functions for the most common configuration options: <code>set_cookies</code>, <code>add_headers</code>, <code>authenticate</code>, <code>use_proxy</code>, <code>verbose</code>, <code>timeout</code>.</p></li><li><p>support for OAuth 1.0 and 2.0. Use <code>oauth1.0_token</code> and <code>oauth2.0_token</code> to get user tokens, and <code>sign_oauth1.0</code> and <code>sign_oauth2.0</code>to sign requests. The demos directory has six demos of using OAuth: three for 1.0 (linkedin, twitter and vimeo) and three for 2.0 (facebook, github, google).</p></li></ul><p>Track httr&rsquo;s development on <a href="https://github.com/hadley/httr">github</a>, and see what&rsquo;s <a href="https://github.com/hadley/httr/blob/master/NEWS">new in this version</a>.</p></description></item><item><title>lubridate 1.2.0 now on CRAN</title><link>https://www.rstudio.com/blog/lubridate-1-2-0-now-on-cran/</link><pubDate>Mon, 08 Oct 2012 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/lubridate-1-2-0-now-on-cran/</guid><description><p>The latest version of lubridate offers some powerful new features and huge speed improvements. Some areas, such as date parsing are more than 50 times faster. lubridate 1.2.0 also fixes those pesky NA bugs in 1.1.0. Here&rsquo;s some of what you&rsquo;ll find:</p><p>Parsers can now handle a wider variety date formats, even within the same vector</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">dates <span style="color:#666">&lt;-</span> <span style="color:#06287e">c</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">January 31, 2010&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">2-28-2010&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">03/31/2000&#34;</span>)dates <span style="color:#666">&lt;-</span> <span style="color:#06287e">mdy</span>(dates)<span style="color:#60a0b0;font-style:italic">## [1] &#34;2010-01-31 UTC&#34; &#34;2010-02-28 UTC&#34; &#34;2000-03-31 UTC</span></code></pre></div><p>Stamp lets you display dates however you like, by emulating an example date</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r">stamper <span style="color:#666">&lt;-</span> <span style="color:#06287e">stamp</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">1 March 1999&#34;</span>)<span style="color:#06287e">stamper</span>(dates)[1] <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">31 January 2010&#34;</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">28 February 2010&#34;</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">31 March 2000&#34;</span></code></pre></div><p>New methods add months without rolling past the end of short months. Its hard to find a satisfactory way to implement addition with months, but the %m+% and %m-% operators provide a new option that wasn&rsquo;t available before.</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">ymd</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">2010-01-31&#34;</span>) <span style="color:#666">%m+%</span> <span style="color:#06287e">months</span>(<span style="color:#40a070">1</span><span style="color:#666">:</span><span style="color:#40a070">3</span>)[1] <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">2010-02-28 UTC&#34;</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">2010-03-31 UTC&#34;</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">2010-04-30 UTC&#34;</span></code></pre></div><p>lubridate 1.2.0 includes many awesome ideas and patches submitted by lubridate users, so check out what is new. For a complete list of new features and contributors, see the package <a href="https://github.com/hadley/lubridate/blob/master/NEWS">NEWS</a> file on github.</p></description></item><item><title>Where in the world is R and RStudio</title><link>https://www.rstudio.com/blog/where-in-the-world/</link><pubDate>Mon, 01 Oct 2012 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/where-in-the-world/</guid><description><p>Using the web logs collected when users download RStudio, we&rsquo;ve prepared the following two maps showing where RStudio is being used, over the whole globe and just within the continental USA. Obviously this data is somewhat biased, as it reflects the number of downloads of RStudio, rather than the number of users of R (which we&rsquo;d really love to know!). However, based on a month&rsquo;s worth of data, we think the broad patterns are pretty interesting.</p><p><img src="https://rstudioblog.files.wordpress.com/2012/10/us.png" alt=""><img src="https://rstudioblog.files.wordpress.com/2012/10/world.png" alt=""></p><p>We made the maps by translating IP addresses to latitude and longitude with the free <a href="http://www.maxmind.com/app/geolite">GeoIP</a> databases provided by <a href="http://www.maxmind.com/">MaxMind</a>. To make it easier to see the main patterns for each map, we used k-means clustering to group the original locations into 300 clusters for the world and 100 clusters for the US, then used ggplot2 to display the number of users in each cluster with the area of each bubble.</p></description></item><item><title>New version of devtools: 0.8</title><link>https://www.rstudio.com/blog/devtools-0-8/</link><pubDate>Sun, 16 Sep 2012 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/devtools-0-8/</guid><description><p>We&rsquo;re pleased to announce a new version of devtools, the package that makes R package development easy. The main features in this version are:</p><ul><li><p>A complete rewrite of the code loading system which simulates namespace loading much more accurately - this means using <code>load_all</code> is much closer to installing and loading the package. It also compiles and loads C, C++ and Fortran code in the <code>src/</code> directory.</p></li><li><p>All devtools command now only take a path to a package and default to using the working directory if no path is supplied.</p></li><li><p>All R commands are run in <code>--vanilla</code> mode and print the console command that&rsquo;s run.</p></li><li><p>Install github now allows you to install from pull request and private repositories.</p></li></ul><p>Plus much, much more - for a complete list of changes, see the <a href="https://github.com/hadley/devtools/blob/master/NEWS">NEWS</a> on github. If you&rsquo;re interested in package development with devtools you may also want to join the <a href="http://groups.google.com/group/rdevtools">rdevtools</a> mailing list.</p></description></item><item><title>ggplot2 0.9.2 has been released!</title><link>https://www.rstudio.com/blog/ggplot2-0-9-2/</link><pubDate>Fri, 07 Sep 2012 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/ggplot2-0-9-2/</guid><description><p>The main changes in this version are to the theming system. There are also a number of enhancements to the theming system that make it easier to <a href="https://github.com/wch/ggplot2/wiki/New-theme-system">modify themes</a> and we&rsquo;ve renamed a number of functions to have more informative names. Your existing code should continue to work, although you may receive warnings about functions that have been deprecated. Replacing them with new versions is easy. Here are the changes you are likely to encounter:</p><ul><li><p><code>opts()</code> is deprecated. You can simply replace it with <code>theme()</code> in your code.</p></li><li><p><code>theme_blank()</code>, <code>theme_text()</code>, <code>theme_rect()</code>, <code>theme_line()</code>, and <code>theme_segment()</code> are deprecated. You can replace them with <code>element_blank()</code>, <code>element_text()</code>, <code>element_rect()</code>, and <code>element_line()</code>.</p></li><li><p>Previously, the way to set the title of a plot was <code>opts(title=&quot;Title text&quot;)</code>. In the new version, use <code>ggtitle(&quot;Title text&quot;)</code> or <code>labs(title=&quot;Title text&quot;)</code>.</p></li></ul><p>Other improvements include the addition of <code>stat_ecdf</code>, defaulting to the colour bar legend for continuous colour scales, nicer default breaks, better documentation and much more (including many bug fixes). You can read the complete list of changes on the <a href="https://github.com/hadley/ggplot2/blob/ggplot2-0.9.2/NEWS">development site</a></p></description></item><item><title>Welcome Hadley, Winston, and Garrett!</title><link>https://www.rstudio.com/blog/welcome-hadley-winston-and-garrett/</link><pubDate>Mon, 20 Aug 2012 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/welcome-hadley-winston-and-garrett/</guid><description><p>RStudio&rsquo;s mission from the beginning has been to create powerful tools that support the practices and techniques required for creating trustworthy, high quality analysis. For many years <a href="http://had.co.nz/">Hadley Wickham</a> has been teaching and working on his own set of tools for R with many of the same core goals. We&rsquo;ve been collaborating quite a bit with Hadley over the past couple of years and today we&rsquo;re excited to announce that Hadley, Winston Chang, and Garrett Grolemund are joining RStudio so we can continue to work together much more closely.</p><p>You probably know Hadley from his work on <a href="http://had.co.nz/ggplot2/">ggplot2</a>, <a href="http://plyr.had.co.nz/">plyr</a>, and many other packages. Garrett was a PhD student of Hadley&rsquo;s at Rice, and you might also know him from the <a href="http://www.r-statistics.com/2012/03/do-more-with-dates-and-times-in-r-with-lubridate-1-1-0/">lubridate</a> package, which makes dealing with dates and time easier; he&rsquo;s also been working on new tools for visualisation and new ways of thinking about the process of data analysis. Winston has been working full-time on ggplot2 for the last couple of months, squashing many bugs and repaying a lot of the technical debt that&rsquo;s accumulated over the years. Winston&rsquo;s also writing an <a href="http://www.amazon.com/R-Graphics-Cookbook-Winston-Chang/dp/1449316956">R Graphics Cookbook</a> for O&rsquo;Reilly that should be available in the near future.</p><p>What does this mean for RStudio? We&rsquo;ll of course continue developing open-source software like the RStudio IDE, ggplot2, and plyr (among many other projects). One of Hadley&rsquo;s core focuses at RStudio will also be expanding our mission to include education, which we plan to offer in a variety of formats ranging from in-person training to some innovative new online courses. We&rsquo;ll also be working on hosted services (like <a href="http://rpubs.com/">RPubs</a>) as well as some new products that address the challenges of deploying R within larger organizations.</p><p>We&rsquo;re all excited to begin this next phase of work together and will have lots more details to announce later this fall!</p></description></item><item><title>Announcing RPubs: A New Web Publishing Service for R</title><link>https://www.rstudio.com/blog/announcing-rpubs/</link><pubDate>Mon, 04 Jun 2012 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/announcing-rpubs/</guid><description><p>Today we&rsquo;re very excited to announce <a href="http://www.rpubs.com/">RPubs</a>, a free service that makes it easy to publish documents to the web from R. RPubs is a quick and easy way to disseminate data analysis and R code and do ad-hoc collaboration with peers.</p><p><a href="http://rpubs.com/jjallaire/friday-demo"><img src="https://rstudioblog.files.wordpress.com/2012/06/rpubs_document.png" alt=""></a></p><p>RPubs documents are based on <a href="http://rstudio.org/docs/authoring/using_markdown">R Markdown</a>, a new feature of knitr 0.5 and RStudio 0.96. To publish to RPubs within RStudio, you simply create an R Markdown document then click the <strong>Publish</strong> button within the HTML Preview window:</p><p><img src="https://rstudioblog.files.wordpress.com/2012/06/publish_to_rpubs.png" alt=""></p><p>RPubs documents include a moderated comment stream for feedback and dialog with readers, and can be updated with changes by publishing again from within RStudio.</p><p>Note that you&rsquo;ll only see the Publish button if you update to the latest version of RStudio (v0.96.230, <a href="http://www.rstudio.org/download">available for download</a> today).</p><h2 id="the-markdown-package">The markdown package</h2><p>RStudio has integrated support for working with R Markdown and publishing to RPubs, but we also want to make sure that no matter what tools you use it&rsquo;s still possible to get the same results. To that end we&rsquo;ve also been working on a new version of the <a href="http://cran.r-project.org/web/packages/markdown/index.html">markdown</a> package (v0.5, available now on CRAN).</p><p>The markdown package provides a standalone implementation of R Markdown rendering that can be integrated with other editors and IDEs. The package includes a function to upload to RPubs, but is also flexible enough to support lots of other web publishing scenarios. We&rsquo;ve been working with Jeff Horner on this and he has a more detailed write-up on the <a href="http://jeffreyhorner.tumblr.com/post/24404112057/announcing-the-r-markdown-package">capabilities of the markdown package</a> on his blog.</p><h2 id="gallery-of-examples">Gallery of examples</h2><p>We&rsquo;ve also published a <a href="http://www.rpubs.com/gallery">gallery of example documents</a> on RPubs—the gallery illustrates some of the most useful techniques for getting the most out of R Markdown, and includes the following articles:</p><ul><li><p><a href="http://rpubs.com/gallery/equations">MathJax and Writing Equations</a></p></li><li><p><a href="http://rpubs.com/gallery/googleVis">Dynamic Graphics with the googleVis Package</a></p></li><li><p><a href="http://rpubs.com/gallery/options">Customizing Chunk Options</a></p></li><li><p><a href="http://rpubs.com/gallery/cache">Caching Code Chunks</a></p></li></ul><p>Let us know what additional examples you&rsquo;d like to see—we&rsquo;ll be adding more in the weeks ahead.</p><h2 id="heading"></h2></description></item><item><title>MathJax Syntax Change</title><link>https://www.rstudio.com/blog/mathjax-syntax-change/</link><pubDate>Fri, 25 May 2012 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/mathjax-syntax-change/</guid><description><p>We&rsquo;ve just a made a change to the syntax for embedding MathJax equations in R Markdown documents. The change was made to eliminate some parsing ambiguities and to support future extensibility to additional formats.</p><p>The revised syntax adds a &ldquo;latex&rdquo; qualifier to the <code>$</code> or <code>$$</code> equation begin delimiter. It looks like this:</p><p><img src="https://rstudioblog.files.wordpress.com/2012/05/mathjax_latex_syntax.png" alt=""></p><p>This change was the result of a few considerations:</p><ol><li><p>Some users encountered situations where the <code>$equation$</code> syntax recognized standard text as an equation. There was an escape sequence (<code>\$</code>) to avoid this but for users not explicitly aware of MathJax semantics this was too hard to discover.</p></li><li><p>The requirement to have no space between equation delimiters (<code>$</code>) and the equation body (intended to reduce parsing ambiguity) was also confusing for users.</p></li><li><p>We want to eventually support <a href="http://www1.chapman.edu/~jipsen/mathml/asciimath.html">ASCIIMath</a> and for this will require an additional qualifier to indicate the equation format.</p></li></ol><p>RStudio v0.96.227 implements the new MathJax syntax and is <a href="http://www.rstudio.org/download">available for download</a> now.</p></description></item><item><title>NYC Meetup: What's Next for R Markdown</title><link>https://www.rstudio.com/blog/nyc-meetup-r-markdown/</link><pubDate>Thu, 24 May 2012 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/nyc-meetup-r-markdown/</guid><description><p>There&rsquo;s been lots of excitement about the new <a href="http://www.rstudio.org/docs/authoring/using_markdown">R Markdown</a> feature introduced as part of knitr 0.5 and RStudio 0.96. People see R Markdown as both a simpler way to do reproducible research and as a great way to publish to the web from R. Jeromy Anglim has a nice write up on <a href="http://jeromyanglim.blogspot.com.au/2012/05/getting-started-with-r-markdown-knitr.html">getting started with R Markdown</a> and Marcus Gesmann describes how to <a href="http://lamages.blogspot.com.au/2012/05/interactive-reports-in-r-with-knitr-and.html">embed Google Visualizations</a> using his googleVis package.</p><p>We are just as excited about R Markdown and think there is lots more that can be done with it. We&rsquo;ll be talking about this along with Yihui Xie (author of knitr) and Jeff Horner (author of R/Apache and Rook) on Tuesday June 5th in New York:</p><p><a href="http://www.meetup.com/nyhackr/events/64279002/">http://www.meetup.com/nyhackr/events/64279002/</a></p><p>At the meetup we&rsquo;ll be showing the latest versions of knitr and RStudio and will be announcing some new R Markdown stuff—if you are in New York we&rsquo;d love to see you there!</p></description></item><item><title>NYC Meetup: What's Next for R Markdown</title><link>https://www.rstudio.com/blog/nyc-meetup-r-markdown/</link><pubDate>Thu, 24 May 2012 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/nyc-meetup-r-markdown/</guid><description><p>There&rsquo;s been lots of excitement about the new <a href="http://www.rstudio.org/docs/authoring/using_markdown">R Markdown</a> feature introduced as part of knitr 0.5 and RStudio 0.96. People see R Markdown as both a simpler way to do reproducible research and as a great way to publish to the web from R. Jeromy Anglim has a nice write up on <a href="http://jeromyanglim.blogspot.com.au/2012/05/getting-started-with-r-markdown-knitr.html">getting started with R Markdown</a> and Marcus Gesmann describes how to <a href="http://lamages.blogspot.com.au/2012/05/interactive-reports-in-r-with-knitr-and.html">embed Google Visualizations</a> using his googleVis package.</p><p>We are just as excited about R Markdown and think there is lots more that can be done with it. We&rsquo;ll be talking about this along with Yihui Xie (author of knitr) and Jeff Horner (author of R/Apache and Rook) on Tuesday June 5th in New York:</p><p><a href="http://www.meetup.com/nyhackr/events/64279002/">http://www.meetup.com/nyhackr/events/64279002/</a></p><p>At the meetup we&rsquo;ll be showing the latest versions of knitr and RStudio and will be announcing some new R Markdown stuff—if you are in New York we&rsquo;d love to see you there!</p></description></item><item><title>RStudio v0.96.225 Update</title><link>https://www.rstudio.com/blog/rstudio-v0-96-225-update/</link><pubDate>Thu, 24 May 2012 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-v0-96-225-update/</guid><description><p>There&rsquo;s an updated release of RStudio v0.96 available that includes some small enhancements and bugfixes, including:</p><ul><li><p>Comment/uncomment for Sweave and LaTeX</p></li><li><p>Additional in-product documentation for R Markdown</p></li><li><p>Offline support for MathJax previews</p></li><li><p>More flexible handling of MathJax inline equations</p></li></ul><p>The <a href="http://www.rstudio.org/docs/release_notes_v0.96.html#225">release notes</a> include a full list all of the changes.</p><p>We&rsquo;ve also published some additional documentation on using the new <a href="http://www.rstudio.org/docs/using/code_folding">code folding and code sections</a> features.</p><p>The updated version is <a href="http://www.rstudio.org/download">available for download</a> from our site now.</p></description></item><item><title>New Version of RStudio (v0.96)</title><link>https://www.rstudio.com/blog/rstudio-v096/</link><pubDate>Mon, 14 May 2012 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-v096/</guid><description><p>Today a new version of RStudio (v0.96) is <a href="http://www.rstudio.org/download/">available for download</a> from our website. The main focus of this release is improved tools for authoring, reproducible research, and web publishing. This means lots of new <a href="http://www.statistik.lmu.de/~leisch/Sweave/">Sweave</a> features as well as tight integration with the <a href="http://yihui.name/knitr/">knitr</a> package (including support for creating dynamic web reports with the new R Markdown and R HTML formats).</p><p>We&rsquo;ve also added some other frequently requested editing features including code folding. Here&rsquo;s a short video demo of the new authoring and web publishing features:</p><p><a href="http://vimeo.com/79723940">http://vimeo.com/79723940</a></p><p>We&rsquo;re particularly excited about the new possibilities opened up by R Markdown, which make it easier than ever to create web content with R. On June 5th in New York we&rsquo;ll talking about the latest releases of knitr and RStudio with Yihui Xie (knitr) and Jeff Horner (R/Apache and Rook):</p><p><a href="http://www.meetup.com/nyhackr/events/64279002/">http://www.meetup.com/nyhackr/events/64279002/</a></p><p>We&rsquo;ll also be announcing some more new stuff at the meetup—hope to see you there!</p><p>You can <a href="http://www.rstudio.org/download/">download RStudio 0.96</a> from our website now. Here&rsquo;s a list of all the new features:</p><p><strong>Sweave / knitr</strong></p><ul><li><p>Spell checking for Sweave and TeX documents.</p></li><li><p>Integrated PDF previewer that supports two-way synchronization (<a href="http://mactex-wiki.tug.org/wiki/index.php/SyncTeX">SyncTeX</a>) between the editor and PDF view.</p></li><li><p>Support for weaving Rnw files using the <a href="http://yihui.name/knitr/">knitr</a> package (requires knitr version 0.5 or higher).</p></li><li><p>Parsing of TeX error logs to extract errors, warnings, and bad boxes and present them in a navigable list.</p></li><li><p>Chunk option auto-complete, chunk folding, jump to chunk, and iterative execution of chunks.</p></li><li><p>Compilation based on multiple input files (support for specifying a root TeX document) .</p></li><li><p>TeX formatting commands, block comment/uncomment, and various new compilation options.</p></li></ul><p><strong>Web Publishing</strong></p><ul><li><p>Editing and previewing R Markdown and R HTML files (like Sweave except for web pages).</p></li><li><p>Creation of easy to distribute standalone HTML files (with embedded images).</p></li><li><p>Support for including LaTeX, ASCIIMath, and MathML equations in web pages using <a href="http://www.mathjax.org/">MathJax</a>.</p></li></ul><p><strong>Source Editing</strong></p><ul><li><p>Find in files with regular expressions.</p></li><li><p>Code folding (expanding and collapsing regions of code).</p></li><li><p>Automatic comment reflowing (Cmd+Shift+/).</p></li><li><p>Smart editing of Roxygen comments.</p></li><li><p>Syntax highlighting for Markdown, HTML, Javascript, and CSS files.</p></li><li><p>New font customization options.</p></li></ul><p><strong>Miscellaneous</strong></p><ul><li><p>Fixed incompatibility with Winbind for PAM authentication.</p></li><li><p>Fixed editor cursor off by one line problem that occurred after rapid scrolling.</p></li></ul></description></item><item><title>RStudio v0.95 Released</title><link>https://www.rstudio.com/blog/rstudio-v0-95-released/</link><pubDate>Wed, 25 Jan 2012 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-v0-95-released/</guid><description><p>The final version of RStudio v0.95 is now <a href="http://www.rstudio.org/download">available for download</a> from our website (thanks to everyone who put the preview release through its paces over the last couple of weeks!). Highlights of the new release include:</p><ul><li><p><strong>Projects</strong> — A new system for managing R projects that enables easy switching between working directories and per-project contexts for source documents, workspaces, and history.</p></li><li><p><strong>Code Navigation</strong> — Typeahead navigation by file or function name (Ctrl+.) and the ability to navigate directly to the definition of any function (F2 or Ctrl+Click).</p></li><li><p><strong>Version Control</strong> — Integrated support for Git and Subversion, including changelist management, diffing/staging, and project history.</p></li></ul><p>Quite a bit has been added to RStudio since the initial v0.92 release a year ago. We&rsquo;ve put together a new screencast that includes a quick tour of the product and also highlights some of the new features in v0.95:</p><p><a href="http://www.rstudio.org#screencast"><img src="https://rstudioblog.files.wordpress.com/2012/01/screencap-play.png" alt=""></a></p><p>There is also an <a href="http://www.decisionstats.com/interview-jj-allaire-founder-rstudio/">interview</a> with RStudio founder JJ Allaire over on DecisionStats that has a more in-depth discussion of the release and the RStudio project in general.</p><p>The evolution of RStudio is a direct result of the many in-depth conversations we&rsquo;ve had with users at meetups, conferences, and on our support forum. We realize that there&rsquo;s plenty more to do and hope we can keep up with all of the great feedback! In that spirit we hope to see lots of folks this Thursday night at the <a href="http://www.meetup.com/ChicagoRUG/events/47339512/">Chicago RUG meetup</a> as well as in February in <a href="http://www.meetup.com/houstonr/events/48026172/">Houston</a> and <a href="http://www.meetup.com/LAarea-R-usergroup/">Los Angeles</a>.</p></description></item><item><title>RStudio v0.95 Preview Available</title><link>https://www.rstudio.com/blog/rstudio-v0-95-preview/</link><pubDate>Tue, 10 Jan 2012 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-v0-95-preview/</guid><description><p>The next version of RStudio (v0.95) is now available as a <a href="http://www.rstudio.org/download/preview">preview release</a>. Highlights include:</p><ul><li><p><strong>Projects</strong> &ndash; A new system for managing R projects that enables easy switching between working directories, running multiple instances of RStudio with different projects, and per-project contexts for source documents, workspaces, and history.</p></li><li><p><strong>Code Navigation</strong> &ndash; Typeahead navigation by file or function name (Ctrl+.) and the ability to navigate directly to the definition of any function (F2 or Ctrl+Click).</p></li><li><p><strong>Version Control</strong> &ndash; Integrated support for Git and Subversion, including changelist management, diffing/staging, and project history.</p></li></ul><p>Detailed documentation on the new features will be available along with the final release of v0.95, which we expect to make available by the end of January.</p><p>We&rsquo;re also planning on being at the Chicago, Houston, and Los Angeles R User Groups over the next few weeks. We&rsquo;ll be talking about the new release as well as the general state of the project and where people would like to see us go in the future. Meeting dates are:</p><ul><li><p><a href="http://www.meetup.com/ChicagoRUG/events/47339512/">Chicago</a> (Thursday, January 26th)</p></li><li><p><a href="http://www.meetup.com/houstonr/">Houston</a> (Tuesday, February 7th)</p></li><li><p><a href="http://www.meetup.com/LAarea-R-usergroup/events/40337912/">Los Angeles</a> (Thursday, February 9th)</p></li></ul><p>Thanks in advance to everyone who tries out the preview release (you can <a href="http://www.rstudio.org/download/preview">download it here</a>). Let us know what works, what doesn&rsquo;t, and what else you&rsquo;d like to see us do.</p></description></item><item><title>RStudio Update</title><link>https://www.rstudio.com/blog/rstudio-update/</link><pubDate>Thu, 27 Oct 2011 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-update/</guid><description><p>With R 2.14 slated to be <a href="https://stat.ethz.ch/pipermail/r-announce/2011/000542.html">released next week</a> we wanted to encourage everyone planning to upgrade to also update to the latest release of RStudio (<a href="http://www.rstudio.org/download/">v0.94.110</a>). For R 2.14 users this release includes tweaks related to compatibility with the R 2.14 graphics engine as well as compatibility with the new parallel package. There are also a number of other bug fixes which make this a worthwhile update even for users not running R 2.14 (see the <a href="http://www.rstudio.org/docs/release_notes_v0.94">release notes</a> for details).</p><p>In the meantime we&rsquo;ve also been busy at work on the next release of RStudio (v0.95). This release will include some major new features including a project system, code navigation, as well as an integrated version control UI (for <a href="http://subversion.apache.org/">subversion</a> and <a href="http://git-scm.com/">git</a>). We&rsquo;ll be announcing a preview of this release on our blog within the next few weeks.</p><p>Finally, we also wanted to mention that O&rsquo;Reilly has published a book by John Verzani on using RStudio. The book has lots of good insights on learning and getting the most out of RStudio, and also covers some more advanced topics like authoring packages. More info on the book is available <a href="http://shop.oreilly.com/product/0636920021278.do">here</a>.</p></description></item><item><title>RStudio Beta 3 (v0.94)</title><link>https://www.rstudio.com/blog/rstudio-beta-3-v0-94/</link><pubDate>Tue, 14 Jun 2011 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-beta-3-v0-94/</guid><description><p>RStudio Beta 3 (v0.94) is <a href="http://www.rstudio.org/download/">available for download</a> today. The goal for this release was to refine and improve our core features based on the feedback we&rsquo;ve gotten on our first two betas. Highlights of the new release include:</p><ul><li><p><strong>Source editor enhancements</strong> — New editor features include brace/paren/quote matching, more intelligent cursor placement after newlines, function navigation, regex find and replace, run to/from the current line, and a command to re-run the last code region. There&rsquo;s naturally still lots more we&rsquo;d like to do in the editor and we plan to keep improving it with each beta release.</p></li><li><p><strong>New plot export</strong> <strong>features</strong> — We now have a much more flexible plot export UI that supports several formats including PDF, JPEG, TIFF, SVG, BMP, Metafile, and Postscript. The new UI also includes resizable image preview with the ability to maintain the current aspect ratio.</p></li><li><p><strong>Package installation and management</strong>— We&rsquo;ve added many more options to the install packages dialog including support for local archives and multiple target libraries. There is also a new check for package updates dialog as well as the ability to filter the packages listing by name and/or description.</p></li><li><p><strong>Dozens of other small improvements</strong> — We&rsquo;ve also made many smaller enhancements including context-aware F1 for help, sorting of file listings, resizable plot zoom window, custom PDF export sizes, removing items from history, additional working directory commands, optional syntax highlighting for console input, and .zip and .tar.gz packages for users installing without admin privilleges.</p></li></ul><p>Full details on the various new features, enhancements, and bug fixes in v0.94 are in the <a href="http://www.rstudio.org/docs/release_notes_v0.94">release notes</a>.</p><p>Thanks again to everyone for the thorough and thoughtful feedback on our previous betas, please keep it coming!</p></description></item><item><title>RStudio Beta 3 (v0.94)</title><link>https://www.rstudio.com/blog/rstudio-beta-3-v0-94/</link><pubDate>Tue, 14 Jun 2011 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-beta-3-v0-94/</guid><description><p><sup>Photo by <a href="https://unsplash.com/@timmossholder?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Tim Mossholder</a> on <a href="https://unsplash.com/s/photos/beta-fish?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></sup></p><p>RStudio Beta 3 (v0.94) is <a href="http://www.rstudio.org/download/">available for download</a> today. The goal for this release was to refine and improve our core features based on the feedback we&rsquo;ve gotten on our first two betas. Highlights of the new release include:</p><ul><li><p><strong>Source editor enhancements</strong> — New editor features include brace/paren/quote matching, more intelligent cursor placement after newlines, function navigation, regex find and replace, run to/from the current line, and a command to re-run the last code region. There&rsquo;s naturally still lots more we&rsquo;d like to do in the editor and we plan to keep improving it with each beta release.</p></li><li><p><strong>New plot export</strong> <strong>features</strong> — We now have a much more flexible plot export UI that supports several formats including PDF, JPEG, TIFF, SVG, BMP, Metafile, and Postscript. The new UI also includes resizable image preview with the ability to maintain the current aspect ratio.</p></li><li><p><strong>Package installation and management</strong>— We&rsquo;ve added many more options to the install packages dialog including support for local archives and multiple target libraries. There is also a new check for package updates dialog as well as the ability to filter the packages listing by name and/or description.</p></li><li><p><strong>Dozens of other small improvements</strong> — We&rsquo;ve also made many smaller enhancements including context-aware F1 for help, sorting of file listings, resizable plot zoom window, custom PDF export sizes, removing items from history, additional working directory commands, optional syntax highlighting for console input, and .zip and .tar.gz packages for users installing without admin privilleges.</p></li></ul><p>Full details on the various new features, enhancements, and bug fixes in v0.94 are in the <a href="http://www.rstudio.org/docs/release_notes_v0.94">release notes</a>.</p><p>Thanks again to everyone for the thorough and thoughtful feedback on our previous betas, please keep it coming!</p></description></item><item><title>RStudio Beta 2 (v0.93)</title><link>https://www.rstudio.com/blog/rstudio-beta2/</link><pubDate>Mon, 11 Apr 2011 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-beta2/</guid><description><p>RStudio Beta 2 (v0.93) is <a href="http://www.rstudio.org/download/">available for download</a> today. We&rsquo;ve gotten incredibly helpful input from the R community and this release reflects a lot of that feedback.</p><p>The <a href="http://www.rstudio.org/docs/release_notes_v0.93.html">release notes</a> have the full details on what&rsquo;s new. Some of the highlights include:</p><h4 id="source-editor-enhancements">Source Editor Enhancements</h4><ul><li><p>Highlight all instances of selected text</p></li><li><p>Insert spaces for tabs (soft-tabs)</p></li><li><p>Customizable print margin line</p></li><li><p>Selected line highlight</p></li><li><p>Toggle line numbers on/off</p></li><li><p>Optional soft-wrapping for R source files</p></li></ul><h4 id="customizable-layout-and-appearance">Customizable Layout and Appearance</h4><ul><li><p>The layout of panes and tabs is now configurable (enabling side-by-side source and console view, among others).</p></li><li><p>Support for a variety of editing themes, including TextMate, Eclipse, and others.</p></li></ul><p>[](<a href="https://rstudioblog.files.wordpress.com/2011/04/options_appearance1.png)%5B!%5B%5D(https://rstudioblog.files.wordpress.com/2011/04/options-appearance.png)%5D(http://www.rstudio.org/docs/using/customizing">https://rstudioblog.files.wordpress.com/2011/04/options_appearance1.png)[](http://www.rstudio.org/docs/using/customizing</a>)</p><h4 id="interactive-plotting">Interactive Plotting</h4><p>This release features <a href="http://www.rstudio.org/docs/advanced/manipulate">manipulate</a>, a new interactive plotting feature that enables you to create plots with inputs bound to custom controls (e.g. slider, picker, etc.) rather than hard-coded to a single value. For example:</p><div class="highlight"><pre style="background-color:#f0f0f0;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-r" data-lang="r"><span style="color:#06287e">manipulate</span>(<span style="color:#60a0b0;font-style:italic"># plot expression</span><span style="color:#06287e">plot</span>(cars, xlim <span style="color:#666">=</span> <span style="color:#06287e">c</span>(<span style="color:#40a070">0</span>, x.max), type <span style="color:#666">=</span> type, ann <span style="color:#666">=</span> label),<span style="color:#60a0b0;font-style:italic"># controls</span>x.max <span style="color:#666">=</span> <span style="color:#06287e">slider</span>(<span style="color:#40a070">10</span>, <span style="color:#40a070">25</span>, step <span style="color:#666">=</span> <span style="color:#40a070">5</span>, initial <span style="color:#666">=</span> <span style="color:#40a070">25</span>),type <span style="color:#666">=</span> <span style="color:#06287e">picker</span>(<span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Points&#34;</span> <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">p&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Line&#34;</span> <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">l&#34;</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Step&#34;</span> <span style="color:#666">=</span> <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">s&#34;</span>),label <span style="color:#666">=</span> <span style="color:#06287e">checkbox</span>(<span style="color:#007020;font-weight:bold">TRUE</span>, <span style="color:#4070a0">&#34;</span><span style="color:#4070a0">Draw Labels&#34;</span>))</code></pre></div><p><a href="http://www.rstudio.org/docs/advanced/manipulate"><img src="https://rstudioblog.files.wordpress.com/2011/04/manipulate_reduced_noborder3.png" alt=""></a></p><h4 id="more">More</h4><ul><li><p>RStudio now works with versions of R installed from source (either via make install or packaged by MacPorts, Homebrew, etc.).</p></li><li><p>Enhanced support for Unicode and non-ASCII character encodings.</p></li><li><p>Improved working directory management including new options for default behavior, support for shell &ldquo;open with&rdquo; context menus, and optional file assocations for common R file types (.RData, .R, .Rnw).</p></li><li><p>Many other small enhancements and bug fixes (see the <a href="http://www.rstudio.org/docs/release_notes_v0.93.html">release notes</a> for full details).</p></li></ul><p>We hope you try out the new release and keep talking to us on our <a href="http://support.rstudio.org">support forum</a> about what works, what doesn&rsquo;t, and what else you&rsquo;d like RStudio to do.</p></description></item><item><title>RStudio, new open-source IDE for R</title><link>https://www.rstudio.com/blog/rstudio-new-open-source-ide-for-r/</link><pubDate>Mon, 28 Feb 2011 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/rstudio-new-open-source-ide-for-r/</guid><description><p><a href="http://www.rstudio.org/">RStudio</a> is a new open-source IDE for R which we&rsquo;re excited to announce the availability of today. RStudio has interesting features for both new and experienced R developers including code completion, execute from source, searchable history, and support for authoring Sweave documents.</p><p>RStudio runs on all major desktop platforms (Windows, Mac OS X, Ubuntu, or Fedora) and can also run as a server which enables multiple users to access the IDE using a web browser.</p><p>A couple of screenshots (click here for <a href="http://www.rstudio.org/screenshots/">more screenshots</a>):</p><p><img src="https://rstudioblog.files.wordpress.com/2011/02/rstudio-windows500.png" alt=""></p><p><img src="https://rstudioblog.files.wordpress.com/2011/02/rstudio-ubuntu500.png" alt=""></p><p>The version of RStudio available today is a beta (v0.92) and is released under the GNU AGPL license. We&rsquo;re hoping for lots of input and dialog with the R community to help make the product as good as it can be!</p><p>More details on the project as well as download links can be found at: <a href="http://www.rstudio.org">http://www.rstudio.org</a>.</p></description></item><item><title>About the RStudio Project</title><link>https://www.rstudio.com/blog/about-the-rstudio-project/</link><pubDate>Sun, 27 Feb 2011 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/about-the-rstudio-project/</guid><description><p>We started the RStudio project because we were excited and inspired by R. The <a href="http://www.r-project.org/contributors.html">creators of R</a> provided a flexible and powerful foundation for statistical computing; then made it free and open so that it could be improved collaboratively and its benefits could be shared by the widest possible audience.</p><p>It&rsquo;s better for everyone if the tools used for research and science are free and open. Reproducibility, widespread sharing of knowledge and techniques, and the leveling of the playing field by eliminating cost barriers are but a few of the shared benefits of free software in science.</p><p>RStudio is an integrated development environment (IDE) for R which works with the standard version of R available from CRAN. Like R, RStudio is available under a free software license. Our goal is to develop a powerful tool that supports the practices and techniques required for creating trustworthy, high quality analysis. At the same time, we want RStudio to be as straightforward and intuitive as possible to provide a friendly environment for new and experienced R users alike. RStudio is also a company, and we plan to sell services (support, training, consulting, hosting) related to the open-source software we distribute.</p><p>We&rsquo;re looking forward to joining the R community, learning from users, growing the product, and hopefully making a meaningful contribution to the practice of research and science.</p></description></item><item><title>Welcome to our Weblog</title><link>https://www.rstudio.com/blog/welcome-to-our-weblog/</link><pubDate>Sun, 27 Feb 2011 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/welcome-to-our-weblog/</guid><description><p>Welcome to the RStudio weblog! We&rsquo;ll use the weblog to talk about both the product and its features as well as broader issues that concern the R community.</p></description></item><item><title>About RStudio Blog</title><link>https://www.rstudio.com/blog/about/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/about/</guid><description><div style="margin-top:-30px;">Welcome to the RStudio blog! Follow the blog for the latest on:<div class="row pt-4"><div class="col-md-6"><ul><li class="pt-1"><a href="https://www.rstudio.com/blog/categories/company-news-and-events/">Company News and Events</a></li><li class="pt-1"><a href="https://www.rstudio.com/blog/categories/data-science-leadership/">Data Science Leadership</a></li><li class="pt-1"><a href="https://www.rstudio.com/blog/categories/industry/">Industry</a></li></ul></div><div class="col-md-6"><ul><li class="pt-1"><a href="https://www.rstudio.com/blog/categories/open-source/">Open Source</a></li><li class="pt-1"><a href="https://www.rstudio.com/blog/categories/products-and-technology/">Products and Technology</a></li><li class="pt-1"><a href="https://www.rstudio.com/blog/categories/training-and-education/">Training and Education</a></li></ul></div></div><p>Be sure to follow these blogs to find out more about the great work happening across RStudio:</p><div class="row text-center pt-4"><div class="col-md-4 pb-4"><a href="http://blogs.rstudio.com/ai"><img class="pb-4" src="https://www.rstudio.com/assets/img/ai-blog.jpg"></a><a class="text-dark" href="http://blogs.rstudio.com/ai">AI Blog</a><p class="p-80 pt-1">Learn about AI-related technologies in R</p></div><div class="col-md-4 pb-4"><a href="https://www.tidyverse.org/blog/"><img class="pb-4" src="https://www.rstudio.com/assets/img/tidyverse-blog.jpg"></a><a class="text-dark" href="https://www.tidyverse.org/blog/">Tidyverse Blog</a><p class="p-80 pt-1">Explore what’s happening in the tidyverse</p></div><div class="col-md-4 pb-4"><a href="http://rviews.rstudio.com/"><img class="pb-4" src="https://www.rstudio.com/assets/img/rviews-blog-2.jpg"></a><a class="text-dark" href="http://rviews.rstudio.com/">RViews</a><p class="p-80 pt-1">Read highlights from the R community</p></div></div><p>Want to stay connected? You can follow us on <a href="https://www.linkedin.com/company/rstudio-pbc">LinkedIn</a>, <a href="https://www.facebook.com/rstudiopbc/">Facebook</a>, and <a href="https://twitter.com/rstudio">Twitter</a>, and <a href="https://www.rstudio.com/blog/subscribe/">subscribe here</a> for email updates and RSS feeds. These posts are syndicated on <a href="https://www.r-bloggers.com/">R-Bloggers.com</a> and <a href="https://python-bloggers.com/">Python-Bloggers.com</a>.</p><div><ul class="social-buttons d-inline pl-0"><li class="d-inline"><a target="_blank" rel="noopener noreferrer" href="https://www.linkedin.com/company/rstudio-pbc" class="btn btn-icon btn-neutral btn-linkedin btn-lg"><i class="fab fa-linkedin-in circle"></i></a></li><li class="d-inline"><a target="_blank" rel="noopener noreferrer" href="https://www.facebook.com/rstudiopbc/" class="btn btn-icon btn-neutral btn-facebook btn-lg"><i class="fab fa-facebook-f circle"></i></a></li><li class="d-inline"><a target="_blank" rel="noopener noreferrer" href="https://twitter.com/rstudio" class="btn btn-icon btn-neutral btn-twitter btn-lg"><i class="fab fa-twitter circle"></i></a></li><li class="d-inline"><a target="_blank" rel="noopener noreferrer" href="https://github.com/rstudio" class="btn btn-icon btn-neutral btn-github btn-lg"><i style="font-size:19px;" class="fab fa-github circle"></i></a></li></ul></div></div></description></item><item><title>Subscribe to Blog Updates</title><link>https://www.rstudio.com/blog/subscribe/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://www.rstudio.com/blog/subscribe/</guid><description><style type="text/css">legend {font-size: 26px;}.mktoFormRow legend {display: block !important;text-align: left;margin-left: 0px !important;padding-top: 20px !important;}.mktoForm input[type="text"], .mktoForm input[type="url"], .mktoForm input[type="email"], .mktoForm input[type="tel"], .mktoForm input[type="number"], .mktoForm input[type="date"], .mktoForm select.mktoField, .mktoForm textarea.mktoField {box-shadow: none !important;}.mktoForm input[type=url],.mktoForm input[type=text],.mktoForm input[type=date],.mktoForm input[type=tel],.mktoForm input[type=email],.mktoForm input[type=number],.mktoForm textarea.mktoField {padding: 2px 8px !important;height: 44px;font-size: 14px;width: 100% !important;box-shadow: 0px 0px 0px 0px rgba(20, 57, 94, 0.1) !important;}.mktoForm .mktoLabel {padding-top: 0;line-height: 1.6em;padding-right: 25px;}.mktoForm .mktoFormRow {padding-top: 1.3em;}.mktoForm .mktoRadioList > label, .mktoForm .mktoCheckboxList > label {font-size: 14px;}.mktoForm .mktoRadioList > input {margin-top: .14em;}.mktoForm .mktoRadioList, .mktoForm .mktoCheckboxList {padding: 0.2em;}@media only screen and (max-width: 480px) {.mktoForm input[type="url"],.mktoForm input[type="text"],.mktoForm input[type="date"],.mktoForm input[type="tel"],.mktoForm input[type="email"],.mktoForm input[type="number"],.mktoForm textarea.mktoField,.mktoForm select.mktoField {width: 100% !important;height: 3em !important;line-height: 1.5em !important;font-size: 16px !important;}.mktoForm fieldset {padding: 0 !important;}.mktoForm {padding: 10px 0px !important;}.mktoFormCol {width: 100%;}.mktoForm .mktoCheckboxList {width: 11% !important;}.mktoForm .mktoFormCol .mktoLabel {width: 89% !important;padding-right: 10px;line-height: 22px;}.mktoMobileShow .mktoForm, .mktoForm * {padding: 0px;}}@media (max-width: 800px) {#computer {margin-top: -70px !important;}#top-text {padding-left: 20px;padding-right: 20px;}.mktoLabel.mktoHasWidth {width: 77% !important;padding-right: 15px;}}#arrow img {height: 25px;}#main {padding-bottom: 0px !important;}.post-content {margin-bottom: 0px !important;}.mktoFormCol {padding-right: 0px;}.mktoHtmlText.mktoHasWidth {width: 100% !important;}.mktoFormRow legend {display: none;}.mktoForm {width: 100% !important;}.mktoFormRow,.mktoButtonRow,.mktoButtonRow {text-align: center !important;}.mktoButtonWrap {margin-left: 0 !important;}.mktoFieldWrap label {width: 77% !important;}.mktoButton {background-color: #4287c7 !important;background-image: none !important;border: 1px #4287c7 !important;color: #fff !important;width: 100% !important;height: 48px !important;margin-bottom: 7px;text-transform: uppercase !important;font-weight: 600 !important;font-family: source sans pro;font-size: 18px !important;}.mktoAsterix {display: none !important;}div.mktoOffset {display: none !important;}.mktoFieldDescriptor.mktoFormCol {width: 100%;}.mktoForm select.mktoField {height: 34px !important;width: 100%;padding-top: 5px;}#emailPause {width: 100% !important;margin-top: 8px !important;}button.mktoButton {width: 100% !important;line-height:20px;}</style><div style="margin-top:-100px;"><script src="//pages.rstudio.net/js/forms2/js/forms2.min.js"></script><form id="mktoForm_3539" data-toggle="toggle"></form><script>MktoForms2.loadForm("//pages.rstudio.net", "709-NXN-706", 3539);</script></div></description></item></channel></rss>
