Posit AI Blog: Getting started with Keras from R

If you’ve been thinking about diving into deep learning for a while – using R, preferentially –, now is a good time. For TensorFlow / Keras, one of the predominant deep learning frameworks on the market, last year was a year of substantial changes; for users, this sometimes would mean ambiguity and confusion about the “right” (or: recommended) way to do things. By now, TensorFlow 2.0 has been the current stable release for about two months; the mists have cleared away, and patterns have emerged, enabling leaner, more modular code that accomplishes a lot in just a few lines.

To give the new features the space they deserve, and assemble central contributions from related packages all in one place, we have significantly remodeled the TensorFlow for R website. So this post really has two objectives.

First, it would like to do exactly what is suggested by the title: Point new users to resources that make for an effective start into the subject.

Second, it could be read as a “best of new website content”. Thus, as an existing user, you might still be interested in giving it a quick skim, checking for pointers to new features that appear in familiar contexts. To make this easier, we’ll add side notes to highlight new features.

Overall, the structure of what follows is this. We start from the core question: How do you build a model?, then frame it from both sides; i.e.: What comes before? (data loading / preprocessing) and What comes after? (model saving / deployment).

After that, we quickly go into creating models for different types of data: images, text, tabular.

Then, we touch on where to find background information, such as: How do I add a custom callback? How do I create a custom layer? How can I define my own training loop?

Finally, we round up with something that looks like a tiny technical addition but has far greater impact: integrating modules from TensorFlow (TF) Hub.

Getting started

How to build a model?

If linear regression is the Hello World of machine learning, non-linear regression has to be the Hello World of neural networks. The Basic Regression tutorial shows how to train a dense network on the Boston Housing dataset. This example uses the Keras Functional API, one of the two “classical” model-building approaches – the one that tends to be used when some sort of flexibility is required. In this case, the desire for flexibility comes from the use of feature columns – a nice new addition to TensorFlow that allows for convenient integration of e.g. feature normalization (more about this in the next section).

This introduction to regression is complemented by a tutorial on multi-class classification using “Fashion MNIST”. It is equally suited for a first encounter with Keras.

A third tutorial in this section is dedicated to text classification. Here too, there is a hidden gem in the current version that makes text preprocessing a lot easier: layer_text_vectorization, one of the brand new Keras preprocessing layers. If you’ve used Keras for NLP before: No more messing with text_tokenizer!

These tutorials are nice introductions explaining code as well as concepts. What if you’re familiar with the basic procedure and just need a quick reminder (or: something to quickly copy-paste from)? The ideal document to consult for those purposes is the Overview.

Now – knowledge how to build models is fine, but as in data science overall, there is no modeling without data.

Data ingestion and preprocessing

Two detailed, end-to-end tutorials show how to load csv data and
images, respectively.

In current Keras, two mechanisms are central to data preparation. One is the use of tfdatasets pipelines. tfdatasets lets you load data in a streaming fashion (batch-by-batch), optionally applying transformations as you go. The other handy device here is feature specs andfeature columns. Together with a matching Keras layer, these allow for transforming the input data without having to think about what the new format will mean to Keras.

While there are other types of data not discussed in the docs, the principles – pre-processing pipelines and feature extraction – generalize.

Model saving

The best-performing model is of little use if ephemeral. Straightforward ways of saving Keras models are explained in a dedicated tutorial.

And unless one’s just tinkering around, the question will often be: How can I deploy my model?
There is a complete new section on deployment, featuring options like plumber, Shiny, TensorFlow Serving and RStudio Connect.

After this workflow-oriented run-through, let’s see about different types of data you might want to model.

Neural networks for different kinds of data

No introduction to deep learning is complete without image classification. The “Fashion MNIST” classification tutorial mentioned in the beginning is a good introduction, but it uses a fully connected neural network to make it easy to remain focused on the overall approach. Standard models for image recognition, however, are commonly based on a convolutional architecture. Here is a nice introductory tutorial.

For text data, the concept of embeddings – distributed representations endowed with a measure of similarity – is central. As in the aforementioned text classification tutorial, embeddings can be learned using the respective Keras layer (layer_embedding); in fact, the more idiosyncratic the dataset, the more recommendable this approach. Often though, it makes a lot of sense to use pre-trained embeddings, obtained from large language models trained on enormous amounts of data. With TensorFlow Hub, discussed in more detail in the last section, pre-trained embeddings can be made use of simply by integrating an adequate hub layer, as shown in one of the Hub tutorials.

As opposed to images and text, “normal”, a.k.a. tabular, a.k.a. structured data often seems like less of a candidate for deep learning. Historically, the mix of data types – numeric, binary, categorical –, together with different handling in the network (“leave alone” or embed) used to require a fair amount of manual fiddling. In contrast, the Structured data tutorial shows the, quote-unquote, modern way, again using feature columns and feature specs. The consequence: If you’re not sure that in the area of tabular data, deep learning will lead to improved performance – if it’s as easy as that, why not give it a try?

Before rounding up with a special on TensorFlow Hub, let’s quickly see where to get more information on immediate and background-level technical questions.

The Guide section has lots of additional information, covering specific questions that will come up when coding Keras models

as well as background knowledge and terminology: What are tensors, Variables, how does automatic differentiation work in TensorFlow?

Like for the basics, above we pointed out a document called “Quickstart”, for advanced topics here too is a Quickstart that in one end-to-end example, shows how to define and train a custom model. One especially nice aspect is the use of tfautograph, a package developed by T. Kalinowski that – among others – allows for concisely iterating over a dataset in a for loop.

Power users: Check out the custom training Quickstart featuring custom models, GradientTapes and tfautograph.

Finally, let’s talk about TF Hub.

A special highlight: Hub layers

One of the most interesting aspects of contemporary neural network architectures is the use of transfer learning. Not everyone has the data, or computing facilities, to train big networks on big data from scratch. Through transfer learning, existing pre-trained models can be used for similar (but not identical) applications and in similar (but not identical) domains.

Depending on one’s requirements, building on an existing model could be more or less cumbersome. Some time ago, TensorFlow Hub was created as a mechanism to publicly share models, or modules, that is, reusable building blocks that could be made use of by others.
Until recently, there was no convenient way to incorporate these modules, though.

Starting from TensorFlow 2.0, Hub modules can now seemlessly be integrated in Keras models, using layer_hub. This is demonstrated in two tutorials, for text and images, respectively. But really, these two documents are just starting points: Starting points into a journey of experimentation, with other modules, combination of modules, areas of applications…

In sum, we hope you have fun with the “new” (TF 2.0) Keras and find the documentation useful.
Thanks for reading!