Quantcast
Channel: Babbel Bytes

Hackday 5.0 - our review

$
0
0

Powered by lightning fast internet, lasagne and tons of Club Mate, our 5th Hackday was a success! So, here we are with our own review about a day full of projects, sun - and caffeine.

This year’s Hackday took place at Colonia Nova, a proper Berlin-style venue in the famous Neukölln district. Let’s be honest, the huge and sunny rooftop terrace on the 5th floor made our day even better. It was great to enjoy the sunshine while hacking with our colleagues - because yes, we set up the wifi there too!

5th Babbel Hackday


If you are one of our aficionados, you might already know that the 10 main ingredients for a perfect Hackday are:

  • THE people - our hackers, 100 people among our Engineering, Product, Marketing, Didactics and Design Teams;
  • The projects - 15 ideas made it into live demos and the outcomes included artificial intelligence, role playing games, hardware, musical instruments & much more;
  • Timeframe - we hacked from 10 am to 19.30, with a countdown projected on the wall reminding us how time flies when you´re having fun.
  • Yummy food - from some breakfast snacks, to the mediterranean lunch feast and the proper pizza from Naples in the evening, we were feeling as if it was Christmas at Grandma´s;
  • Caffeine - coffee, club mate, coffee, club mate and repeat; ideas need power to make it into projects.
  • Demos - each team had 3 minutes for presenting their project with a demo and without talking; that´s always a challenge, but that is also when creativity comes into play;
  • Good vibes - smiles, sun, good food, drinks, a guitar, flip flops, bean bags, sofas, music, some freshly baked bread rolls, these are the good vibes we know!
  • Competition - Hackdays are fun, yet really competitive; some teams might have similar projects so..that little detail always makes the difference to the final vote.
  • Prize & T- Shirts - no competition comes without a prize! The winning crew won a Lasergame team building experience while all participants received the black Hackday 5 t-shirt with a really cool logo.
  • Afterparty - Hackdays can also be intense and frantic; nothing better than a Berliner Eckkneipe down the street for celebrating the great success altogether.Zauber Insel, you are our new favourite spot!

5th Babbel Hackday


What to say to wrap-up our review?

Hackdays now play a huge part in our working culture. Some people are already thinking about their next projects, others are still recovering. Days like these are surely fun, challenging and rewarding…because in the end, Hackdays are all about Learning; learning how to work with new technologies and tools, learning how to work in a new team, learning how to work in a very quick way and learning how to build awesome stuff.

What will we learn next time? Well, check out our open positions; you might be part of it!

View the video on YouTube.


How I got the job I wanted (even though I wasn’t qualified)

$
0
0

At Babbel, we’re constantly looking for new and unconventional ways to get our customers over the hurdle of speaking in a foreign language. As a result, finding talented people to help move our organization forward can also come in unexpected ways. This week, we invited Jeanny (Director of Product) and Clyde (Product Specialist) to share one of our more amusing hiring stories.


Jeanny: So, after almost six months here at Babbel, we thought it’d be fun to tell your whole story from the beginning. What led you to seek out a job here in the first place?

Clyde: Well, I had just finished a Masters Degree in International Relations, but was having doubts about my career options. At the time, I was also using Babbel to try and learn Spanish, and since I’m a huge language nerd, I liked the product, but also saw tons of room for improvement. I realized then, that I wanted to pursue this passion of creating an exceptional language learning product. There was only one problem: I had no experience in product management.

Jeanny: It’s funny you mention that because I remember when I started out in the field over a decade ago, product managers were predominantly engineers and business managers. I myself came from a liberal arts background and had to get creative in order break into the tech industry. I immersed myself into the field by plowing through books, listening to almost every single IT Conversation talk, going to local meetups and teaching myself to code. The deeper I delved into the topic, the more excited I got! It was intense!

Clyde: It’s slightly embarrassing to admit that, in contrast to your approach, my journey started by pure chance =P. A friend sent me an article entitled How to Get Any Job You Want (even if you’re unqualified) which explained how to create an application that would stand out. Using this as inspiration, I created a survey on how to improve Babbel and found creative ways to send it to hundreds of fellow users. After a few weeks, I got a surprise email from Customer Service which made me realize that I had triggered some questions within the company. What actually happened?

Who was this guy?? Why was he surveying our users?

Jeanny: I still remember Chris from Customer Service approaching us to ask if we were conducting a customer survey, which we do regularly to help improve our product. However, we didn’t have any initiatives at that time, so we dug deeper into your case. We quickly found out that you were from the US, studied law in Boston, and somehow ended up in Vienna, so you obviously weren’t trying very hard to cover your tracks ;). Conspiracy theories flew: Who was this guy?? Why was he surveying our users? We figured you must have a good reason, So we told Chris to get in touch with you, and hook you up with our User Research team. If anything, the field of language learning is broad, and we’re always keen on exchanging insights and ideas with those interested.

Clyde: I did worry slightly that I might’ve ruffled a few feathers, thereby torpedoing my chances, but at that point, what could I do? By the time I got Chris’ email, I’d already received enough responses, so I stopped the survey and created a slide deck summarizing my results. I sent it to the company and eventually, it found its way to you?

Jeanny: Right, you never did get back in touch with our User Research team. But, a month later, your application ended up in my inbox. While your CV wasn’t a fit for the product management role, I looked through your presentation on how to improve our customer experience, and I was pleasantly surprised. Your resourcefulness, the way you conducted your research, the questions you asked, and how that tied in with your proposal – you showed qualities I was looking for in a PM.

Clyde: You have no idea how thrilled I was to hear from you; I’d started to think maybe my email had gotten lost in the vast netherworld of the “Inbox.”

Jeanny: Yeah, I could tell you were pretty thrilled when we first contacted you. But you also put me in a dilemma! I saw the potential, but I was still growing the team, and my priority at that point was to look for experienced PMs. I decided to send you our challenge anyway, giving you a tough deadline and a few basic instructions. To be frank, I wasn’t sure you’d pull through.

Clyde: Seriously, when I first looked at it, I despaired a little. User Flows? Behavior Driven Development?? In the end, after a fair amount of googling, I just tried to send in something logical and cohesive. To my surprise, you wanted to interview me, and after getting through a few rounds, I thought I’d made it through. But then you started our final call by saying you were impressed but that hiring me would be a huge risk. I felt my heart sink because I thought that was the end of the road! Luckily, you went on with “let’s give it a shot.” The next few months were filled with telling everyone I’d landed my dream job, so as you can imagine, my expectations were high =D.

Jeanny: So, how has your Babbel experience been so far?

Clyde: Honestly, it’s been demanding to combine joining a new company, moving to a new city, and starting a new role all at once! Keeping up with the technical lingo, as well as learning to ask the right questions, have all been part of the challenge. Even when I make mistakes, the most important thing has been to learn from them and remain open to being corrected – not so different from learning a language, in fact. Sometimes you gotta just say “let’s do this” and make it happen.

Jeanny: Yes, your unconventional and bold ways definitely got you noticed! But, it was your undeniable passion for languages, your problem solving skills and ability to learn quickly that got you through the door. And, nearly six months in, it looks like I made the right choice!

Hackday 6.0 - our review

$
0
0

Lots of ideas, people, Club Mate and pizzas. Hello again, with our own review after a successful Hackday.

The 6th Hackday was held at Magazine in der Heeresbäckerei, an impressive industrial monument with a prominent appearance in Kreuzberg, directly next to the Spree. It is extremely important to have a amazing location for such a long, intensive day and we’re happy that we could spend it in this wonderful venue! To make sure nobody was hungry we had lunch from Knofi and delicious Italian pizzas from our good friends at LaPausa.

Unlike the participants of the hackday, our organization team had a little longer than 10 hours to prepare their ‘product’: the first steps towards the event were taken already a couple of months before. It is very satisfying to see how all the small parts we’ve been working on are coming together like puzzle pieces, creating one big event that so many people can enjoy.

Enough background, let’s talk about the actual Hackday!

The 6th Babbel Hackday started at 8 o’clock in the morning - which is considered the middle of the night for some of the Babbellonians - with coffee and sleepy faces, and people happily putting on their work uniforms: this year’s Babbel Hackday T-shirts. With every zip of hot beverage, more and more smiles appeared and our day with ten hours of hacking was ready to begin. The energy levels were rising by the minute and if this Hackday would have had background music, it would definitely be Reel 2 Real’s ‘I like to move it’!

6th Babbel Hackday


Our hackers came up with 18 different creative projects, of course concerning the obvious subject of language learning, but also very different, innovative ideas - all of that in only ten hours of hacking. One of the most interesting facts of the 6th Babbel Hackday was that the winning team was not just programmers and computer scientists, something one would likely expect, but consisted of seven people from different departments including Marketing, Didactics, Product and IT-Support.

All of the teams (including those working from home) presented their kick-ass ideas in the evening, when everyone from the company was welcome to drink some beers and check out the results. This years winners created a very cool RPG game, and went home with their very own VR-headsets. For Veronika, who was a part of the winning team, the 6th Babbel Hackday was very successful:

It was literally the first time I’ve ever won something this cool. And it’s not about the prize, I’m talking about the event in general. It felt amazing, the fact that there were so many great projects and the audience decided to vote for our project. I can’t express how happy I am.

See you at the next Hackday!

6th Babbel Hackday


Well, check out our open positions; you might be part of it!

View the video on YouTube.

Android Modular Project - Organizing your library dependencies

$
0
0

As your project grows you might have found the need to split it into several modules. This becomes even more prominent when you’re working in a company where several teams develop the same app, but different features.

It’s not so uncommon to have the same library dependencies accross the modules. Usually modules tend to use the same library to achieve similar things and keep the number of used libraries to a minimum. However, it’s also not so uncommon to accidentally use different versions of the same library in this setup.

Here at Babbel we recently faced this issue and in this blog post I’ll share with you a possible solution to overcome this.


Approaching the problem

Suppose you’re developing an app that for some reason needs to be split into several modules. Perhaps there’s a certain functionallity that you want to isolate in a single module, or maybe you want to split features accross teams and this setup helps you with that.

Let’s think of a concrete example. Say you wish to split tracking into a single module. Given the following project structure:

MyApp
 |
  \ app
 |
  \ tracking

Here MyApp is the project you’re working on. app is the main application code and tracking is the separated module.

Each of these modules has its own build script. Likewise, the main project has also its own build script. Here’s a diagram showing this:

MyApp
 |
  \ app
 |   |
 |    \ build.gradle
  \ tracking
 |   |
 |    \ build.gradle
 |
  \ build.gradle

This means that each module has its own place for setting compile dependencies. It’s then fair to assume that it’s not so hard to fall into the situation where the app module uses version 1.0.0 of library xyz and the tracking modules uses version 1.0.1 of the same library.

Here at Babbel we ran into the same situation quite recently and applied the solution described in the next section.

Keep your libraries organised

Once you look at the problem it’s easy to understand that it exists because there are multiple places where you can specify compile dependencies with string literals.

MyApp/app/build.gradle

dependencies{compile'com.babbel:xyz:1.0.0'}

MyApp/tracking/build.gradle

dependencies{compile'com.babbel:xyz:1.0.1'}

If there would be a common place from where you could take these string literals, then the problem would be mitigated. We opted to use the project extra properties extension. In the end our build files look something like:

MyApp/app/build.gradle

dependencies{compileproject.libraries.xyz}

MyApp/tracking/build.gradle

dependencies{compileproject.libraries.xyz}

We create the extra property libraries in the project wide build.gradle file like so:

MyApp/build.gradle

classLibrariesextendsExpando{}project.ext.libraries=newLibraries()project.ext.libraries.xyz='com.babbel:xyz:1.0.1'

The Expando class lets you pin down attributes to the class way easier by just setting them with the dot notation as it’s shown in the last line. We went a step further and avoided polluting the project wide build.gradle file by creating a separate file called libraries.gradle at the same level as the project wide build.gradle and used the following to apply it to the entire project:

MyApp/build.gradle

applyfrom:'libraries.gradle'// ...

The Pros and Cons

The obvious advantage of this solution is the fact that now you have all your dependencies in a single place accessible by every module in your project. At the same time it becomes a point for conflicts so one must be careful when updating the libraries.

Another point I consider an advantage is that now you can easily force all modules to be updated once a new libary comes out and not having any of them trailing behind. A lot of the updates are backwards compatible so the impact won’t even be noticeable. For updates that are not backwards compatible it might take a bit longer to update the modules in a single go, but at least at the end you’ll have the guarantee that all your modules are up to date.

Last but not least, we saw a great improvement in the build times. Since now you have less libraries to download and compile it’s very likely that your build time will be less than before even with gradle’s incremental builds.

Summary

In this blog post I showed you a possible solution for organising your library dependencies in a modular Android project. It’s important to notice that this is not the sole solution and that there might be others that better fit your needs.

On the Road to Gender Parity

$
0
0

Jana Rekittke from the Engineering gave us a recap of the EWIT16* conference she attended, presenting some topics and ideas. Enjoy reading the summary of her talk!

Tech workplaces tend to be often typically male dominated. Women might find a “Brogrammer” culture in which they may feel out of place, uncomfortable, or less valued. An unappealing workplace could also be a reason for women to not consider it in the first place. Yet, changes in the workplace tend to only happen when more women start working there. So the question is: How do we achieve gender parity? And do we actually want that? Does it make sense for a company? Quite probably so: a mixed workplace (think of “not only white middle aged male”) will give the company access to a more diverse pool of ideas. If everyone who makes decisions has the same background, similar ideas and opinions, how can a company/country/… evolve and adapt to ever-changing requirements?

How about women in leadership? Do we need more women in middle/upper management and on company boards? Several studies (see links below) indicate that companies with women in upper management and on boards are more successful than their male-run counterparts by achieving greater productivity, creativity, and profitability. However, companies founded by women receive less funding. Which is odd considering that a company run by a woman is more likely to succeed than a company run by a man.

Women in Tech

Sadly, the percentage of companies that have women in upper management is around 70% (big companies, first world), and the percentage of women vs. men in management positions of these companies lies between 0 and 30%. The percentage of women in tech teams is just as low. We are therefore far from parity - and those who say that feminism or gender studies are outdated and not needed should reconsider their position. There is obviously still work to be done.

For further reading on this broad and interesting topic, please consider these:

Introduction to Test Automation

$
0
0

At Babbel, we revised our test automation strategy about 1.5 years ago. Since then, our focus has been on frontend testing (browser and mobile) for crucial parts of our business. Which, in our case, is mostly that a user can register or login and navigate through the language lessons after subscribing.

I’ll give a short introduction to test automation here, so if you’re a test automation engineer, what I’ll be covering is probably not new to you. If, however, you’re new to the topic, read on!


What is test automation?

Basically, instead of a person (usually a manual QA) interacting with a browser or mobile device, you have a driver doing it for you. This driver is a piece of software. There are several different browser/mobile device drivers out there, and the ones we are using are called Selenium and Appium. These drivers execute commands that are specified in code which, at Babbel, is written by us test automation engineers.

Test Code

The test code is usually divided into 3 layers of which the topmost is the easiest to understand for a non-tester and can be written by anyone with sufficient domain knowledge. The middle layer is a bit more abstract and maps the topmost layer to the corresponding implementations that make up the bottom layer. This is where you find the programming code, the actual implementations.

First layer

The first layer is written in regular human sentences and stored in a so called feature file, where each tested potential problem is called a scenario, and each line/sentence in it is a step. The language is called Gherkin Here is an example for a scenario:

@android @3
 Scenario: Try to login with wrong email (babbel login)
   Given I visit the start page
     And I click on the login button
    Then I am on the login page
    When I click on the email button
    Then I am on the login with email page
    When I fill in an invalid username / password combination
     And I click on the login button
    Then I should see a red error box with the text "Invalid email or password."

Second layer

The next layer is composed of files that contain the step definitions. Regular expressions are used to match a step with its definition. Our definitions are written in Ruby where regular expressions are marked with either // or %r{} as you can see in the examples below (the definitions match some of the steps from our example scenario).

Then(/^I am on the login with email page$/)do@current_page=Pages::LoginWithEmail.newassert@current_page.page_detected?,"I am not on the login with email page."endWhen(%r{^I fill in an invalid username / password combination$})do@current_page.fill_in_email_password_combination(true)EndWhen(/^I click on the login button$/)do@current_page.click_login_buttonendThen(/^I should see a red error box with the text "([^"]*)"$/)do|text|assert@current_page.red_error_box_text_correct?(text)end

The use of variables is possible, as can be seen in the last step definition above, where text is a variable that can be set to a specific value when the step is used. In our example scenario the value of text is Invalid email or password. The test will fail if there is no red error box or if the red error box contains a different text.

In the last step definition, we are asserting that on the @current_page the function red_error_box_text_correct? with the input text returns true. The definition of this functions can be found in the code in the next section.

Third layer

Finally, everything that is used in the step definition files needs to be coded somewhere. The actual code files make up the lowest and most complicated layer of the test code. At Babbel, we use the same feature and step definition files for all platforms (Android, iOS, web), but the actual implementations vary, so they are split into different folders. The code can be written in any desired programming language as long as there is a client library written for it - or you write you own (current status Selenium bindings, Appium bindings). We are using ruby.

Let’s have a look at the implementation of the email login example above for Android. Here is small excerpt:

require_relative "android_base"

module Pages

  class LoginWithEmail < AndroidBase


    [...]
    LOGINEMAIL_PASSWDFIELD_XPATH = "#{Pages::Base::ANDROID_XPATH_PREFIX}passwd_text']".freeze
    TOP_RED_ERROR_TEXT_XPATH = "#{Pages::Base::ANDROID_XPATH_PREFIX}errorLabel']".freeze

    [...]

    def click_forgot_password_button
      wait_for_element(:xpath, LOGINEMAIL_FORGOT_XPATH).click
    end

    def red_error_box_text_correct?(assumed_text)
      actual_text = wait_for_element(:xpath, TOP_RED_ERROR_TEXT_XPATH).text
      if assumed_text == actual_text
        return true
      end
      false
    end

  end  # end class

end  # end module

To find out whether the assumed_text is shown, the top red error box needs to be found first. So in the function red_error_box_text_correct? we wait for an element to appear in the app that is identified by its xpath "#{Pages::Base::ANDROID_XPATH_PREFIX}errorLabel']". There are other ways (so called locator strategies) to identify elements, but here xpath is used. Once we have the element, we have access to its text and can compare actual_text to the assumed_text and return true or false.

If you’re wondering about the #{Pages::Base::ANDROID_XPATH_PREFIX}, it is defined in the pages’** base class and translates to //*[@resource-id='com.babbel.mobile.android.en:id/".

Finding elements’

There are several locator strategies. The input for a locator strategy is a selector. For example, we might want to find an element by its ID “green_button”. “Find by ID” would be the locator strategy and “green_button” the selector.

So how do we find the appropriate locator strategy and selectors? Keep in mind that if we can’t identify the elements that are displayed on a page, we cannot interact with that page.

The easiest way is to identify any element by its ID, name, value, or text. And if all of that is neither useful nor given, by its xpath or css selector. With an xpath, anything can be located, but the xpath itself might not be easily comprehensible for a human.

automation

Consider the example above. We are using the Appium inspector on an Android phone which is running our example scenario. It is trying to log in with an invalid username/password combination, and the desired red error box has shown up. In the inspector, the top red error box is selected and we can read its ID (called “resource-id”).

Luckily, the error box has the ID “errorLabel”. If it didn’t, we might have to use any of the other attributes. Look at the image above and consider them from top to bottom.

  • Type android.widget.TextView is unspecific. On any page, there might be 10 TextViews
  • The text Invalid email or password. is dependent on the phone’s locale (location and display language setting) and might be different on different phones. Also, it could also be changed if the product owner felt a different text might be better *Finally, the xpath ‘android.wiget.LinearLayout[1]/…./’ is incomprehensible and can easily change if the layout of the page was altered

An ID however, usually does not change. It is neither dependent on the locale nor the layout.

If you find yourself in a situation where identifying an element is not straightforward or risky (in terms of test might fail soon), ask your developers to please add an identifier for you.

When automating browser interaction, finding IDs/xpaths/css selectors is made very easy for us. Chrome and Firefox offer easy-to-use right-click inspect element options. In this example, we are looking at an element with the xpath “div#siteSub”, or simply the ID “siteSub”. The client libraries for Selenium and Appium offer methods to find elements by their ID/xpath/name/value/.. Take a look at the Python library for a nice example.

automation-2 Also, this cheat sheet is quite helpful.

The driver

automation-3

Let’s go back and think about the driver again. What is it doing? When we run a test script on a computer, we first try to establish a connection to the driver, and the driver to the browser or phone.

Consider using a mobile phone for testing. We’d send a configuration to the driver (in this case, Appium) telling it what phone to use. And we’d also tell it which locale to use, which timeout, layout, and many other things. The configuration we’re sending is called ‘desired capabilities’. Selenium for web testing has them as well, just with settings that apply to browsers.

The driver will then try to connect to the desired phone with the given settings. If successful, we’ll receive a session ID, if not, an error message. Once a session has been established, we can start sending our commands. The driver translates our commands to commands that the phone understands and can execute. For every step in a scenario, messages are being sent back and forth, and finally, we receive either an OK or an error message.

Since so many messages are being sent between our computer, the driver, and the phone, the execution of such tests is usually slow (minutes to hours).

automation-4

This is the runtime of all our current iOS tests run on an iphone simulator on my computer. Local run, 244 steps, 83 minutes. The closer the computer, driver, and phone are physically located, the faster the test will run. This is usually a local setup or a server-side test execution when using external providers. A server-side test is also (similar to) a local setup - just not in our office, but in the provider’s.

We use an external provider (a so-called ‘mobile farm’) to be able to run tests in parallel and to have access to a much greater choice of phone types than we have in our office. Until now, we have only been able to use client-side execution of tests, but server-side execution follow later.

Client-side execution means that the test runs on a computer within our network or, for example, on a CI server like Travis or Bitrise, while the Appium server is located somewhere else, and the phones again somewhere else, although probably close to the Appium server, considering that it’s the same company running both server and mobile devices. The physical distance introduces lag time to the tests. The test execution time is roughly twice as long when using this setup. To learn more about client/server-side execution, you might want to read this article.

Why are we doing test automation?

Presumably, a business wants to make sure its product works as expected before releasing (a new version of) it. To some extent, the testing can be done by humans, but as a product gets more complex with more scenarios to test, or releases become very frequent, you’d have to hire an army of testers. Also, repeating the same tests over and over again is not an interesting job and might lead to careless mistakes.

Computers, on the other hand, neither get bored, nor forget things, nor make human errors; they just do as they are told. Also, as mentioned before, automated tests can run in parallel and off-site. How many devices can a manual QA operate in parallel? Probably not as many as a computer can.

Another often forgotten advantage of writing test cases (scenarios) - which is the first step in test automation - is that they help understanding the complexity of a user story. Imagine the product owner comes up with a new feature. There is often not just one test case, but several positive, as well as negative, ones.

The typical example is the rather simple login field, where the positive test case would be ‘user can log in’, and the negative ‘user cannot log in’. But there isn’t just one scenario where the user cannot log in, but many, for example ‘wrong input’, ‘empty fields’, ‘wrong username/password combination’.

Writing the test cases into the user story gives the developers an idea of what to keep in mind and reduces the number of questions that usually come back while implementing a story. It makes estimation easier, and the test cases can be used as acceptance criteria for the user story. Which test cases to automate can be decided on by the product team.

All automated tests take away work from the manual QAs that can be better assigned to test things that cannot or should not be automated, such as exploratory testing or very complex test cases. They also have more time to speak with team members, take care of tickets, etc., hopefully leaving them happier than if they were being used as test-monkeys.

automation-5

Since well-written frontend tests do what a typical user does, we can be pretty sure that our product is working correctly when we have an extensive test suite and all tests are passing. So… we should automate all the things, right?

Why not to automate all the things

As mentioned before, browser tests (and mobile tests even more so), are slow. Running the login scenario manually might take me 30 seconds. Running it with Appium, however, might take five minutes, four and a half of which are needed to set up the session, start up the mobile phone, apply the settings, and take care of the messages back and forth.

Besides this overhead, the tests are also flaky. There are often issues with timeouts. A manual tester can easily deal with a situation where an element appears too quickly or too late and can still execute all remaining steps, but an automated test will fail if any step fails.

From time to time, there will be incompatibilities between the driver and phone/browser versions that will make all the tests fail, requiring a couple of days’ work until the issues have been resolved by the owners of the respective driver/phone/browser software.

Frontend tests are also quite dependent on browser or phone models. The same test might pass on Firefox, but fail on Chrome, or work on Android Phone A, but not Android Phone B. A manual tester, however, can run tests on all of them.

So with an increasing number of tests, we would see an increasing number of false negatives, not to mention the increasing duration of the test run. Even when done in parallel, it would take a lot of time, and developers are not happy to wait for test results, especially not on a pull request level. If you are unfamiliar with pull requests and the usual git workflow, please read this easy introduction.

Your development team might fail to trust the tests if they are constantly failing for reasons not related to the developers’ code changes.

Also, managers might be pressured to release quickly and be tempted to release without testing if there is no fast running test suite available, creating undesirable precedents or conventions.

Another pressing matter is the question of maintenance. Who writes the tests, and who maintains them? Technical QAs are often moved around teams to help where test coverage is low, so they leave teams and hand over the tests to the developers, who then might be reluctant to change the test code together with their other code changes.

I would recommend keeping the automated user tests to an absolute minimum, and to not run all of them all the time, but to find a setup in which some tests - and not necessarily the same ones - run

  • On PR level (with green tests being mandatory to merge the branch) in a development environment
  • In a staging environment before releasing to production
  • On live during a nightly run with some form of built-in alarm system, because otherwise no one will look at the build results

Generally, in the development process, there is more than one good opportunity to run tests!

There are also ways to speed up tests and to make them more stable, but going into that would be too detailed for this blog entry. For example, a short guideline by saucelabs.

Keep in mind that before you automate every scenario you can come up with, it’s better to have a prioritization session with your team or product owner and to devise a suitable test strategy.

There are many other layers in the software architecture where we can test the correctness of code and services, and many things should be tested on lower levels. There is a so called testing pyramid that can aid with this.

In fact, everything that can be tested on a lower level, should be, and browser/mobile level tests should cover only those features that are built only/mostly on the user interface, and those that are critical for the business. Think in terms of money, company brand, and reputation. What is unacceptable if it doesn’t work? That’s what you should automate.

WTM Berlin Android Study Jam at Babbel

$
0
0

Do you want to learn how to create you own Android app, but you don’t have any developing experience? What about joining your local Android Study Jam?

A study Jam is a community-run study group. Members of local Google Developer Groups support beginners learn Android development.

And this week we hosted in our offices the first session of the “WTM Berlin Android study Jam” season 3!


Study Jam at Babbel

Organised by Women Techmakers Berlin, this is the third session for the Android Study Jam. We are following the Udacity’s Android course “Android Basics”.

Taking part at the Android Study Jam, you will learn to develop your first Android app! No previous programming experience is needed to complete this course!

During the sessions, you will have experienced Android developers available to help you. You will also have to follow and do the Udacity Android course at home.

Study Jams are an opportunity for everyone to receive free support and tutoring on the Udacity course in a fun classroom environment!

Study Jam at Babbel

All genders are welcome!

WTM Berlin Android Study Jam follows the Berlin Code Of Conduct

You can join the upcoming WTM Berlin Android study Jam session following this link!

Improving the Performance of Complex Angular Applications

$
0
0

At Babbel our learning content is maintained and created using a custom-made content authoring system based on Angular 1.x. The application has become quite complex by now, counting about 80 custom directives, 35 services and 10 filters, which are tested by about 1400 unit tests (just so you get an idea of the sheer size). The application is under continuous development for two years now and recently we have experienced some serious performance issues for the first time. Our most complex view consists of a spreadsheet-like layout containing, on average, about 50-80 of rows filled with content. When users switched between content packages, resulting in the view being updated, there was a noticeable lag of 2-3 seconds, which became annoying quickly.

This article will walk you through our process and explain in detail how we cut the rendering time in half.


Recognizing the Problem is the First Step to Recovery

We used several tools to drill down to the root cause of the issue. Unfortunately, using the profiler of Chrome is not very helpful. When we looked at a CPU profile of our application we saw that $digest and $apply took up most of the execution time, but not which functions, called during the digest cycle, took the majority of the time.

Screenshot of CPU profile

ng-stats

In order to get more details on why Angular is spending so much time inside the digest loop we used a small script called ng-stats. It can be run as a snippet inside Chrome Dev Tools. Make sure to run Angular in debug mode, otherwise the script will not work. You can do this by either calling Angular.reloadWithDebugInfo() from the console or configuring your main module like this:

.config(['$compileProvider', function($compileProvider) {
  $compileProvider.debugInfoEnabled(true);
}])

If you reload your Angular application, run the snippet and call the function showAngularStats() to get the statistics. It will add a small visualization in the upper left corner. Inside, you will see the total number of expressions being watched by Angular right now (number on the left ) and the average duration of the last digest cycle in milliseconds (number on the right). In order to identify the views that slow down your application you can click through your application and observe how the stats change. As a rule of thumb, there should be no more than 2000-3000 watchers in total, but preferably we should aim for significantly fewer. The average digest cycle should take no more than a 100ms, if this time is exceeded most users will no longer perceive the computation as instantaneous (source).

Determining Your Point of Attack

Now that you have identified the areas of your application which need improvements it is time to dig deeper into the problem. This script will allow you to count the watchers for the directives which you suspect of slowing your application down. Once again you need to run Angular in debug mode. Afterwards, paste the code into a snippet in Google Chrome and run it using cmd/ctrl + Enter. The console will print an object containing the total amount of watchers as well as a list of the expressions being watched. At first, this might be a little overwhelming. I recommend using Chrome’s inspect tool to select a single directive inside the DOM and to run getWatchers($0); from the console ($0 represents the selected DOM element). This will provide you with a more digestible result and you will know exactly which module you will need to look at in order to reduce your number of watchers. If you are not sure where to start, try looking at directives that are located inside an ng-repeat, because reducing the footprint of these directives will have the biggest impact.

List of Angular Watchers

Culprits

Watchers

Angular sets up watchers to enable its magical two-way data binding. Whenever you use an expression inside your Angular templates using the double curly braces like this `` a watcher is set up that will update the template in case the value changes inside your Javascript models. Furthermore, Angular directives like ng-if and ng-class rely on watchers to provide their dynamic behavior. Especially problematic is ng-repeat, because it sets up a watcher for every single element as well as the collection. Each watcher is dirty-checked during every digest cycle, which can occur as often as several times per second (usually after an user interaction).

Scope Functions and Filters

Calling functions directly from your templates can be quite performance heavy. It basically means they are executed on every digest cycle, even if the user interaction that triggered the digest cycle is not related to your binding in any way. Angular does not know that the result might be the same, so it has to check every time.

<div>{{ computeValue() }}</div>

The same is true for filters defined in your templates. Since they are also functions called during each digest cycle (or even several times during each cycle), even if the value did not change, they have the same detrimental effect to your app performance as the scope functions mentioned above.

<div>{{ value | uppercase }}</div>

How to Fix It

Bind Once

Since Angular 1.3 values can be bound once, which means that after they have been rendered, they will not be updated anymore and therefore do not have to be watched by Angular. The Angular Documentation defines it like this:

An expression that starts with :: is considered a one-time expression. One-time expressions will stop recalculating once they are stable, which happens after the first digest if the expression result is a non-undefined value.

Basically, every read-only data point that will not change during the lifetime of the directive should be bound once. Examples include dropdown lists or navigation bars.

<div>{{ ::value }}</div>

The same syntax can also be used to reduce the footprint of ng-repeat.

<div ng-repeat="user in ::userCollection">
  {{::user.name}}</div>

Pre-Compute Scope Variables and Filters

Whenever possible try to pre-compute your values and assign the already computed value to the scope. The variable will still be checked by Angular during the digest cycle, but at least the function used for the computation is not called every single time.

<div>{{ computedValue }}</div>

You can do the same for filters. The $filter provider allows us to reach the same effect by applying it inside our controllers. It will only be called once, or if you set it up inside a $watch, when the value actually changes, and not on every digest cycle like the equivalent inside a template.

$scope.uppercaseValue = $filter('uppercase')($scope.value);

Utilize ngModelOptions

The ngModelOptions directive was introduced in Angular 1.3 and allows us to specify exactly when our model will be updated (and thus when the next digest cycle occurs). Using this feature makes sense when dealing with forms and input fields, because inputs trigger a digest cycle every time the value changes, which means that every keystroke of the user results in a (potentially heavy) computation of all our watched expressions. The example below highlights the use of the debounce option, which specifies, that the model is only updated after 300ms have passed. The timer is restarted if another change occurs within the 300ms.

<input type="text"
       name="userName"
       ng-model="user.name"
       ng-model-options="{ debounce: 300 }">

Another feature of ngModelOptions is updateOn, which defines specifically, which user interaction triggers a model update. For input fields it makes sense to update the model after a blur event occurred, that is, after the field lost focus and the user has quite likely settled on an input value. In theory every DOM event which might occur from an input field can be used though.

<input type="text"
       name="userName"
       ng-model="user.name"
       ng-model-options="{ updateOn: 'blur' }">

For a full list of options available with ngModelOptions consult the Angular Documentation.

ng-if Instead of ng-show

This little trick did wonders for us. The difference between ng-if and ng-show is that ng-if removes elements from the DOM, whereas ng-show only hides them. This means if you are using ng-show to hide a complex component, containing many expressions being watched, all these watchers will still be active. This is despite the fact that users do not benefit from expressions being updated which are invisible to them. As a rule of thumb we try to use ng-if wherever possible and as long as we do not expect the state of the element to change more than once during the lifetime of a directive. Dropdown lists might be a good example where you actually want to use ng-show, because inserting and removing the whole dropdown menu each time it is used will get expensive quickly.

Plain Old CSS

In one of our views we were using ng-if to display a placeholder in case the image was missing. In order to get rid of this directive (and an unnecessary watcher), we used CSS to absolutely position the image on top of the placeholder when it is present. If no image exists, the placeholder is visible. This way we achieved the same behavior without Angular ever having to watch and compute an additional expression.

Pagination and Infinite Scroll

The solutions mentioned above are all technical, as a last resort you might want to consider a non-technical approach that involves changing your application’s user interface. If repeating over complex directives or very long lists slows your application down, maybe it is time to speed it up by reducing the number of elements being looped over. The most straight-forward way of doing this is to paginate your results and only display e.g. 30 elements at once. On request, the user gets the next 30 elements.

If you do not want to compromise your app’s user experience, a smoother solution might be to use infinite scroll. As the user scrolls, new elements are being loaded and appended to the view. There is a ready-made Angular directive called ngInfiniteScroll that allows you to do that.

Conclusion

Unfortunately, Angular was not built for performance out of the box, but over time the Angular team added several features that allow developers to significantly speed up their applications. There is no easy way of achieving faster rendering times, so you have to try different techniques and see what works best for you. At Babbel we were able to cut the rendering time in half by reducing our watchers by about 50%, utilizing several of the techniques mentioned above. We still have a long way to go, our most complex view still has about 4000 watchers and the initial digest cycle takes about 1200ms, which results in a noticeable lag.

One final word of advise: Make sure you don’t compromise too much on code quality just to get a tiny performance boost. At some point we started re-implementing components in pure html that were formerly encapsulated in a custom directive. This resulted in bloated templates that were much harder to read and maintain, while at the same time only improving the performance slightly. Try to find a trade-off between code quality and performance that makes you and your users happy. Happy performance boosting!


Launch an AWS EMR cluster with Pyspark and Jupyter Notebook inside a VPC

$
0
0

When your data becomes massive and data analysts are eager to construct complex models it might be a good time to boost processing power by using clusters in the cloud … and let their geek flag fly. Therefore, we use AWS Elastic Map Reduce (EMR) which lets you easily create clusters with Spark installed. Spark is a distributed processing framework for executing calculations in parallel. Our data analysts undertake analyses and machine learning tasks using Python 3 (with libraries such as pandas, scikit-learn, etc.,) on Jupyter notebooks. To enable our data analysts to create clusters on demand and not completely change their programming routines we choose Jupyter Notebook with PySpark (Spark Python API) on top of EMR. We mostly followed the example of Tom Zeng in the AWS Big Data Blog post. For security reasons we run the Spark cluster inside a private subnet of a VPC, and to connect to the cluster we use a bastion host with SSH tunnelling, so all the traffic between browser and cluster is encrypted.


Configuration

Network and Bastion Host

The configuration for our setup includes a virtual private cloud (VPC) with a public subnet and a private subnet. The cluster will run inside the private subnet and the bastion will be inside the public subnet. The bastion host needs to respect the followings conditions:

  • have an Elastic IP to be reached though the internet
  • have a security group (SG) that accepts traffic on port 22 from all IPs
  • be deployed inside a public (DMZ) subnet of the VPC
  • Linux OS

The cluster needs to respect the following conditions:

  • be deployed inside a private subnet of the VPC
  • have an AdditionalMasterSecurityGroups in order to accept ALL traffic from the bastion

More information about security groups and bastion host inside a VPC can be found here.

Bastion Host

To connect to the bastion we used an SSH key-based authentication. We put the public keys of our data analysts inside /home/ubuntu/.ssh/authorized_keys on the bastion host. They then add a ~/.ssh/config file like the one below to their local machine:

Host bastion
    HostName elastic.public.ip.compute.amazonaws.com
    Port 22
    User ubuntu
    IdentityFile ~/.ssh/dataAnalystPrivateKey.pem

If the public keys are deployed correctly they will be able to SSH into the bastion by simply running: ssh bastion

Create-Cluster Command

To launch a cluster from command line the aws cli needs to be installed. The command is then aws emr create-cluster –parameter options. The example command below creates a cluster named Jupyter on EMR inside VPC with EMR version 5.2.1 and Hadoop, Hive, Spark, Ganglia (an interesting tool to monitor your cluster) installed.

aws emr create-cluster --release-label emr-5.2.1 \
--name 'Jupyter on EMR inside VPC' \
--applications Name=Hadoop Name=Hive Name=Spark Name=Ganglia \
--ec2-attributes \
    KeyName=yourKeyName,InstanceProfile=EMR_EC2_DefaultRole,SubnetId=yourPrivateSubnetIdInsideVpc,AdditionalMasterSecurityGroups=yourSG \
--service-role EMR_DefaultRole \
--instance-groups \
    InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m4.xlarge \
    InstanceGroupType=CORE,InstanceCount=2,BidPrice=0.1,InstanceType=m4.xlarge \
--region yourRegion \
--log-uri s3://yourBucketForLogs \
--bootstrap-actions \
  Name='Install Jupyter',Path="s3://yourBootstrapScriptOnS3/bootstrap.sh"

With --instance groups, the count and type of the machines you want are defined. For the workers we use spot instances with a bid price to save money. You pay less for these unused EC2 instances but if the demand increases beyond your bidding price you might loose them. The last option --bootstrap-actions lists the location of the bootstrap script. Bootstrap actions are run on your cluster machines before the cluster is ready for operation. They let you install and set up additional software or change configurations for the applications you installed with --applications.

Bootstrap action

You can use the bootstrap action from this Gist as a reference. In the bootstrap script we undertake the following steps:

  1. Install conda and with conda install other needed libraries such as hdfs3, findspark, numPy and UltraJSON on all instances. The first lines set up the user password for Jupyter and the S3 path where your notebooks should live. You can also pass them as a parameter in the AWS command.

    # arguments can be set with create cluster
    JUPYTER_PASSWORD=${1:-"SomePassWord"}
    NOTEBOOK_DIR=${2:-"s3://YourBucketForNotebookCheckpoints/yourNotebooksFolder/"}
    
    # mount home to /mnt
    if [ ! -d /mnt/home ]; then
      	  sudo mv /home/ /mnt/
      	  sudo ln -s /mnt/home /home
    fi
    
    # Install conda
    wget https://repo.continuum.io/miniconda/Miniconda3-4.2.12-Linux-x86_64.sh -O /home/hadoop/miniconda.sh\
    	&& /bin/bash ~/miniconda.sh -b -p $HOME/conda
    echo '\nexport PATH=$HOME/conda/bin:$PATH' >> $HOME/.bashrc && source $HOME/.bashrc
    conda config --set always_yes yes --set changeps1 no
    
    # Install additional libraries for all instances with conda
    conda install conda=4.2.13
    
    conda config -f --add channels conda-forge
    conda config -f --add channels defaults
    
    conda install hdfs3 findspark ujson jsonschema toolz boto3 py4j numpy pandas==0.19.2
    
    # cleanup
    rm ~/miniconda.sh
    
    echo bootstrap_conda.sh completed. PATH now: $PATH
    
    # setup python 3.5 in the master and workers
    export PYSPARK_PYTHON="/home/hadoop/conda/bin/python3.5"

    Setup PYSPARK_PYTHON to use the python 3.5 in the master and in the workers

  2. We want the notebooks to be saved on S3. Therefore we install s3fs-fuse on the master node and mount a S3 bucket in the file system. This avoids that the data analyst will lose their notebooks after shutting down the cluster.
    # install dependencies for s3fs-fuse to access and store notebooks
    sudo yum install -y git
    sudo yum install -y libcurl libcurl-devel graphviz
    sudo yum install -y cyrus-sasl cyrus-sasl-devel readline readline-devel gnuplot
    
    # extract BUCKET and FOLDER to mount from NOTEBOOK_DIR
    NOTEBOOK_DIR="${NOTEBOOK_DIR%/}/"
    BUCKET=$(python -c "print('$NOTEBOOK_DIR'.split('//')[1].split('/')[0])")
    FOLDER=$(python -c "print('/'.join('$NOTEBOOK_DIR'.split('//')[1].split('/')[1:-1]))")
    
    # install s3fs
    cd /mnt
    git clone https://github.com/s3fs-fuse/s3fs-fuse.git
    cd s3fs-fuse/
    ls -alrt
    ./autogen.sh
    ./configure
    make
    sudo make install
    sudo su -c 'echo user_allow_other >> /etc/fuse.conf'
    mkdir -p /mnt/s3fs-cache
    mkdir -p /mnt/$BUCKET
    /usr/local/bin/s3fs -o allow_other -o iam_role=auto -o umask=0 -o url=https://s3.amazonaws.com  -o no_check_certificate -o enable_noobj_cache -o use_cache=/mnt/s3fs-cache $BUCKET /mnt/$BUCKET
  3. On the master node, install Jupyter with conda and configure it. Here we also install scikit-learn and some visualisation libraries.

    # Install Jupyter Note book on master and libraries
    conda install jupyter
    conda install matplotlib plotly bokeh
    conda install --channel scikit-learn-contrib scikit-learn==0.18
    
    # jupyter configs
    mkdir -p ~/.jupyter
    touch ls ~/.jupyter/jupyter_notebook_config.py
    HASHED_PASSWORD=$(python -c "from notebook.auth import passwd; print(passwd('$JUPYTER_PASSWORD'))")
    echo "c.NotebookApp.password = u'$HASHED_PASSWORD'">> ~/.jupyter/jupyter_notebook_config.py
    echo "c.NotebookApp.open_browser = False">> ~/.jupyter/jupyter_notebook_config.py
    echo "c.NotebookApp.ip = '*'">> ~/.jupyter/jupyter_notebook_config.py
    echo "c.NotebookApp.notebook_dir = '/mnt/$BUCKET/$FOLDER'">> ~/.jupyter/jupyter_notebook_config.py
    echo "c.ContentsManager.checkpoints_kwargs = {'root_dir': '.checkpoints'}">> ~/.jupyter/jupyter_notebook_config.py

    The default port for the notebooks is 8888.

  4. Create the Jupyter PySpark daemon with Upstart on master and start it.

    cd ~
    sudo cat << EOF > /home/hadoop/jupyter.conf
    description "Jupyter"
    author      "babbel-data-eng"
    
    start on runlevel [2345]
    stop on runlevel [016]
    
    respawn
    respawn limit 0 10
    
    chdir /mnt/$BUCKET/$FOLDER
    
    script
      		sudo su - hadoop > /var/log/jupyter.log 2>&1 << BASH_SCRIPT
        export PYSPARK_DRIVER_PYTHON="/home/hadoop/conda/bin/jupyter"
        export PYSPARK_DRIVER_PYTHON_OPTS="notebook --log-level=INFO"
        export PYSPARK_PYTHON=/home/hadoop/conda/bin/python3.5
        export JAVA_HOME="/etc/alternatives/jre"
        pyspark
      	   BASH_SCRIPT
    
    end script
    EOF
    sudo mv /home/hadoop/jupyter.conf /etc/init/
    sudo chown root:root /etc/init/jupyter.conf
    
    # be sure that jupyter daemon is registered in initctl
    sudo initctl reload-configuration
    
    # start jupyter daemon
    sudo initctl start jupyter
    

If everything runs correctly, your EMR cluster will be in a waiting (cluster ready) status, and the Jupyter Notebook will listen on port 8888 on the master node.

Connect to the Cluster

To connect to Jupyter we use a web proxy to redirect all traffic through the bastion host. Therefore we first set up an SSH tunnel to the bastion host using: ssh -ND 8157 bastion. The command will open the port 8157 on your local computer. Next we will setup the web proxy. We are using SwitchyOmega in Chrome, available in the WebStore. Configure the proxy as shown below:

Setup Switchy Omega

After activating the proxy as shown below you should be able to reach the master node of your cluster from your browser.

Activate Proxy

To find the IP of the master node go to AWS Console → EMR → Cluster list and retrieve the Master’s public DNS as shown below:

EMR Console

Following the example, you can reach your Jupyter Notebook under the master nodes ip address at port 8888 (something like http://10.1.234.567.89:8888) with the web proxy prepared browser.

Pyspark on Jupyter

If you can reache the url, you will be prompt to enter a password which we set by default to ‘jupyter’.

Create a new notebook choose New → Python 3 as shown below:

New Notebook

Every notebook is a PySpark-app where spark context (sc) as well as sqlContext are already initiated, something you would usually do first when creating Spark applications. So we can directly start to play around with PySpark Dataframes:

Pyspark Example 1

Or read easily data from S3 and work with it:

Pyspark Example 1

Examples are available in this gist

Happy playing with Jupyter and Spark!

7th Hackday Review

$
0
0

Wouldn’t it be better, make you stronger, to have your soul in more pieces, I mean, for instance, isn’t seven the most powerfully magical number, wouldn’t seven — ?” (Harry Potter and the Half-Blood Prince, 2005)

At Babbel we work with words. But sometimes numbers are just as important. This year, the number seven was very important, because our seventh Babbel Hackday took place! According to some religions, seven signifies perfection and wholeness, and according to Harry Potter it’s said to be the most powerfully magical number. Thus, hopes were high for participants on our last Hackday, the 7th!

The seventh Babbel Hackday took place on Friday, April 7th, 2017 (isn’t that wonderful?), at Betahaus Berlin. Every six months we organize a Hackday mainly for our engineers, but not exclusively, so a lot of coworkers from other departments participate as well. We had a colleague working remotely from Italy, a team that thought it was a costume party and some new joiners whose very first day at Babbel was a Hackday. New this time was the introduction of a second winning category. Previously, we designated one team for ‘Best Hack’, but since Babbel is a learning company (inside and out), this year we came up with a second category: ‘Make you speak the language like you’ve always wanted to’, which pursues the best learning hack.


7th Babbel Hackday

There were 108 Babbelonians fighting for victory. It’s always interesting seeing how people can spend hours chatting, drinking unlimited tea, coffee and Club Mate, taking outdoor smoke breaks and therefore getting a free work-out (we were on the fourth floor of betahaus’ huge building - that’s a lot of steps), before realizing that the clock is ticking and they actually only have an hour left to finish their project.

At 7PM the hacking was over and everybody had to put their work down. At that point, all other (not-participating) Babbelonians were welcome to attend the presentations and vote. In the ten hours of hacking, participants came up with a record number of 21 projects (that’s 3 x 7, just saying), some of them were presented as a prototype, some as a demo. All of them awesome.


7th Babbel Hackday

Everybody attending could vote for ‘Best Hack’ and ‘Best Learning Hack’. Since there were two categories, we expected to have two different winners - but the outcome was quite a surprise. Apparently we have a real genius working at Babbel: both categories were won by the same person! The first prize winner went home with a brand new GoPro and an Amazon Echo, sponsored by Amazon. Second prize winners for Best Hack went home with a GoPro, and the second prize for the ‘Best Learning Hack’ was an extensive collection of movie-themed cookbooks.

After everybody celebrated the win and congratulated the winners (or quietly wiped away their tears), we left Betahaus for the traditional Berliner Kneipe ‘Schmitz Katze’ to make a toast to the winners and to end the 7th Babbel Hackday.

Later this year we’ll be hosting Babbel Hackday number eight. According to some spiritual sources, it’s the number signifying infinity and everything good in the world.

Check out our open positions to join our next Hackday!

View the video on YouTube.

Innovation in Language Learning @ Babbel

$
0
0

Although handcrafted content from our language experts lies at the heart and soul of our offering, Babbel is constantly looking at ways to enhance the experience of our users through cutting-edge technology. For this reason, we are actively involved in the international scientific community, so that we can stay up to date with the latest technological developments. We also conduct our own research in how to apply computational linguistics methodologies to language learning applications. Our computational linguistics team recently presented original research at the 12th Workshop on Innovative Use of NLP for Building Educational Applications (BEA12) in Copenhagen, Denmark.


Robyn Loughnane presenting at BEA12

The BEA12 workshop focused on ways to use natural language processing (NLP) technology in educational applications, including error detection and correction, automated essay scoring, annotation and databases, and dialogue systems. This year only 9 papers out of 62 submissions were selected as oral presentations and we were lucky enough to be one of them. The BEA12 workshop was part of the larger Empirical Methods in Natural Language Processing (EMNLP) conference, organized by the Association for Computational Linguistics (ACL).

Our paper describes a prototype created by Babbel’s computational linguistics team that harnesses the power of linked data for language-learning applications, proposing new methods for creating learning content and analyzing existing learning content. Linked data and the Semantic Web are ideas that Tim Berners-Lee, the inventor of the World Wide Web, has been championing since the turn of the century, and for which there is now a suite of technology available. The Semantic Web is the idea of turning the current internet, a web of documents, into a web of data, creating the Giant Global Graph, or Web 3.0. Although we don’t have a Semantic Web yet, the principles and technologies of the Semantic Web can be used for other purposes, like creating a knowledge base around Babbel’s learning content that is filled with linguistic annotation and other linguistic resources.

As a part of this project, we translated the open-source SPECIALIST lexicon into RDF. This code has also been made open source and is available on GitHub.

For the hardcore nerds who want to find out more, the paper is available via the ACL’s website.

What is dependency injection after all?

$
0
0

Are you starting your developer career and came across tools like Dagger, Guice and Roboguice and got a bit confused? Is it hard to grasp the concepts of dependency injection and inversion of control? Then this might help you out. This post tries to explain what dependency injection is and why it is so important in software development.


When talking with friends and family starting their software career I often hear the words dependency injection accompanied by Dagger, Guice or any other tool aiming to help out injecting your application’s dependencies. Unfortunately barely any of them actually understand or grasp why it’s needed and why do we need such tools to accomplish it. Some are even surprised when I tell them you don’t really need these tools to have dependency injection and that they’re just helpers. Really good helpers, but still just helpers.

Personally I believe the tools are great and help you develop faster and deliver better quality code. However, it’s important to understand the concepts behind. This often makes the tools easier to understand.

I’d like to provide at least one explanation with a step by step approach to what I learned and how I see dependency injection helping out in software development.

Starting with functions

Usually one explains dependency injection using object oriented paradigms. I’d like to first try to explain it with functional programming, because I think it’s simpler. Dependencies are everywhere in software. Hard dependencies or tight coupling as we mostly know them make it harder to extend, test and scale the code, while loose dependencies (or loose coupling) tend to be easier to deal with. So to start with we need to understand what really is a dependency.

In functional programming, functions are first class citizens. This means they can be passed as arguments, returned by other functions, assigned to variables, among other things. Yes you’ve guessed it, in functional programming dependencies are functions.

Let’s look at the map example. I’ll be using Elixir because it’s the most purely functional programing language that I’m comfortable with. I’ll do my best to explain the syntax. Elixir already has this function in the standard library, but for the purpose of this post I’ll just implement it myself. We have a list of integers and at some point in our application we need to square these integers:

defmap_to_squares([])do[]enddefmap_to_squares([head|tail])do[head*head|map_to_squares(tail)]end

Using Pattern Matching, Elixir decides which function to call. When the list is not empty it will call the second function - map_to_squares([head | tail]) - and recursively build another list until eventually it’s empty. At this point it will call the first function - map_to_squares([]).

Calling map_to_squares([1, 2, 3, 4]) yields the list [1, 4, 9, 16].

Can we call this function with another list and obtain its squared elements? Absolutely!

Can we call this function with a list and apply another operation to the list’s elements? Say we want to return a list of 1s and 0s depending if the number is odd or even respectively? No!

Why can’t we do this? Because the function can only square items. It has the squaring function coupled with its implementation. you can see it in head * head. In other words, map_to_squares is tightly coupled with the squaring function.

How can we make this work for any function? We need to invert the control. Simply put - we no longer say what function we’re going to apply to the list, but we let the calling code specify this function:

defmap([],_)do[]enddefmap([head|tail],fun)do[fun.(head)|map(tail,fun)]end

So now we have a function map that receives another function. This function can be anything as long as it receives 1 parameter and this parameter is an element of the list.

Calling map([1, 2, 3, 4], fn x -> x * x end) will still yield [1, 4, 9, 16], but now we can also call it like so:

map([1,2,3,4],fnx->ifInteger.is_even(x)do0else1endend)

and get the list [1, 0, 1, 0]. We no longer care what function we need to apply to the list members as long as it respects the rules already defined.

We’ve successfully given control to whom calls this function. The dependency to the mapping function is still there, but now it’s no longer tightly coupled. This makes it much more flexible and reusable. We don’t have to write several functions that apply an operation to each element in the list. We can use just this one and inject the operation as a function.

Notice also that we have decoupled the logic of the mapping function from the logic that iterates through the list. This is really good for testing. We can now make sure that our function iterates all elements in the list and calls the given mapping function for each element.

It’s worth noting that in functional programming there are others ways of addressing these dependencies which might be better than the one explained here - i.e. Partial Application

In the OO world

Things are not that different in the Object Oriented world (OO). If you are no stranger to this paradigm you know that often your objects have a lot of dependencies. These dependencies are other objects. I’ll be using Kotlin for the next examples mainly because I’m an Android developer. I know one can use a lot more functional constructs in Kotlin than in other OO languages, but for the sake of the argument I’ll try to stay away from these.

Let’s imagine the following scenario. We’re building an application for some transportation company which has several transport mediums and drivers. Each driver can drive one or more transports - bus, plane, car, etc. One could start thinking of having the following:

classBus{fundrive()=println("Driving on the road")}classDriver{lateinitvalbus:Busfundrive()=bus.drive()}

As you can see we have a class Bus that can be driven and a class Driver that for now can only drive a bus. There are quite some issues with this approach, but let’s start with the bus dependency from the driver class.

With this approach the driver can only drive buses. However we were told that drivers could drive more than one transportation medium. So first thing we need to do is generalize the Bus class. Let’s build an interface for a generic transport.

interfaceTransport{fundrive()}classBus:Transport{overridefundrive()=println("Driving on the road")}

This step is very important because it will enable us to make sure the Driver class depends on a generic transport and not only on a bus enabling the driver to effectively drive more than one transport. If you recall the example with functional programming this is similar in that the mapping function specifies a signature, but not how it should behave. Likewise we should specify interfaces and not the behavior. As a rule of thumb always depend on abstractions rather than concrete implementations.

We can now write our Driver class as follows:

classDriver{lateinitvartransport:Transportfundrive()=transport.drive()}

Now the dependency to Bus is no longer present. The Driver class depends on an interface that describes the contract for transports. We can now have a single driver driving multiple transports:

interfaceTransport{fundrive()}classBus:Transport{overridefundrive()=println("Driving on the road")}classAirplane:Transport{overridefundrive()=println("Flying over the road")}classDriver{lateinitvartransport:Transportfundrive()=transport.drive()}funmain(args:Array<String>){valdriver=Driver()valplane=Airplane()valbus=Bus()driver.apply{transport=planedrive()}driver.apply{transport=busdrive()}}

As you can see the created driver now can drive both a bus and a plane as well as any other class complying to the Transport interface. Notice also how we’ve created another class - Airplane - that can be used with the class Driver and yet we didn’t have to touch the Driver class.

We’ve given control to the code using Driver. This is why we say dependency injection is a form of inversion of control.

What we’ve seen here is a form of dependency injection where the dependency is injected using a setter. This was done to keep the example simple. Other forms of injection include field injection (which in Kotlin looks pretty much the same) and constructor injection.

I personally prefer the approach of using constructor injection simply because it becomes impossible to call the methods without initializing all dependencies. As an example:

classDriver(privatevaltransport:Transport){fundrive()=transport.drive()}

Here the dependency must be passed when the instances of Driver are created and therefore it’s impossible to forget about it before invoking the method Driver.drive(). Also I take advantage of Kotlin’s type system to avoid nulls here. However, this would not work for our application because it requires one driver to drive multiple transport mediums.

One last thing that is very important to notice. We can now test the Driver class without worrying about which transport medium it’s using under the hood. In fact, because we inverted the control to the calling code, during the tests we can call this class with a mock object and avoid creating a real transport making it way easier to test.

So why are tools like Dagger and Guice needed?

As we’ve seen now dependency injection is a concept that essentially lets the calling code have control of what’s being used. In the given example the dependencies are quite simple and trivial. So much that we actually manually created all the objects and injected every dependency.

However in a real world scenario things are not this simple. Usually your dependencies have dependencies which will have dependencies of themselves and so on. Usually all of these dependencies have a certain life time that needs to be managed. Things get complicated pretty fast and we end up with a graph of dependencies that is too hard to create and maintain.

Here’s where tools like Dagger and Guice help us out. They make it a lot easier to manage these dependencies. They create the dependency graph by themselves and ensure that when you request a given object, all its dependencies are fulfilled. This removes a lot of boilerplate code and boosts productivity.

Wrap up

In this post I’ve tried to give at least one reason why dependency injection is important in software development. We’ve started with a functional approach and moved to the Object Oriented world.

Here we know that there are a lot of tools to help us out with building and maintaining our dependency graph. Dependencies are present everywhere and they should be taken into account. Hard coded dependencies will make it almost impossible to extend the class’ behavior without touching it. Harnessing these tools helps us in the short and long term.

Boost your engineering career with us! Join Babbel Neos!

$
0
0

Hi! I’m Gábor. I want to tell you about an exciting new training program I’m running at Babbel, Berlin for aspiring software engineers. It’s called Babbel Neos. But first, a bit about how I ended up in this position.

I learnt web development during high school for fun. I got my first full-time job during university but I eventually never finished my studies. I always found it easy to find a job that was meaningful to me and helped me develop professionally. I did not only have the chance to progress in my career as a software engineer, but also change my role a couple of times and explore new fields. I transitioned from web development to QA engineering to engineering management to product management. I had the option to work in San Francisco at the Bay Area tech scene.


Technology changes fast. For professionals who work in technology, continuous learning is especially important. Fresh graduates or career changers find it difficult to gain relevant experience without having the opportunity to work on real-life projects. At the same time, engineering teams often do not have the bandwidth to provide the required mentorship for juniors to help them integrate successfully into teams. What will break this vicious circle?

I now recognize this privilege throughout my career. I would like to help create opportunities for people who would like to work in tech, but have difficulties finding a job that matches their skill set. Very recently I have decided to take on this challenge at Babbel.

Babbel is a learning company — inside and out. Company values are best represented in action. Therefore, I’m happy to announce that we are launching a new engineering mentorship program called Babbel Neos. It is a 6 month long paid training program in Berlin for those who invested in educating themselves in software engineering, yet were not able to find a job in the market. Although we cannot guarantee a software engineering job offer for all of the trainees who complete the program, we created this program with the intention to do so.

Application is closed.

Why Babbel's Developers Make A Difference

$
0
0

Babbel is all about empowering people to learn languages — to better communicate and have conversations all over the globe. We are passionate about our purpose, and we want to find developers who care about this too.

Read more about Why babbel developer’s make a difference

Naming Colour Variants

$
0
0

One of the difficult things in the field of software development is naming variables as meaningful as possible. A good variable name is not only important for clean code but also to facilitate the communication across functions such as product managers, designers and developers.


Style guides and SASS variables

As the new frontend development goes, style guides are a common way of getting the specification and implementation work of user interfaces organized. This is the way designers, product owners and devs can easily communicate with. With the introduction of SASS and broad support for CSS variables, it is very easy to change a single variable to reflect a design change across the product.

When the project and style guide grow in size and complexity, mostly we stumble upon the variable names especially colour names.

Colour variables are nice but…

Let’s take a typical scenario of colour variants and how we name them.

$gray-darker:            mix($gray, $black, 30%);
$gray-dark:              #4c4c45;
$gray:                   #727267;
$gray-light:             mix($gray, $white, 50%);
$gray-lighter:           mix($gray, $white, 30%);

Looks all good and well. But the moment we are asked to add one more shade of gray the eminent problem with this naming convention emerges.

Finally, we end up with something like this

$gray-lightest:          mix($gray, $white, 45%);
$gray-lightester:        mix($gray, $white, 50%); // contribution to dictionary
$gray-light-medium:      mix($gray, $white, 30%); // LoL

Soon we would end up with something really really messy.

Name that colour

As we know this problem is not unique to us and try to reach out the internet gods. Then we have seen there are websites which would give names to colours! Why not?!

Here is what we got while trying to use one of the sites

$bright-gray:   #394149;     // gray
$shuttle-gray:  #565d64;     // light(15%)
$rolling-stone: #70767c;     // light(28%)
$shark:         #1f2428;     // dark(44%)

This is a good step forward in the direction of facilitating communication. However if we take a close look at the example above, these are all shades of gray and the assigned names such as shark and rolling-stone don’t reflect this relationship between the color variants.

Numbers everywhere

Let’s try another approach. Use numbers and believe in mathematics. For each variant of the colour we could append incremental numbers (just like IDs in programming)

$space-gray   : #394149;
$space-gray-1 : #70767c;
$space-gray-2 : #565d64;
$space-gray-3 : #356879;
$space-gray-4 : #1f2428;

All good and well. Problem is solved 100% from the perspective of a developer. But we couldn’t persuade our designers to use these variable names because it’s not easy to communicate with each other. So we are back to square one.

Finally!

As we were thinking more about it, the root cause of the problem is that we are not reflecting the shades of the colour in the variable names. So instead of adding different names or numbers for colour variants, we could try adding the shades in the variable names itself.

$space-gray:     #394149;  // gray
$space-gray-w28: #70767c;  // light(28)
$space-gray-w15: #565d64;  // light(15)
$space-gray-b44: #1f2428;  // dark(44)

The designers could easily communicate and visualise this and also the devs could implement this without the fear of having messy variable names once we have to add one more colour variant in-between.

So next time the designer wants 30% lighter space-gray we have the variable name ready: $space-gray-w30.


I've trusted you! You promised no null pointer exceptions!

$
0
0

So you’ve just switched to Kotlin and thought it would be great to have all your API entities written using data classes with proper nullability rules. You’ve set up your Gson objects and prepare to deserialize your API response and surprise! Your non-nullable fields are actually null…

This post explains what’s happening and why is this still an issue if you specify default values for your fields.


Kotlin gave us in my personal opinion one of the most powerful features one can have in a language - the ability to specify that a type is nullable or not.

By making null part of the type system we now can benefit from compile checks. In short, code that was dangerous before becomes harmless now. Take the following example:

voiddisplayName(@NonNullUseruser){someTextView.setText(user.getName());}voidsomeOtherMethod(){displayName(null);}

This code compiles in Java, but when run it will throw a NullPointerException. Granted that the example is a bit silly since someone would hardly do this, but the point here is that we simply get a hint from the IDE saying that user is marked as @NonNull. However, the compiler won’t know about this and therefore the code compiles fine.

Take the same code in Kotlin

fundisplayName(user:User){someTextView.setText(user.name)}funsomeOtherMethod(){displayName(null)}

This time the code doesn’t even compile. The reason is that user is of type User, meaning it cannot be null. Since this information is now part of the type, the compiler actually knows this and seeing that you’re passing null it forces you to deal with the problem. You can either change the argument passed to displayName making it not null or you can change the type of the argument:

fundisplayName(user:User?){someTextView.setText(user?.name)}funsomeOtherMethod(){displayName(null)}

Now that the type is nullable we can pass in null and since the method TextView.setText can deal with nulls, we’re all set.

This is really good because the approach in Java will crash in the users’ hands. The approach in Kotlin will not even build forcing us to deal with it right away.

Great! So wouldn’t it make sense to make all our types null safe? Personally, I think so and we did this here at Babbel. However, once it came to the API entities we had some surprises.

The step by step approach

Once I’ve learned about null being part of the type system I quickly tried to take advantage of it. The next examples use Kotlin 1.2.10 and Gson 2.7. Naively I thought the following code would break:

data classUser(valemail:String,valfirstName:String)funmain(args:Array<String>){valjson="{}"valgson=Gson()println(gson.fromJson(json,User::class.java).email)}

The code reads an empty JSON into a data class that has 2 fields that cannot be null. Yet once you run it the output is:

null

That’s quite strange. Let’s try and explicitly break it:

funmain(args:Array<String>){valjson="""{
    "email": null
    }"""valgson=Gson()println(gson.fromJson(json,User::class.java).email)}

Now we set the email field in the JSON to null and run the code again and it still prints null. In fact, changing the print statement to println(gson.fromJson(json, User::class.java).email == null) results in the IDE telling you that this condition is always false yet it prints true.

Of course, trying to use the email field results in a NullPointerException. Now we begin to feel a bit betrayed. We were promised no more NullPointerExceptions and moreover, we were promised that once a type is not-nullable, then either the compiler will complain or at runtime, we will get an IllegalStateException with a detailed message about which fields are null and should not be.

So what’s happening here? My next attempt was to add a default value for one of the fields. I thought maybe if you specify a default field it will use that value if the JSON doesn’t have it explicitly set.

data classUser(valemail:String,valfirstName:String="<EMPTY>")funmain(args:Array<String>){valjson="""{
    }"""valgson=Gson()println(gson.fromJson(json,User::class.java))}

And sure enough the code above prints:

User(email=null, firstName=null)

Everything is null again. I thought maybe there’s actually some consistency to this. Maybe it doesn’t matter what the data class specifies, Gson takes the JSON as the source of truth, meaning if the fields are not present in the JSON then they should also not be present in the data class. So if we assign default values to everything, it should still come out all as nulls. The proof is in the following code:

data classUser(valemail:String="<EMPTY>",valfirstName:String="<EMPTY>")funmain(args:Array<String>){valjson="""{
    }"""valgson=Gson()println(gson.fromJson(json,User::class.java))}

Which prints:

User(email=<EMPTY>, firstName=<EMPTY>)

Wait… what? So now all fields have the default value?

I’ve dug a bit into the source code of Gson to try and find out what’s happening. The answer is not as trivial as I expected, but not very hard to grasp. It’s a mix between how Kotlin generates constructors for the data classes and how Gson uses them and the default deserializers.

Gson and Kotlin classes

Let’s jump into the code in Gson that causes the problematic behavior. You can find the whole class here, but I’ll paste the relevant part:

public<T>ObjectConstructor<T>get(TypeToken<T>typeToken){finalTypetype=typeToken.getType();finalClass<?superT>rawType=typeToken.getRawType();finalInstanceCreator<T>typeCreator=(InstanceCreator<T>)instanceCreators.get(type);if(typeCreator!=null){returnnewObjectConstructor<T>(){@OverridepublicTconstruct(){returntypeCreator.createInstance(type);}};}finalInstanceCreator<T>rawTypeCreator=(InstanceCreator<T>)instanceCreators.get(rawType);if(rawTypeCreator!=null){returnnewObjectConstructor<T>(){@OverridepublicTconstruct(){returnrawTypeCreator.createInstance(type);}};}ObjectConstructor<T>defaultConstructor=newDefaultConstructor(rawType);if(defaultConstructor!=null){returndefaultConstructor;}ObjectConstructor<T>defaultImplementation=newDefaultImplementationConstructor(type,rawType);if(defaultImplementation!=null){returndefaultImplementation;}returnnewUnsafeAllocator(type,rawType);}

I’ve suppressed most of the comments and annotations for clarity

Let’s break it into chunks.

finalInstanceCreator<T>typeCreator=(InstanceCreator<T>)instanceCreators.get(type);if(typeCreator!=null){returnnewObjectConstructor<T>(){@OverridepublicTconstruct(){returntypeCreator.createInstance(type);}};}finalInstanceCreator<T>rawTypeCreator=(InstanceCreator<T>)instanceCreators.get(rawType);if(rawTypeCreator!=null){returnnewObjectConstructor<T>(){@OverridepublicTconstruct(){returnrawTypeCreator.createInstance(type);}};}

The first thing Gson tries to do is to check if you’ve registered any instance creator. If you’re not familiar with these don’t worry because they’re not essential to this blog post. However, think of them as classes that you register with your gson object that specify how an object should be built. We didn’t do this, so the if statement will fail and the code will carry on onto the next one. The second if is quite similar, but only for the raw type - the type without generics info. This also fails the check and the code carries on to:

ObjectConstructor<T>defaultConstructor=newDefaultConstructor(rawType);if(defaultConstructor!=null){returndefaultConstructor;}

If your class has a defaultConstructor - meaning a no-args constructor - then Gson will use it to build your objects.

ObjectConstructor<T>defaultImplementation=newDefaultImplementationConstructor(type,rawType);if(defaultImplementation!=null){returndefaultImplementation;}

Last but not least the final if seems to be creating a “default” implementation. The 2 last if statements are the ones that concern us and these are the ones where I’m going to focus the rest of the blog post. Turns out that if you decompile the data classes in Kotlin to Java you’ll notice that sometimes you have a no-args constructor and sometimes you don’t.

When all your fields have default values Kotlin will generate a no-args constructor for you. When there’s at least one field that has no default value, then the no-args constructor is omitted.

The null checks happen in the constructors - in the initialization.

It’s fairly easy to understand how the object is created by Gson when using the no-args constructor. It’s a matter of calling it using reflection and we’re done. The code inside the constructor will initialize the fields to the correct default values.

The tricky part comes when trying to understand how can Gson create instances of classes that have no no-args constructor and completely ignore the initialization process. The fact is Gson uses Unsafe.

Believe it or not Unsafe is unsafe. It enables bypassing the object initialization effectively avoiding all the null checks that are generated in the constructor.

Here’s a streamlined example in Java that shows how Unsafe can bypass initialization.

importjava.lang.reflect.Field;importsun.misc.Unsafe;classUser{privatefinalStringemail;publicUser(){this.email="<EMPTY>";}publicStringgetEmail(){returnemail;}}publicclassMain{publicstaticvoidmain(String[]args)throwsException{// Use simple reflection to get the no-args constructor and instantiate the class.// Prints <EMPTY>System.out.println(User.class.newInstance().getEmail());// Use the unsafe class to bypass initialization and create instances of a classClass<?>unsafeClass=Class.forName("sun.misc.Unsafe");Fieldf=unsafeClass.getDeclaredField("theUnsafe");f.setAccessible(true);Unsafeunsafe=(Unsafe)f.get(null);Useruser=(User)unsafe.allocateInstance(User.class);// Prints nullSystem.out.println(user.getEmail());}}

unfortunately, you need to do some dirty reflection work to even be able to use the unsafe instance

As you see allocateInstance allows you to bypass the initialization process.

Since Kotlin does the null checks in the initialization process allocateInstance will effectively be bypassing them.

It’s as a last resort that Gson uses the Unsafe approach to create your objects.

How do we avoid this?

At the time of writing this document, Google has a proposal to add a method disableUnsafe to the GsonBuilder which will enforce the usage of no-args constructors or instance creators. Although this is not enough to play well with null-safety, it’s a really good start. For now, we’re left without this feature and have to take our own measures.

The straightforward answer is to always make sure the classes you’re serializing and deserializing have no-args constructors or instance creators registered. The more interesting part is having this play out well with Kotlin’s null-safety. Personally, I haven’t found a way to avoid having my default fields set to null, without custom type adapters.

For example, if we have the class:

data classUser(valemail:String="<EMPTY>",valfirstName:String="<EMPTY>")

Parsing the JSON:

{"email":null}

Will result in a user object with the field email set to null even though there’s a no-args constructor. What happens here is that Gson uses the no-args constructor to build the user object. The default values are assigned and the null checks pass, but then Gson uses reflection to set the email field to null. Here there’s no null check.

A solution with custom type adapters could be the following:

importcom.google.gson.GsonBuilderimportcom.google.gson.TypeAdapterimportcom.google.gson.stream.JsonReaderimportcom.google.gson.stream.JsonTokenimportcom.google.gson.stream.JsonWriterdata classUser(valemail:String="<EMPTY>",valfirstName:String="<EMPTY>")classUserTypeAdapter:TypeAdapter<User>(){overridefunread(reader:JsonReader?):User{varemail:String?=nullvarfirstName:String?=nullreader?.beginObject()while(reader?.hasNext()==true){valname=reader.nextName()if(reader.peek()==JsonToken.NULL){reader.nextNull()continue}when(name){"email"->email=reader.nextString()"firstName"->firstName=reader.nextString()}}reader?.endObject()returnwhen{email==null&&firstName!=null->User(firstName=firstName)email!=null&&firstName==null->User(email=email)email!=null&&firstName!=null->User(email,firstName)else->User()}}overridefunwrite(out:JsonWriter?,value:User?){out?.apply{beginObject()value?.let{name("email").value(it.email)name("firstName").value(it.firstName)}endObject()}}}funmain(args:Array<String>){valjson="""{
        "email": null,
        "firstName": "My first name"
    }"""valgson=GsonBuilder().registerTypeAdapter(User::class.java,UserTypeAdapter()).create()println(gson.fromJson(json,User::class.java))println()}

There are quite some downsides to this approach. The first and more obvious one is that we would have to write a type adapter for each class we have in our models. Personally, I believe the code is not that easy to follow because of the JsonReader API. This approach also requires that the models have default values.

One can also throw an exception in the event of an illegal null value instead of constructing the user object. Of course, if you’d change the type from String to String? you’d have to update your type adapter too. Not very elegant.

All things considered, we decided to keep all our API models with nullable fields. This at least forces us to use them with care before putting it into the users’ hands.

Summary

In this post, we saw how easy it is to break null-safety in Kotlin using Gson. Essentially, if a field is marked as non-nullable, it can still be null if the parsed JSON contains this field with a null value or we built our models with no no-args constructors or instance creators. This leads to a false safety net where we think our models cannot be null, when in fact they can.

We can leverage type adapters to prevent this situation, but it’s an approach that can easily lead to a lot of custom deserialization and hard to follow. In big projects, this is quite undesirable.

Perhaps the safest approach is to keep everything nullable in the models and deal with this on each access to the fields.

React Amsterdam Conference Highlights

$
0
0

Davide Ramaglietta and Alexander Gudulin from R2D2 team of the Learning Engineering Area attended the biggest React conference in the world with 1200+ attendees. Here is their impressions on the event and opinions on what they heard:

We had the feeling that nowadays hottest topic in the React community are state management, the switch of REST architecture in favour of GraphQL and the potentiality of React 16 and further.


setState Machine

For both of us, the most interesting and intriguing talk was given by Michele Bertoli from Facebook follow him on Twitter, whom was speaking about using State Machines and Statecharts to manage state’s applications in particular React Component’s state. The concept was already been discussed in 1987 by David Harel in his paper “Statecharts: A visual fromalism for complex systems” and in 1998 by Ian Horrocks with his book “Constructing the User Interface with Statecharts” Probably most of us heard about State Machines during their university studies so we should all be familiar with it. In case you want to refresh some knowledge Finite State Machine

While most of the modern approaches (see Redux or Mobx) focus on the actions that a user can make and how they change the single object representing the single state of our application, he suggests to focus on the states of our app. Hence, conceptualize the application as a set of states and transitions from a state to another. To help you in this tasks he recommends the use of Statecharts.

He also suggests the xstates library to represent “Functional, Stateless JS Finite State Machines and Statecharts” Also his own react-automata library, a state machine abstraction for React that provides declarative state management and automatic test generation.

If you are interested of going in more details about the topic, here the link to the slides of the talk.

Structure Your App’s Story With Sagas and Selectors

When you build an app with react and when it scales it’s hard to say where the business logic should live. We can split the logic in three types:

  • Data manipulation
  • Conditionals
  • Async flow

Data manipulation

Some data comes from the user input or as a result of API call. Where should we put the function for filtering a list of something?

We need to use selectors for that. Selector is a pure function with state object as argument.

Selectors can be used in mapStateToProps function and derived data would be passed directly to react component.

Conditionals

We can use selector for the conditions and dispatch the action. The benefit of this approach is

  • Action creators have access to the state object
  • They allow to return you whatever you want

But may be it’s better to place the conditions inside reducer?

Async flow

When an app make calls to remote services all the time, redux thunk doesn’t work anymore. Rebecca says, redux saga is a better way. The mental model is that a saga is like a separate thread in app. saga is a redux middleware, so you can manipulate it with normal redux action.

As a conclusion, don’t include extra tools and complexity if you don’t need it. Could not agree more.

Watch the full video

Others

Among the topics mentioned above, there have been some other interesting presentations. It was particularly impressive to see what amazing stuff Shirely Wu could do combining D3.js (a library used for data visualization) and React. Her live coding session was thrilling! Here an example of the results.

Speaking about exciting live coding session, we can’t omit to mention the one and only Ken Wheeler follow him on Twitter. He literally made the all conference jumping at rhythm of disco 90s music while showing a 3D visualization of a music mixer obtained by combining React with the AudioContext interface. The full talk here

Some others engaging talks came from Tracy Lee speaking about Reactive programming with RxJS, the “GraphQL at scale with AWS” by Richard Threlkeld who exposed the realtime and offline features of AWS AppSync and lastly Manjula Dube showed all the new possibilities that React16 and its new core algorithm has to offer.

Open Source Awards

There was time also for some awards! Congratulations to Apollo GraphQL which was named the breakthrough of the year and to Storybook who won the award as the Most impactful contribution to the community. As we also use this tool in our Learning Teams, we feel in due to forward some appreciation as well. Good job!

Conclusions

As always after attending a conference, the feelings are various and contrasting. Could have been better? Did we learn something? What did we hear that is worthed to share with our team and colleagues? Did we get enough gadgets? Did we met some cool people? Did we increased my network of developer fellas? Each of us has his/her own answers to this questions. We would like to keep seeing more of these conferences out there, to thank Babbel and the conference organizer and to remember a precious tip that came from Sara Vieira and her horror story: close all your illegal activities on your laptop before going on stage and showing it to the whole crowd out there!!

Link to all talks

If you need monitoring, just shout!

$
0
0

AWS provides a lot of low-level monitoring for Lambda functions out-of-the-box: invocations, duration, errors, throttles — you name it. But if you want to monitor aspects of your business domain in CloudWatch, you have to do this yourself.

Let’s explore how you can send metrics from Lambda to CloudWatch Logs.


Singer shouts into microphone, eyes closed

Posting metrics directly from Lambda

Let’s say you’re running a service that allows users to upload files directly to S3. You have a Lambda function in place to process those uploads (e.g. for resizing or copying uploaded files). To do that, you set up a bucket notification, so your Lambda function is invoked with this event on every upload (incomplete payload for brevity):

{"Records":[{"eventTime":"1970-01-01T00:00:00.000Z","s3":{"object":{"eTag":"0123456789abcdef0123456789abcdef","key":"HappyFace.jpg","size":1024},"bucket":{"arn":"arn:aws:s3:::mybucket","name":"sourcebucket",},},"eventName":"ObjectCreated:Put",}]}

Notice the file size as part of the input (Records[0].s3.object.size)? Let’s assume you’d like to monitor that for each upload to see if file uploads are growing on average over time.

Assuming you opted for Node.js when developing your Lambda function, you might come up with this code:

constAWS=require('aws-sdk');constcloudwatch=newAWS.CloudWatch();exports.handler=async(event)=>{constparams={MetricData:[{MetricName:'upload_size',StorageResolution:60,Timestamp:event.Records[0].eventTime,Unit:'Bytes',Value:Records[0].s3.object.size}],Namespace:'uploader'};awaitPromise.resolve().then(()=>{cloudwatch.putMetricData(params).promise();}).catch((err)=>{console.log(err,err.stack);});};

This is pretty straight-forward: Construct the service interface object, assemble the metric payload, and send it to the service.

This solution is fine if your Lambda function is small and posts very few metrics to CloudWatch. But what if your Lambda function is already quite big, or you want to avoid using monitoring-specific parameters like Unit and StorageResolution in the code?

Separating business logic from monitoring

If you want to keep business logic and monitoring nicely separated, consider logging your monitoring to stdout and filtering the logs into CloudWatch metrics using Metric Filters.

That way, monitoring is not yet another piece of code that relies on a network connection where you have to think about control flow or even retries. It’s essentially a single call to console.log that outputs JSON and gets picked up by your infrastructure and turned into metrics.

Revisiting the Lambda function

Here’s the new version of the above function:

exports.handler=async(event)=>{constrecord=event.Records[0];console.log(JSON.stringify({created_at:record.eventTime,key:record.s3.object.key,metric_name:'upload_size',name:'uploader:file_uploaded',size:record.s3.object.size});};

Note that we can remove the dependencies on aws-sdk, and we can skip the error handling. We also got rid of all CloudWatch-specific parameters.

You’re now shouting out your monitoring needs, hoping somebody hears you.

Turning this JSON output into a CloudWatch metric

What’s missing is the glue between the JSON output and CloudWatch. Effectively, we want AWS to filter the logs for a specific pattern and extract the metrics we need.

The syntax for those patterns is quite powerful:

You can also use conditional operators and wildcards to create exact matches.

Filter and Pattern Syntax

If you’re using Terraform, the following configuration will create a Metric Filter that turns the JSON output into the same CloudWatch metric that the original Lambda function created (cf. aws_cloudwatch_log_metric_filter resource documentation):

resource "aws_cloudwatch_log_metric_filter""lambda-uploader-file-size" {
  name = "lambda-uploader-file-size"

  log_group_name = "lambda-notify-uploaded"
  pattern        = "{ $.name = \"uploader:file_uploaded\"&& $.metric_name = \"upload_size\" }"

  metric_transformation {
    namespace = "uploader"
    name      = "upload_size"
    value     = "$.size"
  }
}

The given pattern instructs the filter to:

  1. look for JSON
  2. …with the name attribute being “uploader:file_uploaded”
  3. …and the metric_name attribute being “upload_size”

This first part just describes how we want to find what we’re looking for. Once we find it, we describe how we want this to be transformed into a metric. In this case, the size attribute will be taken as the value for CloudWatch Metrics.

Beyond Terraform

If you’re not Terraform, you can also create the same Metric Filter using the AWS CLI. If you prefer using the AWS Console instead, keep reading.

You can access Metric Filters from your Lambda function’s Log Group:

Screenshot of AWS Console

And from there, you can add Metric Filters and connect them to CloudWatch Alarms:

Screenshot of AWS Console

Wrap-up

You can simplify the control flow in your Lambda function by synchronously outputting your monitoring data as JSON. From there, you can let your AWS infrastructure take care of transforming this JSON into a metric.

This approach might help to make your Lambda functions easier to understand: Your code is not cluttered with monitoring-specific details. It also makes them easier to change (for instance, when the interface between your code and your monitoring is JSON instead of network interaction, it is easier to rewrite in another language).

Please be aware that using Metric Filters is an indirection and it comes with a price: This solution adds architectural complexity. You end up with more moving parts and have to carefully consider if it’s worth it.

Last but not least: Be sure to explore the power of CloudWatch Metric Filters, they are a powerful tool.

[Photo by Vidar Nordli-Mathisen on Unsplash]

Why and how to choose reference user stories

$
0
0

For many agile software engineering teams, it is common practice to estimate the complexity of the user stories that they will be working on. Estimation is not an exact science and so they choose t-shirt sizes, story points, or other non-time based scales to account for the uncertainty that is inherent to software development. Over time, a team of individual software developers gradually comes to an implicitly shared understanding of what degree of complexity each value on their chosen scale represents.

This implicitly shared understanding can easily be challenged, though, when a new joiner to the engineering team or colleagues from different disciplines (e.g. product management, product design, …) pose the supposedly simple question: “What do 5 story points mean in your team?” For the remainder of this blog post, we will use story points when describing complexity. However, what’s being said can be transferred to other complexity scales, too. Translating the implicit agreement into words is, in fact, quite challenging. Most likely, every developer on the team will phrase her or his answer differently, especially in a cross-functional team.


One approach that we at Babbel take to make the shared understanding explicit is to choose reference user stories. While we acknowledge that every new piece of software will at least be in part “terra incognito” for the engineering team and that no user story is like the other, it nevertheless helps to refer back to an agreed-on reference when estimating new stories. By discussing similarities and differences between a new user story and reference stories, the engineering team engages in relative estimation. It is less likely to fall into the trap of indirect, absolute estimation.

User stories with assigned complexity levels

One of our teams recently faced the necessity of making the implicit explicit. It was a rather young team of three software engineers, who had been working together for less than three months. When two new joiners arrived, we accepted the challenge of choosing appropriate reference user stories. We decided to play the Team Estimation Game, originally developed by Steve Bockmann, with slight modifications to pick suitable stories. As this worked quite successfully for us, we are sharing the modified instructions here on Babbel Bytes with you. The game (not counting its preparation) lasted a little less than 1.5 hours.

Preparation

  • Look back at the work your team has done over the last 4-6 weeks. Don’t go too far back, because it should still be relatively fresh in memory for everyone.
  • Select 20 - 25 estimated and completed user stories that, on a quick glance, appear to be good candidates. When in doubt, select more stories rather than fewer, as not all of them will make it into the final selection.
  • Try to make this preselection well balanced. For example, choose a wide range of complexity based on the original estimates. Also, try to create a mixture of both frontend-heavy and backend-heavy stories.
  • Write the stories on index cards (one story per card) but do not to include the original estimates.
  • Clear a table and make enough room for all team members to stand or sit around it.
  • Place a sticky note labeled “smallest” on the far left side of the table. Place another sticky labeled “biggest” on the far right side of the table.
  • Place all user story cards face down in a pile somewhere on the table.
  • Keep one set of planning poker cards ready for the second part of the game. Our set is labeled with the Fibonacci numbers from 1 to 21.

First Part

Ask your team members not to consider the estimates they originally put on these user stories before implementation. Rather, they shall act with the knowledge they have now. That is, ask them to consider the actual complexity they experienced when they actually implemented the stories.

Players take turns:

  • One team member starts the game by picking the first user story from the pile, reading it out loud, and placing it in the middle of the table.
  • The next team member pulls the second story from the pile, reads it out loud, and places it relative to the first. They place it to the left of the first if they deem the second story smaller or equal in terms of complexity than the first. They place it to the right if they deem it bigger.
  • From now on, on each team member’s turn, she/he can either
    • pull another story from the pile and place it on the table relative to the ones laid out already,
    • change the position of one story on the table by moving it, or
    • pass.

On every turn, the players are asked to explain their placement of a story. The first part of the game ends when there are no stories left on the pile and all team members pass in the same round. In other words, the game ends when they have reached a consensus about the stories’ levels of complexity in relation to one another.

User stories ordered by their complexity

Second Part

Provide the players with the planning poker cards.

Players take turns again:

  • The player to start the second part of the game is asked to look at the smallest user story on the very left side of the table and assess whether stories of this complexity are likely to be the smallest the engineering team will encounter while working on the project. If so, the player places the smallest numbered poker card, the “1”, above the smallest story. If not, she or he can choose to place a higher numbered card, instead.
  • The second player searches for a story above which to place the next highest poker card. For example, if the previous player placed the “1”, the second player is asked to place the “2”, where she or he thinks the stories start to be twice as complex.
  • The game continues by players placing the steadily increasing planning poker cards wherever they feel a complexity break occurs. A complexity break occurs when the player considers a story to be notably more complex than the story below the last poker card to the left. Please note that with Fibonacci numbered poker cards, the gaps grow with each card to account for the increasing uncertainty in estimation. For example, the difference in complexity between a 5 point story and an 8 point story is supposed to be much smaller than the difference between a 13 point story and a 21 point story.
  • Instead of placing a new poker card, players may use their turn to move a story card or a poker card (i.e., change the assignment of complexity) or they may decide to pass.
  • It might happen that a poker card cannot be placed above a story because no user story of that size exists in the preselected set of stories. Leave a gap for this poker card and continue with the next highest.

The second part of the game ends when all poker cards have been placed and all players decide to pass in one round. They have reached a consensus about the assignment of complexity values to their user stories.

User stories with assigned complexity levels

Third Part

The objective of the third and final part of the game is to select your reference user stories.

To achieve this, discard any stories that “don’t feel right”. As you have paid close attention to the discussions that your team has led during the first two parts, identifying them should be rather easy. You probably have listened to heated debates about some particular stories. You have experienced cards moving back and forth and back again. Statements like the following are indicators that a story would not serve well as a reference:

  • “This story is somewhere between a ‘5’ and an ‘8’.”
  • “This story is definitely bigger than an ‘8’, but ‘13’ sounds too much.”

Remember, the stories that you keep in your final selection are meant to be references for the respective degree of complexity they represent. From now on, when your team estimates new stories, they will refer back to these stories. The reference stories will also help new joiners to your team during their onboarding and they will facilitate discussions with other colleagues.

That’s why you want to boil down the superset of preselected stories to a smaller set of stories that you have a firm agreement from the entire team on.

In case your final selection is very small due to this elimination, we suggest to simply repeat the game at a later time with a fresh batch of stories. You can always add stories to your set of references.

If, however, you happen to have multiple suitable candidates per complexity level, we recommend to choose more than one user story.

Final selection of reference user stories

In our team, we have documented the reference stories that we picked in a place that is easily accessible not only for us engineers but also for our colleagues, with whom we work in close collaboration. We pull the document up in every planning poker session and occassionally in discussions with product management.

From the first planning poker session onwards, we experienced an increase in confidence when estimating new stories for two reasons: First, we rely less on gut feelings because we can refer back to previously completed stories in our technical discussions to point out similarities or differences. Second and maybe more importantly, we no longer rely on an implicitly shared understanding of complexity. By making this agreement explicit, we were able to remove the doubt whether we truly share the same understanding that kept nagging us.

Hackday memories

$
0
0

A perfect blue sky. Warm sunshine. People wearing T-shirts and sunglasses. And yet, the orange and red leaves covering Berlin’s streets announce the cold season arriving soon. As we watch the leaves fall down, we start reflecting back on the warm summer day we all retreated out of the office to hack our hearts away working on brand new, exciting ideas at Hackday 2018. On June 7th, 110 Babbelonians (from Engineering, Product, Didactics, Marketing, and HR) took a break from their daily work and came up with a total of 18 projects entered into the final competition.


Impressions from the 8th hackday - indoors

This year, leadership from each participating department came together before the 8th edition of our hackday to brainstorm how to make the day even better. Using all the feedback from the previous seven hackdays, they decided on having six prize categories:

  • Biggest impact on our learners
  • Biggest innovation
  • Best fail
  • Biggest business impact
  • Panorama (other)
  • Most popular project (voted for by the audience)

The winners of each category won not only a fabulous prize but also an awesome handmade trophy!

HackDay trophies for each category

The sunny day was filled with lots of hard work, Club Mate, and great food. The venue was on a small island in the heart of Berlin with great views of the canal. It had a nice green garden for us to spread out in. Ideas were shared, collaborated on, and brought to life. With a great variety of projects bubbling with potential, we saw the Babbel values come forth like never seen before.

Impressions from the 8th hackday - outdoors

Hackday is the event we look forward to every year, since it is a great chance to collaborate, innovate, and work together in new exciting ways. To see more watch the Hackday 8 video and consider joining our team to hack with us next year!

Spelling correction with Levenshtein distance

$
0
0

How similar are the words ‘hear’ and ‘here’?

There are many answers to that question. First and foremost, ‘hear’ and ‘here’ are indistinguishable in sound. Meaning, a fluent English speaker will pronounce ‘here’ just as they pronounce ‘hear’. However, ‘here’ and ‘hear’ are different semantically or, rather, their definitions are distinct. ‘Here’ and ‘hear’ are also different in spelling.

Spelling is a particularly important issue for our users, here, at Babbel. It’s common for language learners to make spelling mistakes in a new language. It’s also common for users to make typo mistakes while using a mobile or web application. As we’re constantly looking for ways to improve our language learners’ experience, these issues were addressed in a recent project by using a popular algorithm called Levenshtein distance.


Levenshtein distance

There’s no need to reinvent the wheel this time. Many years of computer science research have resulted in a vast library of algorithms to solve problems like spelling.

In this post, I’ll explore the Levenshtein distance algorithm. It was discovered and documented by a scientist named Vladmir Levenshtein in 1965.

At its core, the purpose of Levenshtein distance is to quantify the differences between 2 strings.

Edit Operations

The Levenshtein distance algorithm defines 3 distinct edit operations that are needed to transform 1 string into another. They are as follows.

  1. Insertion
  2. Deletion
  3. Substitution

Each operation has an edit distance of 1. For example, I will transform the word ‘competers’ to ‘computer’ in operational steps.

  1. ‘competers’ to ‘competer’ (delete the ‘s’)
  2. ‘competer’ to ‘computer’ (substitute the first ‘e’ with ‘u’)

Above, there are 2 operations needed to convert ‘competers’ into ‘computer’. Therefore the Levenshtein distance between ‘competers’ and ‘computer’ is 2.

Formula

For programmatic reasons, it’s necessary to define the mathematical specifications. The Levenshtein distance Wikipedia article includes the following mathematical formula.

Levenshtein Formula

It can look intimidating, so I’ll walk through it step-by-step with an example.

Example

As stated before, the words ‘here’ and ‘hear’ sound the same in English, but are spelled differently. I would like to know how differently they are spelled or, rather, the edit distance in a numeric value.

First, I’ll create a grid with the word ‘hear’ on the horizontal axis (x coordinates) and ‘here’ on the vertical axis (y coordinates). I’ll also fill in the first corresponding row and column with an increasing count starting from 0 to represent the distance from the start of the word.

Levenshtein Grid, Step 1

Then, I’ll approach the first position (1, 1). Because the first character of ‘h’ in ‘here’ matches the first character of ‘h’ in ‘hear’, I’ll look for the value in the diagonally inferior coordinate (x - 1, y - 1). Its value is 0, so I’ll copy 0 from (0, 0) to (1, 1).

Levenshtein Grid, Step 2

Then, I’ll approach another coordinate either horizontally or vertically adjacent to the (1, 1). I’ll choose the box above, (1, 2). The character ‘h’ from ‘hear’ does not match the character ‘e’ from ‘here’. So I have to choose the minimum value from a neighboring box ((0, 0), (0, 1), or (1, 0)) and add 1. The lowest value would be the box below (1, 1) and its value is 0. The sum of 0 + 1 is 1, so I’ll insert the value of 1 into (1, 2).

Levenshtein Grid, Step 3

If I continue to do this for all of the boxes, I end up with the following grid.

Levenshtein Grid, Step 4

The top right box (4, 4) becomes the final edit distance between the strings. Therefore, the edit distance between ‘here’ and ‘hear’ is… 2!

It’s also possible to arrive at this conclusion by noticing that there are 2 edits necessary to turn ‘here’ into ‘hear’. I would have to substitute 2 characters ‘re’ to ‘ar’. However, this grid example is necessary to introduce the logic for the algorithm.

Implementation in Ruby

The best way, in my opinion, to understand an algorithm is to implement it in a programming language you know well. Here, at Babbel, we use Ruby in many of our backend applications. I’ll use Ruby for implementation, but it is also possble to translate the code below into any imperative programming language.

Matrices

Above, I used a grid to explain the algorithm step-by-step. It’s essential to do the same for a programmatic solution as well. Grids are referred to as matrices in Ruby (and generally in mathematics). The Matrix class in Ruby is basically a 2-dimensional array with added mathematical functionality. I can access this class by requiring it.

require'matrix'

Values in Ruby’s Matrix class are immutable. For the purpose of this algorithm, I need to mutate values. To do this, I’ll inherit from the standard class and expose a normally private method.

classMutableMatrix<Matrixdef[]=(a,b,value)@rows[a][b]=valueendend

Algorithm

Now that I have a mutable matrix, I can use it to populate the boxes like above. Matrices define vertical positions increasingly from top to bottom, so position (0, 0) will refer to the top left position (or box).

defmutable_matrix(first_string,second_string)MutableMatrix.build(first_string.length+1,second_string.length+1)do|row,col|ifrow==0colelsifcol==0rowelse0endendend# mutable_matrix('aaa', 'bbbb')# => MutableMatrix[[0, 1, 2, 3, 4],#                  [1, 0, 0, 0, 0],#                  [2, 0, 0, 0, 0],#                  [3, 0, 0, 0, 0]]

Because of the deviation from normal algebraic order, I’ll refer to the positions using a and b, not x and y. Given that I now have a mutable matrix, I will build the first conditional. In the example of ‘here’ and ‘hear’ above, I first checked to see if the first characters were equal.

# (a, b) as a positionfirst_string[a-1]==second_string[b-1]

If this conditional returns true, it’s necessary to find the value in the diagonally inferior position.

matrix[a-1,b-1]

If the characters do not match, I need to find the minimum distance to the neighboring positions.

[matrix[a,b-1],# Insertion:    Look directly above the current positionmatrix[a-1,b],# Deletion:     Look left to the current positionmatrix[a-1,b-1]# Substitution: Look diagonally lower from the current position].min+1# Conclusion:   Return the minimum value from these 3 options and add 1

Now, I’ll combine these 2 cases in a method.

defcost(matrix:,first_string:,second_string:,a:,b:)iffirst_string[a-1]==second_string[b-1]matrix[a-1,b-1]else[matrix[a,b-1],matrix[a-1,b],matrix[a-1,b-1]].min+1endend

The edit cost is defined. Next, I need to fill in every position in the matrix. This requires… looping! It’s fine to loop over either string first.

(1..second_string.length).eachdo|b|(1..first_string.length).eachdo|a|matrix[a,b]=cost(matrix:        matrix,first_string:  first_string,second_string: second_string,a:             a,b:             b)endend

Finally, I need to grab the last value from the matrix to determine the overall edit distance between 2 strings.

matrix[first_string.length,second_string.length]

Here is the composition of all prior logic in a nice Ruby class. There’s multiple ways to implement it. I’ve chosen large methods with named arguments.

require'matrix'classMutableMatrix<Matrixdef[]=(a,b,value)@rows[a][b]=valueendendclassLevenshteindefdistance(first_string,second_string)first_s_length=first_string.lengthsecond_s_length=second_string.lengthmatrix=mutable_matrix(first_s_length:  first_s_length,second_s_length: second_s_length)(1..second_s_length).eachdo|b|(1..first_s_length).eachdo|a|matrix[a,b]=cost(matrix:        matrix,first_string:  first_string,second_string: second_string,a:             a,b:             b)endendmatrix[first_s_length,second_s_length]endprivatedefcost(matrix:,first_string:,second_string:,a:,b:)iffirst_string[a-1]==second_string[b-1]matrix[a-1,b-1]else[matrix[a,b-1],matrix[a-1,b],matrix[a-1,b-1]].min+1endenddefmutable_matrix(first_s_length:,second_s_length:)MutableMatrix.build(first_s_length+1,second_s_length+1)do|row,col|ifrow==0colelsifcol==0rowelse0endendendend

Finally, running this implementation in Ruby is as simple as the following.

levenshtein=Levenshtein.newlevenshtein.distance('competers','computer')# => 2levenshtein.distance('hear','here')# => 2

Shortcomings

Like most algorithms, Levenshtein distance has some faults. For one, there’s no collectively accepted rule for handling accent marks, capital letters, or punctuation. Another clear shortcoming is character swapping.

As an example, the word ‘receive’ in English is easy to misspell. Often ‘receive’ is misspelled as ‘recieve’, with the ‘i’ and ‘e’ being switched. Using the Levenshtein distance algorithm, this mistake would have an edit distance of 2 (for 2 substitutions). The word ‘receipt’ is also both 2 substitutional edits away from ‘receive’. This is not ideal because it’s more likely for someone to spell ‘receive’ as ‘recieve’ while the word ‘receipt’ is very different semantically. Seeing this as a major issue, another scientist named Frederick Damerau proposed an improvement. He added another edit to the Levenshtein distance algorithm, called transpositional edits, to account for character swapping. This improvement differentiates Levenshtein distance from Damerau-Levenshtein distance.

Modern language learning applications desire logic to help correct common learner mistakes with grammar and verb conjugation. It’s not quite possible to accomplish this with basic edit distance. However, the field of natural language processing introduces more refined models and algorithms to help with these problems. Still, Levenshtein distance is extremely useful in its simplicity and performance.

Juggling multiple build stages and test environments with TravisCI

$
0
0

At Babbel, we employ continuous integration to detect issues early in the development process. Our CI does not only automate the build and run different test suits but also handles some tasks for documentation purposes. With so many moving parts, the setup is not trivial. We (one of the Web teams) share our TravisCI setup here on Babbel Bytes in the hope that some of you find it helpful for your own applications.

Our application and its unit tests are written in JavaScript. Additionally, we have Selenium based UI tests written in Ruby. We use TravisCI to run all our tests and build the application for deployment. After a deployment, we also need to build a so-called Storybook and upload it to AWS S3. Storybook is a UI Development Environment. Finally, we check for vulnerabilities using a service called Snyk.


artoo-juggler

This means there are five different job types (from now on called stages).

  1. Run unit tests
  2. Run UI tests
  3. Build application for deployment
  4. Run Snyk
  5. Build and upload the storybook

We want these stages to be executed sequentially, and whenever a stage fails, we want the pipeline to be stopped immediately after it. This will save us time and TravisCI computing resources.

For running the UI tests, we are not using an external service provider (browser/mobile farm), but instead run the tests on TravisCI itself, and the test suite’s total runtime is quite long. For this reason, we cannot run the UI tests on every push/PR/merge, as this would delay our deployments. We decided to be bold and merge code first and run the UI tests afterwards, and roll back or deploy a fix in case an issue came up. Before we deploy to production, however, we still do manual testing in our staging environment for all code changes that require it.

Also - when running sequentially - the UI tests’ runtime is too long for a single TravisCI job. The job would time out before the UI tests have finished. So we decided to parallelize them. Instead of writing our own parallelization script, we found a nice solution offered by TravisCI itself, the TravisCI build stages. As TravisCI puts it:

“Build stages is a way to group jobs, and run jobs in each stage in parallel, but run one stage after another sequentially.”

This is exactly what we wanted. With this solution, we could run each of our five stages sequentially, and the UI tests (and potentially other jobs) in parallel.

And there is more! By enabling the TravisCI Conditions, it becomes possible to run the stages depending on GitHub information. We are using branch, type, and commit_message to decide when to run which stage(s).

To keep the TravisCI runtime as short as possible, we decided to neither run the UI tests, nor build the storybook, nor use Snyk in case we

  • open a PR, or
  • push to any branch that is not called master or integration

We do, however, run the unit tests and build the application.

When we push to master, we run all stages except the UI tests. As mentioned before, we run the UI tests after a merge because the UI test suite’s runtime is too long at the moment. Once we’ll have improved the tests’ speed, we will re-evaluate our setup.

So when do we run the UI tests? For now,

  • When we push to a branch called integration
  • As part of a daily TravisCI cron job
  • When we specify run_tests in the last commit message of a push to any branch

To tell TravisCI that one wants to use stages, they are quite simply listed (in order of execution!) in the stages section of the travis.yml. Each stage has a name and an optional condition. Here is our stages configuration:

stages:-name:Unit Testsif:type != cron AND branch != integration-name:Ui testsif:type = cron OR branch = integration OR commit_message =~ run_tests-name:Deploymentif:type != cron AND branch != integration-name:Snykif:branch = master AND type = push-name:StorybookS3if:branch = master AND type = push

Our standard TravisCI environment (the env section of the travis.yml) is configured to run the UI tests, split up using the matrix option. The environment variables configured in env - matrix are passed to the script section that starts the UI tests.

script:./run_ui_tests $TEST_PARTenv:matrix:-TEST_PART=1-TEST_PART=2-TEST_PART=3

run_ui_tests is a simple bash file that handles the setup for the UI tests:

#!/bin/bashTESTS_PART_1=("test1.feature""test2.feature""test3.feature")TESTS_PART_2=("test4.feature""test5.feature""test6.feature")# .. etcfor test_name in${!TESTS_TO_EXECUTE}TESTS_TO_EXECUTE=TESTS_PART_$1[@]
do
  ./ui_test_run_script "$test_name.feature"done

This is how we parallelize the UI tests - all TravisCI env - matrix jobs are run in parallel, although there can be a limit on how many parallel TravisCI instances are allowed per repository. In fact, our project is one of the larger ones at Babbel, so in order not to block other teams, we gave ourselves a limit. This can be configured under https://travis-ci.com/YOUR_COMPANY/YOUR_REPO/settings:

travis-limit-build-jobs

We are also auto canceling jobs in case we push to a branch on which a job is currently running:

travis-auto-cancel-jobs

Now that we have env set but don’t want to run the UI tests in the other stages, we need to overwrite it with either another or no environment variables. This is done in the jobs - include section of the travis.yml.

jobs:include:-stage:StorybookS3env:# OVERWRITE WITH EMPTYinstall:install_script_for_storybookscript:bundle exec s3_website push-stage:Unit testsenv:TARGET=target_environmentinstall:install_script_for_unit_testsscript:run_script_for_unit_testsafter_success:upload_coverage-stage:some other stage..

Here are the stages related parts of our .travis.yml, from top to bottom:

install:-# install whatever is necessary to run the UI testsscript:.run_ui_tests $TEST_PARTenv:matrix:-TEST_PART=1-TEST_PART=2-[...]# the following line is needed to enable the TravisCI build conditionsconditions:v1stages:-name:Unit Testsif:type != cron AND branch != integration-name:Ui Testsif:type = cron OR branch = integration OR commit_message =~ run_tests-name:Deployif:type != cron AND branch != integration-name:Snykif:branch = master AND type = push-name:StorybookS3if:branch = master AND type = pushjobs:include:-stage:Unit Testsenv:TARGET=target_environmentinstall:./install_unit_tests_dependenciesscript:./run_unit_testsafter_success:./upload_coverage-stage:StorybookS3env:# no environment variables - overwrite with emptyinstall:./install_storybook_dependenciesscript:./build_storybookafter_success:./upload_storybook-stage:Deployname:Deploy to stagingenv:ENVIRONMENT=staginginstall:./install_deployment_dependenciesscript:./deploy-stage:Deployname:Deploy to productionenv:ENVIRONMENT=productioninstall:./install_deployment_dependenciesscript:./deploy-stage:Snykenv:# no environment variables - overwrite with emptyinstall:./install_snyk_dependenciesscript:./run_snyk

A full run on TravisCI for the master branch, containing all build stages, looks like this:

A full TravisCI run with all build stages

Alternative approach for parallelizing

We realize that there is another way of parallelizing the UI tests; it can be done using only stages without env - matrix:

jobs:include:-stage:UI testsname:part1install:.install_ui_test_dependenciesscript:.run_ui_tests 1-stage:UI testsname:part2install:.install_ui_test_dependenciesscript:.run_ui_tests 2

We did, however, feel that this solution made the travis.yml file longer and harder to read. TravisCI build stages are still a beta feature, and matrix will probably be enabled for them sooner or later, and then we will most likely change our approach.

To sum up, in this article you have been introduced to the challenges of setting up continuous integration for a medium-sized project. We showed you how to group jobs in stages, run tasks within one stage in parallel, and run stages conditionally. We hope you find our ideas useful and are looking forward to hearing your thoughts!

The author would like to thank Annalise Nurme for providing the drawing of the juggler.


How We Work

$
0
0

When I started at Babbel earlier this year, I was struck by how many great ideas the engineering department here had and how they saw their technology evolving. In fact, in my first week, a compelling document was shared with me outlining the engineering strategy. When we attempted to put the strategy into action, we encountered many problems:

  • Prioritizing tasks related to the engineering strategy proved difficult because we had no way to judge their importance
  • Tasks were often started but not finished because there was a lack of time and urgency

When we retrospected on what was done – and not done – we realized something that should have been obvious to us: the strategy wasn’t really a strategy at all. A strategy enables you to make decisions. This was rather a list of tasks.

There was no overarching vision of the future guiding us. We had no sense of mission.


Vision & Mission

We started by reflecting on our values. What do we want to be as an organization? Where do we want to position ourselves? What makes an engineering organization work well and how do we want to foster that? After weeks of introspection, brainstorming, and several rounds of feedback, we finalized our vision as follows:

Anyoneanywhere at anytime can use our products to learn languages

By anyone, we make accessibility a priority. By anywhere, we give the mobile app priority as well as ensuring our products work just as well in America as they do in Asia. By anytime, we ensure that our systems are running as optimally as possible at all times, are performant, and that we deliver frequently.

As our mission to enable this vision, we outlined 5 key topics:

Living customer centricity

We make customers our first priority – be they external users of our products or internal stakeholders. We will be responsive to their needs and aim to provide a good customer experience.

Releasing frequently

We prefer delivering incremental value over successive iterations rather than big bang approaches. We aim to have short lead times and to release anytime.

Fostering autonomous teams

Our teams have end-to-end responsibility over what they do. They are empowered to make decisions that affect them and have the accountability that goes with it.

Building scalable products

We build systems that minimize dependencies while maximizing extensibility. We aim to create an ecosystem of services that allows us to grow easily.

Having a great place to work

We have a fun, energetic, creative environment where people are respected and that has an honest and approachable leadership team. We aim to provide engineers opportunities to grow and develop in their career.

Principles

Now that we had a view of the world we wanted to create and a view of how to get there, what then? How do we meaningfully execute on these ideas? How do we go from thought to action?

We realized we needed to change who we are.

We decided to start with our engineering teams since the engineering team is the fundamental organizational unit for our department. We developed a set of principles for organizing our teams.

Teams shall:

  • Be stable, long-lived, and autonomous
  • Have end-to-end responsibility
  • Be cross-functional consisting of:
    • 1 engineering manager
    • 1 product manager
    • A mix of core team members (no more than 6) and complementary team members
  • Have 2-3x redundancy for each critical role (e.g. frontend engineers on a user-facing product)

By core team members, we mean team members that are fully dedicated to the team and report to the engineering manager. There should be no more than 6 core team members per team in order to maintain focus and cohesion. By complementary team members, we mean team members that are partially dedicated to the team for a specific period of time and support its efforts. They may report to different line managers. For example, a designer, business analyst, or didactics expert can be a complementary team member.

Team composition

Ownership

Inherent in our concept of autonomous teams having end-to-end responsibility is the idea of ownership. We felt strongly that teams should “own” their work and drive it accordingly. But what do we mean by “own”? To deconstruct this idea more formally, we introduced responsibility and authority. In order for ownership to work, teams need to have not just a clear understanding of their accountabilities but also a clear understanding of how they are empowered to deliver on those accountabilities.

For us, ownership means the following:

You have the RESPONSIBILITY to:

  • Turn product requirements into working software
  • Deliver high quality (correctness, reliability, integrity, security, performance)
  • Operate, monitor, and maintain your product(s)
  • Contribute to the department and company (especially when you are a dependency)

You have the AUTHORITY to:

  • Choose and improve your way of working
  • Make informed decisions about tools & technology
  • Clean up technical debt
  • Challenge when something is unclear, unfeasible, or disadvantageous

Feature Teams

Traditionally, when one thinks of engineering teams, one thinks of component teams. These are teams that own a software component – be it an application or service – that they are responsible for updating, deploying, and maintaining. Features requiring cross-component updates often entail the need for strong alignment between these component teams. When you have multiple such features, alignment creates costly overhead. Furthermore, component teams tend to feel disconnected from the overall feature because they lack end-to-end impact. They feel like cogs in the wheel.

To activate team autonomy and end-to-end responsibility, we introduced the concept of feature teams. A feature team is a long-lived, cross-functional team that designs, implements, and operates end-to-end customer features. In practice, feature teams are able to work across multiple components allowing them to take the feature from beginning to end. Rather than feeling like cogs in the wheel, they have end-to-end responsibility and are invested in the successful delivery of the feature. In their day-to-day work, they experience the value they are creating for the customer first hand.

To operationalize the feature team concept, we introduced Innersourcing. Innersourcing essentially applies the practices of open-source software development to an organization. It requires that the ownership is clearly defined in a maintainers and a contributors file, that appropriate tests are created to ensure the stability of the product, and that proper documentation exists to guide any team to run and test the codebase. Once these are in place, a team wishing to make a change in a code repository owned by another team need only clone it and create a pull request. The owning team can then review the pull request, request changes, and ultimately merge and deploy the change.

Innersourcing at work

With component teams, you have to align the teams in such a way that they deliver the right thing at the right time. With feature teams, you can use Innersourcing to allow teams to deliver value on their own. Understandably, due to limitations in skill set and overall knowledge of system architecture, Innersourcing does not necessarily preclude the need for alignment in some cases. Also code ownership does not simply go away with feature teams but rather becomes redefined. Documentation, testing, and a stable release process come to the fore. Thus, in the end, the feature team has two functions: one operational, the other strategic.

Service Teams

As we developed this concept, a major problem arose. The concept of end-to-end responsibility works in different ways for different teams. For a product team, it is quite clear. End-to-end responsibility easily applies to features. Nevertheless, there are teams that do not necessarily follow this model. For example, the infrastructure and payment teams, which are teams that have business critical components or components upon which there are many low-level dependencies. The scope and impact of these teams would make sharing operational tasks difficult because different access rights and security concerns come into play.

Feature and Service Teams

We realized there is case for service teams.

We therefore made a conscious decision to divide the engineering organization into product and platform teams. While product teams work on implementing features, platform teams are service teams that own non-customer-facing, integratable, and business critical components that serve multiple teams. Among these are our cloud infrastructure hosted on AWS, our payment systems, and our user account management systems. This makes it possible to have governance and auditability around critical aspects of our ecosystem while allowing the rest to innovate rapidly.

Back to Strategy

We have recently introduced these concepts and are now aligning our teams along these principles. These kinds of changes rarely work in a day; it takes time to realize them. With our vision & mission to guide us, we will also start planning our 2019 engineering strategy. We have a lot of great ideas and not enough time to do them all, but now we can effectively prioritize them and make decisions about what to do – and what not to do.

The evolution of our staging environment

$
0
0

Today, we will be talking about staging environment at Babbel and how we recently improved it. As a reader of our tech blog, there’s a good chance that you are already familiar with the concept of a staging environment. I will nevertheless start with a brief definition, so that we establish a common understanding before going into the details of how to secure a staging environment. Bear with me.


A brief introduction to staging environments

A staging environment is one or more servers or services, onto which you deploy your software first before you roll it out to your production environment. The staging environment shall resemble your production environment as closely as possible. The purpose of deploying to staging is to improve the robustness of your releases by doing a pre-release testing on this environment. The basic illustration of its place in the delivery process can be seen in the following illustration:

Software delivery process, including a staging environment

The first step is a usual one, where you, as a developer, work on a feature. Once it’s finished, in most of the common setups you send this feature to a Build server (like TravisCI or Jenkins), which runs your automated tests, linter checks, and in some cases produces a build artifact, e.g. a compiled binary or a docker image. Then in order to ensure it’s working properly, you’re deploying it (either automatically or manually, depending on how far away are you on your way to Continuous Delivery) to the staging environment. There the manual testing happens, which is made by either QA or yourself, depending on your company processes. Only after this step is passed, you are feeling free to deliver your feature to your customer.

You can read more about staging environment concept on Wikipedia.

Staging environment protection

There are several reasons, why you want to protect staging from external access: You don’t want to expose half-baked features (this is why you have staging in the first place) and because duplicate content may hurt your brand. There are different ways to approach it. Starting from the easiest one, like basic auth, to more and more comprehensive, like the one we use at Babbel.

Many years ago we had a setup where the staging environment was just protected by the HTTP basic auth. You know, that one, which asks you for the username and password in a standard browser dialog.

HTTP Basic Auth as an example of a staging server protection

In ApplicationController of our Ruby on Rails application we would need to just insert a line http_basic_authenticate_with name: "babbel", password: "secret"

It’s a wonderful solution at first, but it stops to be satisfactory at some point. For instance, you might want to actually give different internal users different passwords. Or you’re operating a microservice/serveless architecture that just don’t have a single entry point anymore.

Staging environment at Babbel before

For a long time at Babbel we were using a setup where staging environment is purely inaccessible from the outside of our AWS Virtual Private Cloud, VPC (for simplicity let’s define it as “cloud corporate network”). There were a couple of security additions to that, but it was the main idea.

However, we came across an issue regarding mobile testing. For that, we needed to have our staging environment publicly accessible on network level but still somehow locked. We wanted to make it accessible for a mobile device farm, as we don’t have all the required device/OS combinations on our side and virtualized solutions are not always suitable. We had one more restriction, it is that this device farm cannot use VPN tunnels to access our private VPC. So, we had to come up with a better approach that opens our staging environment only to those who should gain access.

Let’s start thinking of how it can be done. Extremely simplified, our architecture looks like this:

Staging environment architecture at Babbel before the changes

Basically, we have two VPCs, one is accessible from the outside, which is our production one, and another is accessible only from the VPC itself (or from anywhere if you’re connected via VPN tunnel), which is our staging. Inside each subnet, we have a load balancer (AWS ELB) and a set of EC2 machines attached to it. Obviously, since Babbel has a big and multilayered infrastructure, we have not just one VPC and more than two subnets, and not one load balancer, but quite a few, because we have way more than one service. There are also API Gateways, Lambda functions, and other tooling. However, they don’t matter much for this story.

Staging environment at Babbel after

This is our new workflow for staging environment.

Staging environment authentication process at Babbel after the changes

  1. Request comes to the CDN. CDN checks whether the request contains a signed cookie (will be explained below). If it does, skip to step 5.
  2. The user has to go to a service which we name “Passport service”
  3. At this service the user can log in using GSuite credentials.*
  4. In case of success, the user is redirected back to the CDN…
  5. … and the request can now go to the Backend service or S3.

*. The GSuite part is not ready yet, currently we’re using a shared secret authentication for this part

This is how the infrastructure for this approach looks like:

Staging environment architecture at Babbel after the changes

Ooooh, looks insecure, doesn’t it? Both our production and staging load balancers are publicly accessible, as well as both CDNs. As our initial task was to protect them, you are probably wondering how this is being achieved. The trick is in the Signed Cookie mechanism of AWS CloudFront. The purpose of the SignedCookie is basically to make the CDN accessible only to requests that carry this signed cookie, which is signed by some particular trusted service. Just as a side note, in order to add this Signed Cookie mechanism, we didn’t need to change neither our applications, nor even the load balancers. Please keep in mind, it is not something that can be implemented with any CDN, but a specific feature of the CloudFront.

The user first has to log onto the “Passport service”. It uses a SAML authentication based on GSuite single sign-on, which means that we can control on the GSuite-organisation level, who should have access to the staging environment. The “Passport service” issues a signed cookie (actually it is three cookies), allowing the client now to access the CDN. If the Signed cookie is valid, then the CDN will be able to pass the request to the downstream service, e.g. load balancer.

However, there is one caveat. In order for CDN to work, we have to make our ELB (load balancer) publicly accessible and having a DNS record. It means that even if we protect our CDN, the ELB is still open and knowing its DNS name, random people can access it.

Fortunately, it is easily solvable by another mechanism. CDN by itself can add a custom header to the request. That’s what we leveraged. When request goes through the CDN, it acquires this new header and our backend server checks it. This is not optimal, because it requires this not-app specific logic still to be implemented in the app. However, it is just a temporary step, until we switch to the relatively new load balancers from AWS, ALB, which support the signed cookie check by themselves.

So, if you’re accessing our CDN without having the Signed Cookie - you will not get passed through. If then you try to access the ELB itself, avoiding the CDN - you will got blocked by the Shared Secret cookie check. Yay, our staging is public and at the same time secured!

Thank you for reading. We will appreciate if you want to tell us about the way your company approaches staging environment architecture.

AWS Lambda and APIGateway as an AJAX-compatible API-endpoint with custom routing

$
0
0

AWS Lambda is a powerful tool to build serverless applications, especially when backed by APIGateway and Swagger. Lambda executes your code without the need for you to manage a dedicated server. APIGateway provides a front-end for your Lambda to be easily accessed from the Internet via endpoints that can be configured with the Swagger framework. In this article we’ll take a look at one specific example of an AJAX endpoint that uses custom path parameters, something typically problematic to implement because of Swagger limitations.


10000 Gates in Kyoto

Task

Build an HTTP proxy endpoint reachable from a browser using an AJAX-call with an address like path/to/endpoint/{route+}, where route is a URL path (this means that it can contain forward slashes - e.g. path/to/endpoint/foo/bar).

Framework

AWS Lambda - provides the backend for the endpoint. It contains code with business logic, processes user input, and returns a JSON response.

Amazon APIGateway - provides the connection layer to the Lambda from the Internet over HTTP, because you can’t call Lambda functions directly without AWS credentials.

Swagger - describes the APIGateway resource configuration.

Plan

Let’s assume that you have a Lambda function already. If not, then create a new one with the following code:

exports.handler=(request,_context,callback)=>{constbody={url:'prefix/'+request.pathParameters.route};constresponse={statusCode:200,body:JSON.stringify(body)};callback(null,response);});

This function takes a request and returns a JSON with the modified URL. In this case, the function just attaches prefix to the requested path. This is good enough for the shown example but in production you would likely have something more meaningful such as traffic-splitting, domain-switching, or other routing functionality.

In order to connect the Lambda to the Internet, we need to define a Swagger configuration for the APIGateway. You should note that the route parameter can contain any string as is the case when you build a request proxy and your route parameter contains the requested URL. Unfortunately, doing this with a basic Swagger configuration is not possible because the parameters cannot contain forward slashes. In APIGateway this would only be possible with a special extension for Swagger called x-amazon-apigateway-any-method. But herein lies the problem…

Problem

The ANY method will pass all requests to the Lambda function. It’s a good-enough solution if you call an endpoint only from a backend or mobile app but if you call it from a browser, the browser will fire a pre-flight request with the OPTIONS method. The response should contain an empty body and CORS headers which would allow us to perform the AJAX request. In this case however, the request will end up in the Lambda function and the main code will be executed, returning JSON as a result and no CORS. The browser will then reject it and throw an error.

You can hack-fix it on the side of the Lambda function by checking the request method and returning a mock for OPTIONS, but in that case, why use tools as powerful as APIGateway and Swagger in the first place?

Solution

Actually, all you need to do is to define x-amazon-apigateway-any-method with default behaviour and override the necessary methods with an actual configuration.

Here is an example:

swagger:'2.0'info:title:HTTP-proxy APIdescription:Example HTTP-proxy APIversion:'1.0.0'schemes:-httpsproduces:-application/jsonpaths:/path/to/endpoint/{route+}:x-amazon-apigateway-any-method:produces:-application/jsonconsumes:-application/jsonx-amazon-apigateway-integration:type:mockpassthroughBehavior:when_no_templatesresponses:default:statusCode:"403"responseParameters:method.response.header.Access-Control-Allow-Origin:"'*'"responseTemplates:application/json:__passthrough__requestTemplates:application/json:"{\"statusCode\":403}"responses:403:headers:Access-Control-Allow-Origin:type:stringdescription:403 responseget:produces:-application/jsonconsumes:-application/jsonsummary:This is a test Lamda HTTP-endpointparameters:&parameters-name:routein:pathtype:stringrequired:truex-amazon-apigateway-integration:type:aws_proxyhttpMethod:POSTuri:%HERE_GOES_YOUR_LAMBDA_ARN%credentials:%HERE_GOES_YOUR_LAMBDA_INVOCATION_ROLE%responses:200:headers:Access-Control-Allow-Origin:type:stringdescription:Returns the experiment configuration and the destination for a specific targetoptions:produces:-application/jsonconsumes:-application/jsonsummary:OPTIONS method defined for AJAX-callsparameters:-name:routein:pathtype:stringrequired:trueresponses:200:description:200 responseheaders:Access-Control-Allow-Origin:type:stringAccess-Control-Allow-Methods:type:stringAccess-Control-Allow-Headers:type:stringx-amazon-apigateway-integration:passthroughBehavior:when_no_templatesresponses:default:statusCode:"200"responseParameters:method.response.header.Access-Control-Allow-Methods:"'GET,POST,PUT,PATCH,DELETE,OPTIONS'"method.response.header.Access-Control-Allow-Headers:"'Content-Type,X-Amz-Date,Authorization,X-Api-Key,X-Amz-Security-Token'"method.response.header.Access-Control-Allow-Origin:"'*'"responseTemplates:application/json:__passthrough__requestTemplates:application/json:"{\"statusCode\":200}"type:mock

Here we define a x-amazon-apigateway-any-method block that returns a 403 status code by default. After, we define a GET method, overriding the default behaviour with a call to the Lambda function. Finally, we define an OPTIONS method that returns Access-Control-Allow-* headers, necessary for AJAX calls. All we need to do now is to return the Access-Control-Allow-Origin header together with the Lambda response. Let’s modify our code:

exports.handler=(request,_context,callback)=>{constbody={url:prefix/+request.pathParameters.route};constresponse={statusCode:200,headers:{'Access-Control-Allow-Origin':'*'},body:JSON.stringify(body)};callback(null,response);});

Now our problem is fixed. We’ve defined an AJAX-compatible API endpoint using APIGateway tooling and can call our Lambda function from the Internet.

Photo by Jeremy Goldberg on Unsplash

From documentation to empowerment

$
0
0

This is the second in a series of blog posts entitled “How We Work” in which we share with you how we work in Engineering @ Babbel.

Without history, there is no future.

To effectively make technical decisions we need to share a common understanding on the context in which we work. Thus, we need to know “how did we get here?” and then “where do we want to go?”.


Documentation

To understand “how did we get here?”, we need documentation.

Software documentation can have a positive impact on the software development process. It can affect the onboarding experience of new engineers, the distribution of knowledge among the engineering community, and the transparency around architectural evolution as well as the decisions around it.

From the Agile Manifesto we know that working software is considered to be more valuable than comprehensive documentation. Agile documentation are user stories. Still, there are aspects of the software development process that go beyond the scope of user stories and also require to be documented differently. This documentation is usually neglected and there are multiple reasons for this:

  1. The value of this documentation is underestimated.
  2. Lack of awareness about what and when to document.
  3. Significant effort is required to address the previous points.

Yet, good source code is often the best documentation, right?

Yes, but not everything is reflected in the source code. Some of the most valuable pieces of information that tend to get lost during the development process are the reasons to implement something in a certain way and which decisions were made along the way.

Everything doesn’t need to be documented. At Babbel, we concentrate on documenting technical decisions that have significant impact on the way we work, those that shape the trajectory we aim to follow. We document decisions that have major or long-term intentions. This kind of documentation will give us the ability to understand our current situation and will give us context from which to grow. They will support us figuring out “where do we want to go?”.

Using a standard documentation format helps us answering the right questions during the decision-making process. We use a well-known pattern called Architectural Decision Record (ADR) [1][2]. We designed a consice and simple format around this pattern. We agreed to use a common template which implements our standard format to document technical decisions. They are stored as Markdown files and contain the following information:

  • Header: The signature of the document. Contains data about the creation, the review, and the status of the document.
  • Preface: Contains information regarding the context of this document describing background information, business and technical requirements and constraints, and references to other related documents.
  • Decision: All relevant information regarding the evaluation phase of all the different alternatives and the final recommendation, including benefits, potential risks, and consequences.

We agreed to persist the ADR in two forms. Decisions that may affect multiple projects or potentially have a wide impact radius are documented in a dedicated repository. This a global repository accessible to all Babbel engineers, is as any other repository in our organization. Decisions that are local to a particular project, where the impact radius is limited, are documented in the same project’s repository and, thereby, live in close proximity to the relevant source code.

Outdated documentation is often worse than no documentation, and we know it’s hard to keep documentation up-to-date. By documenting technical decisions using the same practices we use for software development, we reduce maintenance overhead and we are able to review this documentation the same way we do for source code.

Reviewing documentation as code creates an opportunity to have open reviews for significant decisions before we implement them. By making reviews open, we win all of the advantages of InnerSourcing. To make the review process more efficient, we defined different groups of engineers responsible for reviewing proposals on different domains. We called these groups Architectural Domain Groups (ADG). They are expected to give feedback and validate the proposal. ADGs are composed of members from different teams, which enables cross-team collaboration throughout the engineering department.

Architectural Domain Groups

ADGs are also responsible for preventing invalid and obsolete documentation as well as inefficient documentation practices. They should distribute the overhead for the maintenance of the documentation to a minimum group of engineers. The ADG is responsible for:

  1. Transparency around documentation topics and changes (e.g. update the documentation agreement, lower entry barriers for engineers, make people aware of documentation changes).
  2. Maintenance of general or domain-specific architectural decisions (e.g. create, review and update/deprecate).
  3. Evaluation of proposals and assessment of solutions around a new decision record (e.g., collisions to prior decisions, incomplete arguments, unclear assertions, impact radius).
  4. ADR life cycle: call for periodic reviews and keep the documentation up-to-date (e.g. review/update/deprecation).

We created a handful of ADGs for the main domains we deal with, such as Backend, Mobile, Frontend, and Infrastructure. Also, there is an ADG that takes care of the process and the other ADGs. We called it Architectural Knowledge Management (AKM).

A generic use case

How teams and ADG should cooperate is described in this workflow chart:

Architectural Decision Flow

Let’s say, a team is responsible to deliver a solution for a given problem. To implement this solution, the team has made a decision that will change the architectural direction of a component/service. Then the decision-making processes will be something like the following:

  1. The team will write an ADR to describe their decision based on the ADR template and create a PR.
  2. The team will inform an ADG that this ADR exists by adding them as approver to the PR.
  3. The ADG will assess and validate the decision and involve other teams if necessary.
  4. The ADG will give feedback about the decision described in the PR, looking to find an agreement in case the decision is questioned.
    1. The team may take the feedback and modify the decision as necessary.
    2. The team may reject the feedback and deliver arguments to backup their decision. This must be documented.
  5. The team will modify the ADR to reflect the agreement on the final decision.
  6. The team owns the final decision and can implement their solution to the problem.

It’s possible for a team to reject feedback coming from the ADGs. However, this must be documented, as the team will own the final decision and all the ramifications that this may cause in the future. Other engineers that want to implement solutions based on this decision should be supported by this team.

Empowerment

This practice empowers engineers to make decisions and implement them, as they are not made by a central entity. We believe this practice motivates engineers to take more initiative and develop more ownership over their work. Even though any decision must be reviewed and validated by an ADG, the last word belongs to the engineers that are presenting a proposal. They are responsible and will be held accountable for the solutions they implement.

With this approach we want to address some of the topics listed in Nehal’s blog post How we work, specifically those points regarding “ownership”:

  • Make informed decisions about tools & technology.
  • Clean up technical debt.
  • Challenge when something is unclear, unfeasible, or disadvantageous.
  • Turn product requirements into working software.
  • Deliver high quality.
  • Contribute to the department and company.

We aim to improve collaboration by creating a channel for cross-team cooperation and by making it a standard step of the development process. Documentation is not a popular topic and the notion of not having a central entity of responsibility was challenging for some. Considerable communication was necessary. We made multiple presentations and went through many open discussions.

Currently, the general acceptance is good and we have a handful of global ADRs and many more in individual repositories. We feel we will reach our goal once this practice is embraced as part of the development process by everyone in the department. We are very optimistic about this, because the whole process is driven by our engineering community.


This is part two in a series of blog posts outlining how we work in Engineering @ Babbel. Each post provides you with a different glimpse into our way of working. There’s of course more to it than what can be described in just four posts, but we hope that they will at least give you a first impression.

A Couple of Takeaways from the (European Women in) Tech Conference

$
0
0

On November 28-29, a fellow engineer from the Payment team, Karen, and I attended the European Women in Tech Conference in Amsterdam. It was my first tech conference and it was definitely the first event I have ever attended where the only males to be seen were cameramen and members of staff 🤭 On a more serious note, we spent the two days chatting with professionals in the field (both tech women and women-in-tech) and listening to inspiring talks from leading women of tech companies, such as Amazon, Facebook, Xing, Arm, just to name a few.


Karen and Lina at EWIT Conference

A couple of themes that came up in most of the talks I attended were:

  • Diversity and inclusion. The conference started with a panel, led by Lesley Slaton Brown, Chief Diversity Officer at HP Inc. She dedicated her work to creating an inclusive tech environment. According to Brown, although the situation is getting better, still only 16-20% of tech professionals are women. Yet, the goal is not to create even more separation by implying that women need a special treatment or protection. Instead, as another panelist Luciana Broggi (Enterprise Solutions GM, EMEA at HP Inc.) pointed out, hopefully soon diversity (and women in tech) will become a theme that no one talks about because it will be so normal that it will not require any attention anymore.

  • The career path is rarely straightforward. More often it’s a zig zag street. Another subject discussed in the panel was a “traditional” career path. All presenters agreed that rarely anyone climbs “the career ladder” anymore. Rather, a successful career path is going up in zig zags. You get better at something, then you move to a new-ish territory, you struggle a bit, you learn it and then repeat the process again. Olivia Schofield (founder of Spectacular Speaking and Vocal Women) added to the point saying that one must have an agile mind-set (“Staying ready, not getting ready”, as she put it) and be able to adapt and change one’s direction as the need arises. Luciana Broggi also admitted that when hiring, the most important thing that she considers in a potential candidate is the growth-mindset and aptitude to learn.

  • You don’t have to have a title to lead. Not surprisingly, another prevailing topic was leadership. All presenters repeated again and again that you don’t need a title to propose a change. Sometimes it is tempting to feel like we should stick to our official role and we hesitate to point out anything that is out of scope of our work. But the attendees of the EWIT conference were encouraged not to wait for a promotion in order to speak up, suggest change, or talk openly about their achievements. That way, they would better serve their team and company, even if “lead” or “manager” does not appear in their official job title (yet).

  • Be yourself, do your thing, not the right thing. A question that came up in many talks was this: “we work in a world dominated by men, does it mean we should start acting like men?” And the answer from all presenters was a clear no! Olivia Schofield, Mennat Mokhtar (IT Area Lead at ING), Kirsi Maansaari (Director, Product Management at Arm) and all other presenters stressed the need to remain feminine, authentic and true to ourselves. We (and by we I mean every one of us, regardless of our gender) often see things from a different point of view, we have different communication styles and ways to approach challenges and embracing it, instead of trying to fit in, brings the most innovative ideas.

Generally, the conference had a very supportive and positive vibe. The goal of it was to celebrate women in tech, no matter how few there may be and empower and inspire them. Of course it’s not all victories and celebration. In my opinion, the most interesting and insightful discussions took place off-stage. The women I talked to told me things that have not been addressed during the official presentations: discrimination and sexism that actually do exist (even if it can be forgotten after two days of interacting with female leaders of the industry), struggles they had to go through working full time, raising children and managing a home, inability to move on in their career because of not being taken seriously, over-scrupulous code reviews from male colleagues, mansplanation and so on. The contrast of hearing stories about wins and challenges of fellow women gave me a lot of inspiration and material for reflection. The not-so-happy accounts are just as important to be shared and, hopefully, in the future more raw testimonies will be coming from presenters.

It is obvious that most of the themes discussed above are as relevant to men as they are to women. Even though the conference was presented as a “women” event, anyone could have benefited from the presentations. I strongly support Luciana Broggi’s idea that we should focus less on separation between male and female engineers, as this trait differentiates us much less as we sometimes think. In the end, we all, men and women, should spend our energy on becoming better engineers, leaders, teammates, trainees and mentors. So my hope is that my blog post of the next year will be titled “A Couple of Takeaways from the Open-Minded People in Tech Conference”.

Signing OkHttp requests with the AWS V4 signing algorithm

$
0
0

A lot of companies nowadays depend on services provided by Amazon. Babbel is no exception. Calling these services through Http usually requires the requests to be signed. Although Amazon provides a vast amount of libraries that handle this for you, sometimes you need to use your own. Maybe the level of customization you’re seeking is not possible with the provided libraries. Maybe you want to add a feature that uses other 3rd party libraries that are incompatible with the ones provided by Amazon. Or maybe you simply want to avoid adding an entire library just to use a very small part of it. In this blog post, I’ll introduce an open source library we’ve built here at Babbel that signs OkHttp requests.


Some years back we’ve decided to start using the Amazon API Gateway to service our frontend apps. In the Android app, we’ve started by using the Amazon Android SDK. At the time, the API gateway SDK was quite new and some of the functionality we needed was missing. We needed to set our own user agent, but this was overridden by the SDK. We required to intercept some requests and manipulate them, but the SDK design didn’t allow to do this in an easy way. Perhaps all of this is already possible to do with the Amazon libraries (in fact Babbel contributed with a PR to prevent the User Agent from being overridden), but it’s long since we’ve decided on another approach. Quite early in our integration of the Amazon API Gateway we’ve decided to drop the API gateway SDK and went with Retrofit.

Header Image

However, not all dependencies to the Amazon SDK were removed. We’ve kept a dependency on the core SDK so we could use the signer written by Amazon.

This was not the ideal scenario. If you know Retrofit, you know it depends on OkHttp. This means that every request will be an OkHttp request. By contrast, the Amazon signer uses requests from the Amazon SDK, which are different than the ones used in OkHttp. Here’s the current signer signature for reference (comments are stripped for simplicity):

packagecom.amazonaws.auth;importcom.amazonaws.Request;publicinterfaceSigner{publicvoidsign(Request<?>request,AWSCredentialscredentials);}

This lead to an in-between stage where we were copying the OkHttp request into an Amazon request so we could pass it to the Amazon signer. Granted that this glue-code was not the most efficient one, it still worked and enabled us to forget about maintaining the signing code.

Unfortunately, recently we’ve tried to add support for Android Pie and discovered that the signer is incompatible with this Android version. The code was trying to get a logger using the Apache Commons Logging facility and this is incompatible with Android Pie. At the time of writing this post, Amazon has fixed the issue as you can see here. However, while adding support in our app for Android Pie we had no fix yet.

We were faced with 2 choices:

  1. Fix it ourselves and submit a PR to Amazon for feedback and get it possibly merged
  2. Implement the signing ourselves completely

Since we had already acknowledged that the glue-code copying the OkHttp request into an Amazon request wasn’t the best one and taking into account that the signing algorithm is well known, we’ve decided to build our own library - okhttp-aws-signer. Moreover, given that tests are provided (along with a lot of other resources) by Amazon, it becomes quite simple to implement the signing algorithm.

How does it work?

The idea is to create a signer object with the region of your service and the name of the service you want to execute.

valsigner=OkHttpAwsV4Signer("eu-west-1","execute-API")

From there one can use the signer to sign the OkHttp requests.

signer.sign(request,accessKeyId,secretAccessKey)

It’s important to note that the particular implementation of the algorithm requires the host and x-amz-date header to be present in the request. Therefore, it might be good to guarantee to have them. Moreover, the date must be formatted with the pattern yyyyMMdd'T'HHmmss'Z'.

valnewRequest=request.newBuilder().addHeader("host",request.url().host()).addHeader("x-amz-date",SimpleDateFormat("yyyyMMdd'T'HHmmss'Z'",Locale.US).format(Date())).build()signer.sign(newRequest,accessKeyId,secretAccessKey)

How can we integrate this with Retrofit?

In the end, having the signer is already a great step to remove the only dependency we had on Amazon. However, we still need to have it integrated with Retrofit. We’ve chosen to build an interceptor to sign all requests. Here’s how it looks like:

classAwsSigingInterceptor(privatevalsigner:OkHttpAwsV4Signer):Interceptor{privatevaldateFormat:ThreadLocal<SimpleDateFormat>init{dateFormat=object: ThreadLocal<SimpleDateFormat>(){overridefuninitialValue():SimpleDateFormat{vallocalFormat=SimpleDateFormat("yyyyMMdd'T'HHmmss'Z'",Locale.US)localFormat.timeZone=TimeZone.getTimeZone("UTC")returnlocalFormat}}}overridefunintercept(chain:Chain):Response=chain.run{valrequest=request()valnewRequest=request.newBuilder().addHeader("x-amz-date",dateFormat.get().format(clock.now())).addHeader("host",request.url().host()).build()valsigned=signer.sign(newRequest,"<accessKeyId>","<secretAccessKey>")proceed(signed)}}

The interceptor above simply creates a new request with the needed headers and signs it as described in the section above. Because it can be called from multiple threads, we secure the SimpleDateFormat with a ThreadLocal object. This is a well-known issue with SimpleDateFormat in Java that you can read about here.

After the request is signed, we can simply proceed with it through the chain of interceptors. It’s worth noting that here we are hard-coding the credentials for the signing. However, these can be dynamically retrieved. If a request doesn’t need signing we can simply pass null to these credentials and the signer will simply return the same request.

Once we initialize the Retrofit instance, we can set it up with this interceptor.

valsigningInterceptor=AwsSigingInterceptor(OkHttpAwsV4Signer("<region>","<service>"))valretrofit=Retrofit.Builder().client(OkHttpClient.Builder().addInterceptor(signingInterceptor).build())// ...
.build()

Limitations

As you might have already noticed, the signer is created for a specific region and service. Right now, the only way to sign requests going to different services or regions would be to have multiple signer instances. This can be fixed by making the sign method accept both the region and the service when signing requests.

The signer only supports the version 4 of the algorithm and requires certain information to be present in the headers. Also, it always uses the same hashing algorithm and isn’t flexible enough to change it on demand. This is not a limitation of the signing algorithm provided by Amazon, it’s a limitation of the code. The hashing methods are hardcoded in the source. Eventually, they can be made configurable and the signing process should still work.

Summary

This post introduces a library to sign OkHttp requests with the Amazon Signing Algorithm. We’ve felt the need to implement this ourselves so we wouldn’t be depending on the entire Amazon SDK when we needed simply the signing part.

Even though the library has some limitations, it is being used in our main Android Application and has proven to suit our needs. Now, while making it open source we expect to help others and improve it by removing the mentioned limitations when needed.

Every contribution is welcomed.

Babbel Neos: the story of eight new junior engineers at Babbel

$
0
0

Babbel Neos was born in May 2018: a Berlin-based, salaried engineering training program for applicants from unconventional, non-computer-science backgrounds. Our target? The aspiring developer who might find it difficult to get a job, otherwise. Through the program we cultivated a mentoring culture and sought to increase diversity within our engineering teams. Six months later, at the program’s conclusion, we hired all eight trainees as junior software engineers at Babbel.


Our values, the harvest

Finding our way

There’s still a monoculture in the tech space of privileged, young men with computer science backgrounds. It’s very difficult for historically marginalized groups and career-changers to secure a job in this field. Companies often focus on hiring experienced engineers, while early career developers are overlooked. More and more companies have been getting a grasp on the value of diversity, yet they hesitate to take bold steps forward. There are solutions and examples to follow though.

At Babbel, a core part of our identity is that we are a learning company. We strongly believe that diversity makes us stronger. These values are essential to our culture. They served both as inspiration and guide rails when we designed the Babbel Neos program.

We set three goals for ourselves to accomplish:

  • Attract and cultivate junior talent
  • Increase diversity within the engineering team
  • Support engineers in strengthening their mentoring skills

Hiring engineers is quite difficult in Berlin. There is a market trend of mainly hiring senior engineers. At the same time, juniors foster a learning environment. They create opportunities for senior engineers to improve their mentoring skills. In a balanced organization, creating development opportunities for engineers is as important as attracting new talent.

Diversity isn’t merely about gender diversity. Instead of promoting the same voices, we were keen to look for people with a diverse set of cultural and professional backgrounds.

“And just because this is someone’s first development job, doesn’t mean that they’re ‘junior’. Career switchers bring so many transferable skills to our industry, and often it’s the communication or soft skills that can be most difficult to learn on the job.” Mercedes Bernard: Empowering Early Career Devs

Our values, the harvest

The spark for Babbel Neos didn’t come straight from the engineering team. It was ignited within the People & Culture circle, a committee formed by employees working in HR, Product, Engineering, and other areas throughout Babbel. It was this collaboration that not only kickstarted the project, but also served as an example of what we wanted it to be: a union of different perspectives brought together to trigger fresh ideas. We were also inspired by two other companies that had similar initiatives — Prezi Jump and SoundCloud’s DeveloperBridge. We’re grateful for their openness in sharing their lessons learned with us.

We also reached out to coding bootcamps and tech communities in Berlin to establish a hiring funnel that represented the audience we wanted to attract. The interest we received in less than a month was amazing: almost 200 applications, and there was a steady stream of positive feedback about offering such a training program. This was our confirmation — there’s a strong need for this. The program even got nominated for the HR Excellence Awards, Germany.

It’s a learning journey for all

Meanwhile, we recruited mentors internally, too. Hiring more juniors without providing sufficient mentorship is a recipe for failure.

There’s a misconception that software engineering is solely about writing code, and therefore growing as an engineer only happens through getting more technically skilled. Engineers working together and sharing their learning processes are able to progress more quickly. When onboarding new members, mentoring juniors is crucial for a team’s resilience. And in the midst of this, becoming a mentor is a chance to revisit concepts that you might have taken for granted and to cultivate the next generation of senior engineers. Mentoring is about building a relationship. It cannot be merely taught; it requires practice and mutual support.

Throughout the six months of the training program, each trainee had the training lead and an additional, dedicated mentor available to them. The mentors themselves met as a group as well, coming together on a regular basis to exchange knowledge and best practices. Senior engineers also offered workshops to share their technical expertise with the trainees. The first three months of the program was a learning period. The trainees studied online courses, did pair programming, and worked on project assignments. During the second phase, they picked teams and joined them to learn more about their team’s tech stack and team practices. This second phase also created an opportunity for the teams to improve their onboarding processes. Throughout the whole engineering team, our training program created a learning momentum that became evident, day by day, in each team.

Building a product for the world

“Not everything is a tech problem.” —Thomas Holl, Babbel CTO

Babbel’s mission is to help people make connections by learning languages. To build a product for the world, we need the world to create it. The lack of technical skills is not the main bottleneck anymore; technical skills can be acquired comparably easily. But a product designed from a narrow set of social and professional experiences will fall short of the needs of a diverse customer base. We need people who understand and empathize with those needs and therefore design and develop a language learning tool that helps people speak a new language with confidence.

And in order to get there, we are looking forward to organizing the next generation of Babbel Neos in 2019! Are you interested in leading the training program? Join us for this journey!


Testing global event listeners within a React component

$
0
0

Although many developers working with React consider it an anti-pattern, occasionally, there is the need to listen and react to an event globally within a React component. As every good developer out there, you probably want to provide a unit test for this functionality. However, while implementing the test, you might face some troubles… Here comes this article trying to help you on your way through.


Listening

The React component

In my particular case, I have a React component which renders an entire page of my app. I want to listen when the user hits the Enter key and do something after this event happens.

Therefore, on componentDidMount I register the event listener to the global window object and on componentWillUnmount I unregister it. You can see part of the code of my component here:

componentDidMount() {
  window.addEventListener('keyup', this.handleKeyUp);
}

componentWillUnmount() {
  window.removeEventListener('keyup', this.handleKeyUp);
}

handleKeyUp(event) {
  if (event.key === 'Enter') {
    this.handleEnterKey();
  }
}

It works perfectly as expected! When the user hits the Enter key, the component listens to it and executes some operations.

The Test

Now, I want to provide a unit test for this functionality.

My testing stack includes Enzyme. Hence, I think of using the simulate() feature to mimic the hit of the key. However, the event handler I registered previously in my component is not being triggered.

After some research on the Web, the community comes rescuing me and I find the solution! I’ll try to summarise and explain it here.

The Problem & the Solution

Let’s start by specifying and making it really clear what the goal is that you want to achieve. I’ll do this by quoting from the linked GitHub issue above:

Your goal here is effectively to make sure the event is bound, and that when it’s fired something happens in your component.

Let’s also quote a statement from one of the Enzyme’s maintainers. It specifies a really important aspect of the testing utility which will help us understand the nature of the issue.

Enzyme is meant to test React components and attaching an event listener to the document with addEventListener means the event is not being handled by React’s synthetic event system. Our simulate method for mount is a thin wrapper around ReactTestUtils.Simulate, which only deals with React’s synthetic event system.

You see now why in this case using the simulate() method of Enzyme is not having any effect at the React component level.

To have any effect, keeping in mind the goal of our test, what you need to simulate is the mechanism of window.addEventListener. That is, you need to create a binding between an event name and callback function.

const map = {};
window.addEventListener = jest.fn((event, cb) => {
  map[event] = cb;
});

Now, when you mount the React component and componentDidMount is executed, the binding is created through the map object defined above.

const component = mount(<MyComponent {...props} />);

At this point, instead of using the simulate() Enzyme method to fire the event, you can simulate the typing of the Enter key by executing the following line of code

// simulate event
map.keyup({ key: 'Enter' })

Now, you can assert that when this event is triggered, the callback binder to this event through window.addEventListener is executed:

expect(component.handleEnterKey).toHaveBeenCalled()

and voilá!

Test is green!

I hope this small article will be helpful for anyone ending up in this same situation and that it will help to understand the why and how of the approach adopted!

Thanks to timoxley to raise the issue, to awaery to follow up and to blainekasten to provide a great solution!

The significance of giving back to open-source

$
0
0

Companies should encourage and allocate time for their engineers to be able to contribute back to open-source. It does not only improve the product that they are contributing back to, but enhances the feeling of self-worth which in turn has a big positive impact on the well being of the engineers and the company itself.

“In learning you will teach, and in teaching you will learn.” Phil Collins


You might be wondering what teaching has to do with contributing to a open-source project? In my opinion it is just a small part of one of the many academic processes that every company should embrace, because there is no better way to improve and learn more about a certain topic than to try to bestow and spread the knowledge to eager learners.

Almost every blog post features a fine example. For this post I will be explaining the process of adding a small yet valuable feature in Hashicorp’s terraform-provider-aws plugin.

But first, let me do a small intro.

Having a multilingual and multinational product with many users who are on various platforms and a multitude of browsers makes matters very complex, and sometimes the problems that our customers experience can be solved simply by clearing their web browsers cookies. We all know that doing so can be a tedious job both for our colleagues in Customer Service and our clients, especially given the fact that there are many web browsers out there and not all of our customers are tech-savvy.

For the reasons stated above we have decided to implement a serverless microservice for clearing cookies clear-cookies.babbel.com (Babbel users be warned, pressing the button will clear out your babbel cookies!) running on lambda@edge. With just one press of a button all the cookies that are related to our product on the customer’s machine will be deleted. After reading the cookies from the client, a POST call, which is CSRF protected, will effectively clear all the cookies that have been generated by us. As some of you might already know there is no such thing as deleting a cookie. Instead, the cookies that should be deleted are emptied of all values and their expiry date is set into the past, which effectively “removes” the cookie.

missing_feature_in_terraform_1.png

And now let us head on towards describing the process that led to the creation of the pull request against terraform-aws-plugin and how it was done.

The complexity of our systems is rather enormous. In order to help with the workload that emerges with our platform’s intricacy we tend to rely on using the serverless application framework that is provided by AWS all under the veil of terraform as our preferred solution for Infrastructure as Code.


In Babbel we are heavily utilizing many open-source technologies, which by itself is a topic out of scope for this blog. For clarity we will mainly focus on Hashicorp’s terraform-aws-plugin. Lately there have been many new features added in AWS which are not immediately present in the infrastructure configuration tools but may or may not be added at a later date. Under normal circumstances those features are added exceptionally fast in said tools (another fine example of the power that the community has), but some are either unpopular, fall under the radar or are indistinct. Consequently they do not get added in a timely manner if at all. One of those was the include_body option for lambda@edge where it exposes the body of the request/response in the cloudwatch logs. After seeing that it was missing and it was greatly needed we decided that it was time to roll up our sleeves and get down to business.

missing_feature_in_terraform

What we usually do in cases like this at Babbel is that we apply the method of “Eat Your Own Dog Food”, meaning that we maintain our own fork of the terraform-provider-aws, in which we have the features that are either missing or will never be added but are required by us. We build it, test it and use it (in production). So when we are satisfied with the quality, we create a pull request to upstream in GitHub and wait for an approval. Once the feature is approved and merged, we update our fork (again), so that we do not stray too far apart just for simplicity and maintainability sake.

For an appropriate and complete pull request against an official terraform plugin, the maintainer HashiCorp is using a software testing method called acceptance testing, in which the proposed changes are tested with apply, refresh, and destroy cycles. For this particular case we had to manually increase the timeout of the acceptance tests, because the terraform cycles where adding and removing cloudfront resources, which at times can be a little bit slower than usual. Needless to say the tests took two and a half hours to complete.

make testacc TESTARGS='-run=TestAccAWSCloudFrontDistribution'==> Checking that code complies with gofmt requirements...
TF_ACC=1 go test ./... -v -run=TestAccAWSCloudFrontDistribution -timeout 1000m
?   	github.com/terraform-providers/terraform-provider-aws	[no test files]
=== RUN   TestAccAWSCloudFrontDistribution_importBasic
--- PASS: TestAccAWSCloudFrontDistribution_importBasic (828.70s)=== RUN   TestAccAWSCloudFrontDistribution_S3Origin
--- PASS: TestAccAWSCloudFrontDistribution_S3Origin (805.78s)=== RUN   TestAccAWSCloudFrontDistribution_S3OriginWithTags
--- PASS: TestAccAWSCloudFrontDistribution_S3OriginWithTags (1028.92s)=== RUN   TestAccAWSCloudFrontDistribution_customOrigin
--- PASS: TestAccAWSCloudFrontDistribution_customOrigin (830.72s)=== RUN   TestAccAWSCloudFrontDistribution_multiOrigin
--- PASS: TestAccAWSCloudFrontDistribution_multiOrigin (1021.48s)=== RUN   TestAccAWSCloudFrontDistribution_orderedCacheBehavior
--- PASS: TestAccAWSCloudFrontDistribution_orderedCacheBehavior (964.56s)=== RUN   TestAccAWSCloudFrontDistribution_Origin_EmptyDomainName
--- PASS: TestAccAWSCloudFrontDistribution_Origin_EmptyDomainName (2.29s)=== RUN   TestAccAWSCloudFrontDistribution_Origin_EmptyOriginID
--- PASS: TestAccAWSCloudFrontDistribution_Origin_EmptyOriginID (2.09s)=== RUN   TestAccAWSCloudFrontDistribution_noOptionalItemsConfig
--- PASS: TestAccAWSCloudFrontDistribution_noOptionalItemsConfig (832.29s)=== RUN   TestAccAWSCloudFrontDistribution_HTTP11Config
--- PASS: TestAccAWSCloudFrontDistribution_HTTP11Config (748.69s)=== RUN   TestAccAWSCloudFrontDistribution_IsIPV6EnabledConfig
--- PASS: TestAccAWSCloudFrontDistribution_IsIPV6EnabledConfig (804.62s)=== RUN   TestAccAWSCloudFrontDistribution_noCustomErrorResponseConfig
--- PASS: TestAccAWSCloudFrontDistribution_noCustomErrorResponseConfig (919.45s)
PASS
ok  	github.com/terraform-providers/terraform-provider-aws/aws	8789.627s
...

The pull request by itself was trivial and I don’t think it deserves a worthy mention.

So to wrap it up I consider this as a “win-win” situation because:

  • We get to use the latest features of AWS (or any other IaaS that we might be using for that matter) without any waiting.
  • We can give a feature that has been tested on production back to the community.
  • It encourages learning and promotes well being among our engineers.

If you enjoyed this blog post, I’d be very grateful if you’d share it, because sharing is caring! Do not forget to leave your thoughts and ideas below!

Start the career as a Software Engineer

$
0
0

One year ago I changed my career from a Recruiter to a Junior Software Engineer. The path was long and challenging but it was worth it! I am happy to share my experience by giving the answers to the most popular questions I get.


Why I decided to change my career

Career change

There is no short answer, just a long one :) I was working as a Tech Recruiter since 2012. I loved what I was doing because I was helping people to find the best place to work for them and I was working in the tech field which was always really attractive to me. It was impressive to me that a person could create a whole world with just a brain, a laptop and the Internet at their disposal. Wow!

“Risk something or forever sit with your dreams.” —Herb Brooks

After some time as an HR professional, I was eagerly searching for how to grow in the field but nothing quite sparked my curiosity. I decided to learn what I’d always wanted to learn but had never quite been brave enough to start - engineering.

How I started

This is the hardest part. In the beginning, I was afraid of even opening a book and starting to read about development because I was not sure that I was smart enough. I felt that it was too late to make such a drastic career change… In reality, all obstacles are just in our heads. We can do and change everything!

There were two options: join a bootcamp or start learning by myself and find an internship afterward. I have chosen the second option because I was working full-time and I couldn’t have a break for three months to join a bootcamp. (Theoretically, I could have, but I wanted to be really sure I liked the software development before committing.)

I created a deadline and gave myself one year to learn how to code in order to find the first internship. Initially, I was more drawn to Frontend development and knew that it was fast-growing in the industry so it felt like a good place to start.

I explored a lot of different resources about how to start and what to learn but the most helpful was a talk with my friend who is an experienced Engineer. He helped me create a learning plan with quarterly objectives. Having a detailed plan was crucial for my learning success.

The resources I used

It was a lot of trial and error but I typically followed free written tutorials and YouTube channels. When I started a new topic, I would initially watch a few videos in order to understand the topic in general and afterward I would read tutorials or articles in order to clarify all the technical details. When I found an explanation difficult, I would simply move on to the next resource.

Once I had some understanding of HTML and Stylesheets, I started building small components like buttons before moving on to bigger, one page components. Afterward, I started using some Javascript on those components.

Also, weekly pairing sessions with my friend had a crucial impact on my progress because I could ask all my questions in person.

How I joined Babbel’s Engineering Team

Babbel Team

After one year of learning how to code in my spare time, I resigned from my position as a Tech Recruiter at Babbel in order to start searching for an internship in Engineering. By chance, my manager in Recruitment advised me to explore opportunities inside Babbel even though there were no open positions for Interns or Juniors at the time. I met with one of the Engineering Managers, Gaetano Contaldi, and after a discussion ending with a big surprise on his face, he agreed to take a look at what I knew and how I was progressing.

Gaetano and I set some learning goals for me. After observing my progress for a few months, he agreed to have me join his team. I was very excited about this! I started on a part-time basis initially. After three months, I signed on as a full-time Junior Software Engineer.

I am super happy that the Management at Babbel was open for such a change and believed in me.

How expectations are different from reality

The reality was totally different from my expectations. Not in a good or bad way. It was just different.

During meetings, despite team members talking in English all the tech jargon used felt like another language to me! It took a lot of time and effort from the team, Manager and myself in order to be able to speak the same language.

Development is not just about writing code (Gaetano always says this and now I understand him). You need to be able to understand how the project works, the architecture, the logic between components/modules, how external services work etc. When the team has a story to deliver, they begin by discussing their approach, agreeing upon architectural decisions and carrying out a further investigation if needed before finalizing how to proceed.

It was not obvious for me that such work needed to be carried out because when you are learning how to code you typically focus on smaller exercises where the task is already well-defined.

How the team helps

Babbel Team

They are patient. It’s already a big thing :)

My team members (developers, manager, product managers) are amazing! They are always happy to clarify topics. First, they do it in a simple way with drawings and examples from real life and then afterward with the technical details. It helps a lot!

As I mentioned above, pairing sessions also work well. As well as helping me to learn about the structure of our projects, new concepts and how to best approach tasks, pairing also helped me realize that even experienced Engineers do not always know how to solve a problem upon first glance.

One more important thing that my team members realize is that it’s difficult to always understand and learn everything fast and that sometimes the brain needs a break :)

A word on working with Juniors

These are the most important points:

  • Invest time and explain the problem/task and why it should be solved in a specific way. It’s much better than just explaining how to solve the problem, because the Junior can then take this knowledge and apply it to the next situation which might be similar.
  • Give hints and not the solution right away.
  • Give advice on the direction of learning and discuss the learnings.

“Find a group of people who challenge and inspire you, spend a lot of time with them, and it will change your life.” —Amy Poehler

I am excited to be a part of Babbel’s Engineering Team. I am inspired by passionate developers who learn, share knowledge, accept diversity and build a great product!

Resources that are useful for me

Tutorials

W3Schools Online Web Tutorials

The Modern Javascript Tutorial

Eloquent JavaScript

YouTube channels

Fun Fun Function

The Net Ninja

Techsith Tutorials

Terraform provider for Code Climate is open-sourced

$
0
0

Here is a short release notice. At Babbel, we’ve been using Code Climate successfully for a while, but we were unhappy about the lack of the Terraform provider for it.

Finally, we’ve decided to build our own provider and now it’s open-sourced.


Why Code Climate?

The answer will be short: We care about the quality of the code we’re producing, because a clean code base is easier to maintain and evolve. This, in turn, means we can ship features faster which directly impacts our users. Code Climate helps us to automatically detect code smells, missing test coverage, and so on, right during the build process on our CI stage.

How we lived before

For us, configuring Code Climate is a pretty common task to do. We already have a few hundreds of repositories and we add new ones regularly. For most if not all, we decide to configure Code Climate checks. Unfortunately, this process was still not automated.

If you adopted Infrastructure as Code, you must be familiar with the following feeling: Most of your configuration for infrastructure and 3rd-party services is codified, but you end up with edge cases that either cannot be codified or you just didn’t have time for this. Dealing with those remaining services make you feel insecure, as manual configuration is more error-prone than Infrastructure as Code: Manual configuration is missing common best practices like pull request reviews and versioning for every change. So you start thinking, how may you improve the situation.

If there is a Terraform provider for this, you have no doubts and can simply go and use it. However, in a case like ours, you have a choice, and the low hanging fruit here is a simple bash script. It can be used, because the external data source type in Terraform allows you to use any executable, e.g., a bash script, to bring some data to the Terraform state. Getting the data, without being able to complete the whole REST actions cycle, was a good enough for us at the moment, so we wrote a script for ourselves and integrated it with the current Terraform code.

The bash script:

#!/bin/shset -e

curl \
  -H "Accept: application/vnd.api+json"\
  -H "Authorization: Token token=$1"\
  https://api.codeclimate.com/v1/repos?github_slug=$2 |
  jq '.data | .[0] | .attributes | .test_reporter_id | { "test_reporter_id": . }'

The way to integrate it:

data "external""codeclimate-babbel-gem" {
  program = [
    "bin/cc_test_reporter_id.sh",
    "link-to-a-secret-from-aws_secretsmanager_secret_version",
    "${github_repository.babbel-gem.full_name}",
  ]
}

It gets a repository from Code Climate and extracts the test_reporter_id from there. Bash scripts are great for many purposes. However, in this case, we rather prefer a more “native” solution.

The current setup

So, this is what we’ve done:

Start of the dialog with Code Climate in Twitter

We’ve created a Terraform plugin for Code Climate. It was not our first provider, you can also take a look at one that we built for Rollbar.

Dialog with Code Climate in Twitter

We still implemented just the data source (so, reading from Code Climate), but not a fully functioning resource. The reasons you can find in the section Limitations below.

Currently, the definition of Code Climate repository looks pretty much the usual Terraform way:

data "codeclimate_repository""babbel-gem" {
  repository_slug = "${github_repository.babbel-gem.full_name}"
}

And this is better because we don’t need to maintain dependencies outside of Terraform, like jq for parsing of Code Climate response. If you have worked with Terraform and its HashiCorp Configuration Language (HCL) before, those three lines of code should feel familiar.

Limitations

However, we would like to not just describe the existing resources, but also to create new ones, using Terraform, as you’d always do. Having everything described in code is great, but what is best is to be able to manage these resources in code. And we’ve tried to add this functionality into the provider.

Unfortunately, it turned out, that Code Climate API misses some features, which are crucial for our use case if we wanted to automate it:

  1. The repositories#create creates side effects on the GitHub side (webhook and SSH deploy key), which in result won’t be managed by Terraform and in case of a deletion of the repository on Code Climate side, we will still have to manually delete the GitHub part. Which means that it is pretty much error-prone.
  2. Code Climate API doesn’t provide a way to manage users’ permissions. This means we have to manually grant access to the respective repository for the Engineering team after creating a repository in Code Climate using Terraform. Which is again a manual step.

These two facts make further development of the provider not that appealing from our side. Yet, once they’re improved, we might think to resume the development.

Conclusion

We still will be glad if Code Climate will take over the development because we’re affiliated neither with them nor with Terraform. However, it’s a great feeling when you’ve built something, that you can share. Also, it was important for us to get rid of yet another bash script using a dependency outside of Terraform (like jq), and the goal is achieved. If the future provider will be based on our foundation - we’re happy to help to get on board with the architecture, though the provider is pretty small anyway.

If you feel that you want to contribute - you know, it’s open, so feel free!

Thank you for reading.

How we established monitoring for our product health

$
0
0

In the past, we introduced monitoring and alerting that tracks the technical status of our platform and services. However, we sometimes experience issues that our monitoring cannot detect. This could be team-independent or cross-team issues or even issues that do not have a technical reason. For example, two alerts in different teams might not appear urgent but combined they make an impact. To answer these questions, we have started working on Product Health monitoring.


Even more interesting: Have our latest product improvements succeeded? Are they as well accepted by our users as we believed they would be when we started implementing them? An example could be the latest improvement to the “Review Manager”, a single page application. Is it working in all the different browser versions and is it improving the learning experience of our users?

That’s why we started working on Product Health Monitoring, with which we want to focus on the non-technical aspects of our products. Initially, we set up some simple use cases like “Number of Review Sessions”. We didn’t want to only display numbers. We were aiming for easy-to-digest and dynamic graphs - nothing gets more desinterest in the teams than not (or slow) moving graphs. They were to be shown on huge screens and attract passers-by that would also ask questions if something looks weird to them - such as a dropping curve or huge spikes. The time-based windows are defined differently per team. For some teams, it is sufficient to look at the graph for one week, others want to observe them daily or even more often. All of the graphs are drawn in almost real-time and allow to visually detect anomalies or invariances within minutes. This enables us as Technical Product Managers to get back to the teams, correlate specific events with releases, and evaluate observed anomalies. Are they intended? Are they just short-term (like for data migrations)? Should a feature be rolled back to allow for a throrough analysis of the data? In some cases, we also discovered incidents or outages of third party systems even before they notified us (if at all).

Let’s get back to our example of “Number of Review Session”: In the graph shown below it looks all good, the current numbers of the Review Manager (blue) are above last week’s numbers (gray). The delta (green) also indicates a higher usage. So the most recent feature improvement and release seem to work fine and is also accepted by our users.

Product Health Monitoring example graph with comparison

From a technical point of view we have decided to use Kibana and mainly the visualisation feature “Timelion”. This feature allows to not only draw a visually appealing graph based on Elasticsearch queries but also to combine different queries in one graph! So we can make use of various data in one graph and compare them. In some cases it is as simple as just compare the current data with the same data from a week ago. This allows us to detect invariances where maybe something is broken, because - depending on the time of the day and seasonality of course - we assume, that last week’s users were as active as this week. For example, we would expect users are reviewing their vocabulary this week as often as they did last week.

We came up with a number of visualisation conventions that make it easy not just for us but for anyone to read the graphs. Even stakeholders that are not involved in the nitty gritty details can look at the graphs and more or less immediately understand them without reading documentations or labels.

The code snippet below shows an example, which we use for comparing the data from today and a week ago. You may ask, why do we calculate a moving average (mvavg) for the data a week ago. That’s because for the baseline, it is not necessary to have outliers (up or down) visible - we’re just interested in the baseline. Additionally it makes the graphs easier to read.

.es('<your kibana query>').color("#1a75ff").lines(width=2).label("This week"),
.es('<your kibana query>', offset=-1w).lines(fill=0.5,width=0.5).color(gray).mvavg(window=10).label("Last Week"),
.es('<your kibana query>').subtract(.es('<your kibana query>', offset=-1w)).bars(width=1).color(green).mvavg(window=10).label("Delta")

As the next step, we are going to also enable alerting to not only visually/personally detect possible issues but also inform the teams or stakeholders immediately. At the time we have set this up, there was no alerting on AWS Kibana - but now there is 😉As written above, we plan to include these alerts in our established alerting process using PagerDuty instead of following a new process.

Femgineering at Babbel: Of the Women, For the Women and By the Women

$
0
0
Change that only comes from the top can ultimately only deliver so much. What happens when women in tech drive the process, advocating for themselves? What results does that yield? Babbel is a window into that approach.

Case in point: The Femgineering Community of Practice (CoP) at Babbel. Its mission, as a women-centric community, is to support and enhance the role and reach of women in tech at Babbel. In 2017, a few women in the Engineering department thought about putting their heads together to deal with situations they encountered and wanted to do something constructive about it. Little did they imagine that this first set of meetings would evolve into its present day form. Thus the Femgineering CoP was born.


Members of the Femgineering CoP

Members of the Femgineering CoP

One of the first tangible results from these meetings was the initial impetus for the Team Agreement (seen below). It offers guiding principles for the Engineering department on how we work and interact with each other, affecting a considerably broader audience than just the founding members.


Engineering team agreement

Team Agreement

Key areas of focus

Over time, the Femgineering CoP evolved into its current incarnation, with the aim of engaging in open dialogue, sharing ideas and innovations and providing a supportive community for women in tech at Babbel.

Meeting once a month over healthy lunches, we have members from across the entire Engineering department with almost every team represented in the group, including sixteen different nationalities. In the mix are Directors, Engineering Managers (EM), Team Coordinators and members from the development and testing domains.

After identifying potential opportunities for improvement, as a group, we decided to focus our efforts on the following key areas:

  • Skill-sharing and Collaboration
  • Visibility, Awareness and Support
  • Hiring and Promoting female engineers

How do we know what it is that we would like to accomplish and if we are progessing in the right direction? Each pillar has concrete goals that we strive towards and regularly take stock of within the CoP. A core team focuses on each of these goals.

The Femgineering initiatives, a few of which we dive into, also directly contribute towards one of Babbel’s Engineering Strategy missions, being a ‘Great Place To Work’.

In synergy we trust

Babbel is a learning company inside and out. This is reflected in the Femgineering forum as well. Learning from each other by sharing skills, knowledge and experiences is a way we demonstrate our commitment to each other’s growth. Toward that end, a part of our regular meet-ups is focused on knowledge-sharing. Topics that have garnered interest so far range from mastering the art of ‘learning how to learn’ - a critical skill in the ever-changing landscape of the tech industry, to an introduction to natural language processing, machine learning and Amazon Web Services (AWS).

With a view to enhancing presentation skills, the Visibility, Awareness and Support pillar of the CoP recently organized a two-part Effective Storytelling workshop for the Femgineering group. In addition to receiving one-on-one feedback on how to deliver more engaging presentations, we also learned about the importance of narrative hooks and connecting emotionally with our audience. Being able to access dedicated training on such topics is one more way Femgineering is attempting to redress gender dynamics in the workplace. Other initiatives under this pillar aim to raise the visibility of female engineers outside of the company, be it through public speaking engagements, contributing to our tech blog or representing Babbel at conferences such as Tech Open Air, the AWS Summit or the European Women in Tech Conference.


Femgineering members representing Babbel at the AWS Summit in Berlin

Femgineering members representing Babbel at the AWS Summit in Berlin

Sowing the seeds for a more diverse workplace

In the last quarter, members of the hiring pillar have also been working closely with representatives from the Human Resources department (recently renamed to People & Operations) to formalize a number of initiatives designed to reduce the likelihood of hidden bias occurring during the recruitment process.

One such change relates to reviewing the type of language used in job descriptions. Although we’ve never been looking for ‘jedi-rockstar-ninja coders’ who ‘work hard and play harder’, as a company whose very raison d’être is to facilitate the acquisition of language skills, we understand that the language one chooses to employ is often unknowingly a reflection of one’s values, attitudes and beliefs.

Research shows that the inclusion of certain, typically masculine-coded words, such as ‘fearless’, ‘hungry’ or ‘competitive’ in job descriptions can significantly reduce the number of female applicants. As a result, in May 2019, we began using a gender bias decoder tool to identify and modify words that are more likely to resonate with male candidates. Our open job descriptions now employ a more gender-neutral tone and emphasize the value we place upon fostering an open, collaborative and inclusive environment. Initial results following this change have been encouraging with 40% of the applicants for one of our Senior Engineer positions being female, representing a marked increase versus the previous month.

Femgineering is also changing the way in which we plan for interviews at Babbel. We strive for interview panels to contain at least one female panelist in a bid to both normalize the presence and expertise of our female engineers and reaffirm the importance we place upon building diverse teams. In parallel, to make sure that our recruitment process doesn’t uphold systemic discrimination, all engineers are invited to attend in-house interview training, where topics such as identifying and mitigating against the types of biases that we may inadvertently hold, are addressed.

Reaping the rewards

As a result of some of these initiatives, we have already seen an upward trend in the number of women in senior engineering positions and engineering leadership roles at Babbel in the past year. Two of our three Engineering Directors are women! This is especially encouraging as having women in leadership positions is key to maintaining momentum and enabling similar changes to percolate from the top of the pyramid as well.

Supporting the Femgineering initiatives, at an organizational level, we also have a series of benefits in place to encourage a healthy work-life balance. For example, many employees regularly work from home, we offer part-time roles to better support those with other commitments alongside partnerships with local co-working spaces offering childcare until our own in-house childcare facility is ready. As employees, we are trusted to manage our time as we see fit and all things considered, it is a win-win situation.


Femgineering is a great example of how well grassroot level changes can work in harmony with organizational support and we’re excited about how it’s evolving. Extending beyond a desire to simply increase the number of female engineers in the workforce, we’re committed to building an environment in which female engineers are not only included but are also supported and encouraged to reach their full potential. If joining us on our journey sounds appealing to you, come talk to us: we’re hiring :).


Are you working on a similar initiative? We’d be very happy to hear from you and exchange learnings.

Different ways to manage feature toggles on iOS

$
0
0

There are times during app development when you want to temporarily disable some functionality or enable it only under some conditions. For example, you work on a new feature that is not done yet, and it should not be accessible by everyone. Or you have a particular setting that enables extra functionality that is useful only during testing.


At Babbel, we have many different options that we enable only under some conditions. We use different strategies based on a use case and depending on who should have control over the setting. This blog post will cover how we implement these strategies for our iOS apps.

Using conditional compilation

Conditional compilation

In our project, we have a few of these conditions. The most widespread is DEBUG that we use in case we run our app in the Debug configuration. We use it to enable extra logging, and we switch to our staging environment.

#if DEBUG// do something only if the condition active#else// otherwise do something else#endif

The other place where we use this approach is to build different flavours of our app. At Babbel, we have an app for each language that we teach. The conditional compilation allows us to have different behaviour based on the app we are building. There are also possibilities to exclude or include files based on the configuration. If you want to dig deeper on this topic, I would recommend Dave DeLong’s blog series on conditional compilation.

Unfortunately, the conditional compilation makes testing slightly harder, because some code will not be present. You can work around this problem by creating a parameter on your class with a default value. The default value is determined based on the conditional compilation, but in your tests, you will be able to provide any value you want. Unfortunately, it is a bit more inconvenient than the following approaches. The most significant benefit with the conditional compilation is the ignored part of the code will not be part of the app. It reduces the size and hides things you don’t want others to discover.

Using arguments and environment variables in Xcode schemes

Arguments and environment variables in Xcode

These have a huge benefit because if you apply them to a scheme that is not shared, you will not need to version them. Every developer can have different options. Another great thing about arguments is that they overwrite values in NSUserDefaults. Let’s say you have a flag, in your user defaults, that controls the visibility of a feature. Instead of modifying your code or looking for plist in your app sandbox you provide an argument -YourKeyInUserDefaults newValue, and your app will start with this value.

You can access these under ProcessInfo. There are two properties available processInfo.environment and processInfo.arguments.

These are also very helpful for UI tests because they are suppliable to XCUIApplication, so you can fine-tune your app during the tests.

Using Info.plist file

Every bundle (thus app) usually contains Info.plist that can also be a great place to store configuration variables. The benefit of this approach over preceding ones is the variables are persisted. At Babbel, we have a build phase with a script that reads environment variables of the operating system, and if the named variables are present, we write their values to the Info.plist file. This approach is helpful when using our CI because we can trigger a new build without changing anything in our project with our desired configuration. The benefit over conditional compilation is that we can quickly inspect the Info.plist file and see all provided options, and it is easier to manage than a single variable (SWIFT_ACTIVE_COMPILATION_CONDITIONS).

Here is an example of our script in the build phase. It is necessary to convert the file back to binary because PlistBuddy converts it to XML.

PLIST_FILE="$BUILT_PRODUCTS_DIR/$INFOPLIST_PATH"

add_entry(){if[$3]; then
        /usr/libexec/PlistBuddy -c "Add :$1$2$3""$PLIST_FILE"fi} 

add_entry "YourDesiredKey""bool"$YOUR_DESIRED_KEY

plutil -convert binary1 "$PLIST_FILE"

You can access these values easily through Bundle.main.infoDictionary.

Remote Config

Last but not least is Firebase Remote Config. The remote config is an excellent solution we use for our all product-related feature toggles. Thanks to that, we can change parameters on the fly, even for apps that are live in production, and customise them based on different conditions. For example, we can enable features only in specific regions, for particular versions or platforms. We can also smoothly run AB tests without much hassle. It is very convenient. The downside is that we fetch values on the app start, and if the app fails to do so, it will miss the desired behaviour. The integration is straightforward and very well-documented.

Combining the solutions

With so many options, it might be hard to decide which one to use when. The best option is to combine them into a single class that can be injected into your classes and enables easy testability. The benefit of using all three approaches is flexibility. A developer can change it easily in its scheme or trigger a new build on CI with the desired option. Or our product manager can switch it in the remote config. Here is an example to give you an idea.

classConfig{privateletprocessInfo:ProcessInfoprivateletbundle:BundleprivateletremoteConfig:RemoteConfiginit(processInfo:ProcessInfo=.processInfo,bundle:Bundle=.main,remoteConfig:RemoteConfig=.remoteConfig()){self.processInfo=processInfoself.bundle=bundleself.remoteConfig=remoteConfig}varisMySettingEnabled:Bool{returnprocessInfo.arguments.contains("-MySettingEnabled")||bundle.infoDictionary?["MySettingEnabled"]as?Bool==true||remoteConfig["MySettingEnabled"].boolValue}}

What do you think? What is your preferred approach? Do you use any other? Let us know by leaving a comment below 👇!


On the Hunt for the Fastest CI Service

$
0
0

A speed- and usability-focused comparison of the Travis CI, CircleCI, and Semaphore CI Continuous Integration services.


Imagine that you’re a developer making changes to a shared codebase. Following git etiquette, you create a new branch for the changes to live in and make your changes there. But, instead of merging those changes immediately, you keep your code in your local branch until you have finished a larger set of related changes. Unfortunately, when you go to merge your branch into the master branch, another developer has already merged into the master branch, creating merge conflicts. Now, you are spending hours making sure you don’t break anything while resolving the conflicts: welcome to integration hell.

Until relatively recently, every developer had experienced integration hell. Then, in 1991, a man named Grady Booch proposed Continuous Integration (CI) - the process of integrating code into a shared repository frequently so as to minimize the pain of dealing with conflicting changes. As an added bonus, continuously integrated development allows us to verify each integration with an automated build and automated tests, and when using either Continuous Deployment or Continuous Delivery, the changes can be deployed to production as soon as they are integrated.

CI platforms exist to automate these bonuses. A CI platform can test, build, and deploy the code without any user intervention (though many still prefer to do that last step manually). When using a good CI platform, developers can deploy bug fixes quickly and introduce new features seamlessly. When using a bad CI platform, developers may spend hours waiting for their changes to be processed, only to receive an error in a test that should have been thrown immediately.

Here at Babbel, we have used Travis as our CI platform for years. With changing climates (see #travisAlums on Twitter), we decided to research other options. Below, I’ve ranked Travis against two other CI platforms: CircleCI - a popular alternative used in a number of large projects including the open source Vue.js (available on GitHub) - and Semaphore - the self-labeled “fastest CI/CD platform”; we will soon see if the latter claim is true.

Obviously, speed will be a key metric in our comparison. However, we will also note the usability of each platform. To measure build speed, I configured all three services on both an internal Babbel codebase and Vue.js. Please note that both of these codebases are JavaScript-based; you may find different results using different languages. We executed each app’s production build process thirty times on each codebase using each CI service and recorded the self-reported times. Note that the data is self-reported, and while it appears that Circle and Semaphore report the “start” time as the time a commit is received, Travis reports when the task exits its initial queue, which seems to be roughly thirty seconds after the other two start.


CI Services on Babbel Magazine

Measurement of Build Speed of CI Services (see Appendix for raw data)


CI Services on Vue

Measurement of Build Speed of CI Services (see Appendix for raw data)


In terms of speed, Travis ranked third in our tests. Despite not reporting the time a build spends in its queue, Travis’ reported build times lagged behind both Circle and Semaphore, and Travis’ fastest average build speed took 60 percent more time than the next slowest! Specifically, there is only one data point in which Travis outperforms any other service.

The difference in speed between Circle and Semaphore was less glaring. Circle took only 13 and 43 percent more time than Semaphore on our code and the Vue codebase respectively. A two-sample t-test, which measures the difference between any two numerical distributions (calculated from evanmiller.org), confirms Semaphore as the faster alternative in both cases. For statistics nerds: using 99% confidence intervals and the null hypothesis of equality between distributions, we achieved p values of 0.0015 and 0.0001 on our code and Vue, respectively.

When considering usability, Travis lacked features found in both of the other services. With both Circle and Semaphore, starting a new project is as simple as saving an autogenerated file into your GitHub repository, but getting started with Travis involves wrapping your head around the specifics of the configuration files and weeding through the unintuitive documentation. Similarly, both Circle and Semaphore support parallelization naturally using their workflows and pipelines, but enabling the same functionality with Travis involves manually using their build matrices, in which you manually configure a number of virtual machines.


Screenshot from CircleCI UI

Insights from Circle's Dashboard

Usability is clearly improved for Circle and Semaphore. They both have incredibly simple tools to create an initial configuration file, and both offer a fully functional CLI for those so inclined. Semaphore offers a fairly minimalist dashboard, but still supports core functionality (though before their recent redesign, functionality as simple as deleting projects wasn’t possible from the dashboard). Circle, in contrast, has a beautiful and intuitive online dashboard as well as a similarly powerful CLI, and provides a variety of views as well as insights into build statistics, shown above. Additionally, Circle has by far the most readable and thorough documentation. They even have a whole section of documentation dedicated to migrating from Travis, which provides better explanations of Travis’ commands than their own documentation.


Overall Comparison
TravisCircleSemaphore
Performance
Median Speed on Babbel Magazine (h:mm:ss)0:02:330:01:310:01:19
Median Speed on Vue (h:mm:ss)0:03:550:02:020:01:29
Executes in
MacOS VMYesYesYes
Linux VMYes - Ubuntu OnlyYes - any from Amazon Machine ImagesYes - Ubuntu Only
Windows VMYes - Very Early SupportYesNo
Docker ContainersNoYesNo
Deploying with Docker
Docker CommandsYesYesYes
Docker ComposeYesYes - Only on VMsYes
Layer CachingYesYes - On all but WindowsYes
Terraform Support
Official Terraform ProviderNoNoNo


The choice between Semaphore and Circle is one of personal preference: would you sacrifice a small increase in build time for a better developer experience? Personally, I use Circle for my own projects–I find the dashboard intuitive and I enjoy looking at Circle’s “insights”. Based on the results of my research, the fastest choice appears to be Semaphore, and most usable appears to be Circle. But between the easy-to-use Circle and the speedy Semaphore, it’s up to you.


Appendix:


Raw Build Speed Data (h:mm:ss)
TravisCircleSemaphore
Babbel Magazine
Trial 00:02:290:01:370:01:25
Trial 10:02:320:01:340:01:21
Trial 20:02:330:01:400:01:18
Trial 30:02:390:01:420:01:20
Trial 40:02:330:01:430:01:15
Trial 50:02:340:01:360:01:25
Trial 60:02:340:01:520:01:30
Trial 70:02:300:01:210:01:25
Trial 80:02:330:01:380:01:29
Trial 90:02:340:01:360:01:15
Trial 100:02:330:01:210:01:56
Trial 110:02:320:01:190:01:10
Trial 120:02:310:01:210:01:14
Trial 130:02:330:01:320:01:12
Trial 140:02:340:01:200:01:16
Trial 150:02:360:01:420:01:15
Trial 160:02:310:01:190:01:08
Trial 170:02:340:01:260:01:20
Trial 180:02:350:01:310:01:15
Trial 190:02:300:01:240:01:09
Trial 200:02:300:01:220:01:10
Trial 210:02:360:01:300:01:24
Trial 220:02:350:01:210:01:31
Trial 230:02:430:01:510:01:29
Trial 240:02:320:01:200:01:16
Trial 250:02:360:01:290:01:25
Trial 260:02:370:01:570:02:12
Trial 270:02:320:01:290:01:19
Trial 280:02:290:01:260:01:09
Trial 290:02:340:01:440:01:12
Arithmetic Mean0:02:330:01:320:01:22
Geometric Mean0:02:330:01:320:01:21
Median0:02:330:01:310:01:19
Vue
Trial 00:03:510:01:470:01:27
Trial 10:03:570:02:130:01:17
Trial 20:03:550:01:560:01:19
Trial 30:04:010:02:090:01:20
Trial 40:04:120:01:570:01:32
Trial 50:04:020:01:440:01:27
Trial 60:03:570:01:580:01:24
Trial 70:04:110:02:100:01:32
Trial 80:03:550:01:520:01:21
Trial 90:04:020:01:580:01:31
Trial 100:03:480:01:560:01:34
Trial 110:03:440:05:280:01:31
Trial 120:03:500:01:450:01:20
Trial 130:04:110:02:120:01:23
Trial 140:03:540:01:490:01:31
Trial 150:03:540:02:070:01:35
Trial 160:04:060:02:110:01:54
Trial 170:04:000:02:000:01:18
Trial 180:03:470:02:010:01:18
Trial 190:04:090:02:050:01:21
Trial 200:03:480:02:180:01:31
Trial 210:03:550:02:150:01:24
Trial 220:03:490:02:040:01:35
Trial 230:03:520:01:510:01:32
Trial 240:03:490:02:090:01:26
Trial 250:03:550:01:470:01:40
Trial 260:03:480:02:450:02:19
Trial 270:03:590:01:540:01:18
Trial 280:04:020:02:080:01:36
Trial 290:04:010:02:020:01:35
Arithmetic Mean0:03:570:02:090:01:30
Geometric Mean0:03:570:02:060:01:29
Median0:03:550:02:020:01:29

How to Fight Retrospective Fatigue

$
0
0

It’s the end of a sprint. You’re sitting in a barren meeting room. Small stacks of colorful, still empty sticky notes and black markers lie on the desk that occupies most of the room. The moderator opens the Sprint Retrospective by asking their never-changing questions in the usual calm, routine manner: “What went well this past sprint? What didn’t?” One team member joins the session seven minutes late, another doesn’t show up at all because of a conflicting, supposedly more important meeting. As the session continues, you notice colleagues secretly glancing at their watches or yawning. No comment sparks a real discussion. No one cracks a joke. The meeting finishes ahead of time. The team has filled a mere dozen or so sticky notes. One colleague in the back has neither said nor written a single word the entire time. Just before leaving the room, the team half-heartedly commits to a few actions for further improvement. You expect these actions will be forgotten already tomorrow when the busyness of the new sprint kicks in.

Does this sound familiar to you? Then, you and your team might be suffering from Retrospective Fatigue. In this post, we are going to show you how you can fight it!


“What went well this past sprint? What didn’t?” are popular questions for running Sprint Retrospectives. In our team, we discarded them more than a year ago and never looked back. While our Sprint Retrospectives were never as dull as the introduction above suggests (that was just click bait to get you to read this article 😈), we found ways to make them tremendously more engaging and effective.

What helped us was introducing variety and creativity to the sessions. Over the past 1.5 years, in our Sprint Retrospectives, we went sailing, visited a circus, traveled to Tanzania, fought zombies, and embarked on many other adventures.

Variety’s the very spice of life, That gives it all its flavour. (William Cowper)

At the same time, we established a structure that ensured actionable outcomes regardless of how crazy creative the theme of the session was:

  1. Set the stage
  2. Gather data
  3. Generate insights
  4. Decide what to do
  5. Close the retro

Note: We borrowed this structure from Agile Retrospectives: Making Good Teams Great.

Below, we share three examples of our team’s past Sprint Retrospectives. We close the blog post with some remarks on facilitation techniques, tooling, and useful resources. It’s a dense read, packed with inspiration for your next sprint retrospectives! If you are impatient, you may skip ahead to example 1 “A Visit to the Circus”, example 2 “Voyage on the High Seas”, example 3 “Dramatic Failures”, or the closing remarks on facilitation techniques, tooling, and resources.

Example 1: A Visit to the Circus

Setting the Stage

This was the first retrospective, in which we relied on the participants’ imaginative powers: We imagined our diverse and interdisciplinary team to be a group of artists performing together in a circus. 

The facilitator then asked every participant to choose what artist or act described their work over the past cycle the best and to give a brief, verbal explanation:

  • Knife Thrower - “I worked with precision.”
  • Juggler - “I was multi-tasking a lot.”
  • Clown - “I created a good atmosphere.”
  • Lion Tamer - “I dealt with threats.”
  • Tightrope Walker or Equilibrist - “My work was well-balanced.”
  • Spectator - “I made no contribution.”

A flip chart with six circus artists

A flip chart with six circus artists and the team's votes

Note: Inspiration for this was the ESVP exercise of the Retromat.

Gathering Data

Participants were asked to write the Internet reviews, including a star rating, that our performance would have gotten from spectators that attended our show. The review cards were collected, shuffled, and “dealt out” again. Each participant shared the review that they got dealt by reading it out loud.

A review of the "circus performance"

A review of the "circus performance"

A second review of the "circus performance"

A second review of the "circus performance"

Note: Inspiration for this was the Amazon review exercise of the Retromat. Pictures and names of the spectators on the empty review cards were created with random user generators.

Generating Insights

Each participant was asked to propose 1 - 3 actions that they thought would get the review they had been dealt to the best possible rating of 5 stars.

Deciding What to Do

The participants were asked to collaboratively cluster the proposed actions and dot-vote on the clusters. They then refined the top 3 most voted clusters into actions and committed to implementing them until the next Retrospective.

Closing the Retro

To close the Retrospective, the facilitator gathered feedback from the participants on the session itself. They were asked to throw knives (metaphorically speaking, of course 😇) on a flip chart with two targets. The placement of the “knives” indicated whether the discussed topics were important to the participants and whether they felt they could speak openly during our session.

Two feedback targets with "knife marks"

Two feedback targets with "knife marks"

Note: We found this feedback exercise, Retro Dart, on the Retromat.

Example 2: Voyage on the High Seas

Setting the Stage

In preparation of this session, the facilitator drew a map with a dashed red line that represented the team’s journey. As the destination of this journey, the facilitator chose our team’s mission statement.

The participants were asked to place a mark anywhere on the map (yes, even off course), indicating how far along they felt we were on our journey towards implementing our team’s mission.

A map of the team's journey with markers

A map of the team's journey with their markers

Gathering Data

The moderator then presented another drawing. Participants were asked to write statements that fit into one of five categories and place the sticky notes on the drawing.

  • wind = speeds us up
  • sun = appreciation
  • island = opportunities
  • anchor = slows us down
  • rocks = risks

A drawing of a sea voyage

A drawing of a sea voyage

Note: We found this exercise in the Spotify Retro Kit.

Generating Insights

The participants formed five groups, one for each category. The groups were asked to propose at least one action that the team should take to address one or more statements in the respective category.

Deciding What to Do

The participants dot-voted on the proposed actions and committed to the top three most voted ones.

Closing the Retro

We adapted the exercise from the previous retro. This time, though, the facilitator handed out sheets with the targets printed on them. This allowed the participants to provide their feedback anonymously. We continue to use this feedback sheet until today.

Example 3: Dramatic Failures

Note: The Liberating Structure TRIZ served as inspiration for this Retrospective.

Setting the Stage

The moderator gave the group three pictures:

  1. a model of the Swedish warship “Vasa” from the 17th century
  2. a photograph of the German airship LZ 129 Hindenburg from 1936
  3. a painting of the British passenger liner Titanic from 1912

The participants were asked to identify the vehicles and find a connection between them. The Vasa foundered after sailing for roughly 20 minutes on its maiden voyage. 14 months after its first flight, the Hindenburg was destroyed by fire while attempting to land. The Titanic collided with an iceberg and sank, also on its maiden voyage.

The connection was apparent to everyone: All three were high profile projects that people had high expectations towards - and they all failed quite dramatically.

For this Sprint Retrospective, the facilitator invited the participants to embrace failure. More specifically, we were encouraged to embrace the failure of the project our team was working on at the time.

Gathering Data

The participants were asked what actions they can deliberately take so that their project would certainly fail. They formed pairs and brainstormed possible actions, writing one action statement per sticky note.

Next, the participants were asked to identify behaviors that the team was displaying in past cycles and which resembled the actions listed before in any shape or form. Again, the participants formed pairs and wrote sticky notes.

Generating Insights

The participants were then asked to propose steps that the team could take to stop these disadvantageous behaviors. They noted down their proposals on sticky notes.

Deciding What to Do

The participants were asked to collaboratively cluster the proposed steps and then dot-vote on them. They refined the top 3 most voted clusters into actions and committed to implementing them.

Closing the Retro

The participants filled out the anonymous feedback sheet introduced in earlier Retrospectives.

Facilitation techniques, tooling, and resources

For running your Retrospective, you can employ many different facilitation techniques. As much as the variety of themes helped to keep our Retrospective sessions engaging, a variety of facilitation techniques supported this. Here are some examples:

  • For quick decision-making, we usually use dot-voting. When dot-voting, you need to watch out for its potential caveats, though!
  • We borrowed from Improv Theatre and encouraged participants to build on the comments of others by forcing them to begin every single statement with “Yes, and …”.
  • We asked participants to draw pictures of the past cycle.
  • We asked participants to create mood timelines, depicting how they experienced the past cycle.
  • We borrowed from feedback trainings and asked participants to categorize their statements according to faces of playing cards (hearts = positive, but general and vague feedback; diamonds = positive, specific feedback; clubs = negative, but general and vague feedback; spades = negative and specific feedback). We then supported them in transforming their hearts into diamonds and their clubs into spades.
  • We ran silent retrospectives, in which all discussions were held in written form. This can help to get quieter folks to contribute equally (if all participants have comparable writing skills in the corporate language).

Honestly, it does not need much tooling to run effective Retrospectives. Flip charts, sticky notes, and thick markers are usually sufficient. If you do want to go beyond that, we have found the following tools helpful:

  • Mentimeter for conducting a silent Retrospective
  • The Scrum Check List by Henrik Kniberg can serve as a guide to review your implementation of Scrum
  • The Zombie Scrum Symptoms Checker by the Liberators helps to evaluate whether your implementation of Scrum is healthy and effective and provides you with actionable feedback

As you have seen in the notes scattered throughout this blog post, we draw inspiration from many places when preparing a Sprint Retrospective. These are the resources we refer to most often:

Last but not least, we found that the room in which we conduct the Sprint Retrospective session can influence the engagement of the participants. Traditional meeting rooms with a large desk as its centrepiece, comfortable chairs, and tech that unnecessarily occupies space (such as conference phones, big screens, and monitor cables with adapters) can be impedimental. The room pictured below provides space for the participants to move around more freely, whiteboards to draw on, and a standing desk with bar chairs to the side of the room to sit at when needed.

A meeting room that fosters collabration

A meeting room that fosters collabration

Share your thoughts with us

When you made it through this long blog post, we hope you found some inspiration for your next Sprint Retrospective. If you make use of it, we would love to hear from you. Please share your experiences in the comment section below!

One note of caution, though: All the creative themes, the engaging facilitation techniques, and the best tooling are to no avail if the team does not follow up and implement the actions that were agreed on during a Sprint Retrospective. We quickly realised that the implementation afterwards can be even harder than conducting an effective Retrospective session. We certainly got better at this but still struggle from time to time. Do you have recommendations for us? Share your tips below!

Movin’ on up: A femgineer’s journey at Babbel

$
0
0

This first installment in our series on Femgineers at Babbel, we talk with Pooja Salpekar about the process of transitioning from ground-level engineer to Engineering Manager, and what she’s learned along the way.


Who are you, where are you from? What’s your title, in what team do you work at Babbel?

Pooja Salpekar

Hello! I am Pooja. I moved to Berlin from India and joined Babbel 3 years back. Currently I am working as an Engineering Manager for the Computational Linguistics team at Babbel.

Tell us a little bit about your work at Babbel. What do you do?

As an Engineering Manager at Babbel I wear multiple hats - Technical Coach, Delivery Lead, People Developer, Culture & Community Builder. In practice this means that on some days I am working closely with my team, discussing and brainstorming broad architecture or weighing the pros and cons of different technologies/solutions. And on other days I work to identify and fill any cultural/collaboration gaps in the team.

When did you start at Babbel?

I joined Babbel in December 2016. As much as I was dreading the cold, I was very excited about working here. I started as a full-stack engineer, but was mostly doing heavy lifting on backend and Infrastructure side of the stack. My first few months, my team and I worked on one of the most critical projects for Babbel that redefined our content creation and delivery.

My contribution to this project and my growth in software knowledge wasn’t unnoticed, and I was promoted to a Senior Software Engineer. Working closely with stakeholders on feature lifecycles and delivery management, I wanted to steer my career towards a role that keeps me close to people and community while bringing my strengths from the engineering background. My peers and managers encouraged me to take up this new role, and are constantly supporting me in becoming better everyday.

Recently you became a manager. What has changed for you with your new role?

With the new role came a new set of challenges and lessons. As a software engineer I was used to working with deterministic nature of code. That predictability does not apply the same when working with teams. Getting comfortable with the team, running 1:1s with teammates and managing the change were a few initial challenges that I faced in the journey. Having mentors and peers helped to not be stressed about things. But what really worked for me is accepting that I am clueless. That mindset shift was needed to understand that this is not a promotion, it’s a change of role.

Since this role change was internal, it helped a lot that I was comfortable in the company and had my network of people supporting me. However, some of the changes came as a surprise, but in a good way. I was glad to explore what it means to be an Engineering Manager at Babbel. Detaching myself from the code seemed to be very difficult, but the team and setup I am working in allows me to have both high level and deep dive implementation conversations.

Understanding the dynamics in the team and helping people with their development path was one of the biggest learning opportunities for me in this change of roles.

How did you get to Babbel?

I did my engineering degree in computer science in India. After graduation, I worked as a software consultant with ThoughtWorks, a company that is widely known for its strong software development practices. Before starting with Babbel, I joined an early stage startup as their first engineer hire, which helped me understand closely how businesses and product development work.

What do you love most about your job?

The opportunity to understand and work with people from different backgrounds, and building teams from diverse individuals that work as a homogenous team nonetheless.

Can you tell us something about a project you particularly enjoy working on?

My team has been working on building a personalised recommendation engine to help learners learn lessons with more guidance. The team is a mix of computational linguists and full-stack engineers. Designing and deep-diving into different models with the team and tying it to feature-delivery in small iterations, A/B experiments, and performance optimisations was an exciting journey.

What’s your role in Femgineering at Babbel?

Femgineering at Babbel is an initiative to support and enhance the role and reach of women in tech roles at the company. I am driving the ‘Hiring and Promotions’ pillar of this initiative. As part of this, we identify the gender gaps in engineering and take measures to fill those. We work closely with the Recruiting team to design initiatives that helps us reduce the likelihood of hidden bias occurring during the recruitment process.

More general: What do you think are the biggest obstacles women have to face in the industry?

Alongside the barriers in the tech industry that are prevalent and residue of unconscious biases, studies show that women in tech experience imposter syndrome at work, struggling to have the confidence to speak up or be heard. I too faced similar challenges, but fortunately I had strong female role models, that I’d look up to and learned from. In one of the Femgineering meetings at Babbel, my manager said, we are strong in ways that we usually don’t see. And I couldn’t agree more.

Babbel Neos: The first year of a junior developer

$
0
0

Gábor Török, the former Program Lead for our engineering mentoring program – named Babbel Neos – follows up with the trainees to see where they are, how the program helped them, and where they see themselves going in the future.


The First Babbel Neos

The First Babbel Neos: Top row (left to right): Lina, Karen, Masha, and Rufael; bottom row (left to right): Serena, Hari, and Ana; not pictured: Ewa.

One year ago Babbel hired 8 junior developers who completed a 6-month in-house software engineering training program called Babbel Neos. To date, 7 out of 8 are still working at the company in various teams from data infrastructure to payments to product development. Creating an opportunity for early-career developers wasn’t new at Babbel, but allocating specific bandwidth to invite 8 applicants to a dedicated training program was a new initiative at the time.

I was responsible for managing the program and giving first-hand guidance to the trainees and mentors involved. Although I’m not employed by Babbel anymore, I was eager to know how the first Neos cohort was doing and how they saw the first year of their new career in engineering in retrospect. This article was born out of my passion and curiosity about this group’s journey.

I was looking for answers to the following questions:

  • What attracted them to software engineering
  • How their picture of what it means to be a software engineer changed
  • What helped them succeed in joining a new industry
  • How their different motivations might have influenced their journeys

I interviewed 5 out of 8 of the former Neos.

The story of Karen (from Spain)

I wanna be someone who knows the code inside out.

Karen is a career changer. She had a senior position in Customer Success at a technology company. She felt that the only way she could have progressed in her career was taking on management roles. She wished she could stay as an individual contributor but still grow. She even considered going back to university to do a Masters to challenge herself. Out of curiosity, she attended a tech meetup. There she discovered a one-day coding workshop that she decided to join. She was impressed by how much you could achieve with computer literacy. She encountered people with many different cultural and professional backgrounds, and that was attractive to her. She took the challenge and signed up for a 9-week paid developer bootcamp. It was clear from that moment that software development was going to be her next adventure.

Many get inspired by the power of technology yet they struggle to take their first steps. While never before has there been such a wealth of resources available for self-education, understanding what to focus on and receiving feedback on one’s progress is what makes a difference. Before joining the bootcamp, Karen signed up for a bunch of online courses but got easily sidetracked. The bootcamp’s strict curriculum and dedicated mentorship helped her kickstart her career in tech. Eventually, it enabled her to secure a trainee position later on in Babbel Neos.

Why would someone want to become a software engineer? Engineers work isolated on their computers, are often stressed as sprints come to an end, have long working hours, and have not much of a social life anyway. At least this is the stereotypical picture many have outside of the industry. Becoming a junior developer proved this wrong. Karen was astonished by how much software developers at Babbel were involved in product development and how many times they sat together to talk about programming code. It was far more than lonely typing. She closely collaborates with the engineers in her team and receives support from them. For certain domains, other departments like Marketing would directly seek her help. Explaining to her mother what she does daily is still difficult though. “As someone working in Payments, we take care of some critical systems that you don’t see but fuel the website experience. I tell to my mother that we are the ones making sure that the money gets to the right place when she buys something online. Honestly, I think she still wonders what I’m doing here,” she laughs.

Changing roles wasn’t the only challenge for Karen. Her life changed overnight. She went from being a senior professional to being a student striving to become a junior developer. And everyone around her was more senior. It can feel intimidating. Karen pointed out that having a dedicated mentor throughout the Neos training program – and eventually joining her mentor’s team – gave her the support she needed. It helped her find her footing at the company. She had the chance to partake in real-life projects. She had someone always accessible to her to ask questions. Last but not least, she made new friends. By the time the training was over and the company decided to hire her, the whole team was clear about her background and skills. She had her desk and felt welcome.

Karen decided to change her career because she wanted to challenge herself and find a position to keep growing professionally. In retrospect, she is completely satisfied with her decision. She sees herself in the future as a very knowledgeable developer. “My goal is clear. I want to be like Pooja. Someone who knows the code inside out and can mentor others.” And she’s fully on it. For almost a year now, Karen has been also working as a volunteer at the ReDi School, a Berlin-based coding program to mentor career changers. While one might think that you need to be a senior to effectively mentor others, Karen’s recent experience of becoming an engineer enabled her to speak the language of her mentees. While this might sound like an unusual concept, it’s a key pedagogical approach at Le Wagon, the coding bootcamp from which Karen graduated. Teachers at Le Wagon are graduates of previous batches.

Finally, I was curious about Karen’s experience of being a woman in tech. “Honestly, I never encountered a situation at Babbel when I felt treated differently because I was a woman. Neither way. However, people sometimes use the male vocabulary. It’s something that needs to be changed. During conference visits, I felt this was more of an issue. Some are still surprised when a woman gives a talk or even wonder what we are doing at tech conferences. At the same time, I receive job offers because companies want to have more women on board. I take that as a positive sign.”

The story of Rufael (from Sweden)

I wanted to be able to build my ideas.

During his Interaction Design studies, Rufael had the chance to try coding at university. He enjoyed being able to build his design ideas. For a few years, he tried different ways to get a grasp on programming, but it turned out to be quite difficult. There was so much to learn and it wasn’t clear what to focus on in order to get a job. “I wished for some guidance. I didn’t feel I was making much progress. I almost gave up on it when I bumped into the Neos training program. I guess I wasn’t aware of coding bootcamps at that time.”

Rufael worked with programmers as a designer before, so the industry wasn’t completely new to him. Yet the fact that Babbel was calling for software engineer trainees sounded a bit scary for him. Engineers are very educated people and make no mistakes; how can a junior live up to that expectation? While his experience of working with developers was quite smooth, he wasn’t sure what to expect. Encountering Babbel’s engineering culture first-hand answered his questions. There was plenty of room to make mistakes. He found the fact that he could “improve by doing” to be appealing; his mistakes were chances to learn.

Joining the Neos program gave him the guidance he was lacking. It was a safe space for trial and error, yet with a clear direction and feedback on his progress. It eventually opened up an opportunity to join Babbel as a junior developer; exactly what he wished for. “I liked that the company was very clear from the beginning that they were willing to potentially hire everyone from the group.” It was a mutual interest for both parts to do their best.

Rufael currently works on Babbel’s online learning experience. His team’s goal is to win customers back to the product. Although he’s still a junior developer, he’s already been able to contribute to the product. He finds it a special opportunity and it feels rewarding for him. Having dedicated mentors and pairing up with senior engineers enabled him to catch up quickly with the necessary knowledge, he says. He receives feedback regularly from his peers and manager that helps him identify areas for further development. “I learned how to work with others, how to ask the right questions and figure out things myself. I gained much more than technical expertise.” He was primarily focusing on frontend technologies, but lately, he’s started to learn more about backend services and infrastructure. He has now a much deeper understanding of general programming principles and he likes the logical challenges of the backend stack.

“I remember for some trainees it was difficult that the Neos training program didn’t have a very strict structure. For me, it was a great opportunity to learn how to be more independent and responsible for my career.” When I asked him if he had a specific learning goal besides his everyday tasks, he said his role in his current team already required learning many new things. It would distract him if the learning goal wasn’t directly connected to what he was working on.

Finally, he mentions that he’s still in contact with most of the former Neos trainees. He finds these connections very valuable. It helped him to arrive at the company and have people in the same boat with whom he could exchange experiences.

The story of Masha (from Russia)

Being an engineer is very social.

Masha was looking for a job in Germany. She didn’t speak German at the time and she realized that without fluent German, she wasn’t able to secure a job she wanted. She discovered that many people like her changed careers and became web developers abroad in international companies. It wasn’t a secret either that there was a high demand for software engineers in the market. This gave her enough inspiration to give it a try.

She subscribed to an online Udacity course and she had a programmer friend that helped her take the first steps. In the meantime, she learned that the German State would sponsor paid courses for job seekers. She evaluated various options and decided to sign up for a 12-week coding bootcamp in Berlin. It had a steep learning curve. While it helped her establish her feet in the world of web development, she wasn’t sure if she gained enough knowledge to apply for a job. When she heard about the Neos training program, she thought it was the right next step for her. It gave her a chance to further develop the technical skills she believed she was lacking. At the same time, as she never worked in a tech company before, the training program provided her a proper introduction to the industry as well. She understood how the product and engineering organizations worked, what the everyday rituals for agile teams were, and what the expectations for software developers were in general.

Before joining Neos, Masha’s view of being an engineer was the following: people work on complex challenges, the job is sometimes boring, it requires lots of concentration, and it isn’t social at all. Being in the job for about a year now, her view has changed. It’s complex and it needs concentration indeed. Yet it’s very social as well. “I didn’t think I would collaborate so much with my colleagues. We do lots of pair programming and we have regular discussions about the projects and tasks with the whole team.”

Masha has a dedicated mentor in the team and she finds this crucial. “As a junior, I’m not able to solve all the tasks alone. My mentor helps me understand the tasks and we pick stories from the backlog together. I’m not very self-confident. To have a check-in with my mentor every day is reassuring for me.” While having high expectations for yourself can boost your growth, being surrounded by many seniors can unfold a risk; you start comparing yourself to others and that’s not healthy. It makes you feel you don’t achieve much and pushes you to work even harder. It might lead to burnout early on, a common syndrome in the industry, that has been openly discussed in the recent years. Masha’s mentor provides timely feedback on her progress so that she sees what to improve. This creates space to celebrate her achievements too. “The bootcamp and then the traineeship were very intense and stressful. I pushed myself too much,” she added. “I needed to understand my limits and find a sustainable pace for growth.”

Finally, I also asked her about her experience as a woman in tech. “I didn’t face any issues at Babbel. I’m glad that the company is aware of gender issues and takes action to resolve them. For instance, our Femgineering forum is an important initiative to support female engineers at the company.”

The story of Hari (from India)

I wanted to become a big data engineer.

Hari has more of the traditional profile that one would expect from a software engineer. His elder brother studied computer science and currently works as a business analyst. Hari was always interested in what his brother was working on and decided to have a similar career. He studied civil engineering in India and then moved to Germany to do a Master’s in Computational Sciences. He wrote his first program in Matlab. It inspired and reassured him that he wanted to become a programmer. “Programming seemed so powerful,” he says. He got to know Python and signed up for a Udacity course. He wanted to show what he was capable of so he started to learn the basics of website development as well.

“In Germany, I was working beside my Master’s study to finance myself. I wished I could find a full-time junior developer job but I only encountered internships.” He wasn’t hiding how excited he became when he stumbled upon the call for applications for the Neos training program. “It was exactly what I wished for! I’m very grateful for this opportunity.” Joining a 6-month paid training enabled Hari to dedicate his attention to coding without financial worries. Furthermore, receiving dedicated guidance and having a chance to work in a professional context gave him a big career boost.

Since his brother worked in software technology, Hari had a pretty realistic picture of what it means to be a developer. “Flexible working hours and being frustrated by hunting down software bugs,” he laughs. From early on, it was clear that he wanted to become a software developer and work with data. Today Hari is part of Babbel’s data engineering team. He has recently completed his Master’s thesis about a project that improved Babbel’s data infrastructure. “I want to become a big data engineer. I spend one hour at the end of each day learning something new.” His next career goal is to earn an Amazon big data specialist certification that the company offered to sponsor.

The story of Ana (from Peru)

I want to be a professor at university.

Ana’s story with software engineering dates back when she was part of an NGO to empower women in tech in Peru. She saw how technology can bring a positive impact on people’s lives. She wanted to be involved and become a role model for others. The organization helped her get started with programming. She attended online courses and local meetups wherever she traveled. She wanted to educate herself and make connections with people in the industry. To give herself a chance to secure a job in software engineering, she decided to sign up for a 12-week coding bootcamp in Germany that she was eligible to do with her visa. During this time, she heard about Babbel Neos and eventually got selected for the training program.

“Before becoming a software developer, I thought engineers had a nerdy life, were extremely intelligent, and were men. I never met a woman engineer before,” she says of her prior view of the tech scene. “You can still find the stereotype but the industry is changing and I’m an example of that. I see many young people with diverse cultural and professional backgrounds becoming developers.”

Ana started her career at Babbel’s backend team and later switched to another company. She’s grateful for her journey at Babbel. She learned a lot about team dynamics and she studied computer science concepts about which she had no prior education. Her experience at Babbel taught her what type of support she was looking for and she could articulate this already in the next job interviews.

Although the switch to another company was challenging for her, she’s glad she made the change. When Ana joined her new company, the manager made it clear that it was the whole team’s responsibility to set up her for success. “It’s a journey for both the mentor and mentee. The whole team needs to invest in it. I’m a blocker many times for others and that’s okay,” she says firmly. Her team baked in processes in their everyday life that made sure Ana was never left out; everyone is interested in her development.

Completing the Neos training program and having a junior developer job at Babbel opened many doors to Ana. She found her next job much easier, and she got invited to public events to talk about her experience both as a career changer and a woman in tech. She’s also a volunteer mentor at the ReDI School along with Karen. When I asked her about her plans, it turns out that she applied to the BSc Computer Science course at the University of London and she’s starting her studies already this October. “I want to be a professor at university,” she explains.


While the market demand for software engineers is high, companies tend to overlook the group of highly motivated professionals who are willing to enter the industry if the required support is given. Universities and coding bootcamps alone can’t bridge the gap. Babbel invested resources in providing dedicated training for 8 people and creating an opportunity for existing engineers to level up their mentoring skills.

“Schools and coding bootcamps in Berlin are graduating many early-career engineers monthly. With the Neos program, Babbel was able to select from this pool providing a fast way for them to be productive in teams,” concludes Nehal Shah, Director of Engineering at Babbel. “Given the competition for talent in Berlin, Babbel was able to create a pipeline of engineering talent to scale out our teams. Furthermore, the company gave opportunities to traditionally under-represented groups like women and minorities and created more diversity in our engineering department that is essential for developing a language learning product for a diverse customer base.”

Babbel Neos was recognized by Fast Company’s World Changing Ideas Awards 2019 in the education category.

The making of Replicator

$
0
0

A story of an internal tool – Replicator. It involves AWS Lambda, one of the hardest things in Computer Science, and Star Trek.


Background

At Babbel, we use AWS Lambda a lot. Lambda supports many runtime environments and one of them is Ruby. When our team was about to create the first Lambda function written in Ruby, I thought it would be awesome to structure the project that it looks and feels like a Ruby project. Lambda itself does not insist on any project structure. You only specify which method to call and where it is located, the rest is up to you. Ruby does not have strict rules as well, but there are conventions that would make sense to follow. These conventions are supported (even encouraged sometimes) by tools and accepted by the community. They make Ruby-based projects feel like home… or at least place where you know how to navigate and find things :)

After a few hours of experimentation, a project structure was born that satisfied expectations. It was a bunch of files and directories, but everything had its meaning and place. The results looked good enough to share them with other engineers at Babbel. We decided to make one more step and create a tool that will help to bootstrap projects based on this structure, so people do not have to repeat our journey and could focus more on the problem they need to solve.

At its core Replicator is a tool that helps engineers and saves their time. You run one command and everything is ready to go:

> replicator init --function-name hello_world

...

The project has been successfully bootstrapped. Run following command to install all necessary dependencies:

script/setup

Check README.md file for other useful instructions.
---

Engage! 🖖

At this stage, you might wonder what is in the box. Right? Bootstrapped projects include:

  • Core modules and classes of the Lambda function according to common Ruby styles and conventions
  • Acceptance and unit tests for generated code
  • Scripts to automate routine processes like testing, linting, preparation for deployment, etc
  • Nice README and contribution guides
  • Everything necessary to work with the Lambda function on the local machine

The Lambda function is ready to be packaged and deployed.

What makes Replicator possible?

If I have seen further it is by standing on the shoulders of Giants.

Isaac Newton

Replicator relies a lot on following amazing projects:

All of them helped to build Replicator itself and everything it generates. Kudos to everyone who was and currently is involved in these projects!

Why Replicator?

Every good project starts with an idea. After the idea comes a name. The name is important because it defines style, sets the voice and direction. Have you ever tried to come up with a good name? That is right! It is difficult.

There are only two hard things in Computer Science: cache invalidation and naming things.

Phil Karlton

As the tool was supposed to bootstrap projects based on a predefined skeleton, names like builder, generator, etc. were considered. However, all of them are plain, flavorless, and boring. Seeking for inspiration I looked at Star Trek as a potential source and remembered the replicator.

In Star Trek a replicator is a machine that can create things.

Wikipedia

That was it! “…a machine that can create things.”

Our tool was going to be also about creating things. As replicators from Star Trek, the tool should be able to reproduce the same structure every time you ask it. So the name fit perfectly and it was bringing some style and familiar background.

Soon you will discover how the name influenced the tool even more than originally expected ;)

The final touch

When replicator init command was ready, it seemed like the project is done. However, I was not able to get off a feeling that something is missing. Something like a cherry on top of a cake. Something that will distinguish Replicator from other tools and adds individuality.

As the name sets already the main theme of the project, it was necessary just to ask the right questions to find inspiration. In Star Trek replicators were used a lot by many different characters, but there was one person that engaged my creativity. One of the greatest characters and a true leader – Jean-Luc Picard, the captain of the starship USS Enterprise (NCC-1701-D). His famous phrase “Tea. Earl Grey. Hot.” was exactly what I was looking for :) From this point it was obvious that Replicator is missing tea command.

Every time you run replicator tea, it prints a random quote of captain Picard and an ASCII-graphics of a cup of tea.

> replicator tea

   ((((
   ))))
 _ .---.     There is a way out of every box, a solution to every puzzle;
( |`---'|    it is just a matter of finding it.
 \|     |
 : .___, :   -- Jean-Luc Picard
  `-----'

Only inspiring and positive quotes made it to the final list. I was very excited about results and spent a few more minutes to add one final touch - an option --cold.

> replicator tea --cold

 _ .---.
( |`---'|    Things are only impossible until they are not.
 \|     |
 : .___, :   -- Jean-Luc Picard
  `-----'

I truly believe that software engineering is a creative activity. It is never boring and even tools that are supposed to move files from one place to another could have style, personality, and make people smile :)

Engage! 🖖

A Look Back at Hack Day #9

$
0
0

Have a look inside one of Babbel’s most exciting initiatives, Hack Day, with the official aftermovie and an interview with Mo Mourad, one of the members of the winning project.


Babbel is a learning company, both inside and out. We have challenged ourselves in this topic in many ways, but one of our favorite learning exercises is our annual Hack Day that we hold for our Product and Engineering departments. Over the course of 8 hours, over 100 Software Developers, Designers, Language Experts, and Product Managers team up to build a new feature or product for Babbel’s users.

With drinks and snacks fueling the teams at our off-site location, the dedication to each project was felt in the room throughout the day. Once the countdown ended and the hacking came to a stop, each team had to demonstrate their project to our CPO, CTO and fellow colleagues, who then proceeded to vote for their favorite one.

We caught up with Mo Mourad, a Senior Product Manager who was on the winning Hack Day team, to get an inside look on the 9th edition of Hack Day.

Congratulations on being the winners of Hackday Day! Can you tell us about your Hackday Day project?

At Babbel, we always thrive to help our learners to have real-life conversations and to be courageous enough to actually use the language they are learning. This is what inspired us to build an Alexa skill as a companion native speaker to talk and practice the language they are learning with Babbel. We completed an MVP (Minimum Viable Product) that allows learners to review the vocabulary they have learned. We focused on review as it is a crucial part of learning a new language as it helps the learners retain the knowledge and keep it fresh in their minds.

Hackday winner team

What did you enjoy most about Hackday Day?

What was most amazing about Hackday Day was the innovation that came out of it; shooting for the moon, thinking big and breaking all the boundaries while still aligning with the product objectives and company values. I also enjoyed working with some of the smartest and most brilliant minds at Babbel. Plus the whole day had an amazing vibe driven by the passion of everyone to have a great product for our users!

What was the biggest challenge for you and your team while doing this project?

The biggest challenge for us was building a new way of learning a language just by using voice. To achieve this we needed to build a VUI (Voice User Interface) that is responsive, empathetic, and has a positive personality which encourages users to keep on going even when they have bumps on the road.

In the evening part of Hackday Day everyone presents their projects which they have been working on all day. When we walked on stage to present our demo, the room was very full and everyone was anticipating to hear a response from a boring robotic voice. My colleague Fred from Engineering started talking to Alexa in Spanish. First, he gave her some right answers and then he gave her a wrong answer. At that moment, she replied with a funny “oops” and then he tried again and answered correctly. This time he got praise from her for getting it right! Everyone in the room laughed and every time Fred made a mistake the people cracked up at her comments. That was the moment that I knew the audience loved our demo because it just felt like a real language learning buddy!

Hackday winner demo

What have you learned from completing this project?

That it’s all about execution. We wanted to end Hackday Day with a working product and we did! We planned it thoroughly, anticipated all the potential risks, set a goal and had a diverse set of skills and knowledge, and, thankfully, managed to achieve it. Even though we used few technologies and different programming languages, it was never a barrier for us.

Were there any other projects that inspired you or stood out to you?

I loved all the learner-centric ideas that were based on actual user requests such as Babbel Notebook, a personal notebook for curating vocabulary, phrases, and other learning resources from within Babbel as well as the wider web; learner-curated learning resources using a Babbel-designed Chrome browser extension. Another great idea was Babbel Snap, just snap a picture with your phone and Babbel will recognize the items in the image, translate and add them to your vocab list!

Becoming a Femgineer at Babbel and relocating to Germany

$
0
0

This interview is part of our blog series on Femgineers (female Engineers) at Babbel. In this installment, we talk to Pallavi Subramanya who started her journey with us just a few months ago, moving from India to Germany for her position.


What’s your name, where do you come from and what is your title at Babbel?

My name is Pallavi and I am from Bengaluru, India. I joined Babbel as a Backend Engineer in June 2019.

Pallavi Subramanya

How did you get to Babbel?

I studied Computing Science at university and was always ambitious about going into Software Engineering. After starting my career in 2013 at an organization named Rails Girls Summer of Code , which offered a global fellowship program for women and non-binary coders, I knew that working in an inclusive environment was important to me. The fellowship gave me an official introduction to web development and served as a stepping stone to landing my first job at a start-up in Bengaluru, Bang The Table.

I moved on to Decision Resources Group, a healthcare research and data company, giving me experience and insights into a breadth of business types. I was blessed with a baby shortly after - and was lucky enough to take some time off from work to care for my child whilst planning my next career move - which is where we come to Babbel. What really appealed to me about the organization was that it is parent-friendly and offered a new challenge in my field of expertise. The flexible working hours, family room for kids and child care solutions have made my life so much easier - and the job isn’t bad either! The relocation process also didn’t impede me from moving to Germany. The company welcomes expats from all across the world to its HQ here in Berlin, and I am proud to be working with 750 people from over 50 different nationalities.

Tell us a little bit about your work at Babbel. What do you do exactly?

As a Backend Engineer at Babbel, I am responsible for coding and improving the server applications and databases that, when combined with front-end code, helps to create a functional, seamless experience for the end user. I also research industry trends, use new techniques to create and improve back-end processes and code, and work with other experts to design a better program overall.

To work at Babbel, you took a big step by moving to Berlin. How did the relocation take place?

Of course, relocating from India to Berlin seemed like a big challenge. Naturally, I had many questions: What’s next? How do I go about sorting my visa, travel, accommodation? Luckily, Babbel had answers to all of these questions and a relocation package was also included as part of my job offer. Babbel provided me with all the necessary information for the visa process and having the relocation funds upfront helped me to arrange short-term accommodation, as well as travel to Berlin. On the day after we arrived in the city, Babbel’s People and Organization team helped me with the somewhat complicated bureaucracy for newcomers in the city. They had booked an appointment at the local registration office, helped me translate necessary documents and even assisted with speaking to the German office clerks. They made everything super easy! I was registered on the same day and sorted everything from my Tax ID to health insurance. I started my job at Babbel with a warm welcome, learning about the company and its many different departments on my Welcome Day where I was treated to cool Babbel swag!

What do you like most about working in Berlin?

Living and working in Berlin has taught me the importance of work-life balance. People are productive and focused on their work during working hours - but once the day is over, it’s really over.

You are part of the Femgineers at Babbel. What do you think are the biggest obstacles facing women in the industry?

There is still a lack of role models in the industry, particularly when it comes to female Engineers in senior roles. Creating a more equal workplace means attracting more women to the industry, which is why we need more females in top positions to offer guidance and support to young talent.

For female Engineers with children, it can be natural to feel like you’re struggling alone, especially if you’re the only mother in the room. What has eased this feeling is Babbel’s flexible working hours and help with child care. Whilst increasing importance is being placed on such offerings in more and more organizations, the industry still needs to step up, notice the gap and take parenthood more fully into account.


Juniors' Guide to Pull Requests

$
0
0

Guidelines on confidently reviewing senior team members’ code without feeling like a total newb.


Sitting down at your desk in the morning and the first thing you see is the pull request from a senior that has been sitting there waiting for you to review.

“No reviewers approved yet. Why am I even added as a reviewer? I don’t know what I’m doing. She (who created said pull request) obviously does. She has been here so long and practically wrote the entire codebase. I’ll just wait until somebody else approves it first.”

We’ve all been there, right? The daunting feeling of ‘this isn’t my place to comment’. What if the pull request gets merged and there’s a bug and it affects hundreds or even thousands (millions?) of users, and my name is on the list of approvers.

As a junior I definitely experienced that internal conflict, but at some point in my career, I realised there is always room to contribute. One little thing even. If you don’t feel comfortable to comment on or approve a pull request, you should be able to still learn from it. In my experience I would start by checking out the branch, change something, try to see what changed, debug, get a value I didn’t expect… hmmm… bug?

“Did I just find my first bug that’s not in my own code?”

Nope, perhaps I just didn’t understand what the code was meant to do. Why didn’t I understand what the code was meant to do though? Going deeper, I would come across a variable that is named, well, something incomprehensible. Ok, so at this point you should hopefully see where I’m heading with this without giving the play by play of my train of thought in looking at a (usually more senior) teammate’s code.

The thought of compiling a list of things to look at in pull requests was inspired by a previous company I worked at, but was more aimed at new joiners who didn’t have context on the codebase yet. As a more junior/inexperienced team member, there are still items you can view and confidently comment on, even if you feel too intimidated to actually approve a pull request. These are things that don’t necessarily require you to fully comprehend the surrounding context of the entire feature. Use them simply as a starting point, as a comfortable platform with which you can iniate comments, questions, or suggestions.

Some of the below points are self-explanatory, but it will hopefully give you some confidence to actually start looking at the code in order to learn, and in time, contribute. Note that the examples below are overly simplified for effect.

RM: Readability & Maintainability

F: Functionality

D: Design

Things to look out for

  1. Spelling mistakes or typos (RM)

  2. Agreed upon naming conventions (whether you’re following a style guide or just based on a team agreement) (RM)

  3. If the naming of functions and variables are descriptive and make sense (nothing like dom for dayOfMonth) (RM)

  4. Possible runtime errors (.filter on undefined in JavaScript for example) (F)

  5. Commented code that should be removed (RM)

  6. Adhering to style guides in terms of file/folder structure and naming convention (RM)

  7. Tests are included (RM)

  8. Do the names of the tests make sense (is it clear what is being tested)? (RM)

    The below is clearly a mistake in the name and we’re not testing here whether or not Berlin is in the database, but checking for Vienna.

    test('city database has Berlin', () => { expect(isCity('Vienna')).toBeTruthy(); });

  9. Formatting - in case there are no linting rules to auto-format / break builds (RM)

  10. DRY (Don’t Repeat Yourself) (D)

    Sometimes it’s easy enough to spot this when you see methods with some similar looking code, see if you can figure out if code can be added to a helper method for example, or simply ask if the developer can explain the reasoning behind the duplication (very often the answer is ‘Oh I forgot I did the same thing in the other place’, or ‘Yeah I was getting to that and then… well I didn’t’).

  11. Single responsibility (D)

    This may not always be so easy to spot as a junior, but it could, at least, prompt you to ask a question. A very simple example: If a function is aptly named getUsernameById, you could assume that you will be passing a user ID and getting a username in return. Does this function also update a different record with the user’s username? If so, big question mark.

  12. The code is in the correct place in the file structure (D)

    If it’s related to an Order, is it in the OrderService.

  13. Exception error messages are understandable (RM)

    Does the message actually describe an error and not just stating that an error occurred?

  14. If && and || are used correctly according to desired logic (F)

  15. Check that the relevant documentation has been updated/added (RM)

  16. Take one block of code, and dig into that and make sure you fully (like, fully) understand what that is meant to do. If you don’t understand what that is meant to do, add a comment to the pull request and ask for the creator of the pull request to explain.

  17. Ask questions! Be curious and ask the developer to explain why he/she did what they did, out of pure curiosity and learning about certain design decisions. This may either not have an impact on the outcome of the pull request, or it could potentially trigger a thought from the developer of potentially doing it in a different way when having to explain what their intention/reasoning was. Don’t be shy to ask the creator of the pull request to walk you through it offline.

  18. Check if the flow of the code makes sense. If you, even as a junior, can’t figure out what is going on, it may be a hint that the code has room for improvement.

What seniors can do to help from the start

  1. Ensure that you include a description of the changes you made. It is very useful to have an agreed upon template to include in pull requests, for example: 1. What did you change? 2. Why did you change it? 3. Notes and references.

  2. Encourage the junior to debug and see if they can follow the flow, even if not providing feedback on the pull request.

  3. A bit from the other side of things - when juniors create pull requests, ensure to leave comments to question certain things and make sure the juniors understand why they did what they did and not just because “it works”. Encourage them to also learn from these comments.

Sources

How to encourage junior developers to participate in code review

What to look for in a code review

How to do more with fewer servers

$
0
0

March, 2020, won’t be easily forgotten. As schools, offices, restaurants and more began to close all around the world due to the COVID-19 pandemic, it seemed like “normal” life was coming to a halt.

While we at Babbel were adjusting our daily office routines for makeshift desks in our living/bedrooms at home, we suddenly began to see a drastic increase in traffic on both our web and mobile applications.

This led to an exhaustion of our servers resource pool for auto scaling, which meant that we had to make a decision: Do we increase the maximum number of server instances that can be allocated or do we optimize the current setup? We chose the latter.

Find out how tuning auto scaling alarms and switching to Puma threads has resulted in a 2 × faster application run by only one third of the servers.

Traffic almost doubling since beginning of March


Overview

Throughout this post we will provide some background on application, its infrastructure and auto scaling rules. Followed by optimizations plan and all its phases and steps. At the end we will cover improvements overview across 3 key performance indicators: resources utilization, budget and application performance.

Table of contents:

Background

This post is about one of our core applications which handles the account and session management using Ruby on Rails.

Originally we were running it on AWS OpsWorks, with a fixed number of servers. Some time ago we migrated it to run within Docker on Amazon Elastic Container Service (or Amazon ECS for short) with AWS Fargate instances. This has enabled us to use auto scaling, which we haven’t changed since.

Application

Each instance:

  • had 1 vCPU and 2GB of memory
  • was running a Puma application server in clustered mode with 6 processes and disabled threaded mode - 1 thread per process (since the service was never prepared for thread-safety)

All of them were running within one ECS Service behind AWS Application Load Balancer (or ALB for short) in default round-robin fashion.

Auto scaling

AWS auto scaling is composed out of 3 elements:

  1. Target defines what to scale and what are the min/max boundaries of size being adjusted
  2. Policy defines how much should the size of the target change, and how often it can be triggered
  3. CloudWatch Alarm acting as a trigger for a policy, whenever a defined threshold on a given metric is reached

We had target configured to run minimum of 15 instances and maximum of 32 instances for the given application with the following mix of policies and alarms:

  1. Slow scale up: change size by +1
    • alarm on maximum service CPU being over 60% within two consecutive checks, each check spanning 5 minutes
    • alarm on ALB p99 latency above 1 second within one check across 1 minute
  2. Fast scale up: change size by +3
    • alarm on ALB p99 latency above 3 seconds within one check across 1 minute
  3. Slow scale down
    • alarm on average service CPU being below 15% within two consecutive checks, each check spanning 5 minutes

Combination of the above caused fast scale up early in the morning, and scale down only in the middle of the night. Since increase of the traffic we were running maximum 32 instances, each day for longer period of time.

Because of this we didn’t have any more room to breathe in case traffic keeps on growing, which prompted us to revise the above alarms and policies.

Inefficient auto scaling alarms, causing maximum number of instances to run for majority of the day

Budget

The other effect of increased traffic and constant maximum instances utilization was increased cost of running the given application - which almost doubled compared to baseline from months before.

Cost of ECS infrastructure has almost doubled as well

Optimizations

What made us think we could squeeze out more out of existing configuration was:

  1. Resource underutilization: average CPU utilization was between 15% and 20%
  2. On average 75% of time spent in application while waiting on IO, 90% in case of the most requested endpoint.
  3. Some 3rd party APIs can be super slow: above 1 second - causing spikes for auto scaling.

Plan

To avoid increasing maximum capacity, we decided to revise our auto scaling and application setup. To achieve better utilization and scaling we planned three phases:

  1. Throttle scaling: scale up slower and scale down faster, by being more resilient during single latency spikes.
  2. Web Server: due to app IO nature - test Puma with threads.
  3. Revise auto scaling: try to utilize single instance as much as possible before losing performance.

Charts glossary

All the charts focus on 3 key areas:

  • total number of instances running, indicating how auto scaling is working
  • average CPU utilization across the application
  • p95 latency on ALB, multiplied by 10 to increase visibility on the chart, to see whenever there’s degradation in performance

Phase 1: Throttle scaling

First attempt

At the moment a single spike in latency could be a trigger to boot up 3 new instances at once. To avoid it, we increased number of consecutive checks from 1 to 3 that had to be above the latency threshold to trigger either one of the scaling up policies.

For scale down we increased the CPU threshold from 15% to 20% and reduced a single check time from 5 minutes down to 2 minutes. This should help scale down sooner.

The result were more or less as expected:

  • we didn’t scale up to full 32 instances at the beginning of the day
  • we were able to scale down during the first half of the day
  • and then scale down a little bit sooner after the main traffic goes away
  • no visible change in p95 latency

First small improvements in auto scaling, first scale downs during the day

Second attempt

While checking latency graphs we noticed that p99 was spiking a lot, and wasn’t stable nor representative. Also there wasn’t much we could fix, as the vast majority of the spikes was caused by less frequent calls to OAuth providers.

Because of that we decided to switch from p99 latency for alerts, to p95 latency instead. p95 was way more stable, so if it’s high, it means that the system slows down for too many people. With this we also adjusted the thresholds: from 3 to 2 seconds for fast scale up, and from 1 to 0.7 second for slow scale up policy.

As we did not observe any regression in performance due to the last change , we decided to also:

  • increase the CPU threshold again from 20% to 25% for scale down policy
  • reduce the minimum amount of instances from 15 to 8, as during night we were way below 20% CPU

Results:

  • way better scaling: from 8 to 26 instances, instead of 15 to 32 instances
  • a bit better CPU utilization
  • a bit of latency regression, but not significant enough

Second improvements brought more variablity through the day, also it doesn't run at maximum 32 instances anymore

Third attempt

With the last two improvements we had a bit more room to think about the CPU patterns we were seeing so far:

  • quite low average utilization
  • constantly spiking maximum peaks

CPU utilization showed that some instances were constantly doing more work than others which were waiting for IO. With the amount of IO operations this application had, it must have meant that requests routing wasn’t optimal with round robin on ALB. This could lead to situations where some instances while still processing long running IO requests, get even more requests assigned - and thus causing high latency spikes.

It turns out that AWS have released on November 25th 2019 a new routing algorithm for ALB, called “least outstanding requests”:

With this algorithm, as the new request comes in, the load balancer will send it to the target with least number of outstanding requests. Targets processing long-standing requests or having lower processing capabilities aren’t burdened with more requests and the load is evenly spread across targets. This also helps the new targets to effectively take load off of overloaded targets. AWS / What’s New

We were planning on using it earlier, but it wasn’t available in AWS terraform provider till March 6th 2020. Just in time.

After enabling the new algorithm, results looked good with better latency, but worse with the auto scaling. As it turned out, the timing of our deployment was unfortunate. We deployed during one of the attacks we were handling at that time.

This made us rethink an alarm for slow scale up policy based on maximum CPU. We changed it to average CPU being higher than 40% - instead of maximum above 50% - which was happening even more often now.

The last thing that we adjusted was the minimum number of instances: We increased it from 8 to 10, as auto scaling was behaving unstable.

This yielded good results:

  • p95 got a little bit better than before employing new ALB algorithm
  • even less instances running to support the traffic
  • better CPU utilization, still it meant that we were scaling on latency spikes instead of CPU utilization

The last point was a perfect indication that we should move on to web server configuration.

Maximum CPU spikes causing auto scaling to run at max instances again, after deploying usage of average CPU, we got better CPU utilization and stable auto scaling

Phase 2: Web server

Background

When we switched to ECS we also switched from Apache with Passenger to Puma. We weren’t running with multiple threads, due to the age of the application we weren’t sure if it was thread safe.

So now we had 3 options:

  1. Increase number of processes in Puma cluster
  2. Switch from processes to threads with Puma
  3. Use a combination of both

We decided to give it a try for threaded mode in Puma first, for 2 reasons:

  1. We were using an instance with a single CPU, so it didn’t make much sense to increase processes count
  2. We wanted to test app for thread safety anyway 😊

Action

To make sure accounts application is thread safe we:

  1. Went over used rack middlewares, to see if there’s nothing suspicious
  2. Switched from using redis connection directly to using connection pool instead
  3. Configured Puma to run only in threaded mode, using default 16 threads for maximum instead of 6 processes

With integration tests doing good in staging environment, and application not showing any unexpected errors for few days, decision was made to do a test run in production.

The results were more than satisfying:

  • huge drop in latency across the board: p99, p95, p90, p75
  • a bit less instances supporting same traffic
  • no real change in average CPU utilization
  • stabilized maximum CPU spikes, less spikes

After this we also tested a combination of 3 processes in cluster mode, 8 threads each, but without any improvements over plain threaded mode. Which lead to conclusion that we could safely move on to last phase.

Decreased number of instances required to run the app during load, more importantly - huge drop in latency

Accounts application is exposed as gem to other services. Majority of services already leverage Datadog Application Performance Monitoring with distributed tracing to know how various layers behave. This allowed to see what was the impact on the gem side alone and thus all the clients that were utilizing it.

Latency on the gem dropped across the board

Phase 3: Revised auto scaling

As application was able to handle way more traffic, we decided to better utilize the CPU, mainly by increasing the auto scaling alarm thresholds:

  • for scale up, we increased average CPU threshold from 40% to 55%
  • for scale down, we also increased average CPU from 25% to 40%

This yielded nice results of running with maximum of 8 instances and minimum of 5. The only downside was higher p95 latency. Because of which we decreased CPU thresholds by 5%. The minimum number of instances was reduced down to 2 - so if the traffic comes back to baseline from February, even less instances would be used to support it.

Adjusted rules made application better utilize the CPU and run on less instances

Summary

The initial work on optimizations brought some required headroom for scaling and better cost efficiency.

Switch to threaded mode improved and stabilized performance, which contributed to the final cost - mainly due to the nature of the application which spends most of the time on IO operations.

Auto Scaling & Resources Utilization ✔️

At the beginning we were utilizing 15 up to 32 instances, now we run from 4 up to 11 instances, which is 3 × less.

Comparison of instances running for a week, compared with a month before and difference between them

Which means we were basically overprovisioning, and underutilizing resources. Now it looks better, where CPU utilizations is on average higher by 25%.

Comparison of CPU utilization for a week, compared with a month before and difference between them

Budget ✔️

At the moment, with traffic still being higher than at beginning of March, the ECS cost is:

  • 4 × lower compared to cost from before optimizations
  • 2.4 × lower compared to cost from before traffic increase

Daily ECS cost since February till May

Performance ✔️

With performance we didn’t want it to worsen. This went two ways:

For the application clients it looks way better, as we reduced the latency on ALB:

p95 is on average faster by 250ms (259ms instead of 513ms) Comparison of weekly p95 latency on Application Load Balancer, compared with a month before and difference between them

p99 is on average faster by 400ms (416ms instead of 823ms) Comparison of weekly p99 latency on Application Load Balancer, compared with a month before and difference between themIncreased traffic since March

Within application itself it has worsened by around 32% or ~60ms. Comparison of weekly p95 latency in application alone, compared with a month before and difference between them

The performance achieved on ALB compensates the worsened one on the application level, as in the end application is 2 × faster.

There’s still possibility that it could be improved by trying Puma Threaded mode together with Clustered mode, for example by using 3 processes with 8/16 threads each on instance with 2 CPUs.

Tech Stack of Babbel

$
0
0

The Tech Radar is a tool to inspire and support engineering teams at Babbel to pick the best technologies for new projects. It provides a platform to share knowledge and experience in technologies, to reflect on technology decisions, and to continuously evolve our technology landscape.

We also use Tech Radar to inform prospective employees of Babbel, about the technologies we use. After frequently getting the “Which technologies do you use?” question during the interviews, we decided to take our tech stack public. Therefore, we forked Zalando’s MIT licensed TechRadar and modified it for Babbel.


Another use case of the Tech Radar is to inform our new employees joining to Babbel in a couple of months, because of notice periods, relocation and so on. This way, our new joiners can have a chance to look at the technologies we use and they might choose to prepare themselves for their new position/job at Babbel.

Babbel’s Tech Radar can be found in Engineering and Data Engineering categories of jobs.babbel.com.

TechRadar

And it can directly be accessed from https://jobs.babbel.com/en/tech-radar/.

Adoption Levels

The TechRadar includes 4 different adoption levels: Adopt, Trial, Assess, and Hold.

  • ADOPT: Technologies with a usage culture in our Babbel production environment, low risk, and recommended to be widely used.

  • TRIAL: Technologies that we have seen work with success in project work to solve a real problem; first serious usage experience that confirms benefits and can uncover limitations.

  • ASSESS: Technologies that are promising and have clear potential value-add for us; technologies worth investing some research and prototyping efforts to assess impact.

  • HOLD: Technologies not recommended to be used for new projects. Technologies that we think are not (yet) worth (further) investment. HOLD technologies should not be used for new projects, but usually can be continued for existing projects.

Categories

Tech Radar represents our tech stack under 4 different categories:

  1. Languages
  2. Frameworks
  3. Infrastructure
  4. Data Management

1. Languages

The Languages category includes the programming languages that have been evaluated by Babbel. Each language mentioned in the radar has a specific use case. For example, we use Swift to develop Babbel iOS application and Kotlin for our Android app. Python is a popular choice among data engineers and teams working on machine learning and recommendation systems. Ruby is one the most dominant languages in Babbel back-end services and most of our content-delivery parts are written in Ruby, as well. Javascript on the other hand is used both for front-end and back-end applications, depending on the project type and the respective team’s expertise.

We have used CoffeeScript in the past, but with the evolution of modern JavaScript, we decided to stop using it and migrate our services to ES6 or newer versions instead. Java at Babbel had a similar story and moved to hold on behalf of Kotlin.

Programming languages such as GoLang and Typescript are still in the trial stage, however, they are actively being used in production. The decision from trial to adopt usually takes some time since we want to observe the relevant system parts (for example, their performance) for a period of time.

2. Frameworks

In Frameworks category the most eye-catching one is probably the only trial item, which is React Native. For some time, we have been experimenting with React Native to evaluate possible use-cases within Babbel mobile apps. The decision-making process is still going on, but we already have some features written in React Native on production.

ReactJS is the most powerful framework that we use to build front-end and Ruby on Rails is actively in use, mostly for back-end purposes.

3. Infrastructure

As can be seen from the TechRadar, most of our infrastructure is hosted on AWS. We try to benefit from cloud services as much as possible. We also heavily adopted the Infrastructure as Code (IaC) approach with Terraform. You can check out babbel organization in Github for the Terraform providers developed under open-source libraries by our team members.

4. Data Management

The data of Babbel is stored in multiple places, depending on the data type and the project needs. For caching purposes, we use Memcached and Redis. Relational data stored in MySQL database instances, provided by AWS RDS. DynamoDB also has various usage scenarios within Babbel.

Closing Words

In this post, I tried to give a broad overview of the technology choices at Babbel. However, Babbel is a living organization, and the choices mentioned here are open to changes. For the most accurate picture, please don’t forget to check our TechRadar.

This was our experience with TechRadar. Feel free to leave a comment if you would like to share your experience with similar tools and technologies, and let us know if this post or the TechRadar helped you during your job application!

Challenge your initial ideas

$
0
0

Recently I was working on a small command-line tool. The tool was supposed to take some user input and then render this information into a file using a predefined template. Let’s skip the user input handling and focus on the template rendering because it reveals an interesting discovery.


First solution: Take a file path as an argument and render the template into that file

While thinking about the implementation, I discovered the following (“classic”) questions:

  • Shall the tool create the file if it doesn’t exist?
  • Shall the tool override the file when it exists? Shall it ask for permission to override the file? Does that make sense to add --force flag?
  • How to inform the user that the given file path isn’t writable?
  • How to test all these cases above?

That sounded like a lot of questions to answer and many cases to handle. I wanted this tool to be awesome 😊, so I decided to ask myself another question, instead:

“Is it really necessary to go this way?”

Simplified solution: Render the output into the standard output stream

While still satisfying the requirements, the solution would give the following benefits:

  • Automatically dismiss all questions from the first solution as no file handling should be implemented. Also, it wouldn’t be necessary to test all that
  • User would have an option to preview rendered information without the need to create a file
  • User would be able to redirect output to any file and have full control over that (destination, permissions, etc)
  • The tool would play well with other CLI tools

This writing isn’t suggesting to send the output of every command-line tool the standard output, but trying to encourage to challenge initial ideas. Sometimes it’s possible to achieve more by actually doing less 🙂

Happy coding! 🖖

Integrating React Native with Babbel's native mobile apps

$
0
0

Back in mid-2018, my team discontinued a project to focus on our core product: the Babbel applications on Web and mobile. This change might sound easy. However, we wanted to support all Babbel users, but most of them were using our mobile apps and we were a web-only team. ReactJS was our frontend stack, and we had zero experience with native mobile development and their main languages (Kotlin and Swift).


We started an investigation phase to find the best approach for us to contribute to the mobile apps. With our lack of experience in native, we mainly looked for a technology that could integrate new features into an existing native app, while others can still be built and maintained natively.

The most promising candidate for us was React Native, a popular open-source framework introduced by Facebook in 2015 to enable efficient cross-platform mobile development. We also considered learning native mobile languages and other options in the market like Flutter or Webviews. In order to make a transparent decision, we created an Architectural Decision Record to collect pros and cons of every option and feedback from peers, especially colleagues who were native developers. One of the reasons we rejected Flutter was that learning new technologies would have stopped us from producing any work for an extended period of time. In addition, we believed any integration with Webview would have been cumbersome and might eventually come at the price of performance. In the end, we chose React Native for performance and productivity reasons.

React Native’s approach, “learn once, write anywhere” fitted our needs. Becoming a team with cross-platform end-to-end ownership, while bringing our web stack knowledge to mobile platforms, was a very promising outlook.

Other companies’ experiences

During the research phase, we carefully read the article written by Airbnb about their experience with React Native.

Airbnb mentioned in the article multiple downsides that didn’t apply to Babbel’s setup or the latest React Native’s status.

React Native has evolved and is now much more mature than in 2018. This includes improvements like the addition of Hermes (the new JavaScript engine), 64-bit support, and a new JavaScriptCore. Performance has also improved drastically.

After considering other companies’ key successes and failures like Udacity and multiple discussions with our teams, we created a proof of concept. It helped us to verify our assumptions, assess the complexity and get some buy-in from the native developers, who were still skeptics of the technology. After these tasks were completed, we were ready to go.

Looking for the least intrusive solution

Our first premise to start contributing to two established mobile repositories was to be the least intrusive as possible. These two native mono repositories were shared with several native mobile teams. Our intention was to minimize the impact on their workflows. The official way to integrate React Native components into a native application starts by introducing Node to the mobile project. An even bigger downside was the need to merge our Android and iOS repositories, and move the entire native projects into the respective folder (android or ios).

As we were not sure about the end of this adventure, we preferred to avoid any change in our process or our Continuous Integration flows.

Despite an apparent lack of information on the Internet, we eventually found another alternative: Electrode Native. This tool, developed by Walmart Labs, packages the React Native application into Containers which are versioned Android Archive (AAR) libraries for Android or frameworks for iOS.

Like any other native library, the React Native app could now be injected as a dependency into the native projects.

Dependencies.gradle file on Android
dependencies {implementation 'com.github.lessonnine:react-native-app:1.0.0'}
Podfile on iOS
pod 'LearnerTools',:git => 'git@github.com:lessonnine/react-native-app.git',:tag => '1.0.0'

You might have deduced from the injection example that we use a single artifact for all our React Native features. While Electrode provides the concept of Mini-app, a way to encapsulate multiple apps, we decided to simplify the approach by using a single mini-app in order to share logic between them easily. For this, we provide a prop to the React Native instance which is used by the React Router to show the right component.

Below, you can see our current React Native features, the “Can-do Placement Test” (left), and the learning activity (right). Both inflate the React Native view as a Fragment in an Android Activity and as a UIViewController in iOS. This way allows us to integrate the components occupying the entire screen (left) or as a widget with other native components (right). For a better understanding, the React Native components are framed within a green area.

Two screenshots displaying concrete examples of embedded Mini-apps in the Babbel app

Setup and learning curve

The general idea is that working on a standalone React Native app prevents you from having to deal with native concepts like lifecycles, threads… however, this is not the case when the app is hybrid (Native plus React Native).

Our first contact with the native side was the presentation layer. For instance, on our Android app, we used fragments or activities to handle the UI. As React Native had to be rendered on one of these elements, we needed to learn how and when they were instantiated. In order to start a React Native component, we need to gather data from the app state before passing it to the component. As we were following the Clean Architecture we needed to work with the domain layers and its repositories. For these cases, we needed some base native language knowledge.

Independent of the technology we used, our components were part of a broader app, and they needed to communicate with the native side for essential things such as navigation. For this purpose, we introduced the bridge. This was a layer implemented on both sides, which allowed communication between native and React Native via events or requests.

We used the bridge for a couple of scenarios like navigating to a native screen after pressing a button in React Native, reporting an error so it can be tracked by the native Crashlytics implementation or store data in the native app state.

Dependencies locked by Electrode

Our React Native story has become quite coupled with Electrode Native, as we have a strong dependency on it to build our artifacts.

Electrode Native made it possible for us to inject our React Native code into native in an easy way. In the beginning, it kept us away from touching so much native code at a time when we were still not familiar with it. While it made the process quite smooth allowing us to proceed with such integration, it did come with a few disadvantages, too.

It was an extra layer with its own way to declare dependencies, which unfortunately blocked us from updating important dependencies like react-native in the past. Although it did not happen often, we can highlight here one of the dependency indirections we faced.

At Babbel, we use Lottie to create animations and share them between all our platforms. This library was not supported by Electrode Native, so we had to open a pull request to fix it. In addition, we could not use the same Lottie version as our native apps. At the time, Electrode Native was blocking us from updating react-native to version 0.60, and this version was required by Lottie. Hence, we were forced to downgrade our native Lottie dependency and change some of our animations across the entire app as they were not working anymore with an older version.

Nevertheless, we would still recommend Electrode Native for a non-invasive integration. The documentation was great and the development team were super responsive to our questions and pull requests.

Offline support and moving business logic to native

The first feature we implemented in React Native, the can-do placement test, did not need much communication with the native side. We only needed an attribute to be sent to a backend service. Developing the next mobile feature sounded easy for us. At this point, we did not foresee the mobile constraints and the challenges ahead.

One of the main differences between our mobile and web apps is offline support. Babbel offers a rich learning experience, allowing users to learn a language wherever and whenever they are. This is a challenge for our mobile platforms, as the user should get the best experience possible when mobile data is poor or not available.

For this to happen the app needs to store content in advance, to cache API calls in order to synchronize with the backend as soon as the user is online again and apply safety nets to cover all edge cases.

Initially, our tracking events were fired from React Native, making it fragile given a poor connection. React Native’s execution code was tied to its view being displayed, therefore if the view was detached, any retry or fallback strategy was not possible. Fortunately for us, all these functionalities were part of our native apps already, so we refactored our code by communicating through the bridge with the native implementation.

This solution applied later on for API calls too. We did not want to reinvent the wheel around authentication or tokens refresh. Hence we let the native app handle network requests by communicating the endpoint and payload to the native side and waited comfortably to receive the response.

Hiring new engineers is hard

After our first feature was implemented, the team wanted to increase its capacity in order to deliver faster and keep collaborating with the rest of the native teams. As we saw, due to the big dependency we had with the native side, an experienced React Native developer might not be the right fit for us. We looked for the unicorn person for a while, a native developer with some experience in JS and interest to work with React Native. This was unfortunately pretty rare, as React Native was still pretty new in the market and not many native developers were looking to switch stacks. After considering our current team strengths, we decided to give more weight to the native knowledge, as we were especially struggling in this area. Eventually, we found a suitable candidate, a native developer with interest in javascript and react native.

CandidateRequirements

As part of our hiring process, our candidates had to complete a coding challenge to assess as best as possible the expected requirements. We adapted our original native coding challenge to best reflect the setup the candidate would later be working in. The challenge focused on creating the UI on React Native with a bit of logic, while the API calls and main business logic was located in native. In order to reduce the coding time for our candidates a basic bridge was already implemented.

And finally, where do we stand today?

You might be wondering what the outcome of this is. Are we happy? Do we regret it?

After one year and a half since we incorporated React Native to our stack, these are the results:

The good:

  • As a Web team, we were able to bring value to our mobile apps. We shipped a few features that otherwise might still not exist today, as the mobile teams had other priorities. Additionally, the can-do placement test was one of the most impactful features in 2019.
  • The vast majority of the React Native code is shared between iOS and Android.
  • Better developer experience compared to regular native development with instant page reloads.
  • Opens up to a mixed stack development, giving the autonomy to each team to choose whatever tech stack best suits their needs.
  • Bigger posibilities to build a feature on mobile and web at the same time reusing bigger portions of code.

The bad:

  • As we moved a lot of business logic to native, we ended up doing more native work than expected.
  • Lots of context and 3 repositories to work on made us slower.
  • Any global UI change (a button redesign for instance) affects now 3 repositories and not only 2.
  • We had a long setup phase due to our hybrid approach and lack of native knowledge.

The ugly:

  • It’s a risk that our team is the only one at Babbel having knowledge about React Native. Once the ownership of a feature changes or people move on to new jobs, things might get rewritten.
  • When a feature has a heavy dependency on logic from native, React Native might be only used for the UI layer. Otherwise, most of it can be created in React Native.
  • The testing process is slower than native. We need to create the artifact, inject it into the native app and create the final artifact for our QA team. The process can be automated although it requires some work to achieve that.

Although there are a lot of pain points, we remain positive as we have overcome the main infrastructure work. While we might consider implementing features with certain native work fully on native, we still see the potential for fully React Native features. Standalone features like learning tools or simple games will benefit from it and allow us to move faster.

Babbel's hybrid lesson player

$
0
0

Babbel for iOS has evolved into a hybrid application. Hybrid applications combine native and web application parts. The latter allow us to reuse code for specific parts across platforms. If you’re using an iPad for learning with Babbel, you have already been in touch with those. Since a few months ago, all trainers for iPad are now implemented via web technology, mounted to a WKWebView. It’s the same code that is being executed when using Babbel in a desktop browser.

This article wants to give an idea of why and how we introduced web technology to our mobile apps, starting with iPad with the goal to bring it to iPhones as well as Android devices, too.


Introduction

The composition of learning content, didactical concept and user experience is one of the many domains inside Babbel’s tech department. Being a learning company inside-out, each of the disciplines evolves continuously. To let our learners benefit from this in most efficient ways means to reduce uncertainties when it comes to implementation strategies. Catering our service to multiple different platforms, this premise has moved the focus to review our trainers. Unless learners are provided with the same UX for each platform, it is difficult to ensure equal learning experience with our service.

Maintaining trainer implementations across different environments and platforms has benefits and trade-offs. Native implementations of trainers ensure better performing animations and user interaction, whereas a Web-based implementation erases almost any overhead for maintenance, due to a shared codebase and a standardized runtime environment. With HTML5, Web technologies are providing us with a good standard for user interactions and device APIs.

These aspects aside, as Babbel is growing for more than ten years now, the engineering department has also built up a legacy of its own. Speaking from my experience, the more you identify with the mission of a project, the more you become attached to the work you’ve put into the code you write, the concepts you draft, the time you (have) put into this.

It is not easy to let go, but we decided to do so.

Why - Origin / previous stack

Babbel iOS

Alike all early iOS applications, the Babbel App’s source was written in Objective-C. With growing demands to engineering and new engineers joining Babbel after 2014s introduction of Swift, its share in the source has grown to a point, where only a relatively small set of trainers represent a legacy in Objective-C.

Babbel Web

For learners that prefer using a desktop browser for learning, trainers were implemented with a proprietary framework in JavaScript (ES5) called b3. We started moving to React in 2017, replacing the b3 framework step-by-step. As of today, similarly to iOS, only a small part of the Babbel web application is still implemented with b3.

Motivation

To prepare our app for SplitView, we either needed to refactor the native trainers or port the Web implementation of the trainers to the iOS application. Both options would put a halt to the development of new trainers. A task with no immediate value to our learners, but one that would enhance sustainability due to the benefits of web standards.

On the other side, the web implementations of trainers were not completely covering the mobile equivalent features – a state which makes it difficult to predict learning experience too.

Ultimately, with moving to a common code base for trainers we expected to gain benefits by reducing the overhead in trainer maintenance and faster iteration of trainer development.

How

Tech scope

The Babbel web application is using ReactJs with Redux. Except for the b3 legacy, Its codebase is written in ES6. Webpack is configured to use babel loaders and style loaders for building bundles, which are either deployed via Amazon S3 buckets or bundled with the iOS application.
For porting the Lesson Player’s code base to iOS, data providers, authentication, user input and layout demanded refactoring. HTTP based interfaces are bridged via resource handlers on iOS side, covering data and authentication.

Fail early - Evaluating WKWebView

In order to consolidate our strategy and to assess potential risks of this endeavor, a task force was formed which we dubbed “Will it blend?”. They were given two months of time to evaluate in an iterative fashion the following uncertainties:

  • regression in user experience
  • hardware demands in terms of CPU and memory
  • reliable speech recognition
  • integration test compatibility / required refactoring
  • offline support and file management
  • release process

Those evaluations yielded very useful information and gave a good idea on what we would have to focus on:

  • Handling user input – especially the impact of the software keyboard being presented/animated required adjustments to the web components.

  • Network availability – due to differences in storage size and accessibility, native applications can be expected to allow custom caching mechanisms and preloading, which allows to implement a different application flow as well. For the Lesson Player integration, it was necessary to emphasize these differences in behavior. Instead of writing to the Lesson API after a lesson is completed, those requests are now delegated to the native application.

Lesson Player: web application flow
Web Sequence

Lesson Player: ios application flow
iOS Sequence

Current state

With our recent release to iOS, we reached a big milestone, by replacing the native writing review with one based on web view. Before, web technology was only used for trainers when learning with an iPad. We’re looking forward to complementing our experiences with similarly successful contributions to the Android platform soon.


Trainers

Learning content is provided in lessons that contain several exercises. A Trainer is a module that contains the logic for how a lesson segment is presented and how user input is processed.

Poor Acceptance Criteria Can Ruin Any Great User Story - A Drawing Activity for Your Team

$
0
0

In agile software development, user stories are everywhere. These informal, natural language descriptions of requirements of a software system put the the focus on the end-users and emphasize the value that we aim to create for them. Their intention is to facilitate conversations between the development team and its stakeholders.

The strengths of user stories - being user-centric, informal, and brief - happen to also be their weakness. Due to their nature, user stories cannot serve to reach a formal agreement between stakeholders and development team. They remain open for interpretation and may lack details that are necessary for implementation. What’s more, when they follow one of the popular templates, they usually miss the non-functional requirements (such as maximum response time).

This is where acceptance criteria come in. Stakeholders and development team agree on the boundaries of a user story and when a deliverable will be considered to meet the desired requirements. But there’s a caveat: Poorly chosen acceptance criteria can ruin even the greatest user story!


Common pitfalls that I have observed (and, admittedly, have fallen into myself) are acceptance criteria that

  1. are overly and unnecessarily specific,
  2. prescribe solutions, or
  3. go beyond the intended scope of the user story.

They might limit the team’s creativity and flexibility to find the best solution to generate the desired value for the end-users. Or they might allow or even require work that does not add value.

Take this user story as an example:

As a learner, I want to receive a cheerful confirmation after completing a lesson, so that I stay motivated to continue learning.
 
Acceptance Criteria:
- When a user completes a lesson, the URL query string lessonComplete is set to true.
- On the learning path, if and only if URL query string lessonComplete is true, the completion card is displayed.

In this example, the acceptance criteria dictate the implementation of the desired behavior via URL query strings. While this solution might be fine, it prevents the developer from exploring alternative solutions such as using the browser’s local storage or using API calls. Other solutions could potentially be better suited in the particular situation.

Instead of imposing a particular solution, the acceptance criteria should describe the desired behavior:

As a learner, I want to receive a cheerful confirmation after completing a lesson, so that I stay motivated to continue learning.
 
Acceptance Criteria:
- After completing a lesson and returning to the learning path, the completion card is displayed.
- When the user refreshes the browser, the card is still displayed.
- When the user navigates away from the learning path and returns again, the card is no longer displayed.

I have encountered these “user story smells” in almost every team I have been working. But rather than giving the team a boring lecture (think “Harry Potter and the Poorly Chosen Acceptance Criteria”), I prefer to run a short team activity that allows them to discover these risks themselves. Plus, it’s always fun to put away the keyboard and do a little doodling instead!

Drawing exercise

Material

  • 3 flip charts
  • 3 empty sheets of paper per participant
  • colored pencils (but no red) for all participants
  • a timer

Preparation

On each flip chart, write the following user story:

As [INSERT YOUR TEAM NAME], we want to live in a house, so that we can spend more time together.

Add no additional information on the first flip chart.

On the second flip chart, add these acceptance criteria:

- big enough that everyone gets an individual room and we share at least two common rooms
- rooms should be filled with natural light
- door on the ground floor, which we can open but intruders can’t

On the third and last flip chart, add these acceptance criteria:

- 3 floors, brick and mortar
- blue painted walls, red door
- four equal-sized windows on floors 1 and 2,
- two windows on ground floor (one to each side)
- door lock with 4-digit number pad
- garage for 1 car and at least 4 bicycles next to the house

Instructions

Round #1

Ask your team to imagine it’s the beginning of a sprint. They have just pulled in a user story and are now asked to implement it by drawing on the first sheet of paper.

The sprint lasts 2min.

Show them the first flip chart and start your timer.

Ask them to stop as soon as the time runs out.

Round #2

Ask your team to imagine they are redoing the same sprint again. This time, you show them the second flip chart, though. The sprints lasts again 2min.

Round #3

For the third sprint, you show your team the last flip chart and set the timer again to 2min.

Sooner or later, they will realize they are missing red pens to meet the second acceptance criteria: the door is supposed to be red.

Let them struggle a bit. When they feel blocked, tell them you need to talk to the Data Analysts first.

Wait some more, then apologize, and claim: “Historical data showed that the color of the door had no statistical significance on the Net Promoter Score of the inhabitants. Hence, you may choose any color to paint your doors.” Try not to smirk.

Demo & Discussion

Pretend it’s the sprint review. Let each team member present their work from all three rounds.

Some drawings from all three rounds.

Some drawings from all three rounds.

Ask them how they felt in each round. Lost, guided, confined? I expect you’ll arrive at these or similar insights:

The varying degree of specification of acceptance criteria influences the solution.

The varying degree of specification of acceptance criteria influences the solution.

What’s next?

Now that your team has learnt to identify common “smells” in user stories, a reasonable follow-up activity is a backlog grooming: Distribute upcoming user stories among the team members. Ask them to check the respective story for these smells. As a team, improve the affected user stories.

It will take some time and the implementation of a couple of refined user stories before you find the sweet spot. What degree of specification provides guidance without restriction and flexibility without ambiguity? Eventually, your improved user stories will lead you to better solutions that serve your users!