Get Even More Visitors To Your Blog, Upgrade To A Business Listing >>

Are you sure the bus line is still listed? Problems with gathering data by scraping a web page with Selenium WebDriver. Use a RESTful API endpoint instead!

You are an automation developer on a team developing in Java a new public transportation web application. The customer base for the app is located in Southeastern Massachusetts.

The MBTA (Massachusetts Bay Transportation Authority) oversees the subway lines and buses of the Greater Boston area and the surrounding suburbs. The #230 bus line stretches across four cities, connecting a Commuter Rail Station in Brockton, MA (my hometown), a bus terminal for the Brockton Area Transit's BAT bus, and a major subway stop in Braintree, MA.

You will need in this project to get data from the MBTA for this particular bus line: bus schedules, bus stops, maybe even bus locations. But this begs the question:

  • How do you make sure that the #230 bus line is still listed?

Attempt #1: Manual Testing: Use the MBTA's Web Site

The easiest way to test to see if the #230 bus is still listed by the MBTA is a visual check through the MBTA's website.

1) Go to http://www.mbta.com/
2) Select the "Bus" icon

From the MBTA webstite, http://www.mbta.com/

Test #1: Assert that the "230" entry is in the dropdown for "bus".

3) Choose "230" in the dropdown provided.

4) Wait until you are redirected to the schedule for the 230 bus line at http://mbta.com/schedules_and_maps/bus/routes/?route=230.

Test #2: Assert that the schedule appears.

Schedules and maps from MBTA
We can visually check confirm that:

  • Yes, the 230 bus line is listed on the front page
  • Yes, the bus schedule appears... after a bit.

... But what if we want this to be part of an automated test and just want a yes or no answer to the question "Is the Bus Line Listed Correctly?"

Problems with Gathering Data from the UI

Let's say that we use Selenium WebDriver for our test. We write an automated script to:

1) Go to http://www.mbta.com/
2) Select the "Bus" icon
3) Choose "230" in the dropdown provided
4) Wait until you are redirected to the schedule for the 230 bus line

And we wait... and wait ... and wait...

... and wait ...

... and you may be automatically redirected to the bus schedule, or you may not.

Automate this with Selenium WebDriver and Java, and, mark my words, you are going to spend the next month or so trying to get the synchronization correct so the test doesn't time out.

This test will be deemed a "flaky test". Woe to anyone attempting to use this test in their test library.

A better test would be to not go through the web site at all to get this data, and communicate with a webservice that has this information instead.

A Better Way to Gather Data: RESTful API Endpoints

Luckily for us, like many sites nowadays, the MBTA offers an MBTA Developer Portal. Developers of third-party applications can read production data in real time, to see not just schedules of the buses and trains that the MBTA operates, but also the positions of the buses and trains in real time. 

http://realtime.mbta.com/Portal/

The MBTA wants to share it's data with the public. It provides: 
  • A public key you can use to test out the system (but make sure to subscribe for your own key if you are doing more than monkeying around)
  • Documentation on how to set up information queries (See the PDF for the MBTA-Realtime API Quickstart Guide dated 5/2016)


What is the Public API Key?

"Below is the current open Development Api Key. It may change at any time. This key is open for all developers to use in development and testing. DO NOT go into production using this key! Register for an account and request a key of your own on realtime.mbta.com. There is no cost. 

"As of 5/11/13 the open development API key is:  wX9NwuHnZU2ToO7GmGR9uw"


What Information Can You Lookup?

According to the QuickStart documentation, part of the items you can look up are:

  • Stopsbylocation: GPS coordinates, listing latitude and longitude of the bus or train
  • Routesbystop: All the stops on the Red, Green, Orange line or bus line
  • Predictionsbystop: When will the next train or bus be arriving?
  • Alertheaders: Are there any service alerts?

You can see even more information in the MBTA-Realtime API Documentation (V 2.1.3) dated January 4, 2017. Listed is that you can query routes, such as bus routes.

If we wanted to manually interact with the API to check if the bus line #230 is still active, we can:

1) Go to http://realtime.mbta.com/developer/api/v2/routes?api_key=wX9NwuHnZU2ToO7GmGR9uw&format=json to view all routes.

2) Search on the page for the numbers "230", and we can see:

{
"route_id" : "230 ",
"route_name":"230"
}

... And, yes, we can see that the 230 bus line exists!

Coming Up Next Article...

We have covered a bit about REST APIs before. Last year we covered:

  • How to use Apache HTTP Components, such as HTTP Get, with Java testing the Stripe API
  • An Introduction to REST APIs: How manual testers can automate using Postman, JSON, and JavaScript
For this section, we will be covering how to use Java, TestNG and REST Assured to interact with MBTAs API. 



Until then, Happy Testing!

-T.J. Maher
Twitter | LinkedIn | GitHub

// Sr. QA Engineer, Software Engineer in Test, Software Tester since 1996.
// Contributing Writer for TechBeacon.
// "Looking to move away from manual QA? Follow Adventures in Automation on Facebook!"


This post first appeared on Adventures In Automation, please read the originial post: here

Share the post

Are you sure the bus line is still listed? Problems with gathering data by scraping a web page with Selenium WebDriver. Use a RESTful API endpoint instead!

×

Subscribe to Adventures In Automation

Get updates delivered right to your inbox!

Thank you for your subscription

×