A Windows SSO (for Java on client and server)

A couple of months ago I worked on a single sign-on (SSO) for a Windows client and server made in Java. The scenario was the following:

  • A client made with Java running on Windows
  • A server made with Java running on Windows
  • Both where logged-in to the same domain (an Active Directory LDAP)

The question was, how the server could get the identity (the name of the Windows account) of the client and – of course – how it could trust this information. But if the client would just send a name (e.g. from Java’s System.getProperty("user.name"); method), the client could send anything.

The solution for this dilemma (trust what the client sends to you) is to use a (so called) trusted third party. A trusted third party is an instance which both, client and server, know and trust. The client authenticates itself to this party and the server can verify requests against it. In the scenario above, the domain of the company (an Active Directory LDAP) is the trusted third party. Each client identifies itself against this domain when it logs-in to Windows. Its Windows username and password are checked by the domain/LDAP. On the other side, the server has also access to the domain controller and can verify information send by the client.

The nice thing about this is, that the Windows domain is already configured on nearly every machine which stands in a company. Every company, bigger than maybe five people, will have a Windows domain to log-in. Therefor, a SSO based on the Windows domain will work right out of the box in most cases and we don’t need and configuration in our Java code, since it is already configured in Windows.

Java Native Access (JNA)

To use Windows and the domain controller for authentication, we can use native Windows APIs. To call those APIs in Java, we can use the Java Native Access (JNA) library, which you can find on GitHub at https://github.com/twall/jna and on Maven central:

For example, to get all user groups of the current user, you would do:

Waffle

On top of JNA exists a library called Waffle which encapsulates all functionality you need to implement user authentication. You can find it on GitHub at https://github.com/dblock/waffle and also on Maven central:

You can use Waffle to create a token on the client, send it to the server (e.g. over HTTP or whatever) and to validate that token on the server. At the end of this process (create, send and validate) you will know on the server who the client is – for sure!

Here is an example of how to identify a client on the server. Note that this piece of code is executed completely on one machine. However, you could easily split it into two parts, one on the client and one on the server. The only thing you would need to do, is to exchange the byte[] tokens between client and server. I commented the appropriate lines of code.

(By the way, I asked this myself on Stackoverflow some times ago).

The only thing that is a little bit complicated with that solution is, that you need to do a small handshake between client and server. The client will send a token to the server, which will response with another token, which the client needs to answer again to get the final “you are authenticated” token from the server. To do this, you need to hold some state on the server for the duration of the handshake. Since the handshake is done in a second or two, I just used a limited cache from Google’s Guava library to hold maybe 100 client contexts on the server.

The exchanged tokens are validated against the underlying Windows and its domain.

Best regards,
Thomas

Who is using OSGi?

Who is using OSGi? If you take a look http://www.osgi.org/About/Members you will see more than a hundred members of the the OSGi Alliance and a lot of big players just like IBM or Oracle. But lets do a little investigation with Google Trends on our own.

Google Trends

Google Trends is a service where you can search for a particular key word and get a time line. The time line shows you when the key word was search and how many requests have been made. That is a great way to estimate how popular a certain technology is at a time. Google will also show you where the search requests have been made – and that is where we start.

OSGi in Google Trends

If we search for “OSGi” on Google Trends we will see a chart just like the one shown below. As we can see, OSGi is somehow over its peak and the interest in the technology is decreasing since a couple of years.

But even more interesting is the map which shows where the search requests have been made. As we can see, most requests came from China. On the fourth place is Germany.

If we take a closer look on Germany, we see that most requests come from the south of Germany.

But we can even get more specific and view the exact cities. It is a little bit hard to see in the chart, but you can click on the link at the bottom to see the full report. You will see Walldorf, Karlsruhe and Stuttgart on top. So what? Well, in Walldorf, there is one big player who is not on the list of the OSGi Alliance: SAP.

We can do the very same with the USA and we will end up in California and Redwood City, where companies like Oracle and Informatica are located.

Best regards,
Thomas

Visualizing KML files in Google Maps

The question is easy: How can I visualize the track of a KML file on Google Maps via JavaScript?

Last evening, I spend about four hours looking for a solution for this (pretty trivial) question. In the end, the code was simple and very short, but it was hard to find some good and clear resources about the topic. The documentation from Google about their maps API is very well – but in my opinion it lacks of some simple examples to start with. Therefore, here is a super simple example to copy and past and start right ahead.

Example

Run it

You can just copy and paste the example. You only have to do two things:

  1. Get a KML file, call it test.kml and put it beside your HTML file in a folder.
  2. Start a web-server in that folder (see below).

In order to run the example, you have to start a web-server. Otherwise you can’t load the KML file, since it is not available trough the browser (due to cross origin requests and all that stuff). A very easy way to do that is with python. Just open a command line the folder where your HTML document is and start a server like this:

Then go on http://localhost:8080/ and see your map.

Explanation

The script is only doing a few simple things. First of all, it initializes a map object using the Google Maps API. This object represents the actual map drawn in the div with the id canvas. Then the script creates a parser object of the geoxml3 library. This library offers a very comfortable way to display KML files on Google Maps. However, the support for polylines (tracks on the map) is pretty new. So you have to use the poly branch of library. Otherwise you won’t see any lines, just your starting point. The library can also parse KML as a pure string. Check their wiki for more information.

Finally



Best regards,
Thomas Uhrig

Writing an online scraper on Google App Engine (Python)

Sometimes you need to collect data – for visualization, data-mining, research or whatever you want. But collecting data takes time, especially when time is a major concern and data should be collected over a long period.
Typically you would use a dedicated machine (e.g. a server) to do this, rather then using your own laptop or PC to crawl the internet for weeks. But setting up a server can be complicated and time consuming – nothing you would do for a small private project.

A good and free alternative is the Google App Engine (GAE). The GAE is a web-hosting service of Google which offers a platform to upload Java and Python applications. It comes with its own user authentication system and its own database. If you already have a Google account, you can upload up to ten applications for free. However, the free version has some limitations, e.g. you only have a 1 GB database with a maximum of 50.000 write-operations per day (more details).

One big advantage of the GAE is the possibility to create cron-jobs. A cron-job is a task that is executed on fixed points in time, e.g. all 10 minutes. Exactly what you need to build a scraper!

But let’s do it step by step:

1. Registration

First of all, you need a Google account and you must be registered by the GAE. After your registration, you can create a new application (go to https://appengine.google.com and click on Create Application).

01_gae_registration

Choose the name for your application wisely, you can’t change it later on!

2. Install Python, GAE SDK and Google Eclipse plugin

To start programming for GAE, you need to set up some simple things. Since we want to develop an application in Python, Python (v. 2.7) must be installed on your computer. Also, you need to install the GAE SDK for Python. Optional, you can also install the Google plugin for Eclipse together wit PyDev which I would recommend, because it makes life much easier.

3. Create your application

Now you can start and develop your application! Open Eclipse and create a new PyDev Google App Engine Project. To make a GAE application, we need at least two files: a main Python script and the app.yaml (a configuration file). Since we want to create a cron-job, too, we need a third file (cron.yaml) to define this job. For reading a RSS stream we also use a third-party library called feedparser.py. Just download the ZIP-file and unpack the file feedparser.py to your project folder (this is ok for the beginning). A very simple scrawler could look like this:

Scrawler.py

app.yaml

Note: The application must be the same name as your registered on Google in the first step!

cron.yaml

Done! Your project should look like this now (including feedparser.py):

02_gae_project

4. Test it on your own machine

Before we deploy the application on GAE, we want to test it locally to see if it is really working. To do so, we have to make a new run-configuration in Eclipse. Click on the small arrow at the small green run button and choose “Run configurations…”. Then, create a new “Google App Engine” configuration and fill in the following parameters (see the pictures):

Name:
GAE (you can choose anything as name)

Project:
TechRepScrawler (your project in your Eclipse workspace)

Main Module:
C:Program Files (x86)Googlegoogle_appenginedev_appserver.py (dev_appserver.py in your GAE installation folder)

Program Arguments:
--admin_port=9000 "C:UsersThomasworkspace_pythonTechRepScrawler"

04_gae_run_config_1

04_gae_run_config_2

After starting GAE locally on your computer using the run configuration, just open your browser and go to http://localhost:8080/ to see the running application. You can also go to an admin perspective on http://localhost:9000/ to see, e.g. some information about your data.

5. Deploy your application to GAE

The next – and last step! – is to deploy the application on GAE. Using the Google Eclipse plugin, this is as easy as it can be. Just click right on your project, go to PyDev: Google App Engine and click upload. Now your app will be upload on GAE! On the first time, you will be asked for your credentials, that’s all.

05_gae_upload

06_gae_deployed

Now your app is online and available for every one! The cron-job will refresh it every 10 minutes (which just means, it will surf on your site like every other user would do it). Here’s how it should look:

07_last

Best regards,
Thomas Uhrig