Monday, August 29, 2011

Mongo DB - Scalable Solution for Persistence of Media Files

MongoDB is a non-relational (schemaless) database that contains records in BSON format (binary representation of JSON). The best part of the MongoDB is its scalability, easy to integrate APIs (available in various web application development scripting languages) and easy usage via commands on various operating systems. The persistence of records in various collections (similar concept as of tables in relational databases) is non-transactional which makes the database operations quite fast as compared to relational databases. The point to be considered here that MongoDB should not be used incase a record miss cannot be accepted as in case of Banking systems and E Commerce Systems.


Some Use Cases of MongoDB

The schema-less and non-transaction property of MongoDB has made it to be usable in the following scenarios:

  • Archiving: the change in schema in relational databases over a decent amount of time makes it difficult to archive the data in non relational databases.
  • Logging: The insertion of records is fast in MongoDB because of the non-transaction property. However the same property is responsible for some insertion misses over a large number of insertions.
  • Real Time Analytics: it can be used to track the real-time performance metrics (page views, unique visits, etc.)  of a given website.
  • User Information Persistence for Identity Provider Systems: The user information such as registration, ratings, session data and profile can be saved in MongoDB in case of Identity Provider Systems or SSO systems.
 
Using MongoDB for File Storage

Grid File Store is a MongoDB specification based on which a large file is saved by splitting it into smaller chunks of data (256 K is size as default). The file is saved using two collections:

  • files: the meta-information like object id, size, insertion date and chunk id goes here.
  • chunk: the file data is saved in this collection.

Insertion API using GridFS

 
The insertion of a record required necessary meta-info that can be passed as key value pair in the form of Map. The other required parameter will be the file data in the form of bytes and the collection name which will be used to establish connection with the server.
 
public String insertContent(byte[] fileData,String myCollection, Map metainfo)
            throws MongoInsertionException, RepositoryConnectionException{
      
        mylogger.debug("Inserting record in Mongo Database.");
        DBCollection dbCollection = getMongoDBConnection(myCollection);
        DB db = dbCollection.getDB();
        GridFS myFS = new GridFS(db,myCollection);
        String mongoid=null;
      
        GridFSInputFile gridFileInput = myFS.createFile(fileData);
          
        for (Iterator iterator = metainfo.keySet().iterator(); iterator.hasNext();) {
            String key = (String) iterator.next();
            String value=metainfo.get(key);
            gridFileInput.put(key, value);
        }
      
        gridFileInput.save();
        mylogger.debug("RECORD ADDED SUCCESSFULLY IN MONGO!!!");
        mongoid=gridFileInput.getId().toString();
      
        if(mongoid==null) throw new MongoInsertionException();
     
        return mongoid;
}


Deletion of Record in GridFS

The deletion of record can be done by passing the object id to the GridFS instance.
private void deleteRecord(String id, String collectionName) 
                                             throws RepositoryConnectionException{
       
    DBCollection dbCollection=getMongoDBConnection(collectionName);
    DB db = dbCollection.getDB();
    GridFS myFS = new GridFS(db,collectionName);
    myFS.remove(new ObjectId(id));
}



Updation of Record in MongoDB using GridFS

Direct file data updation is not possible in MongoDB. For updation of a file data the logic can be:
  • Insert a record with the new file data.
  • In the meta-information of the newly added record (child) add information of the old data.
  • In the meta-information of the old data (parent) add  this information of the newly added record.
public boolean updateRecord(String id, String collectionName, byte[] updatedValue)
    throws MongoInsertionException, RepositoryConnectionException{
       
       
        Map metainfo=new HashMap();
        metainfo.put("parentid", id);
        String newid=insertContent(updatedValue, collectionName, metainfo);
       
        /*update meta information of old id*/
        DBCollection dbCollection=getMongoDBConnection(collectionName);
        DB db = dbCollection.getDB();
        GridFS myFS = new GridFS(db,collectionName);
       
        GridFSDBFile gridFSDBFile = myFS.find(new ObjectId(id));
       
        /*if some id exists previously, then delete it*/
        String currentUpdatedRecord=(String)gridFSDBFile.get("updatedversion");
       
        if(currentUpdatedRecord!=null && !"".equals(currentUpdatedRecord)){
            deleteRecord(currentUpdatedRecord, collectionName);
        }
       
        gridFSDBFile.put("updatedversion", newid);
        mylogger.debug("The meta information updatedversion updated to value:"+newid);
        gridFSDBFile.save();
        return true;
    }



Searching of Record in GridFS

With the above logic of updation the record can be searched using the child information saved in the parent record.

 public static byte[] searchRecord(String id, String collectionName) 
                                          throws RepositoryConnectionException{
      
    try{
        mylogger.debug("id="+id+" collectionName="+collectionName);
        DBCollection dbCollection=getMongoDBConnection(collectionName);
        DB db = dbCollection.getDB();
        GridFS myFS = new GridFS(db,collectionName);
        GridFSDBFile gridFSDBFile = myFS.find(new ObjectId(id));
        String newid=(String)gridFSDBFile.get("updatedversion");
        if(newid!=null && !"".equals(newid))
            gridFSDBFile = myFS.find(new ObjectId(newid));
        InputStream in = gridFSDBFile.getInputStream();
        byte[] bytes = IOUtils.toByteArray(in);
        return bytes;
  
    }catch(IllegalArgumentException e){
        e.printStackTrace();
        mylogger.error("UNABLE TO SEARCH RECORD!!!",e);
        return null;
    } catch (Exception e) {
        e.printStackTrace();
        mylogger.error("UNABLE TO SEARCH RECORD!!!",e);
        return null;
    }
}


Monday, June 27, 2011

Security Assertion Markup Language (SAML)


Single sign-on is a method of access control that enables a user to access multiple independent software systems with common user base. SSO provides a single action of user authorization and authentication to access systems without the need to enter login information several times.

Security Assertion Markup Language (SAML) is a product of the OASIS Security Services Technical Committee and an XML-based open standard that provides one of the solutions to implement SSO functionality. The authentication and authorization data between security domains is exchanged between an identity provider and a service provider. The SAML standard defines set of rules and syntax for the data exchange, and the flexibility for custom data to be transmitted to the external service provider.

 

SAML Structure

A SAML transaction involves three roles:

  • Asserting Party: or the identity provider is the system that provides the user information.
  • Relying Party: or the service provider is the system that trusts the asserting party and uses the provided user data to application accessibility to the end user.
  • Subject: the user information that is involved in the transaction.
The transaction between the identity provider and the service provide is called a SAML assertion. The structure of SAML assertion is in the form of XML document and contains the statements regarding subject in the form of attributes and conditions. The assertion can also contain the authorization related information that will define the application functionalities that a user can access.

 

SAML protocols

The SAML standard defines set of request and response protocols in order to communication the assertions between the service provider and the identity provider. Some of such protocols are:

• Authentication Request Protocol – defines the request assertion by the service provider related to authentication statements.

• Single Logout Protocol – defines the logout process out of all service providers using the single logout.

• Artifact Resolution Protocol – defines how the initial artifact value and then the request/response values are passed between the identity provider and the service provider.

• Name Identifier Management Protocol – defines how to add, change or delete the value of the name identifier for the service provider.

 

SAML Bindings

SAML bindings define the mapping between the SAML protocols and the network protocols that are used for communication of SAML assertions between the identity provider and service provider.

Some example bindings used are:

• HTTP Redirect Binding – uses HTTP redirect messages.
• HTTP POST Binding – defines how assertions communication using base64-encoding.
• HTTP Artifact Binding – defines how an artifact is transported using HTTP.
• SOAP HTTP Binding – uses SOAP 1.1 messaging over HTTP.

 

SAML Profiles

SAML profiles are the business use cases that defines dictate how the assertion, protocol and bindings will work together to provide SSO. Some example profiles are:

·        Web Browser SSO Profile – uses the Authentication Request Protocol, and any of the following bindings: HTTP Redirect, HTTP POST and HTTP Artifact.
·        Single Logout Profile – uses the Single Logout Protocol to logout the user from all services.
·        Artifact Resolution Profile – uses the Artifact Resolution Protocol over a SOAP HTTP binding.
·        Name Identifier Management Profile – uses the name Identifier management Protocol and can be used with HTTP Redirect, HTTP POST, HTTP Artifact or SOAP.

 

SSO using SAML

The most popular business use case for SAML federation is the web browser SSO profile, used in conjunction with the HTTP POST binding and authentication request protocol. A user requests via a user agent (generally a web browser) a web resource protected by a SAML service provider. The service provider issues an authentication request to a SAML identity provider through the user agent in order to know the identity of the requesting user.

Single Log out using SAML