|

2nd Team Challenge Submission: Source Navigator for CVS by Amit & Kevin

We received our second LEARN SOMETHING NEW TEAM CHALLENGE submission. Thank you for participating, Amit & Kevin.

The team will post updates in the comments section of this post so please check back regularly to hear how this Source Navigator for CVS project is coming along.

Here are the details so far.

PROJECT DETAILS
Current Situation or Problem: CVS is a vastly popular code sharing and versioning repository used by open source projects as well as by private enterprises to manage code. In many big and long running projects, there is often the need to refactor code, review, redo, improve or enhance functionality or technology. What is lacking at the moment, is a quick and user friendly way to make the task of finding usages of code that one intends to modify across the multiple modules that may comprise an enterprise product. Currently a developer making a change would need to have a local and updated source code copy of all modules of such a project in order to do a search through code using any of the popular IDE’s (like IntelliJ). Remembering to keep this code up to date before doing a search to see affected code areas while doing a crucial change in central code is irksome and error prone, not to mention the wastage of space on a developer’s machine due to large code check-outs. Our application intends to solve this problem, by searching for such code usages “on the server” itself! Thus a developer using our application would not need to check out all code modules locally for risk analysis purposes, but only the one he intends to change.

Purpose and Details of the Project: The idea behind this project is to locate the usages of any symbol (class, method, field etc) in the source code present on the CVS server. Illustrating the scenario which we experienced and thought that the utility would have been helpful was a case where we wanted to find out the usages of java.util.Calender in a huge code base (including around 30-35 modules in the repository) to make sure that no two threads use the same calendar instance since the Calender class is not thread safe. The software can then search for the usages of this class in the source committed on the version control system server. Other features the software would be 1. To identify the client modules calling a method exposed by the API or to find out the classes/interfaces which implement/extend a particular interface. 2. To identify the subclasses of a particular class or to find out the implementations of a particular interface These all searches could be done in various branches too which could be very helpful Note that this project is only for java projects using CVS as the version control system.
 

Roles of Members on the Team: Kevin Dmello – Functional conceptualization and scope analysis – Software Development – Code quality reviews – Testing – User feedback compilation, enhancement request management.  Amit Shah – Technical Research – Software Development – UI Design – Testing – Release management
 

Learning Objectives: Objectives include 1.) Mastering CVS management and best practices 2.) Deliver quick results – focus on performance and concurrency offerings to achieve quick source code searches. 3.) Learn basics of desktop application development with Swing 4.) User friendly reporting – provide a search result report which is easy to browse through and containing relevant information. How we will measure the successful attainment of these objectives- 1. We plan to roll out the initial application version in our workplace and get user feedback. 2. The tool will be used while searching through a vast code base which is familiar to us and thus we will get a first hand report of accuracy, speed and convenience of use.

RESOURCE 1

Book Title: Essential CVS
Author: Jennifer Vesperman
Publisher: O’Reilly Media, Inc

 

RESOURCE 2

Book Title: Search Patterns
Author: Peter Morville; Jeffery
Publisher: O’Reilly Media, Inc.

Due Date: The projected completion date would be August 16th 2010.
Comments: The resources mentioned above are not the full list of books. We would exploring safari books online more once we actively start working on the project

Kevin and Amit provided the following updated resources. These are resources they actually used while working on the project.

 

  Book Title: Essential CVS, 2nd Edition
By: Jennifer Vesperman
Publisher: O’Reilly Media, Inc.
 
  Book Title: Java™ I/O, 2nd Edition
By: Elliotte Rusty Harold
Publisher: O’Reilly Media, Inc.
  Book Title: Java NIO
By: Ron Hitchens
Publisher: O’Reilly Media, Inc.
 
  Book Title: Maven: The Definitive Guide, 1st Edition
By: Sonatype Company
Publisher: O’Reilly Media, Inc.
 
   Book Title: XPath: Navigating XML with XPath 1.0 and 2.0 Kick Start
By: Steven Holzner
Publisher: Sams

About Safari Books Online

Safari Books Online is an online learning library that provides access to thousands of technical, engineering, business, and digital media books and training videos. Get the latest information on topics like Windows 8, Android Development, iOS Development, Cloud Computing, HTML5, and so much more – sometimes even before the book is published or on bookshelves. Learn something new today with a free subscription to Safari Books Online.
|

41 Responses to 2nd Team Challenge Submission: Source Navigator for CVS by Amit & Kevin

  1. Amit says:

    An update on the project status
    - We have started figuring out the scope & the functionality the utility will support in the initial version. The first version would support finding the usages of a method given the class name & method signature.
    - The utility would be a distributed application involving a server & client component. The client component would consist of a GUI to accept the user input & the server component located on the cvs server would to the requested search. The server component would be reading the configuration files of the project management & build tool used by the code base. Maven2 is the build tool used in our organization

    Next Steps
    - Understand Maven
    - Finalize on the technology to used for client server communication.

  2. Amit says:

    While exploring different client server communication technologies we striked out JMS & RMI since they would involve installing a jms broker or a rmi server on the cvs machine which could mean an extra overhead on the server.
    We are now understanding NIO (non-blocking I/O) to find out if it could suit our application needs. Java I/O (Second Edition) and Java NIO books are good starting points which are helping us out.

  3. Kevin says:

    We have taken a look at NIO and now we plan to familiarise ourselves with the Apache MINA NIO framework.

  4. Amit says:

    Maven – The Definitive Guide book helped to understand maven & pom fundamentals. Maven provides an api to read the xml programmatically.

  5. Kevin says:

    While Amit is mastering the finer details of Maven 2, I got busy reading Essential CVS 2nd edition by Jennifer Vesperman.

    The “Installing CVS” section gave me a good idea of what we need to start off with :

    “CVS is client/server software that runs on Unix and Linux platforms, including Mac
    OS X. The CVSNT program is a CVS-like server that runs on Windows, and there
    are CVS clients for Windows, Mac (including pre-OS X Macintosh), Linux, and
    Unix. When you install CVS on a Unix/Linux server, you automatically get both
    server and client software. To access CVS across the network from any Unix/Linux
    machine, install CVS on the machine in question. The server and (command-line) client software are one and the same.
    CVS is available from http://cvs.nongnu.org. It is also available as an installation package with many GNU/Linux distributions, including Debian, Red Hat, and SUSE.
    A Windows-compatible CVS server is available at http://www.cvsnt.org. This server is
    not identical to the Unix server, but the differences are clearly listed in the CVS NT
    FAQ, and an installation guide is available on its web site.”

    At this point, we’ll mostly use CVSNT as we’re working on a Windows platform. Will post another update once I am able to get the server up!

  6. Kevin says:

    While Amit is mastering the finer details of Maven 2, I got busy reading Essential CVS 2nd edition by Jennifer Vesperman.

    The “Installing CVS” section gave me a good idea of what we need to start off with :

    “CVS is client/server software that runs on Unix and Linux platforms, including Mac
    OS X. The CVSNT program is a CVS-like server that runs on Windows, and there
    are CVS clients for Windows, Mac (including pre-OS X Macintosh), Linux, and
    Unix. When you install CVS on a Unix/Linux server, you automatically get both
    server and client software. To access CVS across the network from any Unix/Linux
    machine, install CVS on the machine in question. The server and (command-line) client software are one and the same.
    CVS is available from http://cvs.nongnu.org. It is also available as an installation package with many GNU/Linux distributions, including Debian, Red Hat, and SUSE.
    A Windows-compatible CVS server is available at http://www.cvsnt.org. This server is
    not identical to the Unix server, but the differences are clearly listed in the CVS NT
    FAQ, and an installation guide is available on its web site.”

    At this point, we’ll mostly use CVSNT as we’re working on a Windows platform. Will post another update once I am able to get the server up!

  7. Amit says:

    As mentioned above in my earlier post, the Source Navigator application will be taking inputs from the build management software used by the code base. As we use maven as build manager in our organization, the initial support would be for maven. Maven works based on the convention vs configuration concept. Below paragraph from the “Maven : The Definitive Guide 1st Edition” book by Sonatype Company which explains the advantages of using maven over ant.
    “Maven incorporates the concept by providing sensible default behaviors for projects. Without customization, source code is assumed to be in ${basedir}/src/main/java and resources are assumed to be in {basedir}/src/main/resources. Tests are assumed to be in ${basedir}/src/test, and a project is assumed to produce a JAR (Java ARchive) file. Maven assumes that you want to compile byte code to ${basedir}/target/classes and then create a distributable JAR file in ${basedir}/target. Although this might seem trivial, consider the fact that most Ant-based builds have to define the locations of these directories in every subproject.”

    Maven core does not take care of compiling source, packaging bytecode, running junits etc. This all is done by maven plugins. In addition to providing information about plugins, remote repositories and artifacts maven object models provides dependency management.

    When a search request is received by the Source Navigator server component, it will take advantage of maven’s dependency management to find the modules under which the search needs to be done. For e.g. If a search is requested to find the classes and interfaces which implement or extend an interface, the server component will find the modules which depend on the module in which the interface is defined. The search would be restricted to only these modules. This is a simple, neat trick that will increase the responsiveness of the application & reduce the number of files eligible for the search (especially when the code base involves around 20-30 modules)!!
    The next step involves understanding the pom (project object model) file through which dependency management is handled by maven.

    • Amit says:

      An update on maven
      Maven’s dependency management allows continuous independent development by different project teams. This dependency management is done through pom.xml. Maven knows about a project via the pom.xml. Maven provides a set of unique identifiers (co-ordinates) which help to uniquely identify a project, a dependency or a plugin in POM. An example of how a dependency is defined in the project is as below.

      junit
      junit
      3.8.1
      test

      Chapter 9 – The Project Object Model in “Maven : The Definitive Guide 1st Edition” book by Sonatype Company provides more inner details about the xml file.
      On researching out we found a way to get the dependencies of a module through the maven api. The below three lines of code does the magic!

      MavenXpp3Reader reader = new MavenXpp3Reader();
      Model model = reader.read(new InputStreamReader(new FileInputStream(“path to the pom.xml”), “utf-8”));
      model.getDependencies();

      You just need to have the maven jar (maven-2.2.1-uber.jar) in your classpath.

      Next step would be a research task to find java source file parsers which would help the server component to perform the CVS search.

  8. Kevin says:

    Read through Essential CVS 2nd edition by Jennifer Vesperman (this resource has useful, detailed and comprehensive information about CVS). As I mentioned before, it’s also easy to go through quickly.

    We got some useful administrative tips from the “Creating a Repository” section.

    “To create a repository, create the directory that you want to use as the repository
    root directory on the computer that will act as the CVS server and ensure that the
    repository root directory is owned by the user who will ultimately own the repository.
    Execute the command cvs -d repository_root_directory init, where repository_
    root_directory is the name of your directory: this command sets up that directory as
    a CVS repository. The root directory given must be an absolute path, not a relative
    path. “

    Also noted some important facts from the section on “pserver Access” –
    “The pserver method allows users to connect to the repository with a username and
    password that are stored on the repository server. The main advantage of pserver is
    that it permits anonymous, passwordless, read-only access.
    The pserver method allows users to connect to the repository with a username and
    password that are stored on the repository server. The main advantage of pserver is
    that it permits anonymous, passwordless, read-only access.

    The repository path format for pserver is:
    : pserver: [ [ user] [ : password] @] [ hostname: [ port] ] /path “

    The information got us through setting up and successfully configuring a test CVS server (yeah!). We’ve setup CVSNT suite trial (ver 2.8.01). We’re using TortoiseCVS as the CVS client. We uploaded a few java files on the “mainline” and the setup works beautifully!

  9. Amit says:

    With a little help from google, we became aware that most projects who needed to parse files used a 3rd party tool. PMD is a static analysis tool that has good reviews and is fairly widely used to parse the source files. The next logical step is to evaluate PMD and ensure that it satisfies our use case.

  10. Amit says:

    We have done a basic evaluation of PMD!
    Would like to share a few useful points about what we discovered –

    PMD is a static analysis tool which scans the source code without actually running or compiling the program. PMD does this source scan by using the Java CC parser generator in conjunction with an Extended Backus-Naur Formal (EBNF) grammar and JJTree to generate a Abstract Syntax Tree (AST). For more details on how PMD works please refer the link here : http://onjava.com/pub/a/onjava/2003/02/12/static_analysis.html

    PMD comes with a number of ready-to-run rules that can be run on the source code to find unused variables, unnecessary object creation, empty catch blocks, and so forth. Generally, a PMD rule is a Visitor that traverses the AST looking for a particular pattern of objects

    The best part comes in where PMD provides a framework to allow writing external rules. The ability to add external rules will help the source navigator application to parse the java and perform the requested search. Custom rules can be written in two ways
    1. Using XPath
    2. Using Java Classes

    Next Step – understand & implement external rules in PMD, and get familiar with XPath.

  11. Amit says:

    Going through the samples for defining external rules posted on the PMD project site, I learned that writing a rule in XPath (XML Path Language) is simpler than writing it in Java especially when PMD has the XPath engine is plugged in.
    A good introduction which I found in the overview section of the “XPath : Navigating XML with XPath 1.0 and 2.0 Kick Start” book by Steven Holzner was “XPath is to XML as SQL is to databases. XML applications need XPath to locate specific data within an XML document …”
    I went through the basics of XPath mentioned in Chapter 1 and planning to go deeper in order to implement XPath expressions.

  12. Kevin says:

    Below are the list of tasks we need to wrap up to have a basic working model of the application
    1. Understanding XPath & finishing a sample application with PMD. This is what Amit is already working on.
    2. Understanding NIO and apache mina framework. Trying out a sample application to simulate a distributed application involving a client and a server. I am planning to start with this one.
    3. Integrating individual pieces involving maven, pmd and apache mina frameworks to have the basic working model of the source navigator application up.

    Looks like a busy weekend ahead :)

  13. Amit says:

    Going through the basics of location steps & paths from chapter 3 of “XPath : Navigating XML with XPath 1.0 and 2.0 Kick Start” book helped me to write the xpath expression for finding the usages of an interface.
    PMD provides an designer tool through which one can test the xpath queries. Once the xpath expression is written, one just needs to create a rule and add it to the ruleset.xml. This xml file is then given as input to the PMD.main() method call. I was able to complete my sample application to parse the java files using PMD !

  14. Amit says:

    After figuring out the basics of PMD, we start back on Apache Mina, the NIO socket communication framework.
    Going through the architecture diagrams posted on the apache mina site ), I became clear on the steps & the classes involved to create a sample TCP based server and client. The examples shared out there are quite extensive. Next step is tryout a sample.

  15. Amit says:

    I tried a basic distributed application with apache mina framework. It was quite simple. Encoder & Decoder are the critical classes which require special handling. I will now be integrating the various pieces involving the maven dependency finder, PMD java parser & apache mina framework to have a basic working model up.

  16. Amit says:

    We integrated maven, PMD and the apache mina frameworks but by doing a couple of runs we realized that PMD will work out in the multi-threaded environment.
    The technical issue in short is as below : To define custom rules in PMD a ruleset.xml needs to be defined in which the xpath expression is specified. Passing a parameter to the expression (which in our case was the interface name) meant updating the xml file before invoking PMD. In a distributed environment where multiple client requests are received by the server at a time, this approach would be incorrect since the same xml file needs be modified.

    While researching we found Checkstyle as another static analysis tool which supports defining custom checks (similar to rules in PMD). According to the documentation, Checkstyle parses the java file into a Abstract Syntax Tree (similar to what PMD also did) & it supports passing runtime values to the custom defined checks through System properties.
    A simple POC should confirm whether this tool suits our requirements.

    • Amit says:

      While Basavraj was going through the blog to get the inner details about our project he pointed me out the above comment & was confused on the reason why we shifted to CheckStyle. I realized a slight mistake in the above statement.
      Instead of adding “PMD will not work out in the multi-threaded environment” I had mentioned it will work out :).

  17. Amit says:

    The POC with CheckStyle worked out. We implemented a custom check for finding the classes which implement a particular interface. The interface name was set as system property whose key was specified in the configuration file.
    While implementing this custom check, special handling is required to cover scenarios where an inner class implements the specified interface (This was not required with PMD though). The code needs to browse the Abstract Syntax Tree (AST) to get the class name.

    I will be now be integrating Checkstyle with the maven & apache mina frameworks.

  18. Kevin says:

    We have a working model of the application up ! The current implementation supports finding the implementation classes of an interface.

  19. ec says:

    Great! It’s about time someone came up with a lightweight application to search through mounds of source code. I can well see the advantage in using this on large maintenance projects, if not on new development projects as well.

    I’ll be waiting for the 16th when I can try this out – assuming of course it will be open source.

    • Kevin says:

      Thanks EC! Glad you liked our idea. Yes, we do plan to make this project open source during the next few days. Will update the blog with the details once we have everything in place. We welcome any feedback you may have.

      • Wahaj says:

        Neat idea – I for one know exactly how painful it can be to first checkout tons of code only to search for an API usage or for a finding a common pattern across modules. Ofcourse, this would apply typically to distributed architectures.

        A suggestion I had on this being made into a stand-alone tool as opposed to a client-server tool. Being stand-alone has its own benefits with developers being able to independently use the tool without needing any other dependencies other than the CVS server.

        Good luck!

        - Wahaj

  20. Kevin says:

    We are designing the UI at the moment. Will follow up with an update once we are ready with version 1 of the UI.

  21. Amit says:

    I am working on the packaging part. We will be using Ant to create artifacts for source & binary distributable.

    • Amit says:

      I am ready with the ant build file which would create the source & binary distributable. We have also uploaded our source code here – .

  22. Basavaraj Kalloli says:

    Would be a good utility when you are making an api change and want to know what all modules are getting affected by it. You dont have to load all the modules in an IDE for finding the usage.
    This will also be a great utility when you want to find existing implementation of an interface in open source implementation of projects like Spring. Like if you want to find the pointcut implementations in aop I dont have to go and download the source and start the search. This would be an over kill.
    Will also be a neat way of finding util methods which you can use for doing repitetive tasks like parsing string and loads of others.
    It would be good if you also implement the feature where we can find out which interfaces extend a particular interface. It would also be great if you list out the features that you will be present in the first release.

    • Kevin says:

      Thanks Basavaraj, for showing interest in our project. We will definitely consider your request for including searches for interfaces extending another interface.

      For our first release on the 16th, we plan to minimally have the following features included –

      1.) Provide a light-weight application which allows multiple users to search through source code of large java based projects quickly.
      2.) Provide a basic reporting module that formats and creates informative search result reports.

      We have a few more ideas in mind to add more functionality and improvements, however owing to the fact that we are both working and only get a few hours a week to contribute to the project, we have set realistic limitations for the challenge.

      However, we do plan to make this project open source and are committed to continue supporting it to add new features in the future.

      If there’s another feature you would like to see in here in the 1st version, let us know, and we will try to surprise you!

  23. Kevin says:

    You can take a look at version 1 of our UI at –

  24. Amit says:

    Summarizing the tasks to be done over a couple of days
    1. Adding the support for searching interfaces which extend a specified interface
    2. Make modifications in the GUI in order to support displaying the output
    3. Creating an HTML file from the java objects received by the client from the server after the search is done
    4. Integrating the GUI with the actual application code

    I would be working on 3 & 4 while Kevin has taken up 1 & 2.

    • Kevin says:

      Update on this – we were done with all the above items over the weekend. We even rolled the tool out to selected project groups since Monday. The feedback so far has been good and we’re into the fine tuning stages.

  25. Kulbhushan says:

    I would like to add this tool to my toolkit, as it sounds handy. Keeping that in mind, I would request the team to add a search facility for any string search, in this version if possible, or maybe the next. Good luck guys!

    • Amit says:

      Thanks Kulbhushan. Good to know that you found the tool handy.
      String search was already in the feature list we had planned earlier but due to time constraints we had moved it out of the first version. I think we would have sometime to have this feature too in the first version though. We will try out best :)

      • Amit says:

        We have implemented the string search feature & integrated it with our application. It required a couple of changes in the checkstyle source code tool.

  26. Amit says:

    I have updated the source code which was uploaded at – with the above code modifications.

  27. Amit says:

    We have started working on some of the release tasks.

  28. Kevin says:

    We have reached the day when we can offer our source navigator project up for evaluation for this Safari challenge. We are pleased with our initial version of the software, though we know we have some more features to add in order to make it a more powerful tool in the hands of developers and code reviewers all over.

    We must mention that taking part in this challenge was enriching in many ways besides the technical part of it that Safari helped us out with by generously giving us access to their rich online resources. Time management was a defining factor for both of us as we both hold full time jobs. We have learned a lot with this effort personally and have become better utilizers and balancers of our time and efforts as part of taking part in this short but demanding challenge.

    All said and done, it’s time now to put forward our deliverables!

    You can find the source code along with detailed java documentation at the following location –

    Please also try out a sample we have compiled for you, this can be found here –

    Refer to our user guide for help with trying out the samples –

    Our release notes will give you an idea of where we are with this project as of today –

    The technical documents will give you a brief introduction of the technical decisions we have made as part of this venture –

    Thanks to all for your feedback and support during this challenge.

    Regards,
    Amit & Kevin.

    • Kevin says:

      For some reason the links I posted in the comment above are not showing up.

      Posting them again below..

      You can find the source code along with detailed java documentation at the following location –

      Please also try out a sample we have compiled for you, this can be found here –

      Refer to our user guide for help with trying out the samples –

      Our release notes will give you an idea of where we are with this project as of today –

      The technical documents will give you a brief introduction of the technical decisions we have made as part of this venture –

  29. Pingback: Safari Books Online Announces “Learn Something New Team Challenge” Winners « Safari Books Online's Official Blog

  30. J Bennie says:

    Hi Guys, congratulations I hope you enjoy your prize.
    all the best Jay and David.

  31. Pingback: Press Release: Competition Helps Programmers Meet Key Performance Objectives and Stand Out to Peers and Supervisors « Safari Books Online's Official Blog