Public Lab Research note


Find closest match spectra from database - GSoC project

by Sreyanth | June 24, 2013 07:46 | 81 views | 4 comments | #8410 | 81 views | 4 comments | #8410 24 Jun 07:46

Read more: publiclab.org/n/8410


Who I am

Hi everyone. I am Sreyantha Chary Mora (preferred: Sreyanth), a senior undergrad majoring in Computer Engineering at the National Institute of Technology Karnataka, Surathkal. This summer, I am working for the Public Lab as a GSoC student.

What I will be working on

The public laboratory’s Spectral Workbench provides its users with tools to share the spectra and work on it. I would like to add an extra yet important functionality. A scalable spectral matching mechanism. With this in place, the users will be able to see some results which show up when the system finds some similar spectra in the database. This helps the user explore and learn more about his/her uploaded spectrum.

Formal details of the project

Title: Find closest match spectra from database

Mentor: Jeffrey Warren

More details

What need will my project fulfill?

My project aims to provide the users a scalable and fast system which can look up the database for similar spectra in which they might be interested in and are working on, thus help them in their research or study. This also helps us to discover interesting patterns being observed across various spectra, which normally are believed to be different. My project, on successful completion and integration, makes this task easier and helpful.

How am I going to do this?

I want to adapt the Opticks codebase and integrate it into the Spectral Workbench first. This may need laborious work. But is believed to produce awesome results. Next problem will be to check the scalability of the code. So, I would like to do an algorithmic study and try to enhance the solution for the problem of spectral matching. Also, if time permits, I would like to come up with a similarity index of top matched spectra.

Technical Details

One thing we can do is to check each spectrum with every other spectrum in the database, calculate RMS values and decide which of them are similar. This would take a lot of time to run. Also, the results as we require may not be obtained.

So, I will basically implement the well-known two algorithms, for image-based spectral search. As I am new to this jargon, if I say something technically incorrect, forgive me, and feel free to correct me.

  1. Spectra Angle Mappings

  2. Wang-Bovik Quality Index algorithm

I started reading the Opticks extension and am trying to come up with a spectral matching module in Python. Scalabililty seems to be a issue here. I am thinking of implementing these algorithms by the mid-term evaluation (or a week or so later) and integrating the module with the workbench. Then work on scalability issues and ranking measures in the remaining time.

Did I get something wrong?

Please mail me at (sreyanth@gmail.com) or my mentor Jeffrey (jeff@publiclab.org) directly and let us know where I went wrong.

Thats all for now! Will keep you people posted on the progress of this project in the weeks to come.


4 Comments

Sreyanth,

Good start at wrapping some descriptive boundaries around a tough project. However, after a bit of googling, it raised some questions I thought I'd pose.

It was not clear (to me, anyway) if you intended to treat a spectral plot as a graphical image -- therefore, the idea of using 2D image algorithms. Assuming this was the intent what 2D analysis techniques would you apply? This question is moot, of course, if this was not the intent at all.

It seems that both Opticks and Wang-Bovik are fundamentally written and tuned for 2D digital imagery -- generally photographic. The spectral data is inherently not 2D. Have you considered the tradeoff between the added complexity of those 2D algorithms vs something requiring significant effort just to port? i.e. the much simpler extraction of key parameters from plots -- no image processing needed.

From personal experience, I've found it useful to perform some basic experiments on some arbitrary or random sample data, using a familiar environment, to prove the concepts -- prior to the significant effort required to port code. Is this a possibility? What I didn't see in your notes were references to 'proof of concept'. It is often of advantage to first pick the tasks with the greatest risk (i.e. does a fundamental concept be made to work and come close to meeting basic requirements) before building the finished product. Maybe you're planning on this already?

Spectral Angle Mapping looks interesting, but it's capability and application is in the removal of multiple light sources from spectra of 2D earth imagery. I did not see a description of how those concepts might be applied to the current spectral plot matching/search concept. Are these concepts well formulated but just not yet annotated for the project?

I'm just posing some thoughts up front in hopes the question might be helpful and thereby achieve your goals and possibly save you some effort along the way.

Dave

Is this a question? Click here to post it to the Questions page.

Reply to this comment...


Hi Dave,

Sorry for getting back to you this late.

I agree with you completely. I had the idea of using the data as 2D imagery at the first. But after a series of discussions with you, Jeff and some others, I am using the plots data directly, the % and wavelength. -- You might be aware of this as we had discussed this on the mailing list. Posting a new research note now covering all the work I have done so far. I request you to critically review it :-)

Sreyanth

Reply to this comment...


No problem; I'm glad you found my observations useful. I look forward to reading your new research note.

Dave

Reply to this comment...



Login to comment.

Public Lab is open for anyone and will always be free. By signing up you'll join a diverse group of community researchers and tap into a lot of grassroots expertise.

Sign up