Projects

🚧 This page is under construction. Check back soon.

Multi-View Geometry and 3D Reconstruction

Industry Research

1. Touch Adaptive Image Segmentation and Object Modeling (NVIDIA Research)

2. Feature Based Tracking and Pose Estimation (Qualcomm Research)

3. Time Series Analysis (TSA) for fMRI Imagery (VulcanTech)

University Research – Active

Affordance based Cognitive Computer Vision

4. k-TR Theory to Visual Perception

5. Recognition by Component Affordances (RBCA)

6. AfNet– The Affordance Network

7. AfNet Online - Web Learning Interface

8. Affordance Guided Top-down Search

9. Bottom-up Semantic Saliency

10. Deep Learning/Neuromorphic Cognitive Object Category Recognition

11. Task based Object Manipulation/Interaction

12. Affordance Sequencing

13. Functional-Topological Mapping and Navigation

14. Symbolic Language for HRI/IRI (AfkTRAANS)

15. Symbolic Cognitive Architecture (A-fACToR)

Kinect Data Processing

16. RGB-D Data Calibration and Refinement

17. Semantic Segmentation

18. RGB-D Object Instance Recognition

19. 3D Object Modeling and Tracking

20. Multi-modal Model Estimation

21. Spatial Structure Analysis using RGB-D Data

22. Part Connectivity Calculus

23. Object Part Extraction for Scalable Object Grasping

24. RGB-D Cross Compression

25. Embedded and Mobile Vision

26. Compressive Sensing

27. Quantum Hough Transform

28. Web Mining for Semantic Scene Understanding

29. Landmark Detection for Robot Navigation

30. Traffic Video Processing and Driver Behavior Modeling

31. Augmented Virtuality based Tele-presence for Mining Robots

32. Optical Character Recognition

33. Financial Data Processing

34. Astronomical Image Processing

35. Cognitive Event Recognition

36. Panoramic Video Tracking

37. Algorithms for Bio-rendering

38. WYHIWYS Device (Psychoacoustics)

39. Constant Divider for DSP Processors

40. Dynamic Topology Design

Defense Research

41. Bio-Inspired Generic Object Recognition (US Army)

42. Obstacle Avoidance (NASA)

43. Video Streamer for Immersive Robot Teleoperation (US Army)

44. Learning for Automated Target Recognition (US Army/US Navy)

45. Structural Image Compression (US Air Force)

46. Pose Estimation and Super-Resolution (US Air Force)

47. Building Footprint Extraction (US Department of Homeland Security)

48. Airborne Networks (US Air Force)

University Projects- Past

49. Research Project Management Analysis

50. OpenCV based Image Analysis

51. Generalized Object Recognition

52. 3D Graphics Library

53. Game Design

54. Speech Recognizer

55. Linear Predictive Coding

56. Room Acoustics & HRTF Analysis

57. Acoustics Simulator

58. Analog Devices DSP programming

59. TI DSP programming

60. Information Theoretic Lossless Compression

61. Scalar and Vector Quantization

62. JPEG and MPEG - Analysis and Enhancement

63. Predictive Lossless Compression

64. Node Localization

65. Wireless Environment Characterization

66. Data Rate Optimization

Industry Research

Sponsors/ Affiliations

Multi-View Geometry and 3D Reconstruction

Photogrammetry and Camera Pose Estimation Suite: Built algorithm suite for Structure from Motion (SfM – Incremental, Global) based metric accurate drone camera pose estimation and sparse reconstruction, with a metric measurement error of less than 0.1%; adapting and improving classical methods for Camera calibration (April Tags etc.), Color, exposure, vignetting correction; Feature detection (AKAZE, BRISK, AGAST etc.); Feature matching (SSE Optimized Incremental, Exhuastive, Cascade Hashing etc.); Geometric verification and prefiltering (Ransac, Prosac, Arrsac, LMedS, MLE Ransac, SPRT etc.); Relative pose and camera parameters estimation (Perspective-n-Point (PnP), P3P, 5PRP, 4PH, 8PFM, 5PEM, 4PFL, 5PFLRD, 3PRPR, 4PRPR, 2PAPR etc.); Relative pose transformation (Horn, Besl-McKay, Iterative Closest Point (ICP) etc.); Triangulation and Structure estimation (Direct Linear Transform (DLT), MidPoint etc.); Bundle adjustment BA (Multicore - MCBA, Sparse - SBA, SSBA, cvSBA etc.); employing a slew of non-linear least squares optimization techniques (Trust Region: Traditional, Subspace Dogleg, Levenberg Marquardt, Line Search: Steepest Descent, Nonlinear Conjugate Gradient, BFGS, LBFGS etc. and loss functions Robust L1/L2, Nonlinear, Huber, Softlone, Cauchy, Arctan, Tukey etc.); Linear least squares methods, Factorization methods and preconditioners (Dense QR, SPQR, Dense/Sparse Normal Cholesky, Dense/Sparse Schur, Jacobi, Iterative Schur, CGNR, LDL, AMD, COLAMD, Gauss-Seidel relaxation etc. and using numeric, analytic, auto derivatives). Developed novel proprietary algorithms to customize the pipeline for UAV camera pose estimation targeting cell phone towers, wind turbines, building and other free standing structures.
Dense Fixed Baseline Stereo Module: Developed algorithms for generating near real-time sparse feature based and dense stereo depth maps and 3D reconstructions for imagery from UAVs. Module also includes conventional stereo algorithms for calibration and densification (BM, SGBM, SGM etc.).
Ultra Hi-Density Multi-View Dense 3D Stereo Reconstruction Suite: Created novel algorithms for generating high density billion point tower 3D cloud reconstructions, with resolution of 12 voxels/ mm and smallest structure fidelity of 1-2mm, and capable of reading labels. Suite also builds over classical MVS approaches (Clustering (CMVS), Patch MVS (PMVS), Shading aware MVS (BRDF)) and other depth propagation techniques (MRF/ CRF, Belief Propagation, Diffusion, Variational methods, Primal dual solvers etc.).
3D Surface Processing Suite: Implemented, adapted, improved classical algorithms for Point cloud filtering (Spherical PTZ scan filtering, Octree/ kdtree/ BVH outlier filtering, scale, confidence analysis filtering), Surface generation (Laplacian heat diffusion, Poisson, Screened Poisson, Smoothed Signed Distance (SSD), Floating Scale (FSSR), Moving Least Squares (MLS), Grid projection, Marching cubes - Hoppe’s, RBF etc.); Mesh generation (Ear clipping, Greedy projection, Image connectivity meshification etc.); Mesh screening and trimming; Mesh texturing (with global, local seam stitching), recoloring from posed images; Cloud and mesh analysis (Distance measurement); Surface geometry fitting (Primitives, Splines); Rendering pre-processing (Scale-space voxel analysis, Hidden point removal etc.) and developed proprietary energy minimization algorithms for surface processing using contextual priors for cell towers, buildings, wind turbine assets.

1) Touch Adaptive Image Segmentation and Object Modeling (NVIDIA Research): Formulated algorithms for touch adaptive segmentation and object modeling for Nvidia Tegra/ Android tablet devices using C/CUDA. (2012)Notes: Proprietary

2) Feature Based Tracking and Pose Estimation (Qualcomm Research): Contributed to algorithmic improvements for the tracking and pose estimation routines in Qualcomm Vuforia SDK framework for Android, iOS environments. (2011)Notes: Proprietary

3) Time Series Analysis (TSA) for fMRI Imagery (VulcanTech): Implemented a tool for

Time Series Analysis (TSA) of Functional Magnetic Resonance Images (fMRI) using Wavelets as a part of project Vortex (an fMRI analysis software) for VulcanTech Software, PA, US. (2004-5)

Notes:

The project encompasses the development of tools for the Time-series analysis of fMRI datasets, as a part of the Vortex project of VulcanTech Software, that seeks to build a diagnostic tool for analysis of fMRI Neuro-images. fMRI datasets are obtained by MR scanning the brain, over a period of time, along the three frames, namely Axial, Coronal and Sagittal to record changes in neural activity at each voxel or volumetric element, while a predefined external stimulus is being applied. To detect those specific voxels that are affected by the external signal presents a problem. The imaged response contains convoluted successive stimuli responses, baseline passive or implicit stimuli responses and noise due to detection errors during the imaging process. The time-series analysis tool implemented performs the following functions:

1. Purge Noise bands from the Hypothesis based testing.

2. Normalize the baseline activity in the signal analysis.

3. De-convolve the response, as stored in the fMRI dataset to obtain the correlation between stimulus and its response.

4. Present the effect of application of user-specified base, signal and stop bands on the modeling.

5. Perform statistical analysis to validate the accuracy of the modeling.

6. Render output visualizations of these results as Images, Graphs and Multi-frame datasets.

A wavelet domain deconvolution is used to eliminate noise while separating the signal (response to stimulus of interest) and the baseline model data values. Hypothesis testing is used to determine the response model. The Null hypothesis assumes that the model consists of only the baseline information. The Alternate hypothesis assumes the model to be composed of both baseline and signal information. The evaluation involves Inverse Wavelet Transforming (IWT) the baseline-model coefficients and full-model coefficients and estimation of Sum Square Error (SSE) from fitting, F* and Coefficient of Multiple Determination (CMD) statistics.

University Research – Active

Affordance based Cognitive Computer Vision

4) k-TR Theory to Visual Perception: Formulated a novel evolutionary psychophysics theory to explain human perception and recognition of objects based on evolutionary cognitive algorithmic processes and repeated learning of correlated local features in the object space. The theory, validated using theoretical analyses, psychophysical priming tests (RSVP), neurobiological, linguistic and computer vision models also explains allied aspects of recognition such as Novelty detection, Equivalence classes, Recognition of articulated and natural objects, Attention, Saliency, Memory and Object identity retrieval, Scale of analysis and Choice of features.

Notes:

Affordances are essentially functional properties of objects – such as ‘contain’-ability. k-TR defines recognition in terms of affordance and local features. k-TR hypothesizes that learning of object models in humans for recognition occurs in a two-step process - a higher cognitive level - k level or affordance features level and lower visual level composed of correlated local features in the object space - the TR or transient level.

Ref:

44. KM. Varadarajan, M. Vincze, 'k-TR Theory for Balance of Nature and Nurture in Robotic Perception', IEEE German Conference on Robotics -ROBOTIK, Munich, Germany (2012).

51. KM. Varadarajan, ‘Anti-Mirror Neuron System Model for Affordance based k-TR Common Coding Theory’, Conference of the International Society of Psychophysics – ISP, (2013-TBS).

52. KM. Varadarajan, ‘Learning Surprise and Saliency Affinities for k-TR Semantic Affordance Aberrations’, International Conference on Philosophy, Artificial Intelligence and Cognitive Science - Turing, Manila, Philippines (2012).

53. KM. Varadarajan, M. Vincze, ‘Attention Mechanisms of the k-TR Model to Visual Perception’, Roverto Attention Workshop – RAW, Roverto, Italy (2011).

54. KM. Varadarajan, ‘k-TR: Karmic Tabula Rasa – A Theory of Visual Perception’, Conference of the International Society of Psychophysics - ISP, Herzliya, Israel (2011).

5) Recognition by Component Affordances (RBCA): Created a new theory to object recognition using component affordances. Designed and implemented innovative computer vision algorithms – range processing, object and part semantic segmentation, 3D modeling, affordance ontology mapping, part connectivity calculus, graph matching, view clustering using Non-Negative Matrix Factorization (NNMF), inference mechanisms leading to a Visual Cognitive Engine (VCE) supporting RBCA.

Notes:

RBCA defines objects in terms of their constituent parts and their part affordances along with the topogeometrical relationship between the parts and scale information. For example, a chair is defined as composed of two surfaces providing 'support-ability' or capable of providing supports that are staggered and orthogonal to each other and for objects that are typically the size of a human foream. The Visual Perception pipeline uses RGB-D data to recognize such parts and uses them in a sub-graph matching procedure to recognize different objects.

Ref:

26. KM. Varadarajan, M. Vincze, ‘Learning Affordance Co-occurences from Wearable Camera Data’, IEEE Computer Vision Conference on Pattern Recognition -CVPR Workshop on Human Activity Understanding from 3D Data (2013-TBS).

39. KM. Varadarajan, M. Vincze, ‘Affordance Sequencing for Task Representation’, IEEE International Symposium on Robots and Human Interactive Communication - RO-MAN (2013-TBS).

6) AfNet– The Affordance Network: Developed AfNet, an open affordance computing initiative, that builds affordance knowledge ontologies in terms of afbits (affordance bits) defined by the user community. The project is hosted at theaffordances.net and provides 68 base affordance features (25 structural, 10 material, 33 grasp), over 200 object category definitions in terms of 4000 afbits.

A-fACToR– Affordance based Perception for ACT-R and Af-kTRAANS- Affordance and k-TR Augmented Abugida based Neuro-Symbolic language

Also created A-fACToR, an AfNet based cognitive architecture for robots with nature, nurture delineation and Af-kTRAANS, a cognitive language for inter-robot and human-robot communication.

5) Object Part Extraction for Scalable Object Grasping: Designed Kinect range sensor/ stereo/ monocular cognitive contour grouping based object part detection, grouping and grasp hypothesis generation algorithms targeted at table-top scenes for scalable grasping of unknown objects and manipulation using a robot arm, as part of GRASP EU project. The core component of the system revolves around a new theory postulated- Grasping by Components (GBC).

6) Landmark Detection for Robot Navigation: Designed stereo based object recognition and 3D scene reconstruction algorithms targeted at landmarks such as doorways and room boundaries for indoor robot navigation, room functionality hypothesis generation, as part of Robots@Home IKEA EU project. The core component of the system is a novel Feature Guided Piecewise Depth Diffusion algorithm.

7) Traffic/Driver Behavior Modeling: Developed novel bio-inspired algorithms for traffic event prediction from surveillance video using Markov chain models for EURASIA Pacific Uninet project.

8) Augmented Virtuality based Telepresence for Mining Robots: Designed an Augmented Virtuality based visualization system for control of mining robots for semi-autonomous operation in critical environments.

9) Bio-Inspired Generic Object Recognition: Researched and designed a novel Geon based framework for generic bio-inspired object recognition using Intrinsic image extraction, Curve detection, Laplacian diffusion, Curve space smoothing, Depth segmentation, Part boundary detection, Relaxation labeling - Gradient ascent, Real Coded Genetic Algorithms (RCGA) Superquadrics fitting and DAGSVM learning, intended for Simultaneous Localization and Mapping (SLAM), Human Robot Interaction (HRI) and Autonomous Robot Navigation (Army SBIR).

10) Cognitive Event Recognition: Designed and implemented a novel ‘Bio-inspired Scene Analysis System for Video Indexing and Retrieval’ using the concept of mirror neurons. It uses an Object - Action - Event recognition pipeline employing mirror neuron networks, object state, event state graphs, hierarchy and ontology.

11) Obstacle Avoidance: Developed and implemented a real-time multi-threaded architecture to support GPS data acquisition, Stereoscopic vision processing, Local path planning and User interface display for Obstacle avoidance as part of a NASA-SBIR on Systems for Autonomous sea vehicles and USVs. Also integrated RS-232 serial port communication based GPS data acquisition for the live system.

12) Video Streamer for Immersive Robot Teleoperation: Developed novel algorithms for content driven video compression and streaming over unreliable error-prone tactical ad-hoc wireless networks targeted at immersive robot teleoperation in dynamic urban scenarios, based on Peano-Cesaro tiling, Structural representation, Tree motion compensation and Multiple Description Coding (MDC)/ Layered Coding (LC) - Army SBIR.

13) Panoramic Video Tracking: Designed and implemented a novel ‘Real-Time Panoramic NTSC Video Human Tracking System based on the Kalman Filter’ on fixed point TMS320C6416 DSP processor, with Pipeline, memory & stack management, Logic and instruction level optimization, Hardware and software trade-offs.

14) Learning for Automated Target Recognition: Researched and built modules for innovative Incremental learning (IncLeDec), Environmental context adaptation, Concept drift and Knowledge assimilation for Decision trees based Naive Bayes Classifier (NBC) and other ensemble learning schemes operating on target image chips for Army/Navy SBIR on Perpetual learning and knowledge mining for Automated Target Recognition (ATR).

15) Structural Image Compression: Designed and implemented novel modules for Edge detection & novel edge selection, Progressive encoding, Hybrid (wavelet and edgelet adaptive singularity description) encoding, Laplacian single-grid & Pseudo-full multi-grid anisotropic PDE diffusion using Iterative Back Substitution (IBS) linear equation solver and High fidelity robust unique maritime segmentation & optimization scheme for IR image compression, as a part of Air-Force SBIR on ‘JASSM - An Intelligent and Adaptive Class-Based Compression Technology for Weapon Seekers Suitable for Minimum RF Bandwidth’.

16) Pose Estimation and Super-Resolution: Designed and implemented super-resolution schemes to aid weak pose estimation for monocular passive ranging (Air-Force SBIR).

17) Building Footprint Extraction: Built plugins achieving ESRI compatible shape file format conversion for UC’s building footprint extraction GIS tools (DHS SBIR) and benchmark tested UC’s GIS software.

18) OpenCV based Image Analysis: Implemented and analyzed Lambertian surface shading. Implemented and tested Mean shift & K-means segmenters and analyzed the results of the segmentation with respect to its parameters. Analyzed the performance of Birchfield-Tomasi stereo algorithm based on various parameters such as occlusion penalty, match-reward, reliability etc.

19) Generalized Object Recognition: Designed, implemented and analyzed an object recognition tool in OpenCV, to classify between different kinds of objects, using the Willamowski ‘Bag of Features’ method employing a SIFT-PCA-K means clustering-Feature vector histograms-N nearest neighbor/ SVM classifier pipeline. Conducted error and parameter effect analysis on results.

21) 3D Graphics Library: Developed a 3D graphics library using VC++. Components include Z-buffer triangle rendering, Viewpoint transformations, Lighting and shading, Procedural and image texturing, Accumulation-buffer Anti-aliasing, Shadowing etc.

22) Algorithms for Bio-rendering: Developed couple of novel bio-inspired algorithms for rendering tree growth – Fissured bark geometry and Semi-random cylindrical trunk with astroidal base explosion geometry.

23) Game Design: Designed and implemented Intelligent games – ‘Bantumi (Pallanguzhi)’ and ‘Dragon Eclipse’ and implemented using Microsoft Developer Studio – C with OpenGL.

24) WYHIWYS Device: Designed, implemented, tested a novel algorithm- a ‘What You Hear Is What You Speak' (WYHIWYS) device, incorporating direct and indirect paths, reverberant fields, effects of voiced segments, vocal tract vibrations, attenuation effect of tissues, cancellous bones and the Eustachian tube.

25) Speech Recognizer: Built a simple speech recognizer for distinguishing between numerals using Mel-Frequency Cepstral Coefficients (MFCC) feature extraction, Template description by training and Pattern matching on Dynamic Time Warped (DTW) feature vectors.

26) Linear Predictive Coding: Developed a tool for Linear Predictive Coding (LPC) analysis by way of computation of autocorrelation matrix and implementation of Levinson Durbin recursive solution.

27) Room Acoustics & HRTF Analysis: Estimated and analyzed the Impulse response and Reverberation time of an enclosed space. Analyzed the deviant behavior of Head Related Transfer Function (HRTF) with sample selection. Examined role of pinnae in Localization, Front-back and Back-front confusion, Source direction identification using MIT Fred KEMAR data.

28) Acoustics Simulator: Designed a multi-stage IIR Reverberator filter to simulate reverberations, with ability to control variation due to different surfaces, relative phase delay and multi-path effects.

29) Divider for DSP Processors: Developed and implemented a novel Frequency analyzer circuit and a Synchronous constant divider for DSP based real-time image processing (tested using OrCAD).

30) Analog Devices DSP programming: Implemented and tested JPEG & other Image compression algorithms, Speech echo cancellation algorithms, Gabor filter based recognition on ADSP 2192-12 at IIT DSP Learning Program.

31) TI DSP programming: Implemented and tested Code Excited Linear Prediction (CELP) Vocoders, Adaptive noise reduction, FIR, IIR filters and various other Real-time image and speech filtering applications on DSK TMS320VC5416, C6713 processors using Code Composer Studio (CCS) C and Linear assembly programming.

32) Information Theoretic Lossless Compression: Implemented and analyzed Huffman coding (Global statistics and Locally adaptive statistics), Lempel-Ziv coding, Run-length coding, QM Arithmetic coding on different data types.

33) Scalar and Vector Quantization: Designed and implemented a 3-bit and a 5-bit Lloyd-Max Scalar quantizer with support for compact codes and analyzed results based on rate-distortion curves and average code word lengths. Designed, implemented and analyzed a Lloyd-Max Vector quantizer using LBG splitting and GLA. Implemented and analyzed a Tree Structured Vector Quantizer (TSVQ) using multiple vector dimensions and codebook sizes.

34) JPEG and MPEG - Analysis and Enhancement: Analyzed rate-control in JPEG, developed and implemented a deblocking filter to remove blocking artifacts in reconstructed images, studied and analyzed macro blocks and motion vectors in MPEG-2, compared MPEG-1 and MPEG-2 based on PSNR/frame, bits/frame and motion vectors. Implemented a fast motion vector search algorithm (Diamond search) for MPEG codec.

35) Predictive Lossless Compression: Surveyed major domains of modern lossless data compression: Dictionary based approaches (LZ family), Block sorting methods (BWT based), Predictive mapping based methods (PPM, PPM*, PAQ family and its models; model mixing; contexts) and performed comparative analyses of the performances. Designed Evolutionary / Swarm optimized genetic algorithms based schemes for context mixing in PAQ. Surveyed progressive trends in H.264, G.729, DRM, HD-DVD and Blue Ray implementations.

36) Dynamic Topology Design: Designed and implemented ‘A Power Level based Wireless Sensor Network (WSN) Deployment Scheme for Outage Reduction using Irregular Hexagonalization’ by way of modeling the wireless communication channel. The dynamic hex network topology designed takes into account flat fading through Rayleigh, Rician, Log-Normal, Nakagami, Weibull Distributions, reducing the outage probability.

37) Airborne Networks: Tested UC’s MDP/MARP protocol suite - Mobility aware routing schemes for airborne networks using QualNet emulator (Air-Force SBIR).

38) Node Localization: Tested and analyzed the performance of various localization techniques in wireless sensor networks, clock skews and error accumulation in TPSN two-way time synchronization.

39) Wireless Environment Characterization: Conducted an experiment and analyzed the Path loss, noise and PRR variations of wireless links in a typical environment using TMOTE SKY motes employing TinyOS programming. Numerous other tests using TinyOS were performed.

40) Data Rate Optimization: Analyzed and optimized network data rates using Linear programming in AMPL and NEOS optimization server.

Note: The logos of the respective organizations have been used to identify sponsorship for the corresponding program or projects. This is not to be taken as endorsement of the personnel involved, outcomes, or other aspects of the projects by the organizations. Furthermore, this project list only summarizes information that is already available in the public domain through publications and other communication media and does not contain any proprietary or confidential information. Additional information for each of the projects will be available upon request, subject to commitments of non-disclosure.

Page updated

Report abuse