 
        
        
      
    
    Projects
Skrutable
My first and most popular project. I wanted a minimalistic but effective workspace where I could assemble in one place all the Sanskrit text-processing functions that mattered most to me, which have turned out to be transliteration, meter-related calculations, word splitting, and OCR. The features you’ll find here are those that have proven useful in my own academic work, like meter-agnostic scansion information. It’s powerful enough to apply to large amounts of text at scale, but also flexible enough to use for one-off situations day-to-day.
Blog posts:
Pramāṇa NLP
This curated corpus of Sanskrit philosophy texts is my model of simple but informative text digitization for facilitating NLP work. Like my other work, it’s open source and licensed under a Creative Commons Attribution-ShareAlike 4.0 International License, so go ahead, take a look, and download it for your own use if you want.
Blog post: Pramāṇa NLP
Vātāyana
The digital centerpiece of my dissertation work, this text-mining system and reading environment builds on the Pramāṇa NLP corpus and presents text and insights in an interactive front-end. It uses LDA topic modeling, TF-IDF, and local alignment to automatically find parallel passages, and a database of pre-calculations makes it possible to get an intertextuality summary for even an entire work in seconds.
Blog post: Vātāyana
Pāṇḍitya & SETI
I’m a great admirer of the Pandit Prosopographical Database of Indic Texts, which consolidates extensive information on Sanskrit authors, their works, and their interconnections. Pāṇḍitya builds on Pandit by leveraging its data to generate interactive network visualizations of these authors and works. Users can explore entities of interest via autocomplete dropdown menus, customize and resize graphs as needed, and interact with nodes to reposition them, recenter the graph, or navigate directly to entries in Pandit. The associated SETI metadata aggregation project also makes it possible to link directly to online e-texts from within Pāṇḍitya visualizations.
Blog posts:
Conversations:
- New Books Network podcast: Pāṇḍitya: Mapping Sanskrit Texts Online (interviewed by Raj Balkaran for Indian Religions, also cross-listed on Digital Humanities, Literary Studies, Ancient History, South Asian Studies) 
 
                       
                       
                       
                       
            
              
            
            
          
               
            
              
            
            
          
               
            
              
            
            
          
              