data_hacking is are examples of using IPython, Pandas, and Scikit Learn to get the most out of your security data.
How to get this tool
To use this tool, please use a method listed below.
In a Linux (Debian OS), run the following command(s).
git clone https://github.com/SuperCowPowers/data_hacking.git cd data_hacking
sudo python setup.py install
Download directly from the following link:
How to execute
Most of the notebooks will have relative paths to some resources, data files or images. In general the easiest way we found to run ipython on the notebooks is to change into that project directory and run ipython with this alias (put in your .bashrc or whatever):
alias ipython='ipython notebook --FileNotebookManager.notebook_dir=`pwd`' cd data_hacking/fun_with_syslog ipython (as aliased above)
Python Modules Used:
- IPython: Architecture for interactive computing and presentation
- Pandas: Python Data Analysis Library
- Scikit Learn: Machine Learning in Python, Pedregosa et al., JMLR 12, pp. 2825-2830, 2011.
- Matplotlib: Python 2D plotting library
- Detecting Algorithmically Generated Domains (BSidesDFW 2013)
- Hierarchical Clustering of Syslogs (BSidesDFW 2013)
- Exploration of data from Malware Domain List (BSidesDFW 2013)
- SQL Injection (Shmoocon 2014)
- Browser Agent Fingerprinting (Shmoocon 2014)
- PE File Classification (BSides 2014)
- PCAP Exploration (BSidesATX 2014)
- Drive-By PCAP Analysis (ISSW 2014)
- Mach-O Classification (SANS DFIR 2014)
- Yara Clustering (BSides Las Vegas 2014)
- SWF Classification (ShmooCon 2015)
- Java Class File Classification (ShmooCon 2015)
- Apply for membership (free) at caribbeancspa.org > apply, if you have a photo ID from the Caribbean.
- View our list of Members at caribbeancspa.org > members.
This article was contributed by Jason Jacobs from Guyana. Jason is a member of the Caribbean CSPA.