Clojure Data Analysis Cookbook


Clojure Data Analysis Cookbook
eBook: $32.99
Formats: PDF, PacktLib, ePub and Mobi formats
$28.04
save 15%!
Print + free eBook + free PacktLib access to the book: $87.98    Print cover: $54.99
$54.99
save 37%!
Free Shipping!
UK, US, Europe and selected countries in Asia.
Also available on:
Overview
Table of Contents
Author
Reviews
Support
Sample Chapters
  • Get a handle on the torrent of data the modern Internet has created
  • Recipes for every stage from collection to analysis
  • A practical approach to analyzing data to help you make informed decisions

Book Details

Language : English
Paperback : 342 pages [ 235mm x 191mm ]
Release Date : March 2013
ISBN : 178216264X
ISBN 13 : 9781782162643
Author(s) : Eric Rochester
Topics and Technologies : All Books, Data, Cookbooks, Open Source

Table of Contents

Preface
Chapter 1: Importing Data for Analysis
Chapter 2: Cleaning and Validating Data
Chapter 3: Managing Complexity with Concurrent Programming
Chapter 4: Improving Performance with Parallel Programming
Chapter 5: Distributed Data Processing with Cascalog
Chapter 6: Working with Incanter Datasets
Chapter 7: Preparing for and Performing Statistical Data Analysis with Incanter
Chapter 8: Working with Mathematica and R
Chapter 9: Clustering, Classifying, and Working with Weka
Chapter 10: Graphing in Incanter
Chapter 11: Creating Charts for the Web
Index
  • Chapter 1: Importing Data for Analysis
    • Introduction
    • Creating a new project
    • Reading CSV data into Incanter datasets
    • Reading JSON data into Incanter datasets
    • Reading data from Excel with Incanter
    • Reading data from JDBC databases
    • Reading XML data into Incanter datasets
    • Scraping data from tables in web pages
    • Scraping textual data from web pages
    • Reading RDF data
    • Reading RDF data with SPARQL
    • Aggregating data from different formats
    • Chapter 2: Cleaning and Validating Data
      • Introduction
      • Cleaning data with regular expressions
      • Maintaining consistency with synonym maps
      • Identifying and removing duplicate data
      • Normalizing numbers
      • Rescaling values
      • Normalizing dates and times
      • Lazily processing very large data sets
      • Sampling from very large data sets
      • Fixing spelling errors
      • Parsing custom data formats
      • Validating data with Valip
      • Chapter 3: Managing Complexity with Concurrent Programming
        • Introduction
        • Managing program complexity with STM
        • Managing program complexity with agents
        • Getting better performance with commute
        • Combining agents and STM
        • Maintaining consistency with ensure
        • Introducing safe side effects into the STM
        • Maintaining data consistency with validators
        • Tracking processing with watchers
        • Debugging concurrent programs with watchers
        • Recovering from errors in agents
        • Managing input with sized queues
        • Chapter 4: Improving Performance with Parallel Programming
          • Introduction
          • Parallelizing processing with pmap
          • Parallelizing processing with Incanter
          • Partitioning Monte Carlo simulations for better pmap performance
          • Finding the optimal partition size with simulated annealing
          • Parallelizing with reducers
          • Generating online summary statistics with reducers
          • Harnessing your GPU with OpenCL and Calx
          • Using type hints
          • Benchmarking with Criterium
          • Chapter 5: Distributed Data Processing with Cascalog
            • Introduction
            • Distributed processing with Cascalog and Hadoop
            • Querying data with Cascalog
            • Distributing data with Apache HDFS
            • Parsing CSV files with Cascalog
            • Complex queries with Cascalog
            • Aggregating data with Cascalog
            • Defining new Cascalog operators
            • Composing Cascalog queries
            • Handling errors in Cascalog workflows
            • Transforming data with Cascalog
            • Executing Cascalog queries in the Cloud with Pallet
            • Chapter 6: Working with Incanter Datasets
              • Introduction
              • Loading Incanter's sample datasets
              • Loading Clojure data structures into datasets
              • Viewing datasets interactively with view
              • Converting datasets to matrices
              • Using infix formulas in Incanter
              • Selecting columns with $
              • Selecting rows with $
              • Filtering datasets with $where
              • Grouping data with $group-by
              • Saving datasets to CSV and JSON
              • Projecting from multiple datasets with $join
              • Chapter 7: Preparing for and Performing Statistical Data Analysis with Incanter
                • Introduction
                • Generating summary statistics with $rollup
                • Differencing variables to show changes
                • Scaling variables to simplify variable relationships
                • Working with time series data with Incanter Zoo
                • Smoothing variables to decrease noise
                • Validating sample statistics with bootstrapping
                • Modeling linear relationships
                • Modeling non-linear relationships
                • Modeling multimodal Bayesian distributions
                • Finding data errors with Benford's law
                • Chapter 8: Working with Mathematica and R
                  • Introduction
                  • Setting up Mathematica to talk to Clojuratica for Mac OS X and Linux
                  • Setting up Mathematica to talk to Clojuratica for Windows
                  • Calling Mathematica functions from Clojuratica
                  • Sending matrices to Mathematica from Clojuratica
                  • Evaluating Mathematica scripts from Clojuratica
                  • Creating functions from Mathematica
                  • Processing functions in parallel in Mathematica
                  • Setting up R to talk to Clojure
                  • Calling R functions from Clojure
                  • Passing vectors into R
                  • Evaluating R files from Clojure
                  • Plotting in R from Clojure
                  • Chapter 9: Clustering, Classifying, and Working with Weka
                    • Introduction
                    • Loading CSV and ARFF files into Weka
                    • Filtering and renaming columns in Weka datasets
                    • Discovering groups of data using K-means clustering
                    • Finding hierarchical clusters in Weka
                    • Clustering with SOMs in Incanter
                    • Classifying data with decision trees
                    • Classifying data with the Naive Bayesian classifier
                    • Classifying data with support vector machines
                    • Finding associations in data with the Apriori algorithm
                    • Chapter 10: Graphing in Incanter
                      • Introduction
                      • Creating scatter plots with Incanter
                      • Creating bar charts with Incanter
                      • Graphing non-numeric data in bar charts
                      • Creating histograms with Incanter
                      • Creating function plots with Incanter
                      • Adding equations to Incanter charts
                      • Adding lines to scatter charts
                      • Customizing charts with JFreeChart
                      • Saving Incanter graphs to PNG
                      • Using PCA to graph multi-dimensional data
                      • Creating dynamic charts with Incanter
                      • Chapter 11: Creating Charts for the Web
                        • Introduction
                        • Serving data with Ring and Compojure
                        • Creating HTML with Hiccup
                        • Setting up to use ClojureScript
                        • Creating scatter plots with NVD3
                        • Creating bar charts with NVD3
                        • Creating histograms with NVD3
                        • Visualizing graphs with force-directed layouts
                        • Creating interactive visualizations with D3

                        Eric Rochester

                        Eric Rochester enjoys reading, writing, and spending time with his wife and kids. When he's not doing those things, he programs in a variety of languages and platforms, including websites and systems in Python and libraries for linguistics and statistics in C#. Currently, he's exploring functional programming languages, including Clojure and Haskell. He works at the Scholars' Lab in the library at the University of Virginia, helping humanities professors and graduate students realize their digitally informed research agendas.
                        Sorry, we don't have any reviews for this title yet.

                        Code Downloads

                        Download the code and support files for this book.


                        Submit Errata

                        Please let us know if you have found any errors not listed on this list by completing our errata submission form. Our editors will check them and add them to this list. Thank you.

                        Sample chapters

                        You can view our sample chapters and prefaces of this title on PacktLib or download sample chapters in PDF format.

                        Frequently bought together

                        Clojure Data Analysis Cookbook +    IBM Cognos 8 Planning =
                        50% Off
                        the eBooks

                        Buy both these recommended eBooks together and get 50% off the total price

                        What you will learn from this book

                        • Create beautiful, insightful graphs that you can publish to the Internet
                        • Apply powerful clustering and data mining techniques to better understand your data
                        • Use powerful data analysis libraries like Incanter, Hadoop, and Weka to get things done quickly
                        • Interface with Mathematica and R to use the powerful analysis features they provide
                        • Process data concurrently and in parallel for faster performance
                        • Transform data to make it more useful and easier to analyze

                         

                        In Detail

                        Data is everywhere and it's increasingly important to be able to gain insights that we can act on. Using Clojure for data analysis and collection, this book will show you how to gain fresh insights and perspectives from your data with an essential collection of practical, structured recipes.

                        "The Clojure Data Analysis Cookbook" presents recipes for every stage of the data analysis process. Whether scraping data off a web page, performing data mining, or creating graphs for the web, this book has something for the task at hand.

                        You'll learn how to acquire data, clean it up, and transform it into useful graphs which can then be analyzed and published to the Internet. Coverage includes advanced topics like processing data concurrently, applying powerful statistical techniques like Bayesian modelling, and even data mining algorithms such as K-means clustering, neural networks, and association rules.

                        Approach

                        Full of practical tips, the "Clojure Data Analysis Cookbook" will help you fully utilize your data through a series of step-by-step, real world recipes covering every aspect of data analysis.

                        Who this book is for

                        Prior experience with Clojure and data analysis techniques and workflows will be beneficial, but not essential.

                        Code Download and Errata
                        Packt Anytime, Anywhere
                        Register Books
                        Print Upgrades
                        eBook Downloads
                        Contact Us
                        Awards Voting Nominations Previous Winners
                        Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
                        Resources
                        Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software