Programming Posts

Upward Mobility: To Storyboard or Not to Storyboard

That is the question for iOS developers

Storyboarding was introduced in iOS 6, and it offers a way to consolidate all of your disparate Interface Building files into a single overarching whole. Although it’s tempting to jump on board and use it just because it’s the new thing, there are some things to keep in mind.

  1. Storyboards require you to make the jump to iOS 6; there’s no backwards compatibility for earlier versions. While this isn’t as much of a factor as it was a year ago, if you have legacy iPad 1 customers, you’re going to be locking them out of your app if you move to storyboarding.
  2. Having all your XIB files consolidated may sound good, but if you have a lot of them, you can end up with a new storyboard that’s so big that it is unwieldy.
  3. Storyboards are more than just a consolidation move; you also have to adopt a whole new programming style to move between your screens. Instead of explicit pushes and pulls off of view controller stacks or opening of modals, you are firing off segues that cause new view controllers to be created and transitions to occur.

The last point isn’t a bad thing per say. It’s a much more MVC-like paradigm: for example, where the overall controller knows what a transition means, and the individual views are only responsible for requesting a transition. It’s also much more like the way that Android does things. But it’s a different style of programming from traditional iOS UI development, and has a learning curve associated with it.
Read more…

Comment |

The Ever-Changing Landscape of Mobile Gaming

Unity, iOS 7, and the Quest for a Great Mobile Game Experience

Jon Manning (@desplesda) and Paris Buttfield-Addison (@parisba) talk with me about where mobile gaming is going in the next 12 months.

Key highlights include:

  • Game-specific APIs and standardized gaming accessories in iOS 7 [Discussed at 0:20]
  • Android needs to catch up [Discussed at 1:55]
  • Are tablets putting handheld consoles from Nintendo and Sony out of business? [Discussed at 3:13]
  • Independent developers vs big game studios – fight! [Discussed at 4:53]
  • Unity is now free for mobile game development [Discussed at 6:02]

You can view the full interview here:

Read more…

Comment |

Open Source convention considers situational awareness in cars, and more

A report from OSCon

Every conference draws people in order to make contacts, but the Open Source convention also inspires them with content. I had one friend withdraw from an important business meeting (sending an associate) in order to attend a tutorial.

Lots of sessions and tutorials had to turn away attendees. This was largely fall-out from the awkward distribution of seats in the Oregon Convention Center: there are just half a dozen ballroom-sized spaces, forcing the remaining sessions into smaller rooms that are more appropriate for casual meetings of a few dozen people. When the conference organizers measure the popularity of the sessions, I suggest that any session at or near capacity have its attendance counted as infinity.

More than 3,900 people registered for OSCon 2013, and a large contingent kept attending sessions all the way through Friday.

Read more…

Comment |

NoSQL Choices: To Misfit or Cargo Cult?

Retreading old topics can be a powerful source of epiphany, sometimes more so than simple extra-box thinking. I was a computer science student, of course I knew statistics. But my recent years as a NoSQL (or better stated: distributed systems) junkie have irreparably colored my worldview, filtering every metaphor with a tinge of information management.

Lounging on a half-world plane ride has its benefits, namely, the opportunity to read. Most of my Delta flight from Tel Aviv back home to Portland lacked both wifi and (in my case) a workable laptop power source. So instead, I devoured Nate Silver’s book, The Signal and the Noise. When Nate reintroduced me to the concept of statistical overfit, and relatedly underfit, I could not help but consider these cases in light of the modern problem of distributed data management, namely, operators (you may call these operators DBAs, but please, not to their faces).

When collecting information, be it for a psychological profile of chimp mating rituals, or plotting datapoints in search of the Higgs Boson, the ultimate goal is to find some sort of usable signal, some trend in the data. Not every point is useful, and in fact, any individual could be downright abnormal. This is why we need several points to spot a trend. The world rarely gives us anything clearer than a jumble of anecdotes. But plotted together, occasionally a pattern emerges. This pattern, if repeatable and useful for prediction, becomes a working theory. This is science, and is generally considered a good method for making decisions.

On the other hand, when lacking experience, we tend to over value the experience of others when we assume they have more. This works in straightforward cases, like learning to cook a burger (watch someone make one, copy their process). This isn’t so useful as similarities diverge. Watching someone make a cake won’t tell you much about the process of crafting a burger. Folks like to call this cargo cult behavior.

How Fit are You, Bro?

You need to extract useful information from experience (which I’ll use the math-y sounding word datapoints). Having a collection of datapoints to choose from is useful, but that’s only one part of the process of decision-making. I’m not speaking of a necessarily formal process here, but in the case of database operators, merely a collection of experience. Reality tends to be fairly biased toward facts (despite the desire of many people for this to not be the case). Given enough experience, especially if that experience is factual, we tend to make better and better decisions more inline with reality. That’s pretty much the essence of prediction. Our mushy human brains are more-or-less good at that, at least, better than other animals. It’s why we have computers and Everybody Loves Raymond, and my cat pees in a box.

Imagine you have a sufficient amount of relevant datapoints that you can plot on a chart. Assuming the axes have any relation to each other, and the data is sound, a trend may emerge, such as a line, or some other bounding shape. A signal is relevant data that corresponds to the rules we discover by best fit. Noise is everything else. It’s somewhat circular sounding logic, and it’s really hard to know what is really a signal. This is why science is hard, and so is choosing a proper database. We’re always checking our assumptions, and one solid counter signal can really be disastrous for a model. We may have been wrong all along, missing only enough data. As Einstein famously said in response to the book 100 Authors Against Einstein: “If I were wrong, then one would have been enough!”

Database operators (and programmers forced to play this role) must make predictions all the time, against a seemingly endless series of questions. How much data can I handle? What kind of latency can I expect? How many servers will I need, and how much work to manage them?

So, like all decision making processes, we refer to experience. The problem is, as our industry demands increasing scale, very few people actually have much experience managing giant scale systems. We tend to draw our assumptions from our limited, or biased smaller scale experience, and extrapolate outward. The theories we then tend to concoct are not the optimal fit that we desire, but instead tend to be overfit.

Overfit is when we have a limited amount of data, and overstate its general implications. If we imagine a plot of likely failure scenarios against a limited number of servers, we may be tempted to believe our biggest odds of failure are insufficient RAM, or disk failure. After all, my network has never given me problems, but I sure have lost a hard drive or two. We take these assumptions, which are only somewhat relevant to the realities of scalable systems and divine some rules for ourselves that entirely miss the point.

overfitting

fitting

In a real distributed system, network issues tend to consume most of our interest. Single-server consistency is a solved problem, and most (worthwhile) distributed databases have some sense of built in redundancy (usually replication, the root of all distributed evil).
Read more…

Comment |

Upward Mobility: A Web of Dependencies

The App Store model has increased the uncertainty of the software release process

The recent unavailability of the Apple Developer’s Portal just underscores how increasingly dependent developers have become on third parties during the software lifecycle. For those who are not following the fun and games, the developer.apple.com sites, which include much of the functionality needed to develop Mac and iOS applications, has been unavailable for more than a week as of this writing. Although iTunes Connect, the portal used to actually deploy apps to the App Stores, has remained available, the remainder of the site territory has been off-limits.  This is all thanks to a security intrusion (evidently by an over-zealous researcher.)

The App Store model has fundamentally changed how software is distributed, mostly for the better (IMHO), but it has also removed some of the control of the release process from the hands of the developers and companies they work for. As I have spelled out previously in my book on iOS enterprise development, the fact that Apple has the final say on if and when software goes into the store has required more conservative release timelines. If you want to release on the first of September, you need to count back at least two weeks for “gold master”, because you need to upload the app, potentially go through a round of rejection from Apple, and then upload a fixed version.

Android apps don’t suffer from this lag, because most of the Android stores don’t do any significant checking of the applications uploaded to them. The Devil’s Deal that Apple developers have made with Apple is that in return for the longer wait time to get apps in the store (and having to follow Apple’s rules), they get a de facto seal of approval from Apple. In other words, it is assumed that apps in the iTunes store are more stringently policed and less likely to crash or do harm (deliberately or else-wise.)

The current downtime has brought that deal into question, however. Suddenly, developers who need new provisioning certificates, passbook certificates, or push notification certificates find themselves with nowhere to go. Even if iTunes Connect is available, it doesn’t do you any good if you can’t get a distribution certificate to sign your app for the store. I’m sure that there are developers at this moment who have had their finely tuned release strategies thrown into disarray by the in-availability of the developer portal.

Being essentially at the mercy of Apple’s whims (or Google’s, for that matter) can’t be a pleasant sensation for a company or individual trying to get a new piece of software out the door. The question that the developer community will have to answer is if the benefits of the App Store model make it worth the hassles, in the long run.

Comment |

Open Source In The Classroom

OSCON 2013 Speaker Series

To teach computer programming to a young person with no experience, you must imagine what it’s like to know nothing about languages, algorithms, data structures, design patterns, object orientation, development tools, etc. Yet the kids I’ve seen in high school over the past several years are so immersed in a world of computing, they have interesting partial understandings of how things work and usually far deeper knowledge of what’s being done with the technologies at a consumer level than their teacher. The contemporary adolescent, a child of the late 1980′s and early 1990′s, has lived a life that parallels the advent of the web with no experience of a world before such powerful new verbs as email, Google, or blog. On the other hand, most have never seen the HTML of a web page or the IP address of a networked computer, and none of them can guess what a compiler does. As a high school computer programming teacher, I am usually the first to help them bridge that enormous and growing gap, and it doesn’t help that the first thing they ask is “How do I write an app for my phone?”

So I need to remember what it was like for me to learn programming for the first time given that, as a child of the early 1970′s, my own life parallels the advent of the home computer. I go backwards in time, way before my modern toolkit mainstays like Java, XML, or Linux (or their amalgam in Android). Programming days came before my Webmaster days, before my analyst days, before I employed scripting languages like Perl, JavaScript, Tcl, or even the Unix shells. I was programming before college, where I used older languages like C, Lisp, or Pascal. When I fondly reminisce about banging out BASIC programs on my beloved Commodore-64 I think surely I’ve arrived at the beginning, but no. When I really do recall the very first time I wrote an original computer program, that was even longer ago: it was 1984. I remember my seventh grade math class, in a Massachusetts public school, being escorted into a lab crammed with no more than a dozen new Apple IIe computers. There we wrote programs in a language called Logo, although everybody just called it “turtle”.

Logo, first created in Cambridge, MA in 1967, was designed from the outset as a programming language for teaching about programming. The open source KDE education project includes an application, KTurtle, for writing programs in Turtlescript, a modern adaptation of Logo. I have found KTurtle to be an extremely effective introduction to computer programming in my own classroom. The complete and well-written Turtlescript language reference can be printed in a small packet of about eight double-sided pages which I can distribute to each pupil. Your first view is a simple IDE, with a code editor that does multi-color highlighting, an inspector for tracing variables and functions, and a canvas for displaying the output of your program. I can demonstrate commands in seconds, gratifying students with immediate results as the onscreen turtle draws lines on the canvas. I can slow the speed of the turtle or step through a program to show what’s going on at each command of a larger program. I can save my final output as an image file, having limited but definite value to a young person. The kids get it, and with about 15 minutes of explanation, a few examples, and a handful of commands they are ready and eager to try themselves.

A modern paradigm of teaching secondary mathematics is giving students a sequence of tasks that give them a progressive understanding of concepts by putting them in situations where they struggle enough to realize the need for new insights. If I ask the students to use KTurtle to draw a square, having only provided them with the basic commands for moving the turtle, they can quickly come up with something like this:

reset
forward 100
turnright 90
forward 100
turnright 90
forward 100
turnright 90
forward 100

If the next tasks involve a greater number of adjacent squares, this approach becomes tiresome. Now I have the opportunity to introduce looping control structures (along with the programming virtue of laziness) to show them a better way:

reset
repeat 4 {
  forward 100
  turnright 90
}

first_tasks

As I give them a few more tasks with squares, some students may continue to try to draw the pictures in one giant block of commands, creating complicated Eulerian trails to trace each figure, but are eventually convinced by the obvious benefits and their peers to make the switch. When they graduate to cutting and pasting their square loop repeatedly, I can introduce subroutines:
Read more…

Comments: 3 |

Controlling Drones with Clojure

OSCON 2013 Speaker Series

Carin Meier (@carinmeier) is Artisan at Neo, Founder of Gigsquid Software and OSCON 2013 Speaker. In this interview we talk about her love of Clojure and how she created a library to control her AR drone with it!

Key highlights include:

  • Clojure, a modern Lisp? [Discussed at 0:20]
  • Immutable data structures make Clojure powerful [Discussed at 1:01]
  • Yes, you can program an AR drone in Clojure [Discussed at 2:04]
  • But, how do you get started? It just takes three lines of code [Discussed at 3:47]

For the code behind the drone and Carin’s language Babar (inspired by Elephant 2000) check out her github page.

You can view the full interview here:

Read more…

Comment |

Go Programming Language for System Administration

OSCON 2013 Speaker Series

Go is the first major systems language to emerge in over a decade, even though computing continues to change at a rapid pace—computers are smaller, faster, and can execute operations in parallel via multi core processors. Even languages like Python and Ruby have grown in popularity in recent years among system administrators, operations, and DevOps personnel. Yet, as a relatively new kid on the block, Go is a versatile and robust language that has plenty to offer.

Let’s go through the list:

Open: It’s Open Source—Go has been open source software since November 2009, reaching Version 1 in March of 2012. It includes a language specification, standard libraries, and custom tools. Being open, Go has long-term stability.

Concurrency: Go provides support for concurrent execution and communication. There is no need to learn multiple ways of dealing with threads. Go greatly simplifies threading by providing goroutines and channels.

Fast compilation: Go compiles at a break-neck speed. It has robust dependency analysis and a rigid dependency specification to avoid wasting time with unused dependencies.

One binary to rule them all: Have you ever had to distribute your script or binary to multiple systems, then worry about libraries and dependencies in general? With Go, you simply don’t have to worry about dependencies. Gc, Go’s default compiler statically links its binaries. You can use go build to compile your code and then distribute it to multiple machines with minimal effort.

Feature rich standard library: The language has a great standard library and many third party Go packages maintained at Bitbucket, Github, Launchpad, or Google Project Hosting.

Readability: Go ceases the debate about the best style of programming by providing a code formatter tool (gofmt) and enforcing it in its standard library. The code you write today will be much easier to read and maintain in a few months or even years by simply sticking to gofmt.

And, Go is a language that grows with you. Take the tour

Comment |

Why Choose a Graph Database

Collaborative filtering with Neo4j

By this time, chances are very likely that you’ve heard of NoSQL, and of graph databases like Neo4j.

NoSQL databases address important challenges that we face today, in terms of data size and data complexity. They offer a valuable solution by providing particular data models to address these dimensions.

On one side of the spectrum, these databases resolve issues for scaling out and high data values using compounded aggregate values, on the other side is a relationship based data model that allows us to model real world information containing high fidelity and complexity.

Neo4j, like many other graph databases, builds upon the property graph model; labeled nodes (for informational entities) are connected via directed, typed relationships. Both nodes and relationships hold arbitrary properties (key-value pairs). There is no rigid schema, but with node-labels and relationship-types we can have as much meta-information as we like. When importing data into a graph database, the relationships are treated with as much value as the database records themselves. This allows the engine to navigate your connections between nodes in constant time. That compares favorably to the exponential slowdown of many-JOIN SQL-queries in a relational database.

property-graph

How can you use a graph database?

Graph databases are well suited to model rich domains. Both object models and ER-diagrams are already graphs and provide a hint at the whiteboard-friendliness of the data model and the low-friction mapping of objects into graphs.

Instead of de-normalizing for performance, you would normalize interesting attributes into their own nodes, making it much easier to move, filter and aggregate along these lines. Content and asset management, job-finding, recommendations based on weighted relationships to relevant attribute-nodes are some use cases that fit this model very well.

Many people use graph databases because of their high performance online query capabilities. They process large amounts or high volumes of raw data with Map/Reduce in Hadoop or Event-Processing (like Storm, Esper, etc.) and project the computation results into a graph. We’ve seen examples of this from many domains from financial (fraud detection in money flow graphs), biotech (protein analysis on genome sequencing data) to telco (mobile network optimizations on signal-strength-measurements).

Graph databases shine when you can express your queries as a local search using a few starting points (e.g., people, products, places, orders). From there, you can follow relevant relationships to accumulate interesting information, or project visited nodes and relationships into a suitable result.

Read more…

Comment |

A Hands-on Introduction to R

OSCON 2013 Speaker Series

R is an open-source statistical computing environment similar to SAS and SPSS that allows for the analysis of data using various techniques like sub-setting, manipulation, visualization and modeling. There are versions that run on Windows, Mac OS X, Linux, and other Unix-compatible operating systems.

To follow along with the examples below, download and install R from your local CRAN mirror found at r-project.org. You’ll also want to place the example CSV into your Documents folder (Windows) or home directory (Mac/Linux).

After installation, open the R application. The R Console will pop-up automatically. This is where R code is processed. To begin writing code, open an editor window (File -> New Script on Windows or File -> New Document on a Mac) and type the following code into your editor:

1+1

Place your cursor anywhere on the “1+1” code line, then hit Control-R (in Windows) or Command-Return (in Mac). You’ll notice that your “1+1” code is automatically executed in the R Console. This is the easiest way to run code in R. You can also run R code by typing the code directly into your R Console, but using the editor is much easier.

If you want to refresh your R Console, click anywhere inside of it and hit Control-L (in Windows) or Command-Option-L (in Mac).

Now let’s create a Vector, the simplest possible data structure in R. A Vector is similar to a column of data inside a spreadsheet. We use the combine function to do so:

raysVector <- c(2, 5, 1, 9, 4)

To view the contents of raysVector, just run the line of code above. After running the code shown above, double-click on raysVector (in the editor) and then run the code that is automatically highlighted after double-clicking. You will now see the contents of raysVector in your R Console.

The object we just created is now stored in memory and we can see this by running the following code:

ls()

R is an interpreted language with support for procedural and object-oriented programming. Here we use the mean statistical function to calculate the statistical mean of raysVector:

mean(raysVector)

Getting help on the mean function is easy using:

?mean

We can create a simple plot of raysVector using:

barplot(raysVector, col = "red")

Importing CSV files is simple too:

data <- read.csv("raysData3.csv", na.strings = "")

We can subset the CSV data in many different ways. Here are two different methods that do the same thing:

data[ 1:2, 2:4 ]
data[ 1:2, c("age", "weight", "height") ]

There are many ways to transform your data in R. Here’s a method that doubles everyone’s age:

dataT <- transform( data, age = age * 2 )

The apply function allows us to apply a standard or custom function without loops. Here we apply the mean function column-wise to the first 3 rows of the dataset in order to analyze the age and height columns of the dataset. We will also ignore missing values during the calculation:

apply( data[1:3, c("age", "height")], 2, mean, na.rm = T )

Here we build a linear regression model that predicts a person’s weight based on their age and height:

raysModel <- lm(weight ~ age + height, data = data)

We can plot our residuals like this:

plot(raysModel$residuals, pch = 15, col = "red")

We can install the Predictive Model Markup Language (PMML) package to quickly deploy our predictive model in a Business Intelligence system without custom SQL: Read more…

Comment |