Planet Apache

Syndicate content
Updated: 13 hours 59 min ago

Bryan Pendleton: The Name of the Wind: a very short review

Mon, 2015-01-19 22:08

At some point late last fall, I was anticipating having some time to spend with my Kindle, so I bought Patrick Rothfuss's The Name of the Wind

As usual with me, I am about 10 years behind the times, as this book came out some time ago.

But I was looking for a page-turner (is it fair to say that, when you are reading an e-book on an e-reader?)

At any rate, I turned the pages.

And kept turning them (there are a lot of pages...).

And I turned them all the way until the end.

Which is not always the case with me, and a book. I have too little time and too many distractions, and many is the book that I nobly start yet do not finish.

Rothfuss's style appealed to me, because he knows how to take his time with his story. Sometimes books rush along, hurrying to force the tale to be told, cramming adventures and villains and escapades willy-nilly into every page.

But Rothfuss is trying to tell the story of a boy growing up (even though that boy may become a mighty wizard).

And, as every boy knows (and surely, every girl as well), growing up takes its own time, and proceeds on its own schedule.

So, the long and short of it is: I enjoyed The Name of the Wind, and felt it lived up to my expectations.

Rothfuss has written a sequel, and promises that he will complete his story.


When the time comes.

And, down the road, when I find that I again have some time with my Kindle, I expect that I will continue reading Rothfuss, moving on to The Wise Man's Fear.

And see how I like turning those pages.

Categories: FLOSS Project Planets

Nick Kew: Dodgy Data

Mon, 2015-01-19 20:11

Oxfam grabs a headline with a report telling us the richest 1% will own half the world’s wealth in 2016.

As with many reports coming from lobbying organisations, this one provokes scepticism.  Not outright dismissal, but a “really“, and a need to know what they’re actually measuring before I can treat it as meaningful.  It also provokes mild curiosity: how rich do you have to be to be in that 1% (not least because I have a sneaking suspicion it includes a great many people who our chattering classes don’t consider at all rich).

The Oxfam report itself is a mere twelve pages and disappointingly light on data.  If there’s any attempt to substantiate the headline claim then I missed it.  But googling “World Wealth” finds this report, which tells me total world wealth is projected to be $64.3 trillion in 2016.  OK, that’ll do for a ballpark calculation.  $64.3 trillion between 7 billion people is an average of about $9k per head.  If the top 1% own half of it, that’s $32.15 trillion between 70 million people: an average of $459k per head within that top 1%.

That’s £300k.  There must be a millions in Blighty with that much in housing wealth alone (and others correspondingly locked out).  Not to mention in other high-cost countries around Europe, America, Asia, and I expect even a few in the third world.  All above the average of that fabled top 1%.

But of course housing isn’t our only asset.  In Blighty and around the developed world, a big chunk of our wealth takes the form of Entitlements.  One such in the UK is the Basic State Pension, which is worth £200k, and even the poorest Brit is entitled to it.  It seems you can be in that top 1% without being rich enough to buy a house in Blighty!

Hmmm.  Oh dear.  Maybe Oxfam’s spin isn’t really very meaningful at all.  Except perhaps to highlight how incredibly egalitarian we are within Blighty – and probably all developed countries – once you include the effect of government actions.

Categories: FLOSS Project Planets

Igor Galić: equality

Mon, 2015-01-19 19:00

i’ve started reading a book on lisp. as Alan Perlis said:

A language that doesn’t affect the way you think about programming, is not worth knowing.

but this chapter title:

"Truth, Falsehood, and Equality" — sounds like a chapter from legend of korra

— The Wrath of PB™ (@hirojin) January 19, 2015

made me think beyond programming. i’ve been contemplating this in terms of political systems & stories, and i’m thinking that there’s no chance to achieve radical equality:

societies change over generations, as do their their stories. and while, as societies, we frown at those (ancient or contemporary) societies that use murder of prisoners and slaves as entertainment, our stories are filled with such things.

the fight for power.
the struggle against corrupt power.
we even have to fight for love.

we have no need for equality, because the stories we are raised with neither prepare us for what such an equal society would look like, nor do they raise a desire to achieve it.

we are inching ourselves towards it. that’s societal change over generations. i’m starting to fear the only way we know how to radically change is to erase the past, and that would be profoundly dangerous.

even more dangerous than forgetting the (often recent) past, and regress into “good old” patterns.

Categories: FLOSS Project Planets

Justin Mason: Links for 2015-01-19

Mon, 2015-01-19 18:58
  • carbon-c-relay

    A much better carbon-relay, written in C rather than Python. Linking as we’ve been using it in production for quite a while with no problems.

    The main reason to build a replacement is performance and configurability. Carbon is single threaded, and sending metrics to multiple consistent-hash clusters requires chaining of relays. This project provides a multithreaded relay which can address multiple targets and clusters for each and every metric based on pattern matches.

    (tags: graphite carbon c python ops metrics)

  • Surveillance of social media not way to fight terrorism – Minister

    Blanket surveillance of social media is not the solution to combating terrorism and the rights of the individual to privacy must be protected, Data Protection Minister Dara Murphy said on Monday. [He] said Ireland and the European Union must protect the privacy rights of individuals on social media. “Freedom of expression, freedom of movement, and the protection of privacy are core tenets of the European Union, which must be upheld.”

    (tags: dara-murphy data-protection privacy surveillance europe eu ireland social-media)

Categories: FLOSS Project Planets

Jim Jagielski: Telaen 2.0 Status

Mon, 2015-01-19 17:19

As noted in a previous blog post, I've started working on the 2.0 version of Telaen: a simple but powerful PHP-based Webmail system. Quite a bit has been changed, fixed and added under-the-covers, including baselining PHP 5.4, a more robust installation checker, and some significant performance increases.

However, as I was working to make the backend stuff as up-to-date as possible, it became increasingly obvious that Telaen's UI was extremely dated. It was functional, yes, but made very limited use of CSS, HTML5, Javascript, etc, all of which combine to affect the user experience. Luckily, a very good friend of mine, Mike Hill, has started work on a new UI for Telaen, making it not only more streamlined and attractive, but also much more functional as well.

Now I know, of course, that there are a number of other PHP webmail offerings out there, so some may be questioning the need for yet another. I can think of a few reasons:

  • Telaen is designed to have as few dependencies as possible; the goal is that any typical PHP setup will be able to run Telaen.
  • No external database is required.
  • Extensive support for both IMAP and POP; to be honest, most webmail systems don't support POP at all, or are extremely limited in their support.
  • Consistent functionality, no matter which IMAP/POP server is used; most webmail systems are simple "front ends" for IMAP servers, meaning the capability of the webmail system depends on what IMAP server is being used. Telaen puts that capability within the webmail system for a consistent feature set.
  • Fast caching
  • Designed to serve as both someone's primary Email client, as well as their supplemental client.
  • Lots of what you need/want, and none of what you don't: Telaen is as simple as it can be, but no more so.
  • A fast and secure upgrade path for all those people still using UebiMaiu
  • Open to ALL contributions!

The last point is important: we really want as many people as possible to use, contribute, drive and develop Telaen. It's a great project for someone just starting out as well as for more experienced developers. Or if your passion is documentation, we could definitely use your help! In fact, however you want to be involved, we want to welcome you to the project.

Our goal is to have a beta available sometime within a month's timeframe. Stay tuned!

Categories: FLOSS Project Planets

Colm O hEigeartaigh: Apache Santuario - XML Security for Java 2.0.3 and 1.5.8 released

Mon, 2015-01-19 10:53
Versions 2.0.3 and 1.5.8 of Apache Santuario - XML Security for Java have been released. Version 2.0.3 contains a critical security advisory (CVE-2014-8152) in relation to the new streaming XML Signature support introduced in version 2.0.0:
For certain XML documents, it is possible to modify the document and the streaming XML Signature verification code will not report an error when trying to validate the signature.

Please note that the "in-memory" (DOM) API for XML Signature is not affected by this issue, nor is the JSR-105 API. Also, web service stacks that use the streaming functionality of Apache Santuario (such as Apache CXF/WSS4J) are also not affected by this vulnerability.Apart from this issue, version 2.0.3 contains a significant performance improvement, and both releases contain minor bug fixes and dependency upgrades.
Categories: FLOSS Project Planets

Shawn McKinney: Fortress RBAC Accelerator PDP Benchmark Report

Mon, 2015-01-19 10:34
Benchmark Overview

This post provides a summary of a recent benchmark effort for the Fortress RBAC Accelerator.  The RBAC accelerator uses LDAPv3 extended operations to perform the following access control functions:

  1. Create Session – attempts to authenticate client; if successful, initiates an RBAC session by activating one or more user roles
  2. Check Access – determines if user has access rights for a given resource
  3. Add Active Role – attempts activation for a given role into user’s RBAC session
  4. Drop Active Role – deactivates a given role from user’s RBAC session
  5. Delete Session – deletes the given RBAC session from the server
  6. Session Roles – Returns the active roles associated with current session

The result of each of the above functions are persisted to LMDB for audit trail.

Benchmarks performed using a Jmeter test client to drive load for CheckAccess (#2).  The server hosts the OpenLDAP daemon which has the RBAC accelerator overlay.

Client Machine
  • operating system: ubuntu 13.04
  • kernel: 3.8.0-32-generic
  • processor: Intel® Core™ i7-4702MQ CPU @ 2.20GHz × 8
  • memory: 16GB (doesn’t use anywhere close to that)
  • Java version 7
Server Machine
  • operating system: ubuntu 14.04
  • kernel: 3.13.0-32-generic
  • processor: Intel® Core™ i7-4980HQ CPU @ 2.80GHz × 4
  • memory: 8GB
  • OpenLDAP version: 2.4.39
Test Details
  • 25 threads running on client
  • each thread runs checkAccess 50,000 times
  • 1,250,000 total
  • Client CPU load: approximately 50%
Test Results
  • Response time: 1 millisecond
  • Throughput: 11,533 transactions per second
  • Server CPU load: approximately 85%

Categories: FLOSS Project Planets

Bryan Pendleton: Modern TV: a very short review

Mon, 2015-01-19 10:18

We've been watching some very good TV recently.

A few that stuck out to me:

  • Vera follows Detective Vera Stanhope, following the books of Ann Cleeves. It's set in Newcastle, England, and it is wonderfully compelling. It's gritty yet human, and the sights and sounds of Newcastle fit the show perfectly.
  • The Fall is a police procedural set in Belfast, Northern Ireland, as an English policewoman is brought in to take charge of a case that has gone cold. The astonishing thing about The Fall is its pace: it takes two full seasons, about 15 hours of watching, to tell a story that many other shows might spend 90 minutes on. By really slowing down and digging in, the series becomes riveting; you simply cannot stop watching it once you start.
  • Jack Taylor is a strong show made from the books of Ken Bruen, set in Galway, Ireland. The character of Taylor is heart-breakingly self-destructive, but oh! the shows are so strong.
  • Longmire is based on Craig Johnson's Sheriff Walt Longmire books. It's set in rural Wyoming, on the Wyoming / Montana border (though actually filmed in New Mexico, I believe), and although the lead character is good, what makes the show is the superb richness of the supporting characters and cast.
  • In Plain Sight is sort of a one-woman show. Mary McCormack plays Marshall Mary Shannon, an inspector in the Witness Protection Program who is stationed in Albuquerque, New Mexico. Again, the wild west feel of the show is great, but we've also grown to love the supporting cast of this show, even through its rough edges.
  • Continuum is a fascinating Sci-Fi Channel show that riffs upon the time travel concept with some great writing and an interesting plot. It's got it's flaws, but we've certainly enjoyed it.
  • Orphan Black is another fascinating science fiction show, with a completely different plot. Most of the attention gathered by Orphan Black is hard to reveal without spoiling it, but the reality of the show is completely timely and believable, leading to lots of interesting discussions while you watch.
  • The League is a comedy about a group of friends who stay close by participating in a fantasy football league. But that gives such short shrift to a wonderfully funny and human show.
  • And Community is simply the funniest show you've never heard of. At least 5 laugh-out-loud moments in every 30 minute episode; great writing combined with a cast who clearly are having a delightful time with the show.

A friend commented to me recently that he barely watched movies anymore, because the TV series quality has become so high.

Perhaps it's just a burst of activity, but it's nice to get such great entertainment at the touch of the button at the end of a long hard day.

Categories: FLOSS Project Planets

Asankha Perera: How the UltraESB and AdroitLogic was born..

Sun, 2015-01-18 21:15
The UltraESB and AdroitLogic was born 5 years ago today! In my personal blog, I have captured some of the history behind starting this up!

The "personal" blog of Asankha Perera: How the UltraESB and AdroitLogic was born..
Categories: FLOSS Project Planets

Justin Mason: Links for 2015-01-18

Sun, 2015-01-18 18:58
  • Amazing comment from a random sysadmin who’s been targeted by the NSA

    ‘Here’s a story for you. I’m not a party to any of this. I’ve done nothing wrong, I’ve never been suspected of doing anything wrong, and I don’t know anyone who has done anything wrong. I don’t even mean that in the sense of “I pissed off the wrong people but technically haven’t been charged.” I mean that I am a vanilla, average, 9-5 working man of no interest to anybody. My geographical location is an accident of my birth. Even still, I wasn’t accidentally born in a high-conflict area, and my government is not at war. I’m a sysadmin at a legitimate ISP and my job is to keep the internet up and running smoothly. This agency has stalked me in my personal life, undermined my ability to trust my friends attempting to connect with me on LinkedIn, and infected my family’s computer. They did this because they wanted to bypass legal channels and spy on a customer who pays for services from my employer. Wait, no, they wanted the ability to potentially spy on future customers. Actually, that is still not accurate – they wanted to spy on everybody in case there was a potentially bad person interacting with a customer. After seeing their complete disregard for anybody else, their immense resources, and their extremely sophisticated exploits and backdoors – knowing they will stop at nothing, and knowing that I was personally targeted – I’ll be damned if I can ever trust any electronic device I own ever again. You all rationalize this by telling me that it “isn’t surprising”, and that I don’t live in the [USA,UK] and therefore I have no rights. I just have one question. Are you people even human?’

    (tags: nsa via:ioerror privacy spying surveillance linkedin sysadmins gchq security)

  • DRI’s Unchanged Position on Eircode

    ‘Broadly, they are satisfied with what we are doing’ versus: ‘We have deep concerns about the Eircode initiative… We want to state clearly that we are not at all ‘satisfied’ with the postcode that has been designed or the implementation proposals.’

    (tags: dri ireland eircode postcodes privacy data-protection quotes misrepresentation)

Categories: FLOSS Project Planets

Justin Mason: Links for 2015-01-17

Sat, 2015-01-17 18:58
  • Misogyny in the Valley

    The young women interns [in one story in this post] worked in a very different way. As I explored their notes, I noticed that ideas were expanded upon, not abandoned. Challenges were identified, but the male language so often heard in Silicon Valley conference rooms – “Well, let me tell you what the problem with that idea is….” – was not in the room.  These young women, without men to define the “appropriate business behavior,” used different behaviors and came up with a startling and valuable solution. They showed many of the values that exist outside of dominance-based leadership: strategic thinking, intuition, nurturing and relationship building, values-based decision-making and acceptance of other’s input. Women need space to be themselves at work. Until people who have created their success by worshipping at the temple of male behavior, like Sheryl Sandberg, learn to value alternate behaviors, the working world will remain a foreign and hostile culture to women. And if we do not continuously work to build corporate cultures where there is room for other behaviors, women will be cast from or abandoned in a world not of our making, where we continuously “just do not fit in,” but where we still must go to earn our livings.

    (tags: sexism misogyny silicon-valley tech work sheryl-sandberg business collaboration)

  • Are you better off running your big-data batch system off your laptop?

    Heh, nice trolling.

    Here are two helpful guidelines (for largely disjoint populations): If you are going to use a big data system for yourself, see if it is faster than your laptop. If you are going to build a big data system for others, see that it is faster than my laptop. [...] We think everyone should have to do this, because it leads to better systems and better research.

    (tags: graph coding hadoop spark giraph graph-processing hardware scalability big-data batch algorithms pagerank)

  • BBC uses RIPA terrorism laws to catch TV licence fee dodgers in Northern Ireland

    Give them the power, they’ll use that power. ‘A document obtained under Freedom of Information legislation confirms the BBC’s use of RIPA in Northern Ireland. It states: “The BBC may, in certain circumstances, authorise under the Regulation of Investigatory Powers Act 2000 and Regulation of Investigatory Powers (British Broadcasting Corporation) Order 2001 the lawful use of detection equipment to detect unlicensed use of television receivers… the BBC has used detection authorised under this legislation in Northern Ireland.”‘

    (tags: ripa privacy bbc tv license-fee uk northern-ireland law scope-creep)

  • Australia tries to ban crypto research – by ACCIDENT • The Register

    Researchers are warned off [discussing] 512-bits-plus key lengths, systems “designed or modified to perform cryptanalytic functions, or “designed or modified to use ‘quantum cryptography’”. [....] “an email to a fellow academic could land you a 10 year prison sentence”. notes ‘the DSGL 5A002 defines it as >512bit RSA, >512bit DH, >112 bit ECC and >56 bit symmetric ciphers; weak as fuck i say.’

    (tags: law australia crime crypto ecc rsa stupidity fail)

Categories: FLOSS Project Planets

Luciano Resende: Building a Yarn cluster using containers

Sat, 2015-01-17 18:26
In my previous post, we went trough the basic steps on building a basic standalone container image. Now, let's explore a more advanced scenario, building an Apache Hadoop Yarn Cluster similar to the topology described below: Using Docker containers is proving to be a very viable and lightweight way build/simulate a local Yarn Cluster, compared with using heavy VMs. See below all the steps you need to get started and build your own Yarn cluster in your desktop. Dockerfile - The recipe for building the Image While building an Yarn cluster image, we have to take care of the few main things :
  • Configure passwordless ssh across all cluster containers.
  • Download, install and configura Java.
  • Download, install and configure Apache Yarn:
    • Configure Namenode and Datanode connectivity.
    • Enable dynamic Datanodes to connect to Namenode.
  • Configure Network:
    • Network connectivity.
    • Expose Yarn ports required by Administration UI and Node communication.
Below is a sample docker file that will handle most of the items above, with exception of some network connectivity, which is going to be handled during container initialization.


USER root

# install dev tools
RUN yum install -y curl which tar sudo openssh-server openssh-clients rsync
RUN yum update -y libselinux

# passwordless ssh
RUN ssh-keygen -q -N "" -t dsa -f /etc/ssh/ssh_host_dsa_key
RUN ssh-keygen -q -N "" -t rsa -f /etc/ssh/ssh_host_rsa_key
RUN ssh-keygen -q -N "" -t rsa -f /root/.ssh/id_rsa
RUN cp /root/.ssh/ /root/.ssh/authorized_keys

# java
RUN curl -LO '' -H 'Cookie: oraclelicense=accept-securebackup-cookie'
RUN rpm -i jdk-7u71-linux-x64.rpm
RUN rm jdk-7u71-linux-x64.rpm

ENV JAVA_HOME /usr/java/default

# hadoop
RUN curl -s | tar -xz -C /usr/local/
RUN cd /usr/local && ln -s ./hadoop-2.6.0 hadoop

ENV HADOOP_PREFIX /usr/local/hadoop
ENV HADOOP_COMMON_HOME /usr/local/hadoop
ENV HADOOP_HDFS_HOME /usr/local/hadoop
ENV HADOOP_MAPRED_HOME /usr/local/hadoop
ENV HADOOP_YARN_HOME /usr/local/hadoop
ENV HADOOP_CONF_DIR /usr/local/hadoop/etc/hadoop

RUN sed -i '/^export JAVA_HOME/ s:.*:export JAVA_HOME=/usr/java/default\nexport HADOOP_PREFIX=/usr/local/hadoop\nexport HADOOP_HOME=/usr/local/hadoop\n:' $HADOOP_PREFIX/etc/hadoop/
RUN sed -i '/^export HADOOP_CONF_DIR/ s:.*:export HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop/:' $HADOOP_PREFIX/etc/hadoop/

RUN mkdir $HADOOP_PREFIX/input
RUN cp $HADOOP_PREFIX/etc/hadoop/*.xml $HADOOP_PREFIX/input

# pseudo distributed
ADD core-site.xml $HADOOP_PREFIX/etc/hadoop/core-site.xml
#RUN sed s/HOSTNAME/localhost/ /usr/local/hadoop/etc/hadoop/core-site.xml.template > /usr/local/hadoop/etc/hadoop/core-site.xml
ADD hdfs-site.xml $HADOOP_PREFIX/etc/hadoop/hdfs-site.xml

ADD mapred-site.xml $HADOOP_PREFIX/etc/hadoop/mapred-site.xml
ADD yarn-site.xml $HADOOP_PREFIX/etc/hadoop/yarn-site.xml

RUN $HADOOP_PREFIX/bin/hdfs namenode -format

# fixing the like a boss
RUN rm /usr/local/hadoop/lib/native/*
RUN curl -Ls | tar -x -C /usr/local/hadoop/lib/native/

ADD ssh_config /root/.ssh/config
RUN chmod 600 /root/.ssh/config
RUN chown root:root /root/.ssh/config

ADD /etc/
RUN chown root:root /etc/
RUN chmod 700 /etc/


# workingaround build error
RUN ls -la /usr/local/hadoop/etc/hadoop/*
RUN chmod +x /usr/local/hadoop/etc/hadoop/*
RUN ls -la /usr/local/hadoop/etc/hadoop/*

# fix the 254 error code
RUN sed -i "/^[^#]*UsePAM/ s/.*/#&/" /etc/ssh/sshd_config
RUN echo "UsePAM no" >> /etc/ssh/sshd_config
RUN echo "Port 2122" >> /etc/ssh/sshd_config

CMD ["/etc/", "-d"]

EXPOSE 50020 50090 50070 50010 50075 8031 8032 8033 8040 8042 49707 22 8088 8030
DYI - Building the image
sudo docker build -t yarn-cluster .
Getting Started - Launching Yarn nodes In order to simplify what process to start when launching a NameNode/NodeManager versus a DataNode, a boostrap shell script is used and it supports a --namenode and --datanode parameter which is used in conjunction with the docker run command to launch the Yarn node. When launching the NameNode/NodeManager, there is also a need to map the ports used by the Yarn UI administration applications so it can be accessed ouside of the containers. Below is the command to launch a NameNode/NodeManager node. Note that we use the -p to map the ports, and then we use --namenode to start the proper Yarn services.
sudo docker run -i -t -p 8088:8088 -p 50070:50070 -p 50075:50075 --name namenode -h namenode yarn-cluster /etc/ -bash -namenode
Now that the master node is up and running, let's add some DataNodes to our cluster. A peculiarity of launching the DataNodes is that they need to be aware of the NameNode location, and for this, docker enable containers be linked, which will cause the local /etc/hosts to be updated with the address of the linked container. Below is the command to launch a DataNode node. Note how the --link parameter links the DataNode container to the NameNode container, and also how the --datanode now receives a different parameter to properly start only Yarn DataNode related services.
sudo docker run -i -t --link namenode:namenode --workdir /usr/local/hadoop yarn-cluster /etc/ -bash -datanode
After launching a few images, the DataNode administration ui will then look like the one below : Conclusion Using containers is a very good and lightweight option to build a Hadoop Yarn cluster, but in order to get it to the next level, there are few other items that need to be thought trough and solved, like a few described below :
  • Managing machine resources available for each container : cpu, memory, etc.
  • Strategy for non-transient persistent data.
  • Hack aware data replication, when in container environment.
  • etc.
Also, note that all the source code used to build this Yarn Cluster is also available in the github repository: docker-yarn-cluster.
Categories: FLOSS Project Planets

Luciano Resende: Running your applications in a Container

Sat, 2015-01-17 14:04
In my previous post, I described how you can build a template application to use as a start point for your node.js applications. In this post, we will learn how to run that application in a container. So, what is ? Docker is an open platform for developers and sysadmins to build, ship, and run distributed applications. Consisting of Docker Engine, a portable, lightweight runtime and packaging tool, and Docker Hub, a cloud service for sharing applications and automating workflows, Docker enables apps to be quickly assembled from components and eliminates the friction between development, QA, and production environments. As a result, IT can ship faster and run the same app, unchanged, on laptops, data center VMs, and any cloud. How to build your docker image: In order to build the docker image, we need to create the "recipe" for the image in a Dockerfile.
FROM centos:centos6

MAINTAINER Luciano Resende

# Enable EPEL repository for GIT, Node.js and npm
RUN rpm -Uvh

# Install Git, Node.js and npm
RUN yum install -y git nodejs npm

# Checkout node-app-template from github
RUN git clone /opt/node-app-template

# Install app dependencies
RUN ls /opt/node-app-template
RUN cd /opt/node-app-template; npm install

# Node app is running on port 3000

# Define what to run when container is started
CMD ["node", "/opt/node-app-template/bin/www"]
Now that we have the "recipe" for building the docker image, we can build it with
sudo docker build --rm --no-cache -t node-app .
To run the application, we need to start a docker container based on the image we have just created. Note that, when we start the container, we are redirecting the public port 8080 to the exposed internal port 3000.
sudo docker run -p 8080:3000 -d node-app
Now, we are ready to access the application, just start your browser and point it to
Hope this helps you get started with Docker containers. All the source code is also available in the github repository: node-app-container.
Categories: FLOSS Project Planets

Nick Kew: I won’t be going to FOSDEM

Sat, 2015-01-17 13:32

Belgian cities full of trigger-happy armed troops, with orders to shoot to kill, and a recent track record of doing so.

In reality, probably a lower risk than regular vehicular traffic, even for those of us with an ample beard and a big backpack.  Though surely a far higher risk than the supposed terrorist threat.  But that level of security theatre is hardly welcoming to visitors.  Since I have the choice, I’m staying away, and withholding the support that might be inferred from my travelling to Brussels for a weekend in the near future.

It’s a bit of a shame: I missed last year’s FOSDEM too due to family commitments.  Maybe next year?

[edit] That last sentence is a bit disingenuous, insofar as it suggests this is a big change of plan.  In reality I hadn’t decided one way or the other.  I’ve been doing that of late: I only got around to signing up for ApacheCon in Budapest the day before it started!

Categories: FLOSS Project Planets

Justin Mason: Links for 2015-01-16

Fri, 2015-01-16 18:58
  • A Case Study of Toyota Unintended Acceleration and Software Safety

    I drive a Toyota, and this is scary stuff. Critical software systems need to be coded with care, and this isn’t it — they don’t even have a bug tracking system!

    Investigations into potential causes of Unintended Acceleration (UA) for Toyota vehicles have made news several times in the past few years. Some blame has been placed on floor mats and sticky throttle pedals. But, a jury trial verdict was based on expert opinions that defects in Toyota’s Electronic Throttle Control System (ETCS) software and safety architecture caused a fatal mishap.  This talk will outline key events in the still-ongoing Toyota UA litigation process, and pull together the technical issues that were discovered by NASA and other experts. The results paint a picture that should inform future designers of safety critical software in automobiles and other systems.

    (tags: toyota safety realtime coding etcs throttle-control nasa code-review embedded)

Categories: FLOSS Project Planets

Steve Loughran: InfoSec risks of android travel applications

Fri, 2015-01-16 16:43

I was offline for 95% of the xmas break, instead investing my keyboard time into: (a) the exercises in Structure and Interpretation of Computer Programs and (b) writing some stuff on the implications of the Sony debacle for my home network security architecture.

I'm going to start posting the latter articles in an out-of-order sequence, with this post: InfoSec risks of android travel applications

1. Airline checkin & Travel apps- demand so many privileges that you can't trust corporate calendar/contact data to stay on the devices. Nor, in the absence of audit logs, can you tell if the information has leaked.

2. Budget Airline applications are the least invasive, "premium" airlines demand access to confidential calendar info.

3. Even train timetable apps like to know things like your contact list.

However hard you lock down your network infrastructure, mandate 26 digit high-unicode passwords rolled monthly, mandate encrypted phones and pin-protected SIM cards, if those phones say "android" when they boot you can't be confident that sensitive corporate data isn't leaking out of those phones if the users expect to be able to use their phones to check on buses, trains or airplanes.


Normally the fact that Android apps can ask and get near-unlimited data access is viewed as a privacy concern. It is for home users. Once you do any of the following, it becomes an InfoSec issue:
  • Synchronise calendar with a work email service.
  • Maintain a contact list which includes potentially confidential contact/customers
  • Bond to a work Wifi network which offers network access to HTTP(S) sites without some form of auth.
  • Do the same via VPN
What is fascinating is that apps asking for access to calendar info -especially "confidential event information"- is something that mainstream airline travel apps demand in exchange for giving you the ability to check in to a flight on your phone, look at schedules and your tickets. Android does not provide a way to directly prevent this.

Demands of Applications

Noticing that one application update needed to want more information than I was expected, I went through all the travel apps on my android phone and looked at what permissions they demanded. These weren't explicitly installed for the experiment, simply what I use to fly on airlines, and some train and bus ones in the UK. I'm excluding tripit on the basis that their web infrastructure requests (optional) access to your google emails to autoscan for trip plans, which is in a different league from these.

EntityCalendarContactsNetworkLocation British Airwaysconfidential, participantsNoYesPrecise United Airlinesconfidential, participantsNoYes; view network connectionsPrecise EasyjetNoNoYesPrecise RyanairNoNoYesPrecise National RailAdd, modify, participantsNoYesPrecise National Express CoachNoYesYes; view network connections & wifiPrecise First Great Western trainsNoNoYesPrecise trainlineNoNoYes; view network connectionsPrecise First BusNoNoYes; view network connectionsPrecise
When you look at this list, its appalling. Why does the company that I use to get a bus to LHR need to know my contact list? Why does BA need my confidential appointment data? Why does the UK National Rail app need to be able to enumerate the calendar and send emails to participants without the owner's knowledge?

British Airways: wants access to confidential calendar info and full network access. What could possibly go wrong?

United: wants to call numbers, take photos and access confidential calendar info

National Express Bus Service
This is a bus company. How can they justify reading my contact list -business as well as personal?

UK National Rail
Pretty much total phone control, though not confidential appointment info. Are event titles considered confidential though?

Google's business model is built on knowing everything about your personal life -but this isn't about privacy, it is about preventing data leakage from an organisation. If anyone connects to your email services from an android, your airline checkin apps get to see the title, body and participants in all calendar appointments, whether that is "team meeting" or "plans for takeover of Walmart" where the participants include Jim Bezos and Donald Trump(*).

What could be done?
  1. Log accesses. I can't see a way to do this today, yet it would seem a core feature IT security teams would like to know. Without it you can't tell what information apps have read.
  2. Track provenance of calendar events and restrict calendar access only to events created by the airline apps themselves. This would require the servers to add event metadata; as google own gmail they could add a new BigTable column with ease.
  3. Restrict network access HTTPS sites on specific subdomains. Requiring HTTPS is good for general wifi security, and stops (most) organisations from playing DNS games to get behind the firewall.
Above and beyond that: allow users to easily restrict what privileges applications actually get. Don't want to give an app access to the contacts? Flip a switch and have the API call return an empty list. Want to block confidential calendar access? Another switch, another empty payload. Apple lets me do that with foreground/background location data collection on their devices through a simple list of apps, but google doesn't.

In the absence of that feature, if you want to be able to check in on your android phone on a non-budget airline, you have to give up expectations of the security of your confidential calendar data and contact list.

And in a world of BYOD, where the IT dept doesn't have control of the apps on a phone, that means they can't stop sensitive calendar/contact data leaking at all.

(*) FYI, there are no appointments in my calendar discussing taking over Walmart that include both Jim Besos and Donald Trump. I cannot confirm or deny any other meetings with these participants or plans for Walmart involving other participants. Ask British Airways or UAL if you don't believe me.
Categories: FLOSS Project Planets

Tom White: Hadoop for Science

Fri, 2015-01-16 10:57
Some of the largest datasets are generated by the sciences. For example, the Large Hadron Collider produces around 30PB of data a year. I'm interested in the technologies and tools for analyzing these kind of datasets, and how they work with Hadoop, so here's a brief post.

Open DataAmazon S3 seems to be emerging as the de facto solution for sharing large datasets. In particular, AWS curates a variety of public data sets that can be accessed for free (from within AWS; there are egress charges otherwise). To take one example from genomics, the 1000 Genomes project hosts a 200TB dataset on S3.

Hadoop has long supported S3 as a filesystem, but recently there has been a lot of work to make it more robust and scalable. It’s natural to process S3-resident data in the cloud, and here there are many options for Hadoop. The recently released Cloudera Director, for example, makes it possible to run all the components of CDH in the cloud.

NotebooksBy "notebooks" I mean web-based, computational scientific notebooks, exemplified by the IPython Notebook. Notebooks have been around in the scientific community for a long time (they were added to IPython in 2011), but increasingly they seem to be reaching the larger data scientist and developer community. Notebooks combine prose and computation, which is great for exposition and interactivity. They are also easy to share, which helps foster collaboration and reproducibility of research.

It’s possible to run IPython against PySpark (notebooks are inherently interactive, so working with Spark is the natural Hadoop lead in), but it requires a bit of manual set up. Hopefully that will get easier—ideally Hadoop distributions like CDH will come with packages to run an appropriately-configured IPython notebook server.

Distributed Data FramesIPython supports many different languages and libraries. (Despite its name IPython is not restricted to Python; in fact, it is being refactored into more modular pieces as a part of the Jupyter project.) Most notebook users are data scientists, and the central abstraction that they work with is the data frame. Both R and pandas, for example, use data frames, although both systems were designed to work on a single machine.

The challenge is to make systems like R and pandas work with distributed data. Many of the solutions to date have addressed this problem by adding MapReduce user libraries. However, this is unsatisfactory for several reasons, but primarily because the user has to explicitly think about the distributed case and can’t use the existing libraries on distributed data. Instead, what’s needed is a deeper integration so that the same R and pandas libraries work on local and distributed data.

There are several projects and teams working on distributed data frames, including Sparkling Pandas (which has the best name), Adatao’s distributed data frame, and Blaze. All are at an early stage, but as they mature the experience of working with distributed data frames from R or Python will become practically seamless. Of course, Spark already provides machine learning libraries for Scala, Java, and Python, which is a different approach to getting existing libraries like R or Pandas running on Hadoop. Having multiple competing solutions is broadly a good thing, and something that we see a lot of in open source ecosystems.

Combining the PiecesImagine if you could share a large dataset and the notebooks containing your work in a form that makes it easy for anyone to run them—it’s a sort of holy grail for researchers.

To see what this might look like, have a look at the talk by Andy Petrella and Xavier Tordoir on Lightning fast genomics, where they used a Spark Notebook and the ADAM genomics processing engine to run a clustering algorithm over a part of the 1000 Genomes dataset. It combines all the topics above—open data, cloud computing, notebooks, and distributed data frames—into one.

There’s still work to be done to expand the tooling and to make the whole experience smoother, nevertheless this demo shows that it's possible for scientists to analyse large amounts of data, on demand and in a way that is repeatable, using powerful high-level machine learning libraries. I'm optimistic that tools like this will become commonplace in the not-to-distant future.
Categories: FLOSS Project Planets

Steve Loughran: "It is not necessary to have experience of Hadoop"

Fri, 2015-01-16 10:24

I don't normally post LinkedIn approaches, especially from our competitors, but this one was so painful it blew my "do your research" criteria so dramatically it merits coverage.

FWIW my reply was: this is some kind of spoof, no?

On 01/16/15 2:56 AM, Jessica <surname omitted to avoid embarrassment> wrote:    
Hi Steve,

I hope you are well?

We are currently hiring at Cloudera to expand our Customer Operations Engineering team.

We are looking to build this team significantly over the coming months and this is a rare opportunity to become involved in Cloudera's Engineering department.

The role is home based with very little travel required (just for training).

We are looking for people with strong Linux backgrounds and good experience with programming languages. It is not necessary to have experience of Hadoop - we will teach you !!

For the chance to be part of this team please send me your CV to <email omitted to avoid embarrassment> alternatively we can organise a time to speak for me to tell you more about the role?



As an aside, I am always curious why recruiter emails always start with "I hope you are well?".

a) We both know that the recruiter doesn't care about my health as long as it doesn't impact my ability to work with colleagues, customers and, once my training in Hadoop is complete, maybe even to learn understand how to use things like DelegationTokenAuthenticatedURL —that being what I am staring at right now.(*)

b) We both know that she doesn't actually want details like "well, the DVLA consider my neurological issues under control enough for me to drive again —even down to places like the Alps, and the ripped up tendon off my left kneecap is manageable enough for me to do hill work when I get there"

(*) If anyone has got Jersey+SPNEGO hooked up to UserGroupInformation, I would love that code.

Categories: FLOSS Project Planets

Danny Angus: Privacy In Public III

Fri, 2015-01-16 07:58
Google Glass Explorer Program Shuts DownI'm not going to say a lot about this, except that I'm glad, and I hope that the next step for this tech takes seriously into account the privacy-in-public implications of people walking around with cameras streaming whatever they see.
Anyone who was ever concerned with the level of surveillance in modern society by CCTV, helmet cams, the hacking of web-cams, and the use this can be put to by the nefarious activities of GCHQ and the US NSA, will be pleased that the headlong rush to turn us all into autonomous surveillance drones has paused for thought.
Lets hope Google use the pause to reflect on this.
Categories: FLOSS Project Planets

Justin Mason: Links for 2015-01-15

Thu, 2015-01-15 18:58
  • Group warns of postcode project dangers | Irish Examiner

    “We have spoken to the National Consumer Agency, logistics companies and Digital Rights Ireland, with which we have had an indepth conversation to see if there is anything in the proposal that might be considered to have an impact on anyone’s privacy. Broadly, they are satisfied with what we are doing,” [Patricia Cronin, head of the Department of Communications’ postcodes division] told the committee. However in his letter, [DRI's] O’Lachtnain said the group “want to state clearly that we are not at all ‘satisfied’ with the postcode that has been designed or the implementation proposals”. Some nerve!

    (tags: dri nca privacy patricia-cronin goverment postcodes eircode dpc ireland)

Categories: FLOSS Project Planets