Planet Apache

Syndicate content
Updated: 17 hours 5 min ago

Ortwin Glück: [Code] Gentoo to require initramfs

Mon, 2014-11-10 04:30
A Portage news item announces today that Gentoo is about to upgrade to udev-217. This upgrade will eliminate the userspace firmware loader. The change requires not only a kernel config change (CONFIG_FW_LOADER_USER_HELPER=N). It also may require an initramfs, if the kernel has drivers built-in that want to load firmware before the root filesystem is mounted. Examples of such drivers are: iwlwifi (Intel WiFi chips) or bnx2 (Broadcom NetXtreme II). Without userspace fw loading these drivers will fail to load their firmware. The standard solution to this problem is to use an initramfs.

A module-less initramfs is sufficient for that case (modules are built-in already), and it can easily be shared among different kernels.

Categories: FLOSS Project Planets

Bryan Pendleton: An early lead for Carlsen

Sun, 2014-11-09 12:21

After two games in the 2014 World Chess Championship, Magnus Carlsen has won game two.

In game one, Anand had white, and although it was hard-fought, Carlsen held the draw.

In game two, Carlsen had white, and won.

The details of the games, of course, are far above my ability to comprehend, but they are certainly beautiful to watch.

Categories: FLOSS Project Planets

Matt Raible: Happy Birthday Abbie!

Sun, 2014-11-09 03:09

It's hard to believe that my daughter, Abbie, is now 12 years old. She's in 6th grade now, attending middle school and loving that she gets to choose her classes. I'm particularly happy to see her studying video game design and programming simple games. The picture on the right is of Abbie and her horse, Medallion. Unfortunately, he had to be put down the very next day because of colic. Trish and Abbie were leasing him, but were planning on buying him. It was a very sad day in the Raible household.

Abbie is still taking horseback riding lessons and might start participating in horse shows next year. For the ski season, we got her the 6th-Grade Passport. For $100, she can ski a few days at all the resorts in Colorado. We don't plan to travel as much as we did last year, but we do plan on skiing a bunch.

Happy Birthday Abbie! You're an awesome 12-year-old and we had a great time celebrating your birthday with you tonight.

Categories: FLOSS Project Planets

Igor Galić: a friend died last night

Sat, 2014-11-08 19:00

A friend died last night. I write these words after the first gush of tears has stopped.

He said his good-bye to most of his friends at a dinner party. An unusual good-bye.

I love you all.

Three hours later he jumped in front of a train.

I’ve talked to two people on the phone since I got the message. One of them couldn’t believe it. One of them couldn’t help him.

We asked each other if we have someone to hug, and ensured each other that we’re not alone, and we cried, and we cried again.

I cry, because he was alone. Despite all the people at that party, despite all those close friends.

And I cry because I fear he won’t be the last. I can’t think of a single person in my circle of close friends who has a functional family, or who is not mentally ill.

I cry because I can so vividly imagine what he went through. And I cry because I can also imagine what he went through in those three hours.

In loving memory of a dear friend, I just wish to say:

Don’t let depression lie to you.

You are not alone.

We love you.

Categories: FLOSS Project Planets

Justin Mason: Links for 2014-11-07

Fri, 2014-11-07 18:58
Categories: FLOSS Project Planets

Colm O hEigeartaigh: Apache Syncope 1.2 tutorial - part II

Fri, 2014-11-07 11:32
The previous tutorial on the new features of Apache Syncope 1.2 showed how to use the new UI installer to deploy Apache Syncope to Apache Tomcat, using MySQL for persistent storage. Last year we covered how to import users (and roles) from backend resources such as a database or a directory. An important new feature of Apache Syncope 1.2 is the ability to import non-cleartext passwords into Syncope when synchronizing from backend resources (and also the ability to propagate non-cleartext passwords to resources). The default behaviour is to hash the password according to the global configuration parameter 'password.cipher.algorithm' (defaults to SHA-1). This is problematic if the password is already hashed, as user authentication via the Syncope REST API will then fail.

1) Create policies in Apache Syncope

The first step is to start Apache Syncope and to create some policies for account and password creation, as well as synchronization. Start Syncope and go to the Configuration tab. Select "Policies" and create new "global" policy types for both "Account", "Password" and "Synchronization", with some sensible default values.

2) Synchronizing non-cleartext passwords from Apache Derby.

This is an update from the previous blog entry on importing users from Apache Derby using Syncope 1.1. Follow step 1 "Creating a Schema attribute" and step 2 "Apache Derby" in the previous blog. However, in section 2.b, rather than adding users with plaintext passwords, use the following user value instead when creating a table:

INSERT INTO USERS VALUES('dave', '8eec7bc461808e0b8a28783d0bec1a3a22eb0821', 'true', 'yellow');

Instead of using a plaintext password value, the second field is the SHA-1 encoded value of "security". In section 3.a "Define a Connector", it is necessary to change the "Password cipher algorithm" value from "CLEARTEXT" to "SHA1". In step 3.b "Define a Resource", it is necessary to specify an external attribute for the Username mapping of "NAME". Finally, in step 3.c "Create a synchronization task", use the "DBSyncPasswordActions" action class. This class treats the password retrieved from the table as encoded according to the "Password cipher algorithm" parameter of the Connector ("SHA1" in this case), and to store it directly in Syncope without subsequently hashing it again, which is what would happen for the plaintext case. Note that the presumption is that the (hashed) password is HEX encoded in the table.

After executing the synchronization task, then start a browser and navigate to "http://localhost:8080/syncope/rest/users/self", logging on as "dave" and "security".

3) Synchronizing non-cleartext passwords from Apache DS.

This is an update from the previous blog entry on importing users and roles from an LDAP backend such as Apache DS into Apache Syncope 1.1. Follow the first step in the previous tutorial to set up Apache DS and import users and groups. Add some users, e.g. "colm", this time with a SHA-256 encoded password. Importing users with encoded passwords from LDAP is a bit more sophisticated than the DB case above, because individual users can have different digest algorithms with the LDAP synchronization case, whereas all users must have the same digest algorithm for the DB synchronization case.

Start up Syncope, and follow the steps given in the previous tutorial to create a new connector and resource. The only difference with Syncope 1.2 is that you need to specify the external attribute for both the Username and Rolename mapping ("cn" in both cases for this example). Finally, create the Synchronization task as per the previous tutorial. However this time add both the LDAPPasswordSyncActions and LDAPMembershipSyncActions classes as "Actions classes". Finally execute the task, and check to see if the users + roles were imported successfully into Syncope. You can then log on via
"http://localhost:8080/syncope/rest/users/self" using any of the users imported from Apache DS, regardless of the internal cipher algorithm that was used.
Categories: FLOSS Project Planets

Claus Ibsen: Apache Camel please explain me what these endpoint options mean

Fri, 2014-11-07 06:03
In the upcoming Apache Camel 2.15, we have made Camel smarter. It is now able to act as a teacher and explain to you how its configured and what those options mean.

The first lesson Camel can do is to tell you how all the endpoints have been configured and what these option mean.

Lessons we are working on next is to let Camel explain the options for the EIPs are.

Okay a picture is worth a thousand words, so let me show a screenshot from Apache Karaf, where you can use the new endpoint-explain command to explain how the endpoints have been configured.

The screenshot from Apache is from the SQL example which I have installed in Karaf. This example uses a number of endpoints, and among those a timer to trigger every 5 seconds. As you can see from above, the command list the endpoint uri: timer://foo?period=5s and then explain the option(s) below. As the uri only has 1 option, there is only one listed. We can see that the option is named period. Its java type is a long. The json schema type is integer. We can see the value is 5s, and below the description which explains what the value does.
So why is there two types listed? The idea is that there is a type that is suitable for tooling etc, as it has a simpler category of types accordingly to the JSonSchema specification. The actual type in Java is listed as well.The timer endpoint has many more options, so we can use the --verbose option to list all the options, as shown below:

The explain endpoint functionality is also available as JMX or as Java API on the CamelContext. For JMX each endpoint mbean has an explain operation that returns a tabular data with the data as above. This is illustrated in the screenshot below from jconsole:

In addition there is a generic explainEndpointJson operation on the CamelContext MBean, this allows to explain any arbitrary uri that is provided. So you can explain endpoints that are not in use by Camel.

So how does this works?During the built of the Apache Camel release, for each component we generate a HTML and JSon schema where each endpoint option is documented with their name, type, and description. And for enums we list the possible values.

Here is an example of such a json schema for the camel-sql component:

Now for this to work, the component must support the uri options, which requires to annotation the endpoint with the @UriEndpoint. Though the Camel team has not migrated all the 160+ components in the Camel release yet. But we plan to migrate the components over time.

And certainly now where we have this new functionality, it encourages us to migrate all the components.

So where do we get the documentation? Well its just java code, so all you have to do is to have getter/setter for an endpoint option. Add the @UriParam annotation, and for the setter you just add javadoc. Yes we grab the javadoc as the documentation. So its just documented in one place and its in the source code, as standard javadoc.
I hope we in the future can auto generate the Camel website documentation for the components, so we do not have to maintain that separately in its wiki system. But that would take hard work to implement. But eventually we should get there, so every component is documented in the source code. For example we could have a for each component that has all the component documentation, and then the endpoint options is injected from the Camel built system into that file automatic. Having files also allow github users to browse the Camel component documentation nicely using github style ;o
So what is next?The hawtio web console will integrate this as well, so users with Camel 2.15 onwards have that information in the web console out of the box.

And then its onwards to include documentation about the EIP in the XML schemas for Spring/Blueprint users. And improve the javadoc for the EIPs, as that then becomes the single source of documentation as well. This then allows tooling such as Eclipse / IDEA / Netbeans and whatnot to show the documentation when people develop their Camel routes in the XML editor, as the documentation is provided in the XSD as xsd:documentation tags.

We have captured some thoughts what else to do in the CAMEL-7999 ticket. If you have any ideas what else to improve or whatnot, then we love feedback from the community.

Categories: FLOSS Project Planets

Justin Mason: Links for 2014-11-06

Thu, 2014-11-06 18:58
Categories: FLOSS Project Planets

Matt Raible: Why I prefer IntelliJ IDEA over Eclipse

Thu, 2014-11-06 16:03

Over the last couple months, I've received a few emails asking why I prefer IntelliJ IDEA over Eclipse. They usually go something like this:

I keep seeing you recommending IntelliJ. I keep trying it intermittently with using Eclipse, but I feel like I'm missing something obvious that makes so many people think it's better. Granted having the usual plugins incorporated is nice, but other things like the build process and debugger sometimes seems a step back from Eclipse. Could you please blog a '10 reasons why I love IntelliJ' or point me to something that would clue me in?

I grew to love IntelliJ for a few reasons. It all started in 2006 when I decided to migrate AppFuse from Ant to Maven. Before that, I was a huge Eclipse fan (2002 - 2006). Before Eclipse, I used HomeSite, an HTML Editor to write all my Java code (1999-2002). Eclipse was the first IDE that didn't hog all my system's memory and was pleasant to work with.

The reason I started using IntelliJ in 2006 was because of it's multi-module Maven support. Eclipse's Maven support was terrible, and m2e hasn't gotten a whole lot better in recent years AFAIK.

Back then, I used to think everything should be built and run from the command line. A couple years later, I realized it was better to run tests and debug from an IDE. Now I'm more concerned with the ability to run tests and debug in an IDE than I am from the build system.

In 2009, I started doing a lot more front-end work: writing HTML, CSS and JavaScript. I also started digging into alternate languages for these: Jade, GWT, CoffeeScript, LESS, SASS - even Scala. I found IntelliJ's support, and plugins, to be outstanding for these languages and really enjoyed how it would tell me I had invalid JavaScript, HTML and CSS.

My original passion in software was HTML and JavaScript and I found that hasn't changed in the last 15 years. AFAIK, Eclipse still has terrible web tools support; it excels at Java (and possibly C++ support). Even today, I write most of my HTML code (for InfoQ and this blog) in IntelliJ.

In reality, it probably doesn't matter which IDE you use, as long as you're productive with it. Once you learn one IDE well, the way others do things will likely seem backwards. I'm so familiar with debugging in IntelliJ, that when I tried to use Eclipse's debugger a few weeks ago, it seemed backwards to me.

In a nutshell: the technologies I've worked with have been better embraced by IntelliJ. Has this happened to you? Have certain technologies caused you to use one IDE over another?

Categories: FLOSS Project Planets

Colm O hEigeartaigh: Apache Syncope 1.2 tutorial - part I

Thu, 2014-11-06 07:26
Apache Syncope is a powerful and flexible open source tool to manage and orchestrate user identities for the enterprise. Last year, I wrote a series of four tutorials on Apache Syncope. The first covered how to create an Apache Syncope project, how to set up a MySQL database for internal storage, and how to deploy Apache Syncope to Apache Tomcat. The second covered how to import user identities and attributes from a database (Apache Derby) into Syncope. The third covered how to import users and roles from an LDAP backend (Apache DS) into Syncope. Finally, the fourth tutorial covered the REST API of Apache Syncope, as well as a set of Apache CXF-based testcases to demonstrate how to use the REST API of Apache Syncope for authentication and authorization.

This will be the first post in a new set of tutorials for Apache Syncope, with a focus on updating the previous set of tutorials (based on Syncope 1.1) with some updated features and new functionality that is available in the recently released 1.2.0 release. In this post we will cover how to use the new UI installer for creating a Apache Syncope project and deploying it to a container. This tutorial can be viewed as a more user-friendly alternative to the first tutorial of the previous series. Please also see the Syncope documentation on using the installer.

1) Set up a database for Internal Storage

The first step in setting up a standalone deployment of Apache Syncope is to decide what database to use for Internal Storage. Apache Syncope persists internal storage to a database via Apache OpenJPA. In this article we will set up MySQL, but see here for more information on using PostgreSQL, Oracle, etc. Install MySQL in $SQL_HOME and create a new user for Apache Syncope. We will create a new user "syncope_user" with password "syncope_pass". Start MySQL and create a new Syncope database:

  • Start: sudo $SQL_HOME/bin/mysqld_safe --user=mysql
  • Log on: $SQL_HOME/bin/mysql -u syncope_user -p
  • Create a Syncope database: create database syncope; 

2) Set up a container to host Apache Syncope

The next step is to figure out in what container to deploy Syncope to. In this demo we will use Apache Tomcat, but see here for more information about installing Syncope in other containers. Install Apache Tomcat to $CATALINA_HOME. Now we will add a datasource for internal storage in Tomcat's 'conf/context.xml'. When Syncope does not find a datasource called 'jdbc/syncopeDataSource', it will connect to internal storage by instantiating a new connection per request, which carries a performance penalty. Add the following to 'conf/context.xml':

<Resource name="jdbc/syncopeDataSource" auth="Container"
    testWhileIdle="true" testOnBorrow="true" testOnReturn="true"
    validationQuery="SELECT 1" validationInterval="30000"
    maxActive="50" minIdle="2" maxWait="10000" initialSize="2"
    removeAbandonedTimeout="20000" removeAbandoned="true"
    logAbandoned="true" suspectTimeout="20000"
    timeBetweenEvictionRunsMillis="5000" minEvictableIdleTimeMillis="5000"
    username="syncope_user" password="syncope_pass"

Uncomment the "<Manager pathname="" />" configuration in context.xml as well. The next step is to enable a way to deploy applications to Tomcat using the Manager app. Edit 'conf/tomcat-users.xml' and add the following:

<role rolename="manager-script"/>
<user username="manager" password="s3cret" roles="manager-script"/>

Next, download the JDBC driver jar for MySQL and put it in Tomcat's 'lib' directory. As we will be configuring a connector for a Derby resource in a future tutorial, also download the JDBC driver jar for Apache Derby and put it in Tomcat's 'lib' directory as well.

3) Run the Installer

Download and run the installer via 'java -jar syncope-installer-1.2.0-uber.jar'. You need to enter some straightforward values such as the installation path of the project, the Apache Maven home directory, the groupId/artifactId of the project, the directories where logs/bundles/configuration are stored.

Next, select "MySQL" as the database technology from the list, and give "syncope_user" and "syncope_pass" as the username + password, or whatever you have configured earlier when setting up MySQL. Select "Tomcat" as the application server (make sure the 'syncopeDataSource' is checked), and enter values for the address, port, manager username and password:

The installer will then create a Apache Syncope project + deploy it to Tomcat:

When the installer has finished, startup a browser and go to "localhost:8080/syncope-console", logging in as "admin/password". You should see the following:

Categories: FLOSS Project Planets

Rob Davies: Fabric8 version 2.0 released - Next Generation Platform for managing and deploying enterprise services anywhere

Thu, 2014-11-06 05:09
The Fabric8 open source project started 5 years ago, as a private project aimed at making large
deployments of Apache Camel, CXF, ActiveMQ etc easy to deploy and manage.

At the time of its inception, we looked at lots of existing open source solutions that we could leverage to provide the flexible framework that we knew our users would require. Unfortunately, at that time nothing was a good fit, so we rolled our own - with core concepts based around:

  • Centralised Control
  • runtime registry of services and containers
  • Managed Hybrid deployments from laptop, to open hybrid (e.g. OpenShift)

All services were deployed into a Apache Karaf runtime, which allowed for dynamic updates of running services. The modularisation using OSGi had some distinct advantages around the dynamic deployment of new services, and container service discovery, and a consistent way of administration. However, this also meant that Fabric8 was very much tied to the Karaf runtime, and forced anyone using Fabric8 and Camel to use OSGi too.

We are now entering a sea-change for immutable infrastructure, microservices and open standardisation around how this is done. Docker and Kubernetes are central to that change, and are being backed with big investments. Kubernetes in particular, being based on the insurmountable experience that google brings to clustering containers at scale, will drive standardisation across the way containers are deployed and managed. It would be irresponsible for Fabric8 not to embrace this change, but to do it in a way that makes it easy for Fabric8 1.x users to migrate. By taking this path, we are ensuring that Fabric8 users will be able to benefit from the rapidly growing ecosystem of vendors and projects that are providing applications and tooling around Docker, but also frees Fabric8 users to be able to move their deployments to any of the growing list of platforms that support Kubernetes.  However, we are also aware that there are many reasons users have to want to use a platform that is 100% Java - so we support that too!

The goal of Fabric8 v2 is to utilise open source, and open standards. To enable the same way of configuring and monitoring services as Fabric8 1.x, but to do it for any Java based service, on any operating system. We also want to future proof the way users work, which is way adopting Kubernetes is so important: you will be able to leverage this style of deployment anywhere.
Fabric8 v2 is already better tested, more nimble and more scalable than any previous version we've released, and as Fabric8 will also be adopted as a core service in OpenShift 3, it will hardened at large scale very quickly.

So some common questions:

Does this mean that Fabric8 no longer supports Karaf ?
No - Karaf is one of the many container options we support in Fabric8. You can still deploy your apps in the same way as Fabric8 v1, its just that Fabric8 v2 will scale so much better :).

Is ZooKeeper no longer supported ?
In Fabric8 v1 - ZooKeeper was used to implement the service registry. This is being replaced by Kubernetes. Fabric8 will still run with Zookeeper however, to enable cluster coordination, such as master-slave elections for messaging systems or databases.

I've invested a lot of effort in Fabric8 v1 - does all this get thrown away ?
Absolutely not. Its will be straightforward to migrate to Fabric8 v2.

When should I look to move off Fabric8 v1 ?
As soon as possible. There's a marked improvement in features, scalability and manageability.

We don't want to use Docker - can we still use Fabric8 v2?
Yes - Fabric8 v2 also has a pure Java implementation, where it can still run "java containers"

Our platforms don't support Go - does that preclude us from running Fabric8 v2 ?
No -  although Kubernetes relies on the Go programming language, we understand that won't be an option for some folks, which is why fabric8 has an optional Java implementation. That way you can still use the same framework and tooling, but it leaves open the option to simply change the implementation at a later date if you require the performance, application density and scalability that running Kubernetes on something like Red Hat's OpenShift  or Google's Cloud Platform can give you.

We are also extending the services that we supply with fabric8, from metric collection, alerting, auto-scaling, application performance monitoring and other goodies:

Over the next few weeks, the fabric8 community will be extending the quick starts to demonstrate how easy it is to run micro services, as well application containers in Fabric8. You can run Fabric8 on your laptop (using 100% Java if you wish), or your in-house bare metal (again running 100% Java if you wish) or to any PaaS running Kubernetes.

Categories: FLOSS Project Planets

Justin Mason: Links for 2014-11-05

Wed, 2014-11-05 18:58
Categories: FLOSS Project Planets

Chris Hostetter: What Could Go Wrong? – Stump The Chump In A Rum Bar

Wed, 2014-11-05 17:56

The first time I ever did a Stump The Chump session was back in 2010. It was scheduled as a regular session — in the morning if I recall correctly — and I (along with the panel) was sitting behind a conference table on a dais. The session was fun, but the timing, and setting, and seating, made it feel very stuffy and corporate..

We quickly learned our lesson, and subsequent “Stump The Chump!” sessions have become “Conference Events”. Typically held at the end of the day, in a nice big room, with tasty beverages available for all. Usually, right after the winners are announced, it’s time to head out to the big conference party.

This year some very smart people asked me a very smart question: why make attendees who are having a very good time (and enjoying tasty beverages) at “Stump The Chump!”, leave the room and travel to some other place to have a very good time (and enjoy tasty beverages) at an official conference party? Why not have one big conference party with Stump The Chump right in the middle of it?

Did I mention these were very smart people?

So this year we’ll be kicking off the official “Lucene/Solr Revolution Conference Party” by hosting Stump The Chump at the Cuba Libre Restaurant & Rum Bar.

At 4:30 PM on Thursday, (November 13) there will be a fleet of shuttle buses ready and waiting at the Omni Hotel’s “Parkview Entrance” (on the South East side of the hotel) to take every conference attendee to Cuba Libre. Make sure to bring your conference badge, it will be your golden ticket to get on the bus, and into the venue — and please: Don’t Be Late! If you aren’t on a shuttle buses leaving the Omni by 5:00PM, you might miss the Chump Stumping!

Beers, Mojitos & Soft Drinks will be ready and waiting when folks arrive, and we’ll officially be “Stumping The Chump” from 5:45 to 7:00-ish.

The party will continue even after we announce the winners, and the buses will be available to shuttle people back to the Omni. The last bus back to the hotel will leave around 9:00 PM — but as always, folks are welcome to keep on partying. There should be plenty of taxis in the area.

To keep up with all the “Chump” news fit to print, you can subscribe to this blog (or just the “Chump” tag).

The post What Could Go Wrong? – Stump The Chump In A Rum Bar appeared first on Lucidworks.

Categories: FLOSS Project Planets

Jim Jagielski: Starting on Telaen 2.x

Wed, 2014-11-05 12:14

After a somewhat long sabbatical, I'm energized about rebooting the Telaen Project.

Partly, this is due to jaguNET migrating to using Dovecot for both POP3 and IMAP, and the realization that an upgraded webmail system would be the perfect compliment. Now for sure, Telaen is a great PHP-based webmail system, and, in fact, has served (from what I can tell) as inspiration and source for numerous other webmail systems as well (such as T-Dah Webmail, for example), but I had let it lay fallow for quite awhile and, well, it's showing its age. And to be honest, except for some of the larger, and more complex and dependency-ridden offerings out there, it seems that no real PHP webmail packages are being actively developed.

So, I've gone ahead and create the telean_1.x branch and master on the git repo will be the source of Telaen 2.0 development. In no particular order, I plan the 2.0 version including the following:

  • Removal of PHP4 support and baselining PHP5.3 at a minimum.
  • Faster indexing by utilizing sqlite3 instead of PHP arrays
  • Better and more complete IMAP interaction
  • Better SPAM handling, especially related to auto-population of the Telaen internal SPAM folder (right now, if the user creates a real, IMAP SPAM folder, Telaen gets awfully confused)

In all cases, the design goals of keeping Telaen as simple and streamlined as possible, and avoiding as many dependencies as possible, will be kept and honored. In fact, the only dependency "added", that I can foresee at this time, is sqlite3 capability, which is default for PHP5.x anyway. However, I do plan on adding some hooks so that if people want to use MySQL or Postgres, they will be able to.

If interested, check out the Github page, and help develop the code, add features or wish lists, find and patch bugs, etc... 

Categories: FLOSS Project Planets

Bryan Pendleton: Carlsen Anand II

Wed, 2014-11-05 10:03

There's just three days until Carlsen-Anand begins.

The match will be held in Sochi, Russia: here's the official site.

The games are played at 3 PM Sochi time; according to WorldTimeBuddy, this will be 4 AM my time.

So I can wake up each day with the very best of chess!

Michael Aigner has a nice preview here.

Categories: FLOSS Project Planets

Edward J. Yoon: 한국, 그리고 실리콘밸리 Tier 1 메이저 투자사의 차이

Wed, 2014-11-05 09:40
실리콘밸리 Tier 1 메이저 투자사가 보는것은 딱 두 가지다:

첫째, 유저 베이스 (The number of users of some product or service) 가 얼마나 큰가?
둘째, 원천기술이 있는가?

사실 기업의 평판 자산이나 경험 자산은 그것이 독보적이지 않는 한 큰 의미는 없다고 본다.

잠시 내 분야 얘기로 돌아가서, SIGMOD14 "Are We Experiencing a Big Data Bubble?"[1] 에 대해 내 생각은 이렇다.

초반 빅 데이터 진영에서 빠른 유저 베이스 확보를 위해 꺼낸 카드가 바로 SQL과 R인데, 이것이 결국 기술의 본질 희석과 거품론을 생산하게 된 계기라고 보고 있다. 빅 데이터 진영은 단순 덩어리 키우기 M&A와 Exit에 집중하기보다는 본질 집중과 생각의 전환이 필요한 시점인듯 하다.

여튼 다시.. 한국은 어떨까, 내가 들은 얘기들은 거의 다음과 같다:

첫째, 사람
둘째, 사람
셋째, 사람

더불어 평판과 경험 자산, 그리고 현금 흐름이 우선이며, 또 클라우드>빅 데이터>사물 인터넷 과 같은 트렌드 변화에 굉장히 민감하게 반응한다.

남의 얘기를 잘 들어야한다는 둥 이건 도대체 뭥미? ㅋ 한때는 초짜 스타트업퍼로써 이러한 투자 철학이 왠지 낭만스럽기도 했지만 지금은 생각이 조금 달라졌다. 나도 이제 슬슬 결판을 낼때가 오는것일까!?

Categories: FLOSS Project Planets

Justin Mason: Links for 2014-11-04

Tue, 2014-11-04 18:58
  • Zookeeper: not so great as a highly-available service registry

    Turns out ZK isn’t a good choice as a service discovery system, if you want to be able to use that service discovery system while partitioned from the rest of the ZK cluster:

    I went into one of the instances and quickly did an iptables DROP on all packets coming from the other two instances.  This would simulate an availability zone continuing to function, but that zone losing network connectivity to the other availability zones.  What I saw was that the two other instances noticed the first server “going away”, but they continued to function as they still saw a majority (66%).  More interestingly the first instance noticed the other two servers “going away”, dropping the ensemble availability to 33%.  This caused the first server to stop serving requests to clients (not only writes, but also reads). So: within that offline AZ, service discovery *reads* (as well as writes) stopped working due to a lack of ZK quorum. This is quite a feasible outage scenario for EC2, by the way, since (at least when I was working there) the network links between AZs, and the links with the external internet, were not 100% overlapping. In other words, if you want a highly-available service discovery system in the fact of network partitions, you want an AP service discovery system, rather than a CP one — and ZK is a CP system. Another risk, noted on the Netflix Eureka mailing list at : ZooKeeper, while tolerant against single node failures, doesn’t react well to long partitioning events. For us, it’s vastly more important that we maintain an available registry than a necessarily consistent registry. If us-east-1d sees 23 nodes, and us-east-1c sees 22 nodes for a little bit, that’s OK with us. I guess this means that a long partition can trigger SESSION_EXPIRED state, resulting in ZK client libraries requiring a restart/reconnect to fix. I’m not entirely clear what happens to the ZK cluster itself in this scenario though. Finally, Pinterest ran into other issues relying on ZK for service discovery and registration, described at ; sounds like this was mainly around load and the “thundering herd” overload problem. Their workaround was to decouple ZK availability from their services’ availability, by building a Smartstack-style sidecar daemon on each host which tracked/cached ZK data.

    (tags: zookeeper service-discovery ops ha cap ap cp service-registry availability ec2 aws network partitions eureka smartstack pinterest)

  • Why We Didn’t Use Kafka for a Very Kafka-Shaped Problem

    A good story of when Kafka _didn’t_ fit the use case:

    We came up with a complicated process of app-level replication for our messages into two separate Kafka clusters. We would then do end-to-end checking of the two clusters, detecting dropped messages in each cluster based on messages that weren’t in both. It was ugly. It was clearly going to be fragile and error-prone. It was going to be a lot of app-level replication and horrible heuristics to see when we were losing messages and at least alert us, even if we couldn’t fix every failure case. Despite us building a Kafka prototype for our ETL — having an existing investment in it — it just wasn’t going to do what we wanted. And that meant we needed to leave it behind, rewriting the ETL prototype.

    (tags: cassandra java kafka scala network-partitions availability multi-region multi-az aws replication onlive)

  • Madhumita Venkataramanan: My identity for sale (Wired UK)

    If the data aggregators know everything about you — including biometric data, healthcare history, where you live, where you work, what you do at the weekend, what medicines you take, etc. — and can track you as an individual, does it really matter that they don’t know your _name_? They legally track, and sell, everything else.

    As the data we generate about ourselves continues to grow exponentially, brokers and aggregators are moving on from real-time profiling — they’re cross-linking data sets to predict our future behaviour. Decisions about what we see and buy and sign up for aren’t made by us any more; they were made long before. The aggregate of what’s been collected about us previously — which is near impossible for us to see in its entirety — defines us to companies we’ve never met. What I am giving up without consent, then, is not just my anonymity, but also my right to self-determination and free choice. All I get to keep is my name.

    (tags: wired privacy data-aggregation identity-theft future grim biometrics opt-out healthcare data data-protection tracking)

  • Linux kernel’s Transparent Huge Pages feature causing 300ms-800ms pauses

    bad news for low-latency apps. See also its impact on redis:

    (tags: redis memory defrag huge-pages linux kernel ops latency performance transparent-huge-pages)

  • Please grow your buffers exponentially

    Although in some cases x1.5 is considered good practice. YMMV I guess

    (tags: malloc memory coding buffers exponential jemalloc firefox heap allocation)

  • How I created two images with the same MD5 hash

    I found that I was able to run the algorithm in about 10 hours on an AWS large GPU instance bringing it in at about $0.65 plus tax. Bottom line: MD5 is feasibly attackable by pretty much anyone now.

    (tags: crypto images md5 security hashing collisions ec2 via:hn)

Categories: FLOSS Project Planets

Bryan Pendleton: 3 tools for system software testing

Tue, 2014-11-04 16:36

I very much enjoyed reading three papers which all happened to be part of the same session of the 11th USENIX Symposium on Operating Systems Design and Implementation.

The session that captivated me was called "Pest Control," and it included these papers:

  • Torturing Databases for Fun and ProfitHere we propose a method to expose and diagnose violations of the ACID properties. We focus on an ostensibly easy case: power faults. Our framework includes workloads to exercise the ACID guarantees, a record/replay subsystem to allow the controlled injection of simulated power faults, a ranking algorithm to prioritize where to fault based on our experience, and a multi-layer tracer to diagnose root causes.
  • All File Systems Are Not Created Equal: On the Complexity of Crafting Crash-Consistent ApplicationsWe find that applications use complex update protocols to persist state, and that the correctness of these protocols is highly dependent on subtle behaviors of the underlying file system, which we term persistence properties. We develop a tool named BOB that empirically tests persistence properties, and use it to demonstrate that these properties vary widely among six popular Linux file systems. We build a framework named ALICE that analyzes application update protocols and finds crash vulnerabilities, i.e., update protocol code that requires specific persistence properties to hold for correctness.
  • SKI: Exposing Kernel Concurrency Bugs through Systematic Schedule ExplorationIn this paper, we propose SKI, the first tool for the systematic exploration of possible interleavings of kernel code. SKI finds kernel bugs in unmodified kernels, and is thus directly applicable to different kernels. To achieve control over kernel interleavings in a portable way, SKI uses an adapted virtual machine monitor that performs an efficient analysis of the kernel execution on a virtual multiprocessor platform. This enables SKI to determine which kernel execution flows are eligible to run, and also to selectively control which flows may proceed.

In a lot of ways, these papers are all very similar. They all examine the very challenging problem of finding bugs in extremely complex system software.

And they all take the approach of building a tool to find such bugs.

And all of their tools use lower-level capabilities to examine and interact with the software under test: one tool captures the SCSI commands that are sent to the storage system, and is capable of then manipulating those captures to simulate various power failures and disk errors that can occur; another tool captures the system calls that are made from the application to the operating system, and can manipulate those system calls in various ways; the third tool uses virtualization infrastructure to capture and manipulate the actions between the operating system and the (virtual) hardware.

This sort of tool-based testing is wonderful, I think.

I've seen similar tools (in concept) that do things with captured/replayed network traces; that do things with captured/replayed transaction logs; that do things with captured/replayed web server logs; etc.

It's extremely hard to go "the last mile" when testing complex system software, so I'm always enthusiastic when I see people building powerful testing tools.

Categories: FLOSS Project Planets

David Reid: The A9

Tue, 2014-11-04 13:29

The A9 is an unusual road – it has it’s own website!

OK, to be fair, the website is actually for the A9 Road Safety Group (RSG) but their sole focus is on making the A9 safer The site provides a lot of details and shows their suggestions (now implemented) to make the road safer together with the various documents they have used to make their decisions. Many of these are the usual documents provided by political bodies, such as the RSG, and are therefore of limited interest. One or two are useful and worth a read.

One fact that quickly comes to light is the reliance on the experiences of the A77 speed camera implementation for comparisons. The roads are very different in nature and usage, but it’s unclear how much allowance for these facts has been made.

The A9 can be a frustrating road. Large sections are single carriageway, with limited visibility through woodland. It’s a busy road with a large proportion of users unfamiliar with the road and travelling long distances. The unfamiliarity combined with the distances inevitably leads to frustration, which in turn leads to many instances of poor overtaking – usually at low speed! For regular A9 travellers the experience of rounding a corner and finding a car coming towards you on the same side of the carriageway isn’t unusual. Often the slow speeds involved are the saving grace, but the frustrations and dangers are only too apparent.

Over the past few months average speed cameras have been added to much of the A9 with the aim of reducing the number of accidents. As speed has rarely been a factor in the nearest misses I’ve experienced I find the decision a little strange.

By way of comparison, the A77 already has large sections “protected” by average speed cameras. As with many people I found myself spending too much time watching my speed rather than looking at the road when using the A77, which given the complexity of the road struck me as being a negative for safety.

One aspect shared by both the A9 and A77 is the confusing and overwhelming number and placement of signs. Approaching junctions it’s not uncommon to find 5 or more signs, all essentially giving the same information. The placement of the signs seems decreed by committee and often signs cover each other or are obscured by vegetation. Given the obsession that exists on the A77 (and in Perth and some parts of the A9) for limiting turn options for lanes, correct lane discipline is important but often awkward and a last minute decision unless familiar with the junction due to the sign issues. Couple this with obsessive watching the speed and it’s a wonder more accidents don’t happen.

Average speed cameras are “fairer” than the instant ones that used to be used, but are they really a good solution for the A9? Monitoring the speed of a vehicle provides a single data point, albeit one that can be objectively measured. Police patrols provide a more subjective measurement of a vehicles behaviour, but they require police officers with all the issues that they bring. It’s a shame that the cameras, with their continuous monitoring of traffic and ability to generate as many tickets as required, has made them the only solution now considered for many organisations.

Of course, alongside the speed cameras the A9 group have also lifted the speed limit for HGV vehicles in an effort to reduce tailbacks and the frustrations that accompany them. It’s an interesting approach, but the usual relationship between speed and energy applies to accidents involving HGVs, so any accidents that take place involving HGVs will be more likely to cause injury. Where the balance between reducing the number of accidents and the additional injuries caused cannot be known at present, but it will be interesting to reflect on.

Another aspect of the introduction that seems strange is the placement of some of the cameras. One of the average speed zones has it’s finish just before one of the most dangerous junctions I regularly pass. The addition of warning signs for turning traffic (that only rarely work and are dazzlingly bright when it’s dark) has been rendered irrelevant as cars now accelerate away from the average speed zone straight into the path of right turning traffic. Moving the zones by a small amount would have avoided this – so why was it not done? Such inattention to detail does not bode well for a multi million pound project that is meant to save lives.

As anyone who drives regularly will attest, the safest roads are those with a steady, predictable stream of traffic. Introducing anything that interrupts the predictability of traffic increases the risk of accidents. Sticking speed cameras at seemingly random locations on roads seems like a sure fire way of doing just that. The sudden braking and rapid acceleration that accompanies such sites is often the trigger for accidents. Following the installation of the cameras on the section of road I travel almost daily, changes in behaviour have been obvious and the near collision between a van and car that I witnessed a few days ago was the first – and closest – I’ve seen in months. Hopefully it’s just a transitional thing and people will adjust.

I’m certain that the reports published will support the decisions made by the RSG, after all that’s the beauty of statistics It would be nice to think that they would publish the “raw” detailed information about incidents and accidents, but so far I’ve been unable to find any place online that has such data. If anyone knows of such data then I’d love to have a look at it and try and do something with it, though I suspect that this will be a pipe dream.

All these changes have been described as temporary, meant to provide additional safety while the plans for changing the entire A9 into a dual carriageway are developed and implemented. The fact that several of the average speed camera sites are on existing dual carriageway sections would tend to imply that they will be a permanent fixture. The continuing income from the cameras will no doubt be welcome, even if they don’t provide much improvement in safety.

Categories: FLOSS Project Planets

Justin Mason: Links for 2014-11-03

Mon, 2014-11-03 18:58
Categories: FLOSS Project Planets