Wednesday, December 23, 2009

screw you, facebook

I use facebook. It's handy. The interface is very clean and usually pretty responsive, and I like the way it presents a stream of online social interaction between people. But what I don't like is their "tactics" to deal with people who try to interface with the site.

Recently I wanted to try and scape my own facebook profile for my friends' information so I could aggregate it and view it on my own terms. If your facebook was suddenly gone, wouldn't you want links back to your previous friends or contact information to get ahold of them again? A link to one such perl script here shows the typical response of Facebook to anyone who writes such a script. It was even removed from the Wayback Machine at archive.org, even though the rest of the site from when the script existed is still there.

Another site here details how a developer had his profile disabled due to trying to export his friends' contact information for an app he develops. It's not unusual to see an application log into a service and extract e-mail addresses so you can find the same people on a different service. However, getting your account dropped because of it is annoying to say the least.

At least there's some content out there that hasn't been taken down. This blog post shows how to log in and update your status. Hopefully this will get me close to a workable script; i'll have to create a fake profile to avoid my normal one being "deactivated", though.

Monday, December 14, 2009

on job requirements and resource utilization

If i'm hired as a sysadmin, make me do sysadmin work. Don't make me do programming work.
If i'm hired as a programmer, make me do programmer work. Don't make me do sysadmin work.
Allow me to do either if necessary, but don't task me with something I wasn't hired for.

There's a very logical and simple reason for this. Would you hire a plumber to fix your shower drain and then instruct him to teach a high school English class? It may be possible for him to perform the basic duties of the teacher. But don't expect the kids to make it in to any nice colleges.

Likewise, if you hire a sysadmin and make him take on a programming project you're going to end up with a retarded program. If you really want him to take on a programming project, evaluate him as you would a software developer you were about to hire (because that's what you've just effectively changed his title to). Is this the person i'd want to hire full-time to develop software? Am I confident he is the best person qualified to take on this job? If not, you're barking up the wrong tree.

I'd trust a developer with my system about as far as I could throw them. I think most sysadmins would agree they would rather a developer have less access over a system than more. We've seen the mistakes they make when they don't fully understand the platform they're developing for. The same mistakes are made by sysadmins when developing with a language or framework they don't fully understand, or without the fundamental knowledge of how to develop and maintain software properly.

The job of a "sysadmin/coder" is effectively asking for half a sysadmin, half a programmer (unless you're paying me double the salary). And that's probably what you'd get in the end. We gain experience and strength in our fields by working on them for the majority of our time, not by dabbling here and there and becoming jack of all trades, master of none.

building a robot

i'd like to build an R2D2-like robot.

the goal would be a 3-4ft robot which is sturdy enough to fall down stairs and survive, can push buttons, go over uneven terrain, be remote-controlled via cellular modem, be equipped with a webcam and screen and contain a locking storage compartment.

i had some fancy plans of how i'd make it work just like a real R2D2. it'd probably be very costly and take something like 3 years to build, based on my current knowledge. so i simplified.

  • do away with complexity and just strip a golf cart - make it 2 wheel drive with the 3rd wheel just a castor
  • body could be two trash cans for more support
  • a rectangular frame to simplify fabbing - though mandrel-bent isn't complicated or expensive these days
  • borrow someone's mechanic shop, i know people with welding stuff
  • all solid state electronics
  • there'd need to be significant ballast on the caster wheel, and lowest center of gravity possible; probably put the battery by the caster wheel and cpu behind it
  • driving would be firmware-controlled tied to the proximity sensors and the CPU would override
  • buying a pre-existing arm would be best. otherwise, one pneumatic lifter up out of the can, and one that shoots out forward to push a button, with the webcam on the head
  • firmware would probably take the longest to finish
  • could put a hole in the middle of the can and turn this thing into a sex toy, rent it out to parties to subsidize the costs
  • an LCD "face"
  • to simplify the "CPU" could use an android phone or netbook. if there's a power failure you could still communicate with it (and locate via GPS)
  • by using trash cans for the body and reinforcing the outside with a mandrel-bent tubular steel frame i could put a keg inside and make it a self-propelled kegerator

Now all I need is about $3000 and a nerd with a mechanic shop and a big heart.

Wednesday, December 9, 2009

open source... infrastructure?

so, back on the topic of tools we'd all like to have but nobody'd like to write. that basically covers "tools" and generic glue to provide specific functions. but what about real world (soft and hard) infrastructure used to provide an enterprise-sized service and maintain it? i'm talking about soup-to-nuts planning for a network with thousands of machines, 100,000 users and millions of clients/customers/visitors.

first off, what hardware do you use? what network gear for the internet pipe, switches for the racks and storage devices? what is your expected backup capacity? do you even need hard drives in your servers? (cutting power cuts cooling cuts costs, and moving parts can be sinful to replace) how are you accounting for failover and redundancy?

what software do you use? is your web farm serving a litany of different interpreted languages and required versions and support libraries or is your one site purpose-built with one framework and language? do you use multiple http servers and proxies or one with an incredibly flexible configuration? what kind of operating system(s) are you going to support, and how do you maintain them? how much customization of open source apps are you going to need and how are you managing it? how many teams work with your platform and what methods of authentication and authorization will you use? how are you deploying changes, reverting them, auditing, etc?

how are you dealing with people? what are your coding best practices and do you enforce them? what is the one version control system you're going to use for the next 10 years and how have you set up your repositories? what's your ticketing and bug-reporting systems like? do you have anyone working between teams to make sure communication remains tight and everyone knows what everyone is doing (more or less)?

of course this is basically "how do i run an enterprise network?" plus glue code and hints/tips. but the point is to make most of the decisions and ensure common problems are avoided while reducing the duplication of work inherent in building out any enterprise network using open source tools. i think this is part of the point of open source: get your ideas down on paper, implement them, show your work and let people modify it and contribute their work back to let everyone benefit. this would apply to everything from the planning stages of building out a large site to the little glue scripts that rotate logs and build code changes.

what i'm talking about would not make a lot of people happy. it's probably a lot like car tuners, where the people who do the work of tuning cars are not happy when the people with tuned cars share their tuning maps. their source of income (tuning cars) is supposedly threatened by people just downloading a pre-made map file. but there's more than one way to map a car, and more than one way to build a network. but i'll be happier to not have to duplicate my work and just be able to get stuff done. managers might like not having to throw away more time and money on duplicate work. and it'd be a fun project to work on =)

Tuesday, December 8, 2009

breaking up with BK & RESTful APIs with curl

This article sparked a mini-firestorm on BrightKite. As a couple of the comments at the end of the story mention, several users were apparently deleted from BrightKite due to their negative reactions to this story. There was also a number of users that were deleted because of some kind of AT&T-related "removal" notifications, but some users who were deleted did not use AT&T. In the end BrightKite fucked up yet again.

I am finally getting around to deleting my BrightKite account. For some reason all of my pixelpipe posts have been failing to Twitter but succeeding on BK, so people have actually been commenting on my BK account recently. I've also been getting some strange "love spam" through BK, which is yet another lesson in why this new BK 2.0 sucks: there is no "report" button for posts or users, so I can't flag something as spam. Wonderful new system, guys.

So anyway, this was a nice quick introduction to working with RESTful APIs with curl. I found one example of how to delete all brightkite posts and decided a bash one-liner would be simpler. This page describes how to specify the method and other REST-related options using curl. Here is the one-liner I ended up using:
curl -s -G -u user:pass http://brightkite.com/people/user/objects.xml | \
grep "<id>" | sed -e 's/^.*<id>\(.\+\)<\/.*$/\1/g' | \
xargs -n 1 -I {} curl -s -G -u user:pass -X DELETE http://brightkite.com/objects/{}.xml

Of course, being BrightKite it doesn't work the way it should. Only a small amount of posts get deleted at a time, I don't know why. Intermittently you'll receive a spam of HTML 404 pages, and every single request (authenticated or not) gives a '403 Forbidden' error. No matter; running the one-liner over and over gets them all eventually.

When i'm done i'll post a couple about how BK sucks, and not delete the account outright: I want my opinions of their suckitude to linger for others to discover. It's pretty sad, too: BK actually worked in the beginning and was pretty useful when the few other users down here used it frequently.

Monday, December 7, 2009

everybody wants to work, but not for free

It's fun to code something new or challenging. It's not fun to code a boring piece of glue just to simplify some mundane task like deployment, rollback, authorization or peer-review. That stuff doesn't have a real "payback" in terms of emotional satisfaction with completing a project. Design a new file transfer tool and protocol? That's sexy. Making a QA tool to handle code changes or deployments before they hit a live site? Not so sexy.

So we do this work for our companies as part of the job. We write our little wrappers and snippets of code, push it out and let the company retain this hack to keep things running smoothly. But you'll notice nobody really designs open-source enterprise code/web-app management systems. That's not sexy and not relevant to an open-source hacker's dreams of getting the tools they want completed and in the hands of developers and admins world-wide. The end result is reinventing the wheel.

How many times has a complicated deployment system been crafted at a large company, only ever to be used there by a handful of developers? What's the gain by keeping this internal tool locked up? What are you really gaining by spending these man-hours and seeing it go unmaintained and undeveloped? Shouldn't we perhaps be exploring every opportunity to expose our internal tools to the world so that they may be flushed out, improved upon and eventually re-used by ourselves and others?

But many of these tools aren't suitable for a general purpose. We don't want to write a deployment system that works in any site in the world, so we make it work for us. This'll work for svn but not necessarily git. That'll work for Apache and mod_perl but not with Tomcat and java. You may need to tar up your work and keep it in a repository while i'll use a quick interface to my filer's snapshots and source code to manage rollbacks. Don't even ask about change management.

When is the focus going to shift from the latest, coolest language and framework to the gritty glue that keeps it all flowing? To make a site that really works you need it to be well-oiled, not just in form but in function. I'd love to get paid just to make tools that work with any site, anywhere, any time, one single way. But that will never happen. I'll get paid to make it just work, and that's what i'll end up implementing. I find it a waste to spend your time polishing a grommet when you've got the operation of a whole machine to see to. Some day perhaps we can identify what we lack and knock it out bit by bit. In the meantime i'm just reinventing the wheel.

Sunday, November 29, 2009

spin detector

Unlike the previous Wikipedia spin detector, i'd like to make a web tool that can internalize any web page or editorial content for spin or bias towards a particular subject matter or ideology.

The mechanism should be simple. Isolate words and phrases which indicate a clear bias towards a particular opinion and assign them 'tags'. The tags will indicate whether something is more liberal or conservative, republican or democrat, racist or politically correct, patriotic or revolutionary, etc. A series of filters can first identify the spin words individually and group them in increasingly broader groups. Eventually the page can be highlighted with context to various large groupings of content that has a greater affinity towards a certain subject type or bias and summaries can be generated.

Certainly this could be abused to the point people make snap judgments or rule out content based on its perceived bias, but at the same time one could use the tool on a large swath of content and determine in a general way whether it had a majority of its content leaning in one way or another. I'd partly like to use this to flag content that I don't want to read, but also as an indicator of whether something is filled with cruft. I don't care if it's left-leaning or right-leaning as much as I just want to read unbiased opinions not littered with mindless rhetoric.

http/socks proxies and ssh tunneling

It seems there exists no tool to simply convert one type of tunnel into another. SSH supports both tcp port forwarding and a built-in SOCKS proxy, both of which are incredibly useful. But it lacks a native HTTP proxy. The sad truth is, most applications today only seem to support HTTP proxies or are beginning to support SOCKS. Until such time that SOCKS proxies are universally accepted in networking apps, I need an HTTP proxy for my SSH client.

There are some LD_PRELOAD apps which will overload network operations with their own and send them through a proxy (like ProxyChains). This seems an unportable and more hacky solution than I would use (are you going to change all your desktop links for 'audacious' to be prefixed with 'proxychains'?). I am willing to write a HTTP-to-SOCKS proxy but don't have the time just yet. CPAN seems to have a pure-perl HTTP proxy server, and I can probably leverage IO::Socket::Socks to connect to the SOCKS5 server in ssh.

In any case, my immediate needs are fulfilled: I can use 'curl' to tunnel through ssh's SOCKS for most of my needs. This blog entry on tunneling svn was a useful quick hack to commit my changes through an ssh forwarded port, but much more of a hack than i'm willing to commit to; i'd rather just enable or disable an http proxy.

Sunday, November 22, 2009

checkup: a minimal configuration management tool

This is a braindump of an idea. I want a minimal, simple 'tool' to manage my configuration across multiple hosts. I have configs I share between work and home and my colo box, all with minor differences. Cfengine and Puppet take too much time for me to set them up properly, so i'll just write something which will take slightly longer but be simpler in the end. added Above all, this tool should not make silly assumptions for you about how you want to use it or what it should do - it should just do what you tell it to without needing to know finicky magic or special parameters/syntax.

I already have subversion set up so i'll stick with that, otherwise i'd use cvs. I need the following functionality:

* auto-update sources (run 'svn up' for me)
* check if destination is up to date, and if not, apply configuration
* search-and-replace content in files
* try to maintain state of system services
* send an alert if an error occurs
* added run an external command to perform some action or edit a file post-delivery

That's about it for now. Mostly I just want it to copy files based on the host "class" (colo, home, laptop, work). Since I want it to run repeatedly the alert is only because it'll be backgrounded; otherwise it will obviously report an error on stderr. It should also be able to run on cmdline, background, etc and lock itself appropriately. Logging all actions taken will be crucial to making sure shit is working and debugging errors.

I don't want it to turn into a full-fledged system monitoring agent. The maintain state of system services is more of a "check if X service is enabled, if not enable it" thing, not running sendmail if sendmail isn't running. On Slack this is about as complicated as "is the rc.d file there? is it executable?" but on systems like Red Hat it's more like "does chkconfig show this as enabled for my runlevel?". I don't know what tool i'll use to monitor services; I need one but it isn't in scope of this project.

Now for a name... it's going to be checking that my configuration is sane, so let's go with "checkup". There doesn't seem to be an open source name conflict.

For config files i'm usually pretty easygoing. Lately i've been digging on .ini files so i'll continue the trend. To make the code more flexible to change i'll make sections match subroutines, so I can just make a new .ini section and subroutine any time I wanna expand functionality. Syntax will be straightforward and plain-english, no strange punctuation unless it makes it more readable. Multiple files will be supported, though if a file fails syntax check by default it'll be ignored and the rest of the files will be scanned and a warning thrown. If any file references anything which is missing an error will be thrown and exit status non-zero. Each section will also be free-form: the contents of the section determines what it's doing. If it defines hostnames and a hostgroup name, the name of the section should probably implicitly be a hostgroup class name. A list of files and their permissions, etc would mean that section name imposes those restrictions on those files. A set of simple logic conditionals will determine if that class can be evaluated (ex. "logic = if $host == 'peteslaptop'"; there is no "then blah blah" because this is just a conditional to be evaluated, and if it's false that class is ignored). added The conditionals will evaluate left-to-right and include 'and' and 'or' chaining (i don't know what it's actually called in a language). While we're at it, a regex is probably acceptable to include here ('=~' in addition to '=='). Hey, it is written in Perl :)

added The tool should be able to run by a normal user as any other unix tool is. It should be able to be fed configs via stdin for example. It will not serve files as a daemon as that is completely out of scope of the tool; in fact, it shouldn't really do any network operations at all save for tasks performed as part of the configs. In addition, all operations should be performed locally without the thought or need to retrieve files or other information - all updates should happen first before any configs are parsed. If for some reason that poses a scalability or other problem we may allow configs to be parsed, then files copied to local disk, then examining the system to execute the configs etc.

The only thing i'm not really sure about is roll-back. I always want roll-back, but it's hard to figure out how to perform such a thing when you've basically just got a lot of rules that say how stuff should be configured at a glance - not how it should look in X time or whatever. Rolling back a change to the tool's configs does not necessarily mean your system will end up that way, unless you wrote your tool's configs explicitly so they always overwrite the current setting. For example, you might have a command that edits a file in place - how will you be able to verify that edit is correct in the future and how would you be able to go back to what it was before you made your edit?

Probably the easiest way to get at this would be to back up everything - make a record of exactly the state of something before you change it and make a backup copy. stat() it, then if it's a file or directory mark that you'll back it up before doing a write operation. Really the whole system should determine exactly what it's going to do before it does it instead of executing things as it goes down the parsed config.

Let's say you've got your configs and you're running an update. One rule says to create file A and make sure it's empty. The next rule says to copy something into file B. The next rule says to copy file B into file A. Well wait a minute - we already did an operation on file A. What the fuck is wrong with you? Was this overwrite intentional or not? Why have the rule to create file A if you were just going to copy file B on top of it? This is where the 'safety' modes come in. In 'paranoid' mode, any operation such as this results in either a warning or an error (you should be able to set it to one or the other globally). Alternatively there will be a 'trusting' mode wherein anything that happens is assumed to happen for a reason. For operations in 'trusting' mode, if there is a rule which explicitly conflicts with another - you told it to set permissions to 0644 in one rule and 0755 in another rule - a warning will be emitted, but not nearly as many as in 'paranoid warning' mode. Of course all of this will be separate from the inherent logging in any test or operation which is performed for debugging purposes.

All this is a bit grandiose already for what I need, but might as well put it in now than have to add it later.

added The config's sections have the ability to be "include"'d into other sections to provide defaults or a set of instructions where they are included. In this way we can modularize or reuse sections. In order to make 'paranoid mode' above happy we may also need to add a set of commands to perform basic I/O options. Also we should be able to define files or directories which should be explicitly backed-up before an operation (such as a mysterious 'exec' call) might modify them. Of course different variables and data types may be necessary. Besides the global '$VARIABLE' data type, we may need to provide lists or arrays of data and be able to reproduce it within a section. '@ARRAY' is one obvious choice, though for data which may apply only to a given section at evaluation-time (such as an array that changes each time it is evaluated) we may need a more specific way to specify that data type and its values.

added The initial configuration layout has changed to a sort of filesystem overlay. The idea is to lay out your configs in the same place as they'd be on the filesystem of your target host(s) and place checkup configs in directories which will determine how files will be delivered. I started out with a puppet-like breakout of configs and templates, modularizing everything etc. But it quickly became tedious to figure out the best way to organize everything. Putting it all in the place you'd expect it on disk is the simplest. You want an apache config? Go edit checkup/filesystem/etc/apache/ files. You want to set up a user's home directory configs? Go to checkup/filesystem/home/user/. Just edit a .checkuprc in the directory you want and populate files as they'd be on the filesystem (your checkuprc will have to reference them for them to be copied over, though).

Thursday, November 19, 2009

repeating your mistakes

We ran into an issue at work again where poor planning ended up biting us in the ass. The computer does not have bugs - the program written by the human has bugs. In this case our monitoring agent couldn't send alerts from individual hosts because the MTA wasn't running, and we had no check to ensure the MTA was running.

This should have been fixed in the past. When /var would fill up, the MTA couldn't deliver mail. We added checks to alert before /var fills up (which is really stupid if you ask me; create a file and seek to the end of the filesystem and write something and /var is filled up, so it's possible this alert could be missed too).

So the fix here is to add a check on another host if the MTA isn't running. Great. Now we just need to assume nothing else prevents the MTA from delivering the message and we're all good. But what's the alternative? Remote syslog and a remote check to see if the host is down and when it's back up determine why it was down & to reap the unreceived syslog entry? I could be crazy, but something based on Spread seems a little more lightweight and just about as reliable, though because you're removing the requirement of a mail spool (you keep the logs on the client if it can't deliver the message) it reduces the complexity a tad.

At the end of the day we should have learned from our mistake the first time. Somebody should have sat down and thought of all the ways we may miss alerts in the future and work out solutions to them, document it and assign someone to implement it. But our architect didn't work this way and now we lack any architect. Nobody is tending the light and we're doomed to repeat our mistakes over and over.

Also we shouldn't have reinvented a whole monitoring agent when cron scripts, Spread (or collectd) and Nagios could maintain alerts just as well and a lot easier/quicker.

quick and dirty sandboxing using unionfs

On occasion i've wanted to perform some dev work using a base system and not care what happens to the system. Usually VM images are the easiest way to do this; keep a backup copy of the image and overwrite the new one when you want to go back. But what about debugging? And diffing changes? Using a union filesystem overlay you can keep a base system and copy its writes to a separate location without affecting the base system.

Herein lies a guide to setting up a union sandbox for development purposes using unionfs-fuse. This is the quickest, dirtiest way to perform operations in a sandbox which will not effect the base system. All writes will end up in a single directory which can be cleaned between uses. With debugging enabled one can see any writeable actions that take place in the sandbox, thus allowing for a more fine-grained look at the effects of an application on a system.

Note that unionfs-fuse is not as production-ready as a kernel mode unionfs (aufs is an alternative) but this method does not require kernel patching. Also note that this system may provide unexpected results on a "root" filesystem.

Also note that this guide is for a basic 'chroot' environment. The process table and devices are shared with the host system, so anything done by a process could kill the host system's processes or damage hardware. Always use caution when in a chroot environment. A safer method is replicating the sandbox in a LiveDVD with writes going to a tmpfs filesystem. The image could be booted from VMware to speed development.

Unfortunately it seems like the current unionfs-fuse does not handle files which need to be mmap()'d. A kernel solution may be a better long-term fix, but for the short term there is a workaround included below.

  1. set up unionfs
     # Make sure kernel-* is not excluded from yum.conf
    yum -y install kernel-devel dkms dkms-fuse fuse fuse-devel
    /etc/init.d/fuse start
    yum -y install fuse-unionfs

  2. cloning a build box
     mkdir sandbox
    cd sandbox
    rsync --progress -a /.autofsck /.autorelabel /.bash_history /bin /boot /dev /etc \
    /home /lib /lib64 /mnt /opt /sbin /selinux /srv /usr /var .
    mkdir proc sys tmp spln root
    chmod 1777 tmp
    cd ..

  3. setting up the unionfs
     mkdir writes mount
    unionfs -o cow -o noinitgroups -o default_permissions -o allow_other -o use_ino \
    -o nonempty `pwd`/writes=RW:`pwd`/sandbox=RO `pwd`/mount

  4. using the sandbox
     mount -o bind /proc `pwd`/mount/proc
    mount -o bind /sys `pwd`/mount/sys
    mount -o bind /dev `pwd`/mount/dev
    mount -t devpts none `pwd`/mount/dev/pts
    mount -t tmpfs none `pwd`/mount/dev/shm
    chroot `pwd`/mount /bin/bash --login

  5. handling mmap()'d files
     mkdir mmap-writes
    cp -a --parents sandbox/var/lib/rpm mmap-writes/
    mount -o bind `pwd`/mmap-writes/sandbox/var/lib/rpm `pwd`/mount/var/lib/rpm

Wednesday, November 4, 2009

Bad excuses for bad security

This document explains Pidgin's policies on storing passwords. In effect, they are:

  • Most passwords sent over IM services are plain-text, so a man-in-the-middle can sniff them.
  • Other IM clients are equally insecure.
  • You should not save the password at all because then nobody can attempt to decipher it.
  • Obfuscating passwords isn't secure. (Even though a real encrypted stored password isn't obfuscation)
  • You shouldn't store sensitive data if there is a possibility someone might try to access it.
  • It won't kill you to type your password every time you log in.
  • We would rather you use a "desktop keyring" which isn't even portable or finished being written yet.

These explanations are really a verbose way of saying "we don't feel like implementing good security." I've been using Mozilla and Thunderbird for years now with a master password, which works similarly to a desktop keychain.

The idea is simple: encrypt the passwords in a database with a central key. When the application is opened or a login is attempted, ask the user for the master password. If it is correct, unlock the database and get the credentials you need. This way only the user of the current session of the application can access the stored passwords.

What you gain is "security on disk": that is, the data on the hard drive is secure. There are still plenty of ways to extract the passwords from a running system, but if the system is compromised it's less likely an attacker can get the password if they had to extract it from disk. This is most useful for laptops and corporate workstations where you don't necessarily control access to the hard drive.

Policies like the one described above should not be tolerated in the open-source community. It's clear to anyone who actually cares about the integrity of their data that these developers are simply refusing to implement a modicum of good security because they have issues with people's perception of security. I don't agree with obfuscating passwords - if you're just scrambling it on the disk without a master password that's no security at all. But a master password allows true encryption of the password database and thus secures the data on disk.

It would be nice if we could all have encrypted hard drives and encrypted home directories. Alas, not every environment is so flexible.

Tuesday, October 20, 2009

mistakes by developers when creating a... whatever

  1. Not communicating with sysadmins. You want to discuss technical issues with your sysadmins early on so they can figure out what kind of hardware will be needed to handle the load, and they can propose methods that work within the current infrastructure to do what you want. I know you like working with Storables, but loading them off an NFS filer vs getting just the data you want from a MySQL cluster is an easy pick for a sysadmin. And can we say "application servers" (i.e. FastCGI)?

  2. Picking the latest tech, or any tech based on interest or novelty or perceived design gains. Do the opposite: start with the oldest tech and go up as you look at your requirements. The reason is again with the system in mind: does the tech you want to use scale well? Does it have a long history as a stable production-quality system? Does it support all the methods you'll need to work with it, in Dev, QA and Production? Is there any straightforward deployment model that works well with it? Most importantly: Do you know it backwards and forwards and can you actually debug it once it breaks on the live site?

  3. Not keeping security in mind at design-time. I still meet web app developers who don't know what XSS or SQL injection is. You need to take this seriously as a hosed website can cost you your job.

  4. Not using automated tests for your code. You need to know when an expected result fails so it doesn't find its way into your site. Not QAing or *laugh* not syntax checking your code before pushing it also falls into this category. Test, test, test.

  5. Writing crappy code. Oh yes - I went there. There's nothing more annoying than doing a code review 2 years in and looking at all the bloated, slow, confusing, undocumented, unreliable, crappy code from the early guys who just wanted to get the site off the ground. There's always going to be bit rot and nobody's code is perfect. Just try your best not to cut corners. There is never a good time to rewrite so try to make it last the test of time. A good example is some of the backend code i've seen in some sites: the same app being used for over 10 years without a single modification and never breaking once. Also good to keep in mind is portability. When the big boss says it's time to run your code on Operating System X on Architecture Z, what'll it take to get it running? (hint: you'll probably be doing that work on the weekends for no extra pay)

large scale ganglia

Ganglia is currently the end-all be-all of open source host metric aggregation and reporting. There are a couple other solutions which are slowly emerging to replace it, but nothing as well entrenched. Hate on RRDs as much as you want but they're basically the defacto standard for storing and reporting numeric metrics. It can be hairy trying to figure out how to configure it all though, so here's an overview of getting it set up on your network.

First install rrdtool and all of ganglia tools on every host. Keep the web interface ('web' in the ganglia tarball) off to the side for now. The first thing you should consider is configuring IGMP on your switch to take advantage of multicast groups. If you don't you run the risk of causing broadcast storms which could potentially wreak havoc on your network equipment, causing ignored packets and other anomalies depending on your traffic.

In order to properly manage a large-scale installation you'll need to juggle some configuration management software. You need to configure gmond on each cluster node so that the cluster multicasts on a unique port for that lan (in fact, keep all of your clusters on unique ports for simplicity's sake). If you have a monitoring or admin host on each lan you'll configure gmond to listen for multicast [or unicast] metrics from one or more clusters on that lan. Then you'll need to configure your gmetad's to collect stats from either each individual cluster node or the "collector" gmond nodes on the monitoring/admin hosts. Our network topology is as follows:
Monitor box ->
DC1 ->
DC1.LAN1 ->
Cluster 1
Cluster 2
DC1.LAN2 ->
Cluster 3
DC1.LAN3 ->
Cluster 4
DC2 ->
DC2.LAN1 ->
Cluster 1
Cluster 2
Cluster 3
Cluster 4
DC2.LAN2 ->
Cluster 1

The monitor box runs gmetad and only has two data_source entires: DC1 and DC2.
DC1 and DC2 run gmetad and each has a data_source for each LAN.
Each LAN has its own monitor host running gmond which collects metrics for all clusters on their respective LAN.
The clusters are simply multicasting gmond's configured with a specific cluster name and multicast port running on cluster nodes.
The main monitor box, the DC boxes and the LAN boxes all run apache2+php5 with the same docroot (the ganglia web interface). The configs are set to load gmetad data from localhost on one port.
Each gmetad has its "authority" pointed at its own web server URL.

(Tip: in theory you could run all of that off a single host by making sure all the gmetad's use unique ports and modifying the web interface code to load the config settings based on the requesting URL, to change the gmetad port as necessary)

In the end what you get is a main page which only shows the different DCs as grids. As you click one it loads a new page which shows that DC's LANs as grids. Clicking those will show summaries of that LAN's clusters. This allows you to lay out your clusters across your infrastructure in a well-balanced topology and gives you the benefit of some additional redundancy if one LAN or DC's WAN link goes down.

We used to use a single gmetad host and web interface for all clusters. This makes it extremely easy to query any given piece of data at once from a single gmetad instance and see all the clusters on one web page. The problem was we had too much data. Gmetad could not keep up with the load and the box was crushed by disk IO. We lessened this by moving to a tmpfs mount, archiving RRDs and pruning any older than 60 days. Spreading out to multiple hosts also lessened the need for additional RAM and lowered network use and latency across long-distance links.

If you think you won't care about your lost historical data, think again. Always archive your RRDs to keep fine-grained details for later planning. Also keep in mind that as your clusters change your RRD directory can fill up with clutter. Hosts which used to be in one cluster and are now in another are not cleaned up by ganglia; they will sit on the disk unused. Only take recently-modified files into account when determining the total size of your RRDs. Also, for big clusters make your poll time a little longer to prevent load from ramping up too often.

I'll add config examples later. There are many guides out there that show how to set up the fine details. As far as I can tell this is the simplest way to lay out multiple grids and lessen the load on an overtaxed gmetad host.

Monday, October 19, 2009

reverse versioning system

this is one of those ideas that is either retarded or genius.

so, you know how everyone versions their software off the cuff? this is 0.1, that is 2.3.4, this is 5.00048, etc. it all seems so arbitrary. package management has to attempt to deal with that in a sane way, and if you're ever tried to manage the versions of CPAN modules... just forget it. there should be a simpler way.

why not go backwards? you could keep going forward into infinity or whenever someone thinks their change is important enough to deserve a major number bump. or you could go down all the way to 0. it all hinges on the idea that you clearly define what your software does, the goals you want to reach and the tasks you need to complete to get there. in theory, once you have completed it all you should be done with your program and it should never need another version because it accomplishes everything you set out to do. thus you will have "counted down" to zero.

first take your goals: these will be the major numbers and are broad ideas. your tasks are the minor numbers. you can have a third field of numbers for revisions but these would have to count up to make logical sense (if they are in fact only touched for revisions). as you complete each task and goal your version goes down by that much, so completing 5 tasks would bring your minor version down by 5.

this of course would not work for applications which keep increasing their goals. in theory you could add a billion tasks to your next goal so you can still add features. but eventually all your goals and tasks will be 0, so you need to design it for a purpose and not just keep throwing in junk for each new goal. this system may also only work to provide a "stable base", whereupon reaching version zero you know that the system is complete and ready for use. perhaps negative versions after that, since in theory this would only indicate new features?

package management/version control systems would all need to be modified to fit such a system, but in the end i think it would be a more sane standard than just "these numbers went up" and having to figure out for yourself what that means for that application.

Monday, October 12, 2009

why brightkite is about to die

They had it all. A nice niche in the social networking world. A [semi-]long history of service that was stable and efficient. A sizeable global user base. They were a leg up on their competitors with years of head start. And then they made the one fatal mistake of any start-up: They upgraded.

The riskiest thing you can do as a start-up (or online venture in general) is to upgrade your system. Even something as small as a "graphical revamp" can lead to droves of your users leaving for something less pretentious or easier to use. Because nobody really depends on these kinds of sites, your whole business is based on keeping your users happy and making sure the competition's site isn't more attractive.

Brightkite has been working on a revamp of their site ("BrightKite 2.0") for a while now. They finally unveiled it sometime last week. It crashed. They said they'd get some of the bugs worked out pretty quickly. About one week later the site was still down, when at the end of the week they announced it was finally back. Random outages over the weekend continued to plague them and users slowly filtered back in. Even today the service was still spotty. All this downtime caused a massive surge onto competing services such as Twitter, and though I don't have a definite reason why, i'd bet Twitter's downtime was in part due to over-stressing of their system by fleeing BK users (the 503 errors basically confirm their backend web servers couldn't handle what the proxies were throwing at them and were probably toppling over from load).

They're sticking with the new 2.0 site of course, and still trying to figure out the bugs. The system is slow. The mechanisms which made their old site usable (such as searching for a business near you) goes from mildly broken to nonexistent. The new features on the site seem like a laundry list of nice-to-have features from other social networks which nobody needs.

We all know Twitter is coming out with their own geotagging system Real Soon Now(TM). Google Maps has GeoRSS and other sites are slowly developing their location-aware services. As soon as a viable alternative appears, the BKers will try it out, and as long as it doesn't crash for a week straight they'll probably jump ship entirely. The new site just needs to have an "Add your BrightKite friends!" option and the titanic 2.0 will be rendered below the surface.

It's really sad because they've basically fallen for all the common traps inherent to a site upgrade like this. First and foremost you need a beta site. There may have been one but I never heard of it. You need a *long* period of beta, incrementally adding new users to see the trends in system load and to look for hidden, load-driven bugs. They should have know everything they'd need to scale in the future based on that beta site and have 0 bug reports.

Secondly, you need a backup. If you try to launch the new site and it ends up being down for 24 hours, you *have* to go back to the original site. There's no excuse for this one. If you can't handle switching your site from the new code base to the old one you've made some major planning errors.

I don't know where they plan to go from here. The reviews of the new site were somehow positive - I guess they had a preview copy of the site or they'd never seen the old one. But they fucked up by turning away all their customers. The biggest lesson you could take away from this is how web services have to work: release early, release often. Constant dynamic development on the live site. You simply can't afford to launch a new release after a long cycle of development. It requires too much testing and one or two things missed can sink the whole ship. Test your features one at a time and make sure your code is modular yet lightweight.

From my limited experience, the biggest obstacle to this method of design is in keeping your app reasonably scalable. The same hunk of code hacked on for years will result in some pretty heinous stuff if you don't design it right and keep a sharp eye on your commits.

A side note: this is a good lesson in backing up your files. People who sent their pics to BK and deleted them from their phones may eventually want the pics back. If BK dies they may find themselves wishing they had set up a Flickr or Picasa account to catch their photos. Facebook probably has the biggest free allocation of picture storage ("Unlimited") but they also don't cater to fotogs. As for the geotagging, well... This might not be a bad time for someone to create a lightweight geotagging app or library.

Thursday, September 17, 2009

how to build a high-traffic website

this is all common sense information based on some simple principles. i am not an engineer, but i feel i understand what makes things fast or slow in general and how to apply them in a world of software and enterprise systems.

let's start with cars. something light is going to be faster than something heavier, given the same horsepower. if it's lighter it takes less force to move it. you can compare this to computing as anything 'light' in memory use, disk operations or cpu cycles will take less time to finish 'computing.' keep everything light so you can scale 1000x at a moment's notice.

next, you need your 'frame' to be strong enough to handle the horsepower you throw at it. sheet metal and aluminum will fold up pretty quickly when the torque of 500 horses jars and twists it all at once. this applies not only to the framework of your software, but to the framework of your systems and your network which your systems operate on. they need to scale infinitely (yes, as in to infinity) and they need to be built to handle all the future needs/requirements. when you try to add a hack or reinvent your architecture later you'll find yourself in quite a tight spot, and it will be better if you can just move your existing resources around to handle additional load.

of course, if a car is exorbitantly fast you'll need lots of safety to keep all that power and speed in check. this can mean redundancy and also testing/integrity checking. first you need to consider your technology: are you using Hadoop? if so, you've got a big gaping single-point-of-failure you need to ensure never goes down or is always able to be replicated or hot-failed quickly. what happens when any given part of your infrastructure goes down? will it cause the whole thing to come to a screeching halt or will your car still drive OK with just 7 cylinders? you'll also need to write tests for your software if it changes frequently to ensure it's operating as expected. you could also do to implement a configuration management and tripwire solution to track any change on your systems and apply the correct good working configuration. this isn't a security requirement, this is essential to just keeping everything running.

next, you'll need lots of dials, knobs and lights to know everything is running smoothly. when you have a race car or even a humble sports car, you need to know your oil is OK, your fuel is plentiful and that your engine isn't knocking. this means lots and lots of stats and graphs and monitors. more than you could ever need. you don't want to be caught with your pants down and your car stalled out in the middle of the track because you weren't watching the oil temp level. you need to make sure your disks aren't filling up and know before they do. you need to know the *rate* at which your resources are being used so you have ample warning before things get too close to fix. you need to know when some service is down, and if at all possible get it back up automatically.

finally: the car's performance is dependent upon its mechanics and engineers. you need the people working on it to know what they're doing and you have to make sure they're doing their job right, or you could end up with a catastrophe. you can't afford to have a developer making sql queries that take 50 seconds to complete. you also can't afford to have a sysadmin making local edits or ignoring system faults, or really anybody to falter on their layer. you *need* to police everyone and make sure everything is done *right* with no shortcuts or bad designs. you cannot ignore the fact that diligence will always result in a more reliable system than those without.

Friday, September 11, 2009

how much memory is my application really using?

In order to properly visualize how much memory is being used by an application, you must understand the basics of how memory is managed by the Virtual Memory Manager.

Memory is allocated by the kernel in units called 'pages'. Memory in userland applications is lumped into two generic categories: pages that have been allocated and pages that have been used. This distinction is based on the idea of 'overcommittal' of memory. Many applications attempt to allocate much more memory for themselves than they ever use. If they attempt to allocate themselves memory and the kernel refuses, the program usually dies (though that's up to the application to handle). In order to prevent this, the kernel allows programs to over-commit themselves to more memory than they could possibly use. The result is programs think they can use a lot of memory when really your system probably doesn't even have that much memory.

The Linux kernel supports tuning of several VM parameters including overcommit. There are 3 basic modes of operation: heuristic, always and never. Heuristic overcommit (the default) allows programs to overcommit memory and determines if the allocation is wildly more than the system contains, and if so won't allow allocation. Always overcommit will pretty much guarantee an allocation. Never overcommit will stricly only allow allocation for the amount of swap + a configurable percentage of real memory (RAM). The benefit with never overcommit is that applications won't simply be killed due to the system running out of free pages; instead it will receive a nice friendly error when trying to allocate memory and *can* decide for itself what to do.

Now, let's say the program has allocated all it needs. The userland app now uses some of the memory. But how much is it using? How much memory is available for other programs? The dirty truth is, this is hard to figure out mostly because the kernel won't give us any easy answers. Most older kernels simply don't have the facility to report what memory is being used so we have to guesstimate. What we want to know is, given one or more processes, how much ram is actually being used versus what is allocated?

With the exception of very new kernels, all you can really tell about the used memory of a program is how much memory is in the physical RAM. This is called the Resident Set Size, or RSS. Since the 2.6.13 kernel we can use the 'smaps' /proc/ file to get a better idea of the RSS use. The RSS is split into two basic groups: shared (pages used by 2 or more processes) and private (pages used by only one process). We know that the memory that a process physically takes up in RAM combines the shared and private RSS, but this is misleading: the shared memory is used by many programs, so if we were to count this more than once it would falsely inflate the amount of used memory. The solution is to take all the programs whose memory you want to measure and count all the private pages they use, and then (optionally) count the shared pages too. A catch here is that there may be many other programs using the same shared pages, so this can only be considered a 'good idea' about the total memory your application is using (it may be an over-estimation).

With recent kernels some extra stats have been added to the kernel which can aid in giving a good idea about the amount of memory used. Pss is a value that can be found in smaps with kernels greater than 2.6.24. It contains the private RSS as well as the amount of shared memory divided by the number of processes that are using it, which (when added with other processes also sharing the same memory) gives you a closer idea to the amount of memory really being used - but again this is not completely accurate due to the many programs which may all be using different shared memory. In very recent kernels the 'pagemap' file allows a program to examine all the pages allocated and get a very fine-tuned look at what is allocated by what. This is also useful for determining what is in swap, which otherwise would be impossible to find out.

Based on all this information, one thing should be clear: without a modern kernel, you will never really know how much memory your processes are using! However, guesstimation can be very accurate and so we will try our best with what we have. I have created a perl script which can give you a rough idea of memory usage based on smaps information. You can pass it a program name and it will try to summarize the memory of all processes based on that program name. Alternatively you can pass it pid numbers and it will give stats on each pid and a summary. For example, to check the memory used by all apache processes, run 'meminfo.pl httpd'. To check the memory of all processes on the system, run 'ps ax | awk '{print $1}' | xargs meminfo.pl'.

Some guidelines for looking at memory usage:

* Ignore the free memory, buffers and cache settings. 99% of the time this will not apply to you and it is misleading. The buffer and cache may be reclaimed at any time if the system is running out of resources, and free memory is completely pointless - this relates to pages of physical memory not yet allocated, and realistically your RAM should be mostly allocated most of the time ( for buffers, cache, etc as well as miscellaneous kernel and userland memory ).

* If you don't have Pss or Pagemap in your kernel, a rough guess of used memory can be had by either adding up all the Private RSS of every process or subtracting the Shared RSS from the RSS total for every process. This still doesn't account for kernel memory which is actually in use and other factors but it's a good start.

* Do not make the mistake of confusing swap use with 'running out of memory'. Swap is always used by the kernel even if you have tons of free memory. The kernel tries to intelligently move inactive memory to swap to keep a balance between responsiveness and speed of memory allocation/buffer+cache reclimation. Basically, it's better to have lots of memory you don't use in swap than in physical RAM. You can tune your VM to swap less or more but it depends on your application.

Monday, September 7, 2009

personal preferences for managing large sets of host configuration

after dealing with a rather befuddling cfengine configuration for several years i can honestly say that nobody should ever be allowed to maintain it by themselves. by the same token, a large group should never be allowed to maintain it without a very rigorous method of verifying changes, and policies that will *actually* be enforced by a manager, team lead or other designated person(s).

the problem you get when managing large host configurations with something like cfengine is the flexibility in determining configuration and how people differ in their approach of applying the configuration. say you want to configure apache on 1000 hosts, many of them having differences in the configuration. generally one method will be set down and *most* people will do it the same way. this allows for simplicity in terms of making an update to an existing config. but what about edge cases? those few, weird, new changes that don't work with the existing model? perhaps it requires such a newer or older version of the software that the configuration method changes drastically.

how can you trust your admins to make it work 'the right way' when they need to make a major change? the fact is, you can't. it's not like they're setting out to create a bad precedent. most of the time people just want to get something to work and don't have the foggiest how, but they try anyway. this results in something which is much different and slightly broken compared to your old working model, but since it's new nobody else knows how it works.

i don't have a time-tested solution to this, but i know how i'd do it if i had to set down the original model. it comes down to the same logic behind managing open source software. it's important every contributed change works, yes? and though you trust your commiters you want to ensure no problems crop up. you want to make sure no drastic design changes happen or that in general things are being done the right way. the only way to do that involves two policies.

1. managerial overview. this is similar to peer-review, except there's only 1 or 2 peers whose task it is to look at every single commit, understand why it was done the way it was, and decide to fix or remove the commit if it violates the accepted working model. this requires a certain amount of time from an employee so a team lead makes more sense than a manager. it's not a critical role but it is an important one, and anyone who understands the model and the inner-workings of your configuration management language should be capable.

2. strict procedures for change of configuration management. this means you take the time to enumerate every example of how your admins can modify your configuration management. typically this also includes a "catch-all" instruction to get verification from a manager, team lead, or other person-in-charge if you're going to make a change outside the scope of the original procedures. this requires a delicate touch; bark too hard at offensive misdirected changes and they'll prefer not to contact you in the future. on the other hand, if you don't enforce people following the original procedures and using good judgment you'll get called all the time.

at the end of the day it all comes down to how you lay out your config management. it needs to be simple and user-friendly while at the same time extensible and flexible. you want it to be able to grow with uncommon uses while at the same time not being over-designed or clunky. in my opinion, the best way to go about this is to break up everything into sections, just like you would a filesystem full of irregular material.

the parent directories should be very vague/general and become more specific as they go down, while at the same time always having the ability to group similar configuration into those sub-directories. a depth of 4 or 5 is a good target. don't get too worried about making it too specific; the more general you are at each step, the easier it is to expand the configuration there in the future as well as making it easier for users to find and apply configuration where it makes sense.

you also need to consider: "how practical is it to manage this configuration on this host?" you need to make it so that any given change on any given host should take no longer than 5 minutes to determine how to make that change on that host (using your configuration management system). in this way anyone from the NOC to the system engineers or architects can modify the system in real-time when it counts. documentation is no substitute for intuitive user-friendliness. documentation should explain why something is the way it is, not how to do it or where to find it.

note puppet's style guide and its reasons for its formatting: "These guidelines were developed at Stanford University, where complexity drove the need for a highly formalized and structured stylistic practice which is strictly adhered to by the dozen Unix admins"

*update* i don't think i even touched on it, but in cfengine the use of modules should probably be leveraged greatly instead of briefly. the more code you shove into your module, the less you'll need in your inputs and thus the easier it will be to manage the hulking beast of lengthy input scripts.

Tuesday, September 1, 2009

e-mail compared to the postal service

I think it's interesting to compare the way e-mail works to the way the Postal Service works. Let's look at it from the perspective of the Postal Service handling our e-mail.

First off, everyone pays to send and receive e-mail. Even if it's indirectly (an advertisement in your mail or on your mailbox) someone's paying for your mail. Instead of buying postage for each item you send, you pay a lump sum every month - around $2 per month let's say. Each item you send has the same rate - a postcard and a package are the same under your monthly fee. However, you can only send packages up to 10lbs in weight. You can also only store up to 100lbs (that's not to scale) of mail in your mailbox at any time. If you want to send more or keep more mail, you'll have to pay more per month.

When you mail a letter [e-mail], it goes to your local post office. But what then? In the world of e-mail, every post office is basically a separate fiefdom and subject to its own laws. Your post office will try to deliver your mail to where it needs to go, but it may have to go through several other post offices along the way, and all of them have different rules about whether it is allowed to be delivered. Most of it comes down to whether or not they determine your message is unsolicited bulk mail [spam]. If they think your mail is too suspect, or you're sending too many pieces of mail at once, or you sent it from a post office that may send a lot of spam-like letters, or a number of other reasons: they will not accept your mail. You or your post office will have to convince the other post offices that indeed your letter is a real and honest letter and should be accepted.

Companies that advertise via mail in this way are subject to a common problem: post offices refusing their mail. Sometimes it's genuine; a company may be sending what amounts to spam, and any given post office may not want to pass that on since it may be filling up their queues with excessive mail. A lot of the time the mail is genuine but the post office still deemed it to be spam. Either they're just sending too much and seem suspicious, or some of the recipients don't live at those addresses anymore, who knows. Maybe they just don't have the right official papers [SPF records] and the post office wants them to prove they're real. Everybody has their own criteria and they can do what they want in terms of sending your mail through or not.

It's not their fault. The poor post offices are under siege by tons and tons of mail which most of their users don't want. If the post offices weren't so selective, you'd have full mail bags dumped at your mailbox every day and it'd be nearly impossible for you to sort out an important document from the IRS or something like that. They need to verify that the mail coming in is vaguely genuine or it'll just clog up the whole system.

This isn't an exact representation of how the system works, but it's close. The sending and receiving of mail is basically a game of can-i-send-it, should-i-accept-it, with lots of negotiating and back-and-forth just to get some letters through. But how to solve it? Lots of suggestions have been put forth, even to the point of additional requiring fees to receive or send mail. But I think the heart of the problem lies in the unregulated nature of what is now an essential, core method of communication for the people of the world.

Postal mail is regulated, and though that body is woefully out of date and badly managed it does provide the mail in a reliable and fairly timely manner. The system allows for 3rd parties to also function to ensure a delivery of a message. This needs to be taken into account with our e-mail systems. We should have a regulated method of reliably delivering e-mail, but also allow for 3rd parties that can ensure the delivery of a message. A similar system would require each internet user's mail box to be independent of any particular provider so that anyone anywhere in the world can come and drop mail in it, without need to pay for the privilege of receiving said mail and without thought of whether or not my post office will allow me to receive it [spam blocking]. E-mail should become the next necessary component of communication amongst humans, but to do this you need to break down the barrier of cost and technicality.

I think the idea of requiring a small fee to send some mail is time-tested and effective. Would you spend $0.10 to send a letter to an editor telling them how they're a nazi because their news story had a certain slant? And would spam companies be able to deliver such huge mountains of unsolicited bulk mail if each one cost them $0.10? It wouldn't fix the problem because you'd still have people who have it in their budget to spend a gross amount of money on targeted advertising/marketing, but they couldn't afford to simply mail every person in the world for something they're probably not going to buy. This could also help fund a central regulated organized system of vetting mail and ensuring it gets delivered, instead of having all different post offices decide if they'll deliver you your mail or not.

Friday, August 14, 2009

fixing the problem with encrypted root boxes

before i blogged about my laptop with encrypted root drive and the possibility of someone installing a malicious program to collect login credentials when booting the system. this guide might provide me the solution to that problem: http://www.scribd.com/doc/3499565/IndustrialStrength-Linux-Lockdown-Part-2-Executing-Only-Signed-Binaries

i only skimmed it but basically it shows how combining a signing of applications with a special kernel module allows you to enforce execution of only properly-signed apps. tie that into your /boot and initrd and it may be much more difficult to exploit the unencrypted boot partition.

Saturday, August 1, 2009

how secure can you get?

The recent theft of my EeePC netbook has brought on the usual set of 20-20 hindsight for this scenario. I should have backed up my files. I kept being pessimistic about this due to the cost of doing it right compounded with the possible future need for more storage and thus more backup. I also should have had some backdoor secure application to run and report on the laptop's location when it booted up. But the files themselves are secure - the Linux OS's root drive and swap partition were both encrypted by default.

The backup and lojack applications are easy to accomplish. Burn or copy files every once in a while to some backup medium and put it in a safe or storage center. Write a script that gets my GeoIP information from one site and tweets it or uploads it to a pastebin. Even the encryption is easy to use - follow a simple guide provided by Slackware on their install CD and you can have a fully-functional encrypted root OS a few minutes after a full install. But what they don't discuss is the possibility of tampering with the system.

The whole thing works by allowing an unencrypted program in a boot partition to load up a kernel and some drivers and prompt me for a password. It doesn't take a computer forensics expert to figure out that such a program could be replaced with a trojan with keystroke-logging capabilities or even more complicated & nefarious tricks. If someone just has access to your machine for a few minutes - even when powered off - they can potentially infect your boot partition with a malicious application and the encryption password can be revealed.

So what's the safeguard? As far as I can tell there is little real assurance of trust. You could remove the boot partition and carry a thumb-drive with the bootloader, kernel and initrd. However, even this could be thwarted by BIOS or CMOS level malware that lies in wait. A BIOS password can provide some protection but even then I can imagine further attacks. An attacker could save the state of the CMOS, reset the battery, install their malware, then restore most of the CMOS along with the old password making it very difficult to tell if an attack took place.

In the end it's obvious you need multiple layers of security to even begin to really feel 'safe'. But realistically, the people who took my laptop will probably be thwarted by the Linux logo alone. Oh well; time to go buy some backup media...

Monday, July 27, 2009

how to sound stupid to tech people

Jolicloud.
Let's review it using a couple choice sentences.

"At Jolicloud we believe a movement has started. A movement that will change the computer industry forever"

Well, dang, i'm sure glad you're here to tell us about it! Here I was just sitting around Twatting wondering when the next big thing in the computer industry will come along. Thank god you told us on your "/idea" page.

"Jolicloud is an Internet operating system."

The Internet is a general term used for millions upon millions of interconnected networks and the infrastructure that allows it to seamlessly function. An Operating System is software that is designed to allow people to use programs easier. How the fuck do these things have anything in common?

"It combines the two driving forces of the modern computing industry: the open source and the open web."

Nothing purifies the fact that you're bullshitting like using simple buzz words in a nonsensical way.

"Jolicloud transforms your netbook into a sophisticated web device that taps into the cloud"

There. RIGHT THERE. See that last word? That's where you killed it. A web device I can deal with. Technically something like a Nokia 770 Internet Tablet is a 'web device', but a netbook *can* be a 'web device' so i'll drop that. But assuming I was stupid enough to buy the idea that your provision of a backend of scaling virtual allocations of disk, bandwidth or cpu power is going to be so amazing that my netbook's traditional OS will be obsolete...

"We feel privileged to witness this rebirth of the computer culture"

WOW! Finally people are using computers for more than just Twitter! It's a dream that those of us actively engaged in computer culture have had for a million years.

"We come from the web"

I come from a land down under.

"With our API, developers will have the ability to let their website communicate with the computer directly with no need to code specific native applications."

Kickass! Now we can just own machines directly through your shitty API's 0-days instead of writing shellcode or exploiting some setuid binary on the local system! Thanks Web Operating System!

"Netbooks are very new. They are still bulky"

They come in sizes as little as 7 inches up to 12 inches and weigh about 2 pounds. With a hard drive. What the fuck do you want? Any lighter and they'll fly away in a brisk wind. I do not want to buy a fucking paper weight for my laptop.

"No one has yet entirely switched his or her life online"

This isn't fucking Ghost In The Shell. Even in Surrogates or The Matrix they don't completely "switch online". Protip: don't talk in terms of actions that are either impossible or nonsensical.


So my first question is: why do I need this bullshit to replace my operating system? If it's really all run off the internet why can't I just download your client and run it inside my normal operating system? Assuming I turned off most of the useless bells and whistles of a common desktop environment like KDE or Gnome it's really not using many resources at all.

Are you telling me you've discovered a magical way to get Flash apps to stop sucking up 99% of my CPU, or to get misbehaving JScript to stop locking up the browser, or to get Firefox to stop eating half my RAM due to multiple tabs? If so BY ALL MEANS gimme. But if this is just some accumulation of wireless drivers, 3d acceleration and a pretty GUI wrapped around a user interface to web pages: keep your operating system. Mine works fin^H^H^Hacceptably. And I can do more with it than your 'cloud' will allow.

Friday, July 24, 2009

hackers: a class by themselves

I've heard interviews and stories from the "old hackers," the originals. Richard Stallman and Eric S. Raymond and all the other MIT Artificial Intelligence Lab people. They always seemed a bit self-important and talked about their experiences in such grandiose ways as if any of it mattered. I always thought it seemed like they didn't really do anything productive. The whole "hacker" mythos seems to come from the punkish prankish nature of the MIT students combined with knowledge nobody else could know. Now I think I realize why hackers have always been unique.

If you've ever met a young modern-day self-described hacker, they're usually of a certain type. Big egos, big labels, big brains. Dark clothing. Not very handsome. Not very friendly. Lots of them have grown up and become more adult and less cyber-industrial-goth. A lot of them that get married will probably start getting those "married friends" or friends with kids. But there is a pervasive "look" that seems to permeate the underground hacker scene (or did in the past anyway). The original hackers might have been a lot less "dark" but certainly they must have had a propensity to keep to themselves. But there's something I see that the original hackers may have in common with many modern ones: wealth.

To get into MIT you have to be very smart, but also have a little money. I don't think there's a lot of people who go to such a prestigious college that are broke or very impoverished. A lot of modern-day hackers seem to have a similar situation. Of course there's plenty that have very little and you'll see them all over the place. Wouldn't it make sense that well-educated, intelligent, knowledgeable people would need to have gotten some kind of decent education or at least a little money to afford the books and technology to teach themselves their skills?

I think that a good number of hackers today aren't as "elite" as they make themselves out to be. Perhaps they know a thing or two about technology. But i'm willing to bet you a good number of them wouldn't know their ass from assembler. That isn't to say a well-off hacker couldn't be equally dumb and still be in the community - but I think the economic class of the individual may play a significant role in how much knowledge they could apply in a variety of circumstances. Hackers are sort of a weird breed because they don't have a specific job title, so they have to know everything about everything or they aren't a "good hacker" (whatever that means). There are certainly lots of specific security titles but a lot of being a hacker has nothing to do with security.

I know I would be working at McDonalds or something if my parents hadn't bought our family a computer with internet access. If I hadn't gotten my parents to force my brother to show me how to create web pages (even geocities can be a mysterious technical entity to a newb) it may have taken me years longer to start to develop a curiosity in technology. Though I was curious about hardware I was clearly a "software person". However, if I had just a little direction and example at that early age I could have done almost anything. But any of this requires money, and especially back then it wasn't a small amount of money. The typical desktop cost around $2,000 and I have no idea how much the internet was. The internet was also smaller so finding documentation and other resources for learning was difficult.

Except for some gracious metropolitan areas and retail outlets who front the bill, you do still have to pay to use the internet. The only free computer can be found in the bowels of the endangered species known as a public library. But things are definitely much different now. It's probably an order of magnitude easier to learn something on the internet than how we used to. You no longer have to pay expensive professors' salaries or purchase rare books. So the classes seem to be evening up. Maybe in 20 years we'll have free access of information for even the most impoverished americans. For now, it still seems like poor people are stupid and rich people are slightly less stupid. And hackers who have the free time and resources to become well educated will be more of a hacker than the unemployed struggling artist who just wants to get rich doing something they love.

Monday, July 20, 2009

mobile development: the ugly truth

So you want to build an application for a mobile device (read: cellphone). Great! There's lots of powerful platforms out there you can start with, and port them to any other device you want with a minimum of work. Most everyone gives their SDK away for free and provide lots of forums, documentation and other sources of information to help you write your app. So you pick a platform, develop your code, test it in an emulator and everything's ready to go. Now just open up your wallet, bend over and let the fun begin...

Yes, it seems the sad truth is that for many modern devices you are *forced* to either go through a bunch of hoops to "sign" your application for one specific device (meaning 1 cell phone that you own) or pay a large sum in developer fees for the right to a certificate to sign your applications with. Otherwise either you'll never be able to install your application, or it will install and not run, or it will run with very limited "credentials" - it won't do everything you tell it to do.

For Symbian devices, the two methods developers had to get their apps signed have been Symbian Signed and Java Verified. These two services allow you to submit your application for review, upon which time you'll be shipped back your binary signed with their magic certificates for the IMEI of your device. Lovely. If I want to give my application to others I need to either self-sign it and use the few paltry credentials they let me use (internet access, possibly file access) or pay hundreds of dollars for a 1-year signed cert.

This has culminated with the creation of Symbian Horizon, a new middle-man created by Symbian in order to basically distribute your application to lots of other "app stores" automatically to "reduce developer cost". As far as I can see, i'm still paying about the same as I was before to develop the damn thing. Publishing it is a completely separate issue. I'm more than happy to provide a download URL on my own site and let people install an app themselves - it's really not difficult *if your application is signed*. I don't want to make money off people. I don't create applications for pay. I do it for myself, for fun, for freedom, and to share my work with the general public. I don't need to pay costly fees to develop for a desktop application, or a web application. Why am I forced to pay to develop for this device which i've already paid for, and am paying even more for the luxurious benefit of being able to most-of-the-time use the internet with it?

I can't believe the wool had been pulled over my eyes for so long with respect to these development practices. This is just another result of organizations like the RIAA and MPAA in their long-time battle to keep you from using the things you purchase in the way you choose. It makes me very sad to see this kind of oppression so well hidden within a system that encourages the breakdown of community and the closing-in and restricting of creative content for the sheer purpose of capitalizing on the ignorant masses.

Tuesday, May 12, 2009

vm fun

apparently you can't kick a recent red hat/centos without at least 512MB allocated to the vm, or you get nonsensical errors that don't indicate the problem.

also, the world of virtualization is very confusing. all these different types of vm's with different implementations and different limitations, and no real simple guide on how their actual use relates to one another, and when you need one thing or another. esxi isn't really free since you have to pay for a tool to configure it. their only linux support tool can't actually change any settings, you have to pay for esx to get that. (sure you can unlock the hidden ssh daemon and import an existing vm, but then you have to create all your vm's on your own machine? thats kinda retarded)

xen looks like the best bet for a free cluster of VM's, next to that openvz. i'm still trying to figure out how they work on a low level but all in all it seems you have to use a customized patched kernel and using the kernel you want to use or the patches you want may be quite difficult if not impossible.

Monday, May 4, 2009

thanks for all the coffee

Starbucks is like a number of retail outlets with an antiquated technological business model: charge people for internet access. It's not strange to many people who feel the convenience of getting on the web in the comfort of their favorite java dealer's boutique is ideal and worth a small hourly price. But for the rest of the world that just wants a brief look at their mail or news before heading out the door, belly full of bagels and iced latte, the price isn't worth the convenience. Thus many stores such as Dunkin Donuts and even Dennys have incorporated free wireless internet as a part of their business model to drive revenue by keeping customers planted in their seat and hoping they get hungry.

For the old model there is a price, and it is usually pretty standard in how it is addressed. You go to a store, you find wireless, you go to a website - and you are redirected to a portal demanding $10 an hour. Once receipt of your payment is confirmed, access controls in a proxy allow you to once again peruse the internet "fairly" unrestricted. This model works best in hotels or other establishments where it's preferable to pay a premium to get some kind of comfortable access to data services.

Being a hacker, I dislike the idea of paying to attain information. I don't think all information should be free per-se - some information is of limited scope and a sensitive nature, and i'd rather not everyone seeing all the "information" I access from my personal computer at home, for example. But in general I think paying for something the world has come to accept as "free access" just because it's a convenience some people find useful enough to be worth money to have access to isn't right. It's similar to how software is sold - it may cost the same amount to develop a piece of software that processes e-commerce payments as it does to develop an image-manipulating program, while one product is billed as far more expensive than the other, merely because people are willing to pay that much more simply to attain it. It's simply not fair to hold someone economically hostage because you have something they want or need, even though it really cost you nothing to develop it. This is how the rich become the super-rich in our excellent system of capitalism - exploiting people's weaknesses and benefiting from passing something off as more expensive than it really is.

So keeping this in mind, let me explain how you can get free internet access at Starbucks. There are several methods. The first two require an outside server to run a specific program, often which you must pay for unless someone has granted you a high level of access (and subsequently trusts you very much) or if someone else sets up the software for you.

The first piece of software is called iodine - this also requires name server entries to be set up for the domain (and yes you must own a domain) you will be connecting to. The theory is this: Most
- to be continued -

Friday, April 3, 2009

going retro: old firefox fixes new issues

Linux kinda sucks. There's many reasons (some you can read at The Linux Haters Blog and Why Linux Sucks), but for me it's always been firefox. Everyone claims it's just the best browser ever, but I think it's possibly worse than Internet Exploder.

First there's the memory issues. Since about version 2.0, firefox has been incredibly bloated and slow, introducing god knows what under the covers to soak up as much RAM as possible and eventually stop working. It was so bad for a while they had official notices of when the new versions would come out to fix this persistent issue. To this day you cannot browse around a few websites without 500 megs worth of shit clogging up your system.

Then there's the cpu. Try viewing 3 or more myspace band profiles in a recent firefox on a 3GHz Pentium 4 without it slowing down your whole system. (My load average soars up to 4 or 5 at least) Opera with the same pages loaded is much more manageable, and the same goes for most websites with flash audio and video. Is Opera doing some mystical incantations to make flash faster? I doubt it.

Today was the final straw. My system was left completely immobile by firefox (3). I opened a webpage and the VM went berserk, sucking up all available memory and swapping until the system was completely unresponsive. I rebooted instead of waiting around for the OOM to realize the system was fucked and kill the erroneous process.

To (hopefully) prevent this from happening again I have done the only thing I could think of to still use a halfway-decent browser while not blowing up my box: install an old version. I downloaded and packaged the 0.8 release of firefox (really still phoenix at that point) *without gtk2 or xft support* and the 1.0.8 release of firefox. I also have the latest version (3.0.8) on stand-by if I reeeally need a webpage which is Web 2.0-only, but so far only facebook has required it, and facebook sucks. /usr/bin/phoenix will point to 0.8 and /usr/bin/firefox-old/firefox-1.0.8 will point to 1.0.8, while /usr/bin/firefox goes to 3.0.8.

This should work for most cases as firefox 0.8 still uses the $HOME/.phoenix directory for user settings while 1.0.8 uses $HOME/.mozilla. I should really recompile 1.0.8 with a new user settings directory so I can use firefox 3.0.8 without worrying about clobbering old files in .mozilla, but i'm going to wait it out and see what happens. Alternatively I could run them in a chroot jail or just use the newest release of Seamonkey or the debian-inspired really-truly-free version of firefox.

The end result is fantastic. Firefox 0.8 page rendering times scream and the user experience is incredibly snappy and responsive (as a gtk1 app should be). Using firefox 1.0.8 is much the same as firefox 3, but it is noticeably faster at page rendering and much snappier in general use. "Modern" pages such as yahoo! mail work just fine, with only one or two minor page errors (which really could have been avoided with a better design choice).

My browser probably resembles swiss cheese from a security standpoint, but i'm willing to take the risk of someone exploiting my desktop to steal my incredibly sensitive camera photos that didn't make it onto my facebook. Lord knows the hacker underground has been after my flickr account's session cookies for years.

(psst... don't try looking for archived releases on Mozilla's main page. you'll have to dig through the ftp to find them: ftp://ftp.mozilla.org/pub/firefox/releases/)

Wednesday, April 1, 2009

some tricks to stay hidden

[1] If you're in a shell and typing some commands which you don't want to be preserved in your .bash_history, just run "kill -9 $$" when you're done with your shell. The parent process (shell) will be killed and not write anything to the .bash_history file.

[2] You want to log on to a box and use it interactively but don't feel like being monitored in the wtmp, use "ssh -v -T user@host" to log in. The verbose will tell you once you've successfully logged in. Now run 'last | head -n 5'. Your current login should not show up in the list. (Your login will probably still be logged by sshd in a separate file, but it's less noticeable)

[3] Encrypt your shell scripts: http://blogs.koolwal.net/2009/01/20/howto-encrypting-a-shell-script-on-a-linux-or-unix-based-system/
[3.1] Also use my script: http://themes.freshmeat.net/projects/encsh

[4] Encrypt all the files in a directory with cryptdir(1).