[{"content":"Medium has been going down hill fast. I didn\u0026rsquo;t notice for a while, since I have been heads down coding, but the other day when I needed to share a post with someone, it didnt even load my blog.\nI have moved all my blog over to github webpages, using hugo, medium-to-hugo and PaperMod theme.\nWriting posts and sharing them is very motivating for me and finally medium broke that enough for me to move my decade old blog for the third time.\nAnother enshitification of a service that was good.\n","permalink":"https://maori.geek.nz/posts/my-new-site/","summary":"\u003cp\u003eMedium has been going down hill fast. I didn\u0026rsquo;t notice for a while, since I have been heads down coding, but the other day when I needed to share a post with someone, it didnt even load my blog.\u003c/p\u003e\n\u003cp\u003eI have moved all my blog over to github webpages, using hugo, medium-to-hugo and PaperMod theme.\u003c/p\u003e\n\u003cp\u003eWriting posts and sharing them is very motivating for me and finally medium broke that enough for me to move my decade old blog for the third time.\u003c/p\u003e","title":"My New Site"},{"content":"I really like Zig. I have never wanted to write manually memory-managed code before. I don’t like C because it seems opaque and dangerous and I don’t like Rust for purely aesthetic reasons; but Zig is shiny and new and pretty and I can understand it intuitivly.\nI wanted to write a small HTTP service in Zig (0.15.2) that used SQLite as a data store and could scale up. To do this I used the [http.zig](https://github.com/karlseguin/http.zig) and [zig.sqlite](https://github.com/vrischmann/zig-sqlite) libraries which are pretty stable. I spent a couple weeks on this project, but kind of hit a few walls all at once which has lead me to the decision to abadon the spike.\nI am just going to record for my own sake the problems that I had, so when come back I can pick Zig up where I left off.\nThings that are awesome It is so fast, soooo fast. In testing it is easily 2x faster than the exact same Go service deployed on fly.io. Allocators are very understandable. Also finding memory leaks is pretty easy since you can report unallocated memory at the end of a program. Using arena allocators during a HTTP call also removes a lot of stress. The concurrency with threads is not too complicated with mutexes, Signals and Events being pretty straight forward. comptime is super powerful because it is easy to use. Generating code that can pick up errors like std.debug.print(\u0026quot;{s}\u0026quot;, .{} ) removes runtime decisions and points of failure. I used comptime to generate an HTTP handler for static files, generate a struct that for DB migrations, and generate prepared SQL statements. Zig’s learning curve is not too steep, there were a few days I got stuck on a problem, but was usually able to get through it with a increased understanding (instead of increased dislike) of Zig. Problems: Not having a string type was less of an issue but []const u8, []u8 , [_]u8, [:0]const u8 all being different really tripped me up a few times. Lack of packages was fine for most things. I needed a SQLite pool, so I build it; needed a .env parser, build it, needed a rate limiter, build it… A few things though were a bit more complicated, like an AWS S3 client. There are a few packages of unknown quality, and AWS does have a C S3 client. Finding something reliable without a giant amount of work might be impossible. When http.zig received a large volume of traffic calling server.stop causes a segfault. I think because it unallocates a requests currently being processed by a handler. This might be my fault, if I am holding it wrong but reliably shutting down was important for what I was building. zig build run swallows SIGTERMS so I have to use zig build \u0026amp;amp;\u0026amp;amp; ./zig-out/bin/cmd. If you don’t the zig build process will keep running when the application closes and you have to manually kill. With the current 0.16.0 rewrite in IO layer, I think they removed some functionality like gzip. This made it difficult to implement compression for static HTTP handlers to save on bandwidth costs. Simple mistakes caused a lot of slowness, especially in the DB pool layer. Some thread mutex was too broad so one endpoint because very slow. Zig without my code is fast, but I can lose all benefit with small mistakes. Finding where memory is used is difficult. The process uses about 2x the amount of memory I think it should, it is not a memory leak but maybe I am allocating too many things just to be safe. Its not that bad, but I would like to be able to inspect the allocator during runtime and find where my memory is actually being used. AI/LLM/ Smart Autocomplete is a great tool to learn a programming language. This is limited with Zig and its fast moving standard library and API. docker build breaks for zig on my mac and I can’t fix it. When finding the problem a common response is “you don’t need docker with zig”, but if you are deploying to fly.io (or a lot of other places) you do need docker. BTW it deploys fine, so I know the docker file is good.A lot of these problems may be my lack of understanding, or doing something wrong. I am sure a better programmer or someone who is more familiar with Zig could solve some of these issues. But I am just going to wait a bit before writing an HTTP service in Zig. That being said, I still love Zig and want to use it for a project, I just don’t know what yet.\n","permalink":"https://maori.geek.nz/posts/2026/i-tried-to-write-a-http-service-in-zig-and-failed/","summary":"\u003cp\u003eI really like Zig. I have never wanted to write manually memory-managed code before. I don’t like C because it seems opaque and dangerous and I don’t like Rust for purely aesthetic reasons; but Zig is shiny and new and pretty and I can understand it intuitivly.\u003c/p\u003e\n\u003cp\u003eI wanted to write a small HTTP service in Zig (0.15.2) that used SQLite as a data store and could scale up. To do this I used the \u003ccode\u003e[http.zig](https://github.com/karlseguin/http.zig)\u003c/code\u003e and \u003ccode\u003e[zig.sqlite](https://github.com/vrischmann/zig-sqlite)\u003c/code\u003e libraries which are pretty stable. I spent a couple weeks on this project, but kind of hit a few walls all at once which has lead me to the decision to abadon the spike.\u003c/p\u003e","title":"I Tried to Write a HTTP Service in Zig and Failed"},{"content":"Managing my 5yo son’s Type 1 Diabetes (T1D) at night is a lot like being on call as a software engineer. Both are absolute shite. Waking up to alarms at all hours, trying to fix something to only make it worse, not understanding why anything is happening, alarm fatigue, lost sleep, burnout, and the constant stress you will sleep through an emergency.\nIt is a bit guff to compare the two. Though, there are a some things that do apply to both, like good tools can reduce the burden.\nSo I built an app called “Carer” to simplify decision making and reduce alarm fatigue. This post is about building Carer and about the thinking that went into it. This post is not healthcare advice.\nWho is Carer for? Carer is an app that displays information and triggers alarms based on data from a Nightscout instance. Nightscout is an open-source, self-hosted individual **** repository of T1D data that has wide adoption in the DIY diabetes community for monitoring and remote care.\nCarer can also send information to Nightscout, like consumed carbs or temporary targets. This information is synced down to the artificial pancreas system (APS), **** like **** AndroidAPS, to make decisions about insulin delivery.\nIf you use Nightscout, and especially AndroidAPS, Carer is for you. Carer is available on iOS, macOS and Android (15+). It is free forever with no ads.\nThe Basic T1D Management Loop To understand why Carer exists, you must know the basics of being on call for T1D. The main goal is keeping Blood Glucose Level (BGL) between 4–10 mmol/L by:\nGiving Carbs (e.g. gummy bears) to make BGL go up in 20–30mins. Giving Insulin to make BGL go down in 20–60 mins. The time it takes for insulin and carbs to take effect is imprecise and can affect BGL for many hours.\nUsing these two levers to achieve stable blood sugars results in the T1D management loop:\nLook at Blood Glucose Level (BGL) every 5 minutes, which is the frequency of BGL readings from a continuous glucose monitor (CGM). If Low (\u0026lt; 4) or going low: Give carbs If High (\u0026gt; 10**)** or going high: Give insulin Wait 20–60 mins for carbs or insulin to take effect GOTO 1 This loop is a balancing act between insulin and carbs 24/7, sleep is not a respite. This is a pretty basic description of a complicated process, but it gives a good starting point for designing an app.\nBasic App Requirements To implement a tool to help with the above loop, it would need to show:\nBGL so that is easily **** readable. BGL trend up/down both as a diff between the last two readings and as a longer trend, to help predict highs or lows. History of carbs and insulin to calm nerves when waiting and to stop any double dosing. Information that affects decision making like insulin and carbs on board. Time since last reading so you know when the next reading will be. Making this information available by turning the phone into an always-on display will make checking BGL every 5 minutes much less disruptive. At night this display should dim as low as possible while still being legible as studies [1][2] show that light at night may be detrimental to health.\nThe app should alarm so that you don’t have to focus on the numbers during the day and so that you wake up when asleep:\nAlarm the user when high, low, urgent high, urgent low, fast drop and fast rise. Alarm when no BGL data have been received for a time. This is a catch-all alarm indicating a lot of problems. An alarms sound should be an increasingly loud noise and there should be a bright flashing screen so as not to wake the whole house. Ability to snooze Alarms when waiting for carbs or insulin to work. Predictions of BGL can be used to implement predicted low alarm and smart-snooze when predicted to soon be in range. Customisations for all alarms and alerts should be available, as each person manages their T1D differently. Adding carbs and insulin is tricky. In the future we may be able to raise blood glucose remotely using a dual hormone pump, but these are unavailable at the moment. Giving insulin remotely is possible, but very dangerous, so instead you will have to use other methods for this. What the app can do is adjust the APS algorithm by:\nWriting down given carbs that will increase insulin delivery. Changing the target BGL that will increase or decrease insulin based on target and current BGL. Swapping to a preset profile, which **** changes the settings that the APS algorithm uses to calculate dosage. Carer Screens for Display, Alarms, and Care Experience to Features When I started using Carer I was always on the lookout for how to make it better. Here are a few examples of features that came from something going wrong:\nOne night our son unplugged his phone’s charging chord, so his phone slowly ran out of battery overnight. Eventually it turned off, shutting down and causing a no data alarm after 20 minutes. Carer shows if his phone isn\u0026rsquo;t charging, and alarms if it gets to a low battery. On a few occasions I have woken up to an alarm that I had zero recollection of snoozing. Apparently, I had learnt to snooze an alarm in my sleep. Carer forces the snooze button to be a long press, making it just annoying enough to wake up a little bit. I woke up to a low alarm, gave carbs then snoozed the alarm. The low turned into an urgent low and my son needed more carbs, but the alarms were snoozed and I was asleep. Carer will wake up from a snooze if a low turns into an urgent low. Once I gave too much insulin, then snoozed for too long. My sons BGL dropped until he was low 30 minutes later. It was snoozed so it didn’t alarm for the low. Carer will turn snooze off if not alarming for 20mins. Every 3 days Sam needs his Omnipod insulin pump replaced. One day we forgot and only realised after he had fallen asleep. Carer shows upcoming changes for pumps (every 3 days) and CGMs (every 10 days). Fell asleep with notifications silenced: Show when notifications are turned off Fell asleep with volume down: Alarms override system volume Misclicked buttons: Better UI and UX Phone was locked but still want to be notified of alarms: Use background task to notify of alarms (only runs at most every 15 mins, so is not reliable) As I added each one of these features, I was able to relax a bit more as something that was stressful and that I had to have in the back of my mind all the time was resolved.### Technical Details\nHere are a few of the technical details of Carer’s implementation:\nI implemented Carer in Flutter, because I wanted a cross platform app and did not feel like going back down into the JavaScript depths again with React Native. Their is virtually no difference in the various platforms code. Flutter is great. Carer’s (non-private) data is stored using SQFLite. Using SQL like a web app made it feel like home, especially with testing. I use the Signals library for state management. I found it much more intuitive to understand than other competing libraries like Bloc. I started building Carer in Jan-2024. Started using it in June-2024 (with backup phone running Nightguard). Started using it exclusively November-2024. Beta testers in April-2025. Published in July-2025. It has taken a while. I still use Dexcom’s Follow app for push notifications, and NightGuard for widgets and watch integrations. I hope in future versions to implement these features within Carer. Carer does not have its own backend server using individual Nightscout instances. This makes development harder because Nightscout has pretty big differences in formats and data types based on integrations and versions. Having no server to manage does mean I could write Carer without worry about hosting and costs. What is Next? There are a few features I would like to add to Carer:\nSupport for more non-audio alarms. Sugar Pixel supports a vibrating puck (e.g. this) like those used as hearing impaired alarms. Having Carer support this would make alarms much less intrusive. Push notifications. At the moment, alarms only work when the app is open, and local notifications are not reliable. Push notifications would require a server with user data, which is a lot of work to build securely. Open Source? At the moment I see little benefit of open sourcing this, only downsides. However, if you want to work on this with me I am more than happy to share the source code directly. Better prediction algorithm. I have already written about the limitations of Carer’s prediction algorithm. Getting a solid prediction algorithm would be good, but it is very difficult to do well. Integration with other Nightscout tools like auto-tune. Support for other languages (i18n) and accessibility (a11y) features would make Carer available to many more people.### Similar Apps/Devices: Carer is not unique and has borrowed many ideas from other great tools.\nNightGuard is probably the tool Carer is most similar to. I still use NightGuard for my watch and widget displays. We also use a sugar-pixel, a custom piece of hardware that is more like an retro-alarm-clock. We do not use it for alarms but as a display beneath our TV to easily see numbers. I use, but dislike, Dexcom Follow. It is only useful for push notifications which sometimes do alert me to an issue. I have used APSClient as a remote control but is difficult to use and explain. Here are more apps like Carer that I have not used:\nGluroo Spike Sugar Mate Loop Follow and Caregiver Sweet Dreams Shuggah If one of these tools suites you better than Carer that is awesome. Managing T1D is hard and you deserve help.Thanks to my wife for being the first user; Diego for helping me clean up the UX, beta testers for finding so many issues.FYI: A video showing how to get a Nightscout Token for use with Carer\n","permalink":"https://maori.geek.nz/posts/2025/2025-07-15_being-on-call-for-type-1-diabetes-with-carer/","summary":"\u003cp\u003eManaging my 5yo son’s Type 1 Diabetes (\u003cstrong\u003eT1D\u003c/strong\u003e) at night is a lot like being on call as a software engineer. Both are absolute shite. Waking up to alarms at all hours, trying to fix something to only make it worse, not understanding why anything is happening, alarm fatigue, lost sleep, burnout, and the constant stress you will sleep through an emergency.\u003c/p\u003e\n\u003cp\u003eIt is a bit guff to compare the two. Though, there are a some things that do apply to both, like good tools can reduce the burden.\u003c/p\u003e","title":"Being On Call for Type 1 Diabetes with Carer"},{"content":"I like shiny new things. Zig is very new, and very shiny, and very fast. I have no experience writing non-garbage collected languages, so I want to test it out and compare Zig with GoLang. Benchmark The benchmark is a HTTP API wrapper around SQLite: `\n/write takes a JSON { \u0026ldquo;name\u0026rdquo; } and write is to the DB /read returns 100 items `\nThe SQL: `\n\u0026ndash; TABLE\nCREATE\nTABLE IF NOT\nEXISTS test (id INTEGER\nPRIMARY KEY AUTOINCREMENT, name REAL , timestamp\nINTEGER );\n\u0026ndash; READ\nSELECT id, name, timestamp\nFROM test ORDER\nBY id DESC LIMIT 100 ;\n\u0026ndash; WRITE\nINSERT\nINTO test(name, timestamp ) VALUES (?, ?);\n\u0026ndash; SETTINGS\nPRAGMA journal_mode WAL; \u0026ndash; To increase write concurrency\nPRAGMA busy_timeout 5000 ; \u0026ndash; To wait on busy requests\n`\nThe tests on my MacBook Pro: ` ab -c 100 -n 100000 -p postdata.json http: //\n127.0 . 0 . 1 : 3000 / write\nab -c 100 -n 100000 http: //\n127.0 . 0 . 1 : 3000 / read\n`\nZig I am using the Zap HTTP framework and zig-sqlite wrapper.\nFirst I did the stupid things:\nlike not allocate enough characters for the return message, so it silently did nothing. Didn’t use the right version of Zig, everything is changing quickly. Didn’t understand the ArenaAllocator so I had a memory leak. Also, I used a single instance sqlite.DB across the threads spawned by Zap. This caused everything to blow up under load, reporting arcane error messages. To fix this I used the threadlocal keyword: threadlocal var mainDB: ?sqlite.Db = null ;\nI couldn’t find a hook in Zap to call when a thread is spawned, so on the first request to a thread it checks if it is null, connects to the DB and builds all the prepared statements. So there server does require some warming up.\nWith zig build -Doptimize=ReleaseSafe run we get the results:\nWrite: 16,770 rps, 67% CPU, 12 MB Memory Read: 21,398 rps, 500% CPU, 12 MB Memory, 8 Errors The errors seem to be from requests not correctly disconnecting, these errors occasionally killed ab which is weird.\nGolang I have written too many Golang services, so I couldn\u0026rsquo;t be bothered with this one and told ChatGPT to write it. With some small edits, that frankly ChatGPT should be embarrassed about, I had a working service. If you have seen a Go service before you know what this looks like.\nWrite: 16,239 rps, 180% CPU, 65 MB Memory Read: 15,470 rps, 670% CPU, 71 MB Memory Reactions Con #1: Zig is New Zig is changing fast and some dependencies target master and some the stable 13.0. This means you pretty much have to run master. Since tools like ChatGPT train on dated content, Zig suggestions or code generation is old or non-existent. Other tools like VSCode have gaps in their tooling, like linking to source from dependencies Pro #1: Zig in Fast Zig is significatly faster than Go for /read endpoint. Zig also uses much less RAM and much less CPU. /write endpoint is probably HDD bound, and the reason for similar results. The Go binary is 7.6 MB large and Zig is 2.2 MB. Con #2: Zig is Low Level The LOC for the Zig was 150-ish lines and Go was 100. Each line of Zig was very dense, and their were little landmines everywhere. Go was pretty plain, very difficult to fuck it up. A web service was probably not the best comparison between the languages, but it was what I am working on. Zig error codes are hard to run down. For a bunch of time I was getting weird errors caused by threads. Although I was able to solve them, it took a long time. Zig is erroring on read, but only when there are LOTS of requests. I never ran down the problem, not sure if I am just holding it wrong. That isn’t the point though, Go just works and Zig has these little issues. Fixing bugs is fun when you are learning, but torture if you want to produce something. Pro #3: Zig is shiny Zig is fun to code. It is the first low level language that I looked at and liked the aesthetics (Rust and C look like work). Picking it up and trying it out was very easy. Last time I tried C, I spent an afternoon stuck in dependency hell. New things are fun to learn. Learning a language that is new is like homesteading, where you have to do everything because no body has done it yet. Conclusions I don’t think a web service with SQLite is a very fair comparison. Zig would be better for applications that are performance or size bound like embedded. For my upcoming web service projects I will use GoLang, but I am on the lookout for a Zig project to use the language in anger with. .\n","permalink":"https://maori.geek.nz/posts/2025/2025-03-01_sqlite-http-api-zig-vs-golang/","summary":"\u003cp\u003eI like shiny new things. Zig is very new, and very shiny, and very fast. I have no experience writing non-garbage collected languages, so I want to test it out and compare Zig with GoLang. \u003ca href=\"https://github.com/grahamjenson/zigvsgo\"\u003e\u003ccode here\u003e\u003c/a\u003e\u003c/p\u003e\n\u003ch3 id=\"benchmark\"\u003eBenchmark\u003c/h3\u003e\n\u003cp\u003eThe benchmark is a HTTP API wrapper around SQLite:\n`\u003c/p\u003e\n\u003cdl\u003e\n\u003cdt\u003e\u003ccode\u003e/write\u003c/code\u003e\u003c/dt\u003e\n\u003cdt\u003etakes a\u003c/dt\u003e\n\u003cdt\u003eJSON\u003c/dt\u003e\n\u003cdt\u003e{\u003c/dt\u003e\n\u003cdt\u003e\u0026ldquo;name\u0026rdquo;\u003c/dt\u003e\n\u003cdd\u003e\u003cname\u003e} and write is to the\nDB\u003c/dd\u003e\n\u003c/dl\u003e\n\u003cp\u003e\u003ccode\u003e/read\u003c/code\u003e\nreturns\n100\nitems\n`\u003c/p\u003e\n\u003cp\u003eThe SQL:\n`\u003c/p\u003e\n\u003cp\u003e\u0026ndash; TABLE\u003c/p\u003e\n\u003cp\u003eCREATE\u003c/p\u003e\n\u003cp\u003eTABLE\nIF\nNOT\u003c/p\u003e","title":"Sqlite HTTP API: Zig vs Golang"},{"content":"Our 4yo T1D son, Sam, needs to carry a phone everywhere he goes to act as his pancreas. This phone needs to:\nrun Android APS and a few other apps which need Android 11 or above (14 is preferred). have good bluetooth, Wifi and cell signal. It needs to have constant communication with his Omnipod pump, Dexcom G7 and remote systems like NightScout. be sturdy, light and small. It is strapped to a very active 4yo. be easy to use. Clicking the wrong button can be pretty dangerous. be very cheap. Things break, buy a couple phones as backups. The phone doesn’t need fast graphics, large battery, big screen or good camera; the things most phones are marketed on. It just needs to work and not break when a 4yo falls off their bike.\nIn this post I am comparing Sam’s past phones with phones he may have in the future.\nSam with his Phone in his colourful belt\nPixel 4a 5G When Sam was diagnosed he didn’t move around a lot, so we could just put the phone near where he was playing. We chose the Pixel 4a 5g mostly because it had a large OLED screen making it a good way to display his BGL from across a room.\nThe Pixel 4a is an excellent phone, but once Sam started to move around more we needed to strap a smaller phone to him.\nCubot King Kong Mini 3 The King Kong Mini 3 (KKM3) was the recommendation from Bionic Wookie post “Pancreas Phone”.\nThe KKM3 has been Sam’s pancreas phone for the last 2+ years, and the only reason we are thinking of changing (and the reason for this post) is that we can’t buy them any more. Although they are tough, Sam is getting through them; one had its screen smashed and one drowned in the ocean. When Sam breaks his last KKM3, we will need to pick a new phone.\nSmart E25 When Sam drowned his previous KKM3 we were on Christmas holiday at my parents. I did not pack a spare phone, so I had to quickly buy a new phone off the shelf.\nThe cheapest phone in NZ is the Smart E25 ($80). It is some white-label generic locked phone. Everything about it is a bit crap, but for $80 it worked for a few weeks.#### Unihertz Jelly 2E\nThe Jelly 2E is the more limited version of the Jelly Star, which has good reviews for running AAPS. The Jelly 2E has lower specs than the Star (camera, RAM, storage) making it $100 cheaper.\nDoogee S Mini The Doogee S Mini looks and feels like an upgraded Cubot KKM3. It has the features of the KKM3 that I like. It is a bit bigger and a bit more upgraded.\nPhoneMax Q9 Mini There is not too much info about the PhoneMax Q9 mini online. A few posts including a review and a rumour it is a rebranded Foneopia phone. What I know is it is very cheap, very light, and has a plastic body that feels cheap.#### Comparison\nPixel 4a 5G, KKM3, Smart E25, Jelly 2E, S Mini, Q9 Mini\nAll prices are in $NZD including shipping, measurements in mm, and CPU Score is from GeekBench where higher means a faster phone. `\nPhone Price Weight Dimensions Version CPU Score Pixel 4a 5g $300 174.1g 154 x 74 x 8.2 Android 14 1797 Cubot KKM3 $250 149.8g 131 x 58 x 13.4 Android 12 1358 Smart E25 $80 156.5g 147 x 72 x 8.5 Android 14 N/A Unihertz Jelly 2E\n$300 123.3g 95 x 49 x 16.5 Android 12 525 Doogee S Mini $300 156.1g 133 x 60 x 13.5 Android 13 2025 PhoneMax Q9 Mini $182 112.2g 114 x 54 x 11.9 Android 14 1370 `Sam’s Next Phone\nThe Jelly 2E feels rugged, but is a bit thick so will bulge in Sam’s belt. This might be uncomfortable sitting down. I am also worried that the screen is so small that it might make it difficult to use.\nThe Doogee S Mini is basically an upgraded KKM3. That makes is a solid, safe choice if I were trying to replace Sam’s existing phone.\nThe PhoneMax Q9 Mini is sitting in a sweet spot; it is cheap, it is light, it is fast and has Android 14. It is not as rugged as the other phones, so I am most worried about this breaking, but hopefully the price being 60% the other two makes it worth the risk.The biggest problem with all the options is that we have to ship them from overseas because apparently wanting a small, strong phone is a niche product. What I would like is at least one large manufacturer to build such a phone. Who knows? Maybe the folding phone trend will give us that option.\n","permalink":"https://maori.geek.nz/posts/2025/2025-02-13_small-light-robust-phones-for-a-type-1-diabetic-child/","summary":"\u003cp\u003eOur 4yo T1D son, Sam, needs to carry a phone everywhere he goes to act as his pancreas. This phone needs to:\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003e\u003cstrong\u003erun\u003c/strong\u003e \u003ca href=\"https://androidaps.readthedocs.io/en/latest/\"\u003e\u003cstrong\u003eAndroid APS\u003c/strong\u003e\u003c/a\u003e and a few other apps which need \u003ca href=\"https://androidaps.readthedocs.io/en/latest/Getting-Started/Phones.html\"\u003eAndroid 11 or above (14 is preferred)\u003c/a\u003e.\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003ehave good bluetooth, Wifi and cell signal.\u003c/strong\u003e It needs to have constant communication with his Omnipod pump, Dexcom G7 and remote systems like \u003ca href=\"https://nightscout.github.io/\"\u003eNightScout\u003c/a\u003e.\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003ebe sturdy, light and small.\u003c/strong\u003e It is strapped to a very active 4yo.\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003ebe easy to use.\u003c/strong\u003e Clicking the wrong button can be pretty dangerous.\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003ebe very cheap.\u003c/strong\u003e Things break, buy a couple phones as backups.\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eThe phone doesn’t need fast graphics, large battery, big screen or good camera; the things most phones are marketed on. It just needs to work and not break when a 4yo falls off their bike.\u003c/p\u003e","title":"Small, Light, Robust Phones for a Type 1 Diabetic Child"},{"content":"I use an app called NightGuard to track my Type 1 Diabetic (T1D) son’s blood glucose levels (BGL) and alert me if he needs some food or insulin. I love NightGuard. It is one of the most important applications I use to manage T1D.\nOne of NightGuard’s best features is its predictive alerts. These alerts use regression models to try predict BGL and notify of future problems. In this post I am going to explore the way NightGuard uses regression and see if I can improve its outcomes.\nNightGuard A Continuous Glucose Monitor (CGM) **** reads Sam’s BGL every 5 minutes, sends it to a service called NightScout, which is consumed on my phone by NightGuard. A core feature of NightGuard is that it alerts me if Sam’s BGL is too low or too high:\nLow is below 72ml/dl (4mmol/L) High is **** above 180mg/dl (10mmol/L) In Range is between Low and High The actions I can take when getting an alert are:\nReduce BGL with Insulin: this can take an hour to have proper effect Increase BGL with Carby Food: this can take 20 mins to take effect Wait and see if a previous action will take effect Note: If a diabetic is acting strange, give food (if conscious) then call an ambulance!\nGiven the delays insulin and food have, managing diabetes requires treating problems that have not yet occurred. Being able to accurately predict future issues would make life with T1D much easier and safer.\nNightGuard has a prediction algorithm that uses the last couple (2 or 3) BGL readings as training data for multiple regression models. By calculating the error against the training data, it selects a model to predict future values.\nThis method has one big problem. Using the training data as a means of selection might result in a model that fits the training data perfectly, but not accurately predict the future.\nLets look at the actual models.\nRegression with Least Squares Estimation The algorithm used by NightGuard to create the regression models is the Least Squares Estimation **** using **** LU decomposition. The models NightGuard use are:\nLinear ax + c Quadratic ax² + bx + c Exp log(y) = x Log y = log(x) LogLog log(y) = log(x) Sqrt y = sqrt(x) Lets look at an example of each of these models: given three readings in a row 180,165,165, what will each of these models predict?\nLinear and Exp both predict BGL will continue to drop, the rest predict it to rise, with Quadratic being the most extreme. In this example, NightGuard would select the Quadratic model because it can construct a quadratic curve that perfectly hits all the data points.\nAs someone who has managed T1D for a few years now, I would not think the Quadratic model is correct, but lets look at the data.\nResults I got 90 days (over 20k readings) of Sam’s data from TidePool, then implemented the models, including adjusting for training data, e.g. Linear(2) is a Linear regression using 2 previous readings. I was able to simulate and compare these models using root mean squared error (RMSE) against real data 1–6 readings in the future (5–30 minutes).\nBest Models w.r.t. RMSE. _Log,_ __ _LogLog_ and _Sqrt_ were removed because they were trash.\nThis shows that Exp(2) and Linear(2) have the least error, while all Quadratic models have high errors.\nFeature Implementations Evaluating the models by only RMSE may not result in the best models w.r.t. the features NightGuard has. To make sure the models are good, I tested them against real features:\nPredictive Low Alert: alert when the model predicts BGL will go low. Smart Low Snooze: if already low, snooze alert if the model predicts BGL to rise into range. I looked at these features as if they were Binary Classifiers; the models are used to calculate the number True Positive (TP), False Positives (FP), True Negatives (TN) and False Negative (FN) values. For our features:\nA FN for Predictive Low Alert would be not alerting if about to go low (unsafe), a FP would be alerting when it was not needed (annoying). A FP for Smart Low Snooze would be snoozing an exiting low alert incorrectly assuming BGL will rise (unsafe), a FN would be not snoozing an alert when the BGL will recover (annoying). Each of these factors have a very different level of risk:\nIf a Predictive Low Alert is always incorrect, then I will be annoyed with to many alarms. If Smart Low Snooze is always incorrect, then it might snooze real alarms and put Sam in a life threatening situation. This risk must be taken into consideration.\nI was able to run these features against the above collected 90 days worth of Sam’s data. A pretty good metric to compare the models for Predictive Low Alert is F-Score:\nF-Score favouring agressive models\nThe F-Score for Smart Low Snooze:\nF-Score for both features select the most aggressive models, the ones that return True the most, Quadratic(3) and Linear(7). Using predictions 20mins out, over the 80 days:\nQuadratic(3) would alert 5135 times, 308 being correct. Linear(7) would sleep 930 alarms, 563 correctly. Both Quadratic(3) and Linear(7) had the highest TP values and the highest FP values. I would prefer a less activley reckless model.\nI tried a bunch of different metrics that (IMHO) overweighted TP values v.s. the risks of FN and FP. I ended up measuring the models with a simple weighted sum of their positions based on 20mins (4 readings), e.g. Exp(2) has the 1st least FP and the 8th least FN, so p(FP) = 1 \u0026amp; p(FN) = 8.\nFor Predictive Low Alert the weight is calculated 3*p(FN) + 2*p(FP) + p(TP) For Smart Low Snooze the weight is calculated p(FN) + 10*p(FP) + p(TP) Here are the results:\nLook at that, Exp(2), is near the top of both these lists. I think that inaccurate models, like Quadratic(3), are at the top because they assume Sam will come out of a Low quickly, after we have already treated the low. I think if I removed data that occurred after a treatment we would see that value of Quadratic(3) model decrease.\nComparison with other models This 2020 paper has a useful table to compare my models against:\nExp(2) has an RMSE at 30 minutes (for just Sam) of 50.05. That is nearly 7x WORSE as the best model using only CGM data as input. These models have a long way to go before they are top notch.\nConclusions This is not a scientific study, it is not broadly applicable, and it might be wrong. In this post, I am trying to understand the problems with predicting future BGL levels with the minimum amount of data available. I think this is a pretty good start.\nExp(2) is a good enough model for now. Last night I was woken twice by NightGuard predicting lows (one of them correct). If I were using Exp(2) it would have only woken me for the actual low. This is good enough reason to go ahead with this model.\nReferences: Evaluation of Binary Classifiers F-Score NightGuard: Alarm Rules, Regression Implementation Machine Learning model with low RMSE ","permalink":"https://maori.geek.nz/posts/2024/2024-06-20_problems-with-predicting-blood-glucose-with-regression/","summary":"\u003cp\u003eI use an app called \u003ca href=\"https://github.com/nightscout/nightguard\"\u003eNightGuard\u003c/a\u003e to track my Type 1 Diabetic (\u003cstrong\u003eT1D\u003c/strong\u003e) son’s blood glucose levels (\u003cstrong\u003eBGL\u003c/strong\u003e) and alert me if he needs some food or insulin. \u003cstrong\u003eI love NightGuard.\u003c/strong\u003e It is one of the most important applications I use to manage T1D.\u003c/p\u003e\n\u003cp\u003eOne of NightGuard’s best features is its predictive alerts. These alerts use \u003cstrong\u003eregression models\u003c/strong\u003e to try predict BGL and notify of future problems. In this post I am going to explore the way NightGuard uses regression and see if I can improve its outcomes.\u003c/p\u003e","title":"Problems with Predicting Blood Glucose with Regression"},{"content":"My Laser Cutter bed is 300mm x 500mm, and I need to make a shape that is about 900m x 750mm. I could just use squares, but then the joinery gets complicated making sure everything lines up properly.\nUsing jigsaw pieces might be a fun way to make such large shapes, so here is how to do that.\nInstall “Lasercut Jigsaw” extension Create the puzzle you want with Extensions \u0026gt; Render \u0026gt; Lasercut Jigsaw\nUsing Path \u0026gt; Combine join all the X and Y axis paths together Move the XY path to ABOVE the border path\nUse Path \u0026gt; Division to split out all the pieces\nNow there are lots of individual puzzle pieces to cut.\nOther links: https://inkscape.org/ru/forums/questions/cut-object-using-a-template/ https://github.com/Neon22/inkscape-jigsaw/issues/19 ","permalink":"https://maori.geek.nz/posts/2024/2024-05-21_using-inkscape-to-make-individual-puzzle-pieces-for-laser-cutting/","summary":"\u003cp\u003eMy Laser Cutter bed is 300mm x 500mm, and I need to make a shape that is about 900m x 750mm. I could just use squares, but then the joinery gets complicated making sure everything lines up properly.\u003c/p\u003e\n\u003cp\u003eUsing jigsaw pieces might be a fun way to make such large shapes, so here is how to do that.\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003e\n\u003cp\u003eInstall “\u003ca href=\"https://inkscape.org/~Neon22/%E2%98%85lasercut-jigsaw\"\u003eLasercut Jigsaw\u003c/a\u003e” extension\n\u003cimg alt=\"image\" loading=\"lazy\" src=\"/posts/2024/2024-05-21_using-inkscape-to-make-individual-puzzle-pieces-for-laser-cutting/images/1.png#layoutTextWidth\"\u003e\u003c/p\u003e\n\u003c/li\u003e\n\u003cli\u003e\n\u003cp\u003eCreate the puzzle you want with \u003ccode\u003eExtensions \u0026gt; Render \u0026gt; Lasercut Jigsaw\u003c/code\u003e\u003c/p\u003e","title":"Using Inkscape to make Individual Puzzle Pieces for Laser Cutting"},{"content":"I am building a Blood Glucose Display and just wanted to collect a few of other apps examples together to compare and get ideas:#### Dexcom\nLibre\nNightGuard Nightscout SugarPixel I don’t 100% like any of them.\nThere are like two ways in which I use this information:\nAs a display to “glance” numbers and to alarm me if something is wrong. Like pull and push information. Using the information to calculate treatments, once something is wrong, or I think something is going wrong I need to work out a remedy. A lot of them contain graphs and other information that I don’t really need for either purpose.This is just a quick post about this to force my ideas into a box.\n","permalink":"https://maori.geek.nz/posts/2024/2024-02-20_design-blood-glucose-levels/","summary":"\u003cp\u003eI am building a Blood Glucose Display and just wanted to collect a few of other apps examples together to compare and get ideas:#### Dexcom\u003c/p\u003e\n\u003cp\u003e\u003cimg alt=\"image\" loading=\"lazy\" src=\"/posts/2024/2024-02-20_design-blood-glucose-levels/images/1.gif#layoutTextWidth\"\u003e\n\u003cstrong\u003eLibre\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cimg alt=\"image\" loading=\"lazy\" src=\"/posts/2024/2024-02-20_design-blood-glucose-levels/images/2.png#layoutTextWidth\"\u003e\u003c/p\u003e\n\u003ch4 id=\"nightguard\"\u003eNightGuard\u003c/h4\u003e\n\u003cp\u003e\u003cimg alt=\"image\" loading=\"lazy\" src=\"/posts/2024/2024-02-20_design-blood-glucose-levels/images/3.png#layoutTextWidth\"\u003e\u003c/p\u003e\n\u003ch4 id=\"nightscout\"\u003eNightscout\u003c/h4\u003e\n\u003cp\u003e\u003cimg alt=\"image\" loading=\"lazy\" src=\"/posts/2024/2024-02-20_design-blood-glucose-levels/images/4.png#layoutTextWidth\"\u003e\u003c/p\u003e\n\u003ch4 id=\"sugarpixel\"\u003eSugarPixel\u003c/h4\u003e\n\u003cp\u003e\u003cimg alt=\"image\" loading=\"lazy\" src=\"/posts/2024/2024-02-20_design-blood-glucose-levels/images/5.png#layoutTextWidth\"\u003e\nI don’t 100% like any of them.\u003c/p\u003e\n\u003cp\u003eThere are like two ways in which I use this information:\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003eAs a display to “glance” numbers and to alarm me if something is wrong. Like pull and push information.\u003c/li\u003e\n\u003cli\u003eUsing the information to calculate treatments, once something is wrong, or I think something is going wrong I need to work out a remedy.\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eA lot of them contain graphs and other information that I don’t really need for either purpose.This is just a quick post about this to force my ideas into a box.\u003c/p\u003e","title":"Design: Blood Glucose Levels"},{"content":"Sam, our 3.5yo T1D, just started preschool. In order to keep Sam safe with his BGL in a good range, reduce our stress and to help the staff at his new preschool we devised a plan that focused on:\nmaking a custom Diabetes Action and Management Plans (called 504 plans in the US) Automating with AndroidAPS’s SMB (super-micro-bolus) and Remotely controlling AAPS through SMS commands. This post will go into each briefly to hopefully help anyone else going through similar stress.#### Action and Management Plan\nThe default action and management plans contain a lot of information not relevant to Sam’s T1D management. The preschool staff shouldn’t have to filter through irrelevant or wrong information in these documents, so we made our own.\nSam’s Diabetes Action Plan\nThe action plan is a cheat sheet for managing T1D. It should be easy to find the information you need; the emergency information is at the top, followed by the most commonly needed information.\nManagement Plan, page 1/4\nThe Management plan is a more wordy document outlining Sam’s T1D and what is expected of the staff to manage it. The goal is to make the staff comfortable with T1D management by providing a complete guide.\nManagement Plan, page 2/4\nWe went over this plan in a meeting with the staff and our diabetes nurse. Making everyone comfortable with T1D management and their roles in it gives confidence in making decisions.\nManagement Plan, page 3/4\nFood is of course the biggest issue. Each day Sam has 2 meals, and Sam’s preschool also does cooking as an activity, e.g. baking cake or bread or making smoothies. We told them that Sam can eat anything, we just need to know what it is and how much.\nManagement Plan, page 4/4\nSetting expectations is important for staff to know that T1D management is never perfect. My personal target is 70% TIR while at day care. Last week we hit 68%, that is pretty close so I am happy.#### Automation + Remote Control\nTo implement this plan we use AAPS’s Super-Micro-Bolusing (SMB) feature and targets using SMS commands.\nAn example of this interaction is:\nthe teacher messages us 15 mins before he eats. we set the “target meal”, which is 5.0mmol/L for 60 minutes the teacher then messages us the carbs in the lunch box (it is written on the lunch box) We add the carbs with “carbs 30” SMB will then slowly give small boluses every 5 minutes based on his rising BGL. For example, our preschool said he would be eating in 20 minutes, so we put him in a meal target.\nSoon after they will send us the messages:\nWhen he started eating we added 30 carbs with the SMS command “carbs 30”. But he didn\u0026rsquo;t eat everything, that is fine because SMB didn’t give him all the insulin at once. Then, about 40 minutes later he went back and ate the rest, that is when SMB gave him more insulin.\nAbout 20% of the time SMB is not enough. If he is rising too quickly, or he gets too high, we will help SMB out by giving another a small bolus, e.g. SMS command “bolus 0.5”.\nHe has been dropping too fast once or twice but the alarms on Nightguard alerted the staff and they proactively messaged us about it, for example:\nIf he does need a hypo snack, we usually advise on what to give, but with experience we expect to get less messages asking what to give, and more telling us what they gave.#### Settings + Results\nWe were very conservative on our initial settings, because it is easy to give insulin remotely, but not carbs. Preschool is between 8am and 2pm, during this time\nSam’s target has been 6–8mmol/L increased his basal from what it was increased “Max minutes of basal to limit SMB” to 60mins Carb impact is maxed out at 12, max meal absorption time min 4 hours. Results:\nThe main things we to make sure is that Sam is in good range, we are not over burdening ourselves or the staff:\n68% in range 26% high, 5% very high \u0026lt;1% low, 0% very low 6–7 SMS commands per day 20ish messages per day between staff and parents We have room to make the settings a little more aggressive without risking lows. I also expect the number of messages from staff to decrease as they are asking a lot of questions at this early stage.The information in this post is not health advice, since so much of it is based on specifically Sam and our management of his T1D. I hope this might give ideas or help you with your management of a T1D preschooler. If you have any questions, please reach out I am more than happy to help.\n","permalink":"https://maori.geek.nz/posts/2024/2024-02-18_managing-a-type-1-diabetic-preschooler-with-androidaps/","summary":"\u003cp\u003eSam, our 3.5yo T1D, just started preschool. In order to keep Sam safe with his BGL in a good range, reduce our stress and to help the staff at his new preschool we devised a plan that focused on:\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003emaking a custom \u003ca href=\"https://www.diabetes.org.nz/diabetes-action-and-management-plans\"\u003eDiabetes Action and Management Plan\u003c/a\u003es (called \u003ca href=\"https://www.jdrf.org/socentralohio/wp-content/uploads/sites/12/2017/08/Back2School-504_D2.pdf\"\u003e504 plan\u003c/a\u003es in the US)\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eAutomating\u003c/strong\u003e with AndroidAPS’s SMB (super-micro-bolus) and \u003cstrong\u003eRemotely\u003c/strong\u003e controlling AAPS through \u003ca href=\"https://androidaps.readthedocs.io/en/latest/Children/SMS-Commands.html\"\u003eSMS commands\u003c/a\u003e.\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eThis post will go into each briefly to hopefully help anyone else going through similar stress.#### Action and Management Plan\u003c/p\u003e","title":"Managing a Type 1 Diabetic Preschooler with AndroidAPS"},{"content":"I needed to implement JavaScript .length in Golang, the solution was pretty fun so I will tell you how.For some strings Golang’s len function will do, e.g. `\n\u0026ldquo;Hello World!\u0026rdquo; . length\n// JS 12\nlen ( \u0026ldquo;Hello World!\u0026rdquo; )\n// Golang 12\n`\nThe problems start when you include any unicode characters, e.g. `\n\u0026ldquo;👍\u0026rdquo; . length\n// JS 2\nlen ( \u0026ldquo;👍\u0026rdquo; )\n// Golang 4\n`\nThis is because JavaScript is implemented in UTF-16 (or UCS-2) and Golang is in UTF-8.\nTo convert a string to and from UTF-16 in Golang use: `\nfunc\ntoUTF16\n(s string )\n[] uint16 {\nreturn utf16.Encode([] rune (s))\n}\nfunc\nfromUtf16\n(b [] uint16 )\nstring {\nreturn\nstring (utf16.Decode(b))\n} `\nNow len will work properly: `\nlen (toUTF16( \u0026ldquo;👍\u0026rdquo; )) // 2\n`More reading:\n“JavaScript’s internal character encoding: UCS-2 or UTF-16?” ","permalink":"https://maori.geek.nz/posts/2023/2023-09-14_how-to-implement-javascript-string.length-in-golang/","summary":"\u003cp\u003eI needed to implement JavaScript \u003ccode\u003e.length\u003c/code\u003e in Golang, the solution was pretty fun so I will tell you how.For some strings Golang’s \u003ccode\u003elen\u003c/code\u003e function will do, e.g.\n`\u003c/p\u003e\n\u003cp\u003e\u0026ldquo;Hello World!\u0026rdquo;\n.\nlength\u003c/p\u003e\n\u003cp\u003e// JS     12\u003c/p\u003e\n\u003cp\u003elen\n(\n\u0026ldquo;Hello World!\u0026rdquo;\n)\u003cbr\u003e\n// Golang 12\u003c/p\u003e\n\u003cp\u003e`\u003c/p\u003e\n\u003cp\u003eThe problems start when you include any unicode characters, e.g.\n`\u003c/p\u003e\n\u003cp\u003e\u0026ldquo;👍\u0026rdquo;\n.\nlength\u003c/p\u003e\n\u003cp\u003e// JS     2\u003c/p\u003e\n\u003cp\u003elen\n(\n\u0026ldquo;👍\u0026rdquo;\n)\u003cbr\u003e\n// Golang 4\u003c/p\u003e\n\u003cp\u003e`\u003c/p\u003e\n\u003cp\u003eThis is because JavaScript is implemented in UTF-16 (or \u003ca href=\"https://mathiasbynens.be/notes/javascript-encoding\"\u003eUCS-2\u003c/a\u003e) and Golang is in UTF-8.\u003c/p\u003e","title":"How to implement JavaScript String.length in Golang"},{"content":"I was asked by an Insulet (manufacturers of Omnipod) representative if I would like to share our T1D story with Pharmac (the government organisation that decided which medicines and pharmaceutical products are subsidised) via Diabetes New Zealand. Pharmac recently put out a “Request for Proposals” to fund CGMs and insulin pumps. I love the Omnipod and our Dexcom CGM and wish all New Zealand T1D’s would have access to them.\nBelow is the letter I wrote.To whomever will listen,\nOur 3 year old son, Sam, was diagnosed with Type 1 Diabetes when he was 18 months old. We are fortunate enough to be able to “self fund” many T1D treatments not currently subsidised in New Zealand. For the past year and a half my wife and I have been trying different Type 1 Diabetes technologies and treatments to improve Sam’s short and long term health and simplify our day to day management.\nThe treatments we have tried so far with Sam are (in chronological order):\n2 weeks of Multiple Daily Injections (MDI) with only finger sticks to measure Blood Glucose Levels (BGL). 6 months of MDI with Dexcom G6 Continuous Glucose Monitor (CGM) (self funded). 6 months of Dana-i Insulin Pump (self funded), Dexcom G6 CGM (self funded), CamAPS hybrid closed loop (self funded). 4 months of Dana-i Insulin Pump (self funded), Dexcom G7 CGM (self funded), AndroidAPS (free) 2 months Omnipod Dash Insulin Pump (self funded), Dexcom G7 CGM (self funded), AndroidAPS (free). Although most of these therapies have pros and cons, MDI with no CGM is by far the worst therapy we have experienced. A **** CGM is required to manage the diabetes of a child, anything else is horrible for both the child and the caregivers. CGMs have been around for two decades and New Zealand not subsidising them is shameful.\nOur current treatment setup using Dexcom G7, Omnipod Dash and AndroidAPS has yielded Sam’s best results. We have had excellent control over his blood glucose levels while giving him the freedom to be a child.\nDexcom CGMs have helped Sam with:\nAlerts and remotely following: We can sleep and only be woken if we need to act. Algorithms: AndroidAPS use Dexcom CGMs to adjust insulin delivery to reduce both high and low blood sugars. Sensors last 10 days and are tiny. Sam barely notices his G7 being inserted, and doesn’t notice it while it is attached. Omnipod Dash helped Sam because they are:\nTubeless, water proof, and out of the way on his arm: The Omnipod Dash is small enough he barely notices it, and since it is water proof and out of the way he is free to do many activities without being restricted by diabetes. Controlled from a phone. Catching a child and holding them down to deliver insulin, interrupts life. The Omnipod Dash is controlled remotely from a phone, Sam doesn\u0026rsquo;t even know when he is being given insulin. Easy to use and insert. The Omnipod is replaced every three days in an intuitive and easy to learn process. This reduces the chances of issues which can impact insulin delivery. If Dexcom CGMs and Omnipod Dash were subsidised more people would have the benefits Sam has:\nSam’s HbA1c levels are nearly in the non-diabetic range. Sam’s last HbA1c was 6.1%, his time in range most days is above 70%. This gives him the best chance at not developing long term diabetic complications. Sam’s lows are detected and treated early. The CGM reports his blood glucose levels frequently and accurately, enough to predict lows and react accordingly. We sleep through most nights. This is something that is only possible with a CGM and pump and the technology to connect them together to adjust insulin delivery automatically. Sam is like any child. He gets a new Omnipod every 3 days, a new CGM every 10 days, and a few finger sticks in between. Other than that he doesn\u0026rsquo;t know he has diabetes. The only reason why Sam has these benefits is because we have access to technology not readily available to most New Zealand diabetics. Subsidising Dexcom CGMs and Omnipods would dramatically improve many lives, like it has Sam’s.\nGraham JensonI hope this helps.\n","permalink":"https://maori.geek.nz/posts/2023/2023-08-13_omnipod-dash-dexcom-g7-androidaps/","summary":"\u003cp\u003eI was asked by an \u003ca href=\"https://www.omnipod.com/about-insulet\"\u003eInsulet\u003c/a\u003e (manufacturers of Omnipod) representative if I would like to share our T1D story with \u003ca href=\"https://en.wikipedia.org/wiki/Pharmac\"\u003ePharmac\u003c/a\u003e (the government organisation that decided which medicines and pharmaceutical products are subsidised) via \u003ca href=\"http://diabetes.org.nz\"\u003eDiabetes New Zealand\u003c/a\u003e. Pharmac recently put out a “\u003ca href=\"https://pharmac.govt.nz/news-and-resources/news/cgms/\"\u003eRequest for Proposal\u003c/a\u003es” to fund CGMs and insulin pumps. I love the Omnipod and our Dexcom CGM and wish all New Zealand T1D’s would have access to them.\u003c/p\u003e\n\u003cp\u003eBelow is the letter I wrote.To whomever will listen,\u003c/p\u003e","title":"Omnipod Dash + Dexcom G7 + AndroidAPS"},{"content":"I am thinking about using GoLang and SQLite for a project. Looking around I found a presentation by Ben Johnson from Fly.io about LiteFS which is working towards a more scalable and reliable SQLite. This post is just about me getting something working with these technologies and stressing them a little.\nBenchmark I have a simple GoLang app with one route to inserts a row into a SQLite database. I want to test calling this endpoint when:\nLocally hosted on my 2021 MacBook Pro. This will be so fast! Default Fly.io machine with ephemeral storage. When the machine restarts, all data is wiped. Fly.io machine with mounted volume persistent storage. This is where fly.io machine is attached to a single drive that survives restarts, but is not redundant in any other way(!) Fly.io machine with mounted volume and LiteFS storage. This is a single server with volume, so not redundant, and it uses the LiteFS proxy. I am testing with ApacheBenchmark making 1000 calls with 50 concurrent, i.e. ab -n 1000 -c 50.\nResults Locally hosted: 18,709 requests per second Ephemeral storage: 241 requests per second Persistent storage: 244 requests per second LiteFS storage: 62 requests per second Discussion What can we see:\nLocally hosted is fast. Great for testing. Ephemeral and Persistent storage are basically the same speed. Under the assumption that ephemeral should be faster; there must be another bottleneck somewhere else. Distributed storage(ish) is 4 times slower than persistent storage. This could be the LiteFS proxy taking time, but I bet it is the slow FUSE file system and time to read and broadcast SQLite WAL file. That being said, 60 requests per second is still awesome. Conclusion This is not a great benchmark. It took me a few hours to go from never using fly.io or SQLite to running this benchmark. That in itself is a pretty positive review of both these technologies. On that line, fly.io is awesome and I will be using them for my future projects!\nI think that LiteFS is a cool idea, but I just wish it was a little less complicated to setup. For example, maybe we could create a LiteFS volume that could be mounted to many machines. Either way I am prioritising simplicity and performance for my upcoming project, so I will probably leave LiteFS out of the mix, for the time being.\nOther things Litestream another useful SQLite app Examples of a lite-fs app [1][2] ","permalink":"https://maori.geek.nz/posts/2023/2023-07-17_golang-sqlite-on-fly.io-with-litefs-a-quick-benchmark/","summary":"\u003cp\u003eI am thinking about using \u003ca href=\"https://go.dev/\"\u003eGoLang\u003c/a\u003e and \u003ca href=\"https://www.sqlite.org/index.html\"\u003eSQLite\u003c/a\u003e for a project. Looking around I found a \u003ca href=\"https://www.youtube.com/watch?v=XcAYkriuQ1o\"\u003epresentation\u003c/a\u003e by \u003ca href=\"https://github.com/benbjohnson\"\u003eBen Johnson\u003c/a\u003e from \u003ca href=\"https://fly.io/\"\u003eFly.io\u003c/a\u003e about \u003ca href=\"https://fly.io/blog/introducing-litefs/\"\u003eLiteFS\u003c/a\u003e which is working towards a more \u003cstrong\u003escalable\u003c/strong\u003e and \u003cstrong\u003ereliable\u003c/strong\u003e SQLite. This post is just about me getting something working with these technologies and stressing them a little.\u003c/p\u003e\n\u003ch4 id=\"benchmark\"\u003eBenchmark\u003c/h4\u003e\n\u003cp\u003eI have a simple GoLang app with one route to inserts a row into a SQLite database. I want to test calling this endpoint when:\u003c/p\u003e","title":"GoLang + SQLite on Fly.io with LiteFS: a Quick Benchmark"},{"content":"In this post I am going to discuss moving our Type 1 Diabetic (T1D) 3yo son Sam from CamAPS to AndroidAPS; Why we moved, the logistics of the the change, our current setup, and the results we are seeing.\nSam’s Bedroom door with phone mount\nNote: This is not medical advice or a recommendation, it is just our experience.#### Leaving CamAPS\nTo manage my sons Type 1 Diabetes (T1D) I need data. All the data I can get, and I need it to be accurate and timely. I need to know:\nwhen he gets insulin and how much. when he eats food and how many carbs it has. his current blood glucose and if its trending up or down. CamAPS (the hybrid closed loop) uploads its data to Diasend, a portal for patients and clinicians to analyse diabetes data. Until a few months ago, CamAPS would upload data to Diasend every few minutes. After becoming overwhelmed with this data, I assume Diasend forced CamAPS to reduce that frequency of uploads to every few hours.\nWhen Sam was on CamAPS, we used Diasend to share data and make decisions about treatments. Reducing the frequency of data being uploaded to Diasend (from minutes to hours) caused havoc for us, and within a few days we decided to move off of CamAPS.\nBoth CamAPS and Diasend are trying to fix this issue; CamAPS launched Companion, and Diasend users are moving to Glooko. Neither solve our issue.### AndroidAPS\nWe chose AndroidAPS because it works with Sam’s Dana-i pump and it has features we were really excited for:\nDexcom G7 support with Xdrip, so we can upgrade to the latest CGM. Uploads data to Nightscout, so we can easily access live information. SMS bolusing, so we can bolus him from almost anywhere. The worst thing about it (especially compared to CamAPS) is that it is a complex machine with a ton of settings. Understanding how it works, how to set it up, and how to move from CamAPS with minimal disruption took the better part of a month of research and planning.\nA brief list of steps I took to migrate from CamAPS to AndroidAPS were:\nGet Nightscout Running: we chose to use t1pal because I didn’t want to run my own server. Then we synced Nightscout with Dexcom and Diasend (using this tool) to upload all of Sam’s data. Build and Install AndroidAPS on an old Android phone for testing and learning. Created an AndroidAPS profile with a guess at Sam’s settings. Go through the objectives. AndroidAPS forces you to achieve some goals before it unlocks some features. By doing these on the test phone, we can start Sam out with more knowledge and features. Tuning the profile using the online autotune tool. By tuning Sam’s AndroidAPS profile using the real data from Diasend and Dexcom we started out with more accurate settings. Setting up alerting and monitoring on our phones. We use both Dexcom follow and NightGuard (which I 100% recommend). Install apps on Sam’s phone. Install AndroidAPS, xdrip, and the Dexcom G7 app. Then we imported the tuned profile from the test phone onto Sam’s actual phone. Put G7 on Sam and connect**.** G7’s start to warmup as soon as they are applied. So this kicked off the final countdown to switch. Connect the pump to AndroidAPS, after disconnecting from CamAPS. With AndroidAPS controlling the pump, receiving data from the G7 via xdrip, and everything being sent to Nightscout; the switch was complete.\nOur final setup looked like:\nFirst days on AndroidAPS We started AndroidAPS on 3rd April 2023. To be safe we used the Low-Glucose-Suspend loop, which would turn off insulin delivery if low.\nThe first night on AndroidAPS he kept going high, so we changed some settings. The second night he kept going low, so we changed more settings. Then the third night was perfect!\nYou can see how impactful AndroidAPS has been by looking at my sleep:\nCamAPS sleeps (before April 3rd) is bad sleeps.\nAndroidAPS kept working and letting me sleep throughout April:\nAndroidAPS sleeps\nIt is pretty rare that upgrading a phone app improves your sleep so dramatically.#### Stats of CamAPS vs AndroidAPS\nSam’s nights (and my sleeps) are better which we can see in Sam’s glucose percentile graphs when comparing CamAPS (March) to AndroidAPS (May):\nBGL percentiles using CamAPS (March 2023)\nBGL percentiles using AndroidAPS (May 2023)\nThe difference in Sam’s glucose levels from midnight to 6am (when Sam wakes up) are stark. However, throughout the day you can see that AndroidAPS does seem to be a bit higher. This can be seen in the overall statistics comparing CamAPS (March) to AndroidAPS (May):\nCamAPS March Statistics\nAndroidAPS May 2023 Statistics\nAndroidAPS has less lows (4.4% to 3.5%) and more time in range (80.3% to 81.2%), but has a higher overall average (7.3 to 7.5mmol/L). Other than that CamAPS and AndroidAPS have very similar numbers.\nWe are still new to AndroidAPS, so we are still tweaking and learning. I am happy with these numbers, but I am hoping we can further improve them.#### Problems\nAndroidAPS (with xdrip and Nightscout) has a steep learning curve and requires a lot of technical know-how. Here are some issues that we found:\nXdrip companion mode had some issues. I posted a discussion (here) on GitHub and got the issue sorted using an undocumented solution. To get through some of the objectives and questions I had to actually dive into the source code of AndroidAPS. The Nightscout server in T1D pal went down, and I had to use a workaround (adding and removing a plugin) to restart it. The technical ability of the user to debug and fix issues comes with using an unsupported open source product. The communities online are willing to help and are amazing, and I cannot thank the maintainers of these projects enough.\nA lot of the Dexcom G7’s have had issues, like accuracy, not lasting 10 days, to just straight out failing on insertion. The features, size and convenience are great, so we are hoping Dexcom is working to fix these issues as new batches roll out, otherwise we will have to change to G6 or Libre 3.#### Conclusion\nAndroidAPS is great but has a steep learning curve. We are getting good numbers, and much better sleeps but there is room to improve. Nightscout, SMS bolusing, SMB and other features are making life much easier. There is still a lot more to learn and work to do.We just started Sam on Omnipod Dash with AndroidAPS and it is going really well. Will write about that in my next post.\n","permalink":"https://maori.geek.nz/posts/2023/2023-06-06_moving-to-android-aps/","summary":"\u003cp\u003eIn this post I am going to discuss moving our Type 1 Diabetic (\u003cstrong\u003eT1D\u003c/strong\u003e) 3yo son Sam from \u003ca href=\"https://camdiab.com/\"\u003e\u003cstrong\u003eCamAPS\u003c/strong\u003e\u003c/a\u003e to \u003ca href=\"https://androidaps.readthedocs.io/en/latest/\"\u003e\u003cstrong\u003eAndroidAPS\u003c/strong\u003e\u003c/a\u003e; Why we moved, the logistics of the the change, our current setup, and the results we are seeing.\u003c/p\u003e\n\u003cp\u003e\u003cimg alt=\"image\" loading=\"lazy\" src=\"/posts/2023/2023-06-06_moving-to-android-aps/images/1.jpeg#layoutTextWidth\"\u003e\nSam’s Bedroom door with phone mount\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eNote: This is\u003c/em\u003e \u003cstrong\u003e\u003cem\u003enot medical advice\u003c/em\u003e\u003c/strong\u003e \u003cem\u003eor a recommendation, it is just our experience.\u003c/em\u003e#### Leaving CamAPS\u003c/p\u003e\n\u003cp\u003eTo manage my sons Type 1 Diabetes (\u003cstrong\u003eT1D\u003c/strong\u003e) I need data. All the data I can get, and I need it to be \u003cstrong\u003eaccurate\u003c/strong\u003e and \u003cstrong\u003etimely\u003c/strong\u003e. I need to know:\u003c/p\u003e","title":"Moving to Android APS"},{"content":"I really love art by Martin Tomsky that is created with layers. I wanted to try something similar.\nThe source image:\nTui by Matt Binns [source]\nThe goal:\nThese are examples from previous attempts\nStep 1: Preprocess First we take the source image and use cut out the subject and adjust the colour to be very vibrant.\nNote: Copy Subject in the photos app and New From Clipboard in macOS preview are useful here.\nIncreased saturation and sharpness, also balanced the colours for easier tracing\nStep 2: Import into Inkscape and Trace Bitmap Use Path \u0026gt; Trace Bitmap on the image selecting Multicolor using Colors as detection mode.\nNote: the _Stack_ feature is useless in my experience.\nThis will create 4 paths for the colors in the image:\nEach of the paths that are created\nSort the paths smallest to biggest (just by looking) so the biggest paths are at the bottom. Also name them:\nStep 3: Join Paths Each layer we have must support the layer above it, however at the moment each layer is distinct. We need to join them in a way that each layer includes the layer above it as well.\nSelect paths 1 2 3 4:\nDuplicate them:\nThen Path \u0026gt; Union:\nThe new path (handily also called 1) should replace the old path.\nDo the above steps (duplicate, union, replace) for the paths 2 3 4 and 3 4.\nNow the image is the same as before but the paths are more layered.\nStep 4: Exclusion Currently we have paths that when added together make an image. We want paths that when subtracted from one another they make the same image. This involved some Exclusions and reversing orders.\nDraw a border a little bigger than the image and use Path \u0026gt; Object to Path to make it a path. Place the border over the image and add it to the group.\nDuplicate the border and path 1 then use Path \u0026gt; Exclusion to cut that path out. Duplicate and exclude for each layer.\nAt the end of this you should have 5 paths:\nYou will have to reorder the layers to get an image:\nThe colours are wrong in this image. Before we added the paths together to generate the image, now we subtract them. If you make each path the colour of the path above it, and make the top layer white (because it is the background) it should fix the image:\nFixed by changing the colours\nStep 5: Simplify The paths at the moment can be cut, but they are inefficient. We can simplify these paths pretty easily:\nselect a path Path \u0026gt; Break Apart select all paths (that are not the large rectangle) Path \u0026gt; Union select the 2 remaining paths Path \u0026gt; Exclusion This removes a lot of useless nodes:\nBefore simplification-\u0026gt; After simplification.\nYou could probably simplify a few steps earlier, or do it a different way.\nStep 6: Laser Cut It takes about 8 minutes to cut these paths:\nFinished Product The final product looks like:\n140mm x 100mm print on Laserbox\nIts layers look like:\nThoughts The final result looks fine, but it resembles a crow more than a Tui. This was a quick experiment, but I would like to try and improve the final product.\nA few things that I would like to try:\nSpend more time simplifying and optimising the paths. I think I could get much better results and remove unneeded detail in steps 1 and 2. Bigger images and more layers should improve detail. Try different materials like wood or acrylic. Etch some detail onto the card as well as cutting. Experiment with more colours and shades. I replaced the dark blue with a purple and I think it looks better: If you have any suggestions or improvements to this process please let me know. I am sure there are better ways of doing this kind of work out there. ","permalink":"https://maori.geek.nz/posts/2022/2022-12-21_layered-laser-cut-cardboard-art-with-inkscape/","summary":"\u003cp\u003eI really love art by \u003ca href=\"https://www.martintomsky.com/\"\u003eMartin Tomsky\u003c/a\u003e that is created with layers. I wanted to try something similar.\u003c/p\u003e\n\u003cp\u003eThe source image:\u003c/p\u003e\n\u003cp\u003e\u003cimg alt=\"image\" loading=\"lazy\" src=\"/posts/2022/2022-12-21_layered-laser-cut-cardboard-art-with-inkscape/images/1.jpeg#layoutTextWidth\"\u003e\nTui by \u003ca href=\"https://www.flickr.com/photos/69029168@N00\"\u003eMatt Binns\u003c/a\u003e [\u003ca href=\"https://commons.wikimedia.org/wiki/File:Tui_on_flax.jpg\"\u003esource\u003c/a\u003e]\u003c/p\u003e\n\u003cp\u003eThe goal:\u003c/p\u003e\n\u003cp\u003e\u003cimg alt=\"image\" loading=\"lazy\" src=\"/posts/2022/2022-12-21_layered-laser-cut-cardboard-art-with-inkscape/images/2.jpeg#layoutTextWidth\"\u003e\nThese are examples from previous attempts\u003c/p\u003e\n\u003ch4 id=\"step-1-preprocess\"\u003eStep 1: Preprocess\u003c/h4\u003e\n\u003cp\u003eFirst we take the source image and use cut out the subject and adjust the colour to be very vibrant.\u003c/p\u003e\n\u003cp\u003eNote: \u003ccode\u003eCopy Subject\u003c/code\u003e in the photos app and \u003ccode\u003eNew From Clipboard\u003c/code\u003e in macOS preview are useful here.\u003c/p\u003e","title":"Layered Laser Cut Cardboard Art with Inkscape"},{"content":"This is the third post about my 2yo son’s Type 1 Diabetes (T1D). Previously, I wrote about his initial management coming out of the hospital, then his management with a Continuous Glucose Monitor (CGM). This post is about the last three months we have been using the Dana-i insulin pump with the CamAPS Hybrid-Closed-Loop (HCL) algorithm.\nObligatory: nothing here is medical advice.### Insulin Pump\nThe pump we chose was the Dana-i pump. I wrote about the reasons we picked that pump here, but basically the Dana-i is small, remotely controllable and can be used with the CamAPS hybrid-closed-loop.\nOn its own the Dana-i pump is still a pretty good pump. As well as a lot of the standard functionality (bolus calculator, basal rates, temporary basal…) it also can be remotely controlled from a phone via the AnyDana app:\nAnyDana-i app on iOS or Android\nNote: you cannot use the AnyDana app and CamAPS at the same time.\nSam has been on this pump for about 3 months; here are some quick pros/cons we have noticed using a pump vs MDI.\nSome Pros:\nNot as many needles (less stabbing). On MDI we injected Sam 8–10 times a day, now we change the infusion set once every 3 days. Remote bolus. We don’t have to catch him and hold him down to give insulin. Different basal rates can be set for every hour of the day and adjusted with temporary basal rates. Sam adapted quickly to having a pump strapped to him. We thought he would hate it, but after a few days he barely noticed it. More precise doses of insulin. On MDI we could only inject multiples of 0.5u at a time, on a pump we have 10x the resolution of 0.05u. We are more flexible with foods and times he eats. It is easy to give insulin, so we are less stressed about small snacks and new foods. Single insulin prescription. No more keeping track of basal insulin, bolus insulin and dilute insulin stocks. A few Cons:\nReplacing his infusion set and refilling insulin is very stressful. It is a very finicky task that requires one parent to refill the pump while the other looks after a toddler. The type of site is situation dependent. We need a short, steel needle, so we can use two; Soft or Easy **** release. Soft release is harder to place but can be easily disconnected, great for swimming days. Easy release is easier to place, lower profile so less likely to get ripped off, but more difficult to disconnect, so is good for more playful activities like trampolines. You don’t see the insulin go in. With MDI you see the needle inject the dose, but with a pump you are never 100% sure. You get a new stress in your life after the first site failure resulting in a massive high. Pump belts suck on a 2yo. We use a great pump belt from a facebook store called Bizy Lizy pump belts. But it took trying a bunch of different belts to find the right one. It can’t be too tight or it will annoy him, too loose and it will bounce off while on a trampoline. Lots of insulin waste. Sam is small so about a days worth of insulin is just in his tubing. When doing a site change you throw out a lot of insulin. Cost. Pumps are crazy expensive. Dana-i was about $5,000NZD and about $2,000NZD for a year’s supplies. There are some things that are really tradeoffs:\nMore control/More decisions. You can do more, so now you think of all the little tweaks you can make to improve. T1D is more hidden/visible. With remote bolus we no longer have to hold Sam down, pull out a needle and jab him with an insulin pen. But, he does have to carry his pump around with him everywhere. We have to carry less on short outings, more on long trips. On a pump you don’t need to take much if you go for a few hours. On long trips though we need to take spare sites, emergency needles, more insulin, setter, refiller… A CGM is basically all upside (except the cost), but using a pump is more complicated. The downsides are real and we sometimes do miss MDI, especially when we are worried a site might rip out while Sam is running around a playground.\nSam with his Dexcom G6 and pump belt\nThe biggest bonus for a pump is that it can be controlled by a loop algorithm like CamAPS. So let’s talk about that.### CamAPS Hybrid-Closed-Loop\nA closed-loop system is one where all insulin delivery is automated. A hybrid-closed-loop system is one where manual intervention is still required for food, but the basal (and to some extent corrections) are automated.\nCamAPS is a hybrid-closed-loop. We still have to count carbs, work out our own carb ratios, and pre-bolus for meals. It does manage his basal insulin and can help gently correct him if he is going high.\nRoman Hovorka, the main researcher behind CamAPS, is a mathematician who has been modelling how insulin works in patients since the 1980s [cite]. From the late 1990’s he has been building artificial pancreas algorithms with goals like improving outcomes and reducing stress for families managing T1D in young children [cite]. CamAPS development has now been going for nearly 2 decades and has a lot of clinical trials showing its value.\nCamAPS development [source]\nThe CamAPS FX app contains the algorithm that connects to Dexcom G6 and the Dana pump, and controls everything:\nCamAPS FX app\nThe main features are:\nAuto-mode: which turns the algorithm on and off Boost and Ease-off: which turns the aggressiveness of the algorithm up or down Personalised Glucose Targets: Change the target blood glucose level the algorithm is targeting at different times during the day. When auto-mode is on, CamAPS will adjust the basal rate to try to get your Blood Glucose Level (BGL) to the target glucose value. This looks like:\n[source]\nThe black dots are BGL values and the blue line is CamAPS adjusting the basal rate to compensate. It is really cool you can see the algorithm working.\nCamAPS will learn:\nYour body’s response to insulin: the time to activation, peak activation and your insulin sensitivity factor (ISF) throughout the day. Your body’s response to carbs: To an extent it will ramp up basal to cover carbs that you have consumed but not bolused for. Your schedule: it covers things like the dawn effect which happen about the same time every day. You still tell CamAPS:\nCarb ratios: CamAPS uses carb ratios you set on the pump. Glucose target: If you want to change the default 5.8 mmol/L. When to ease off or boost basal rates usually for exercise. You do not tell CamAPS how much to boost or ease off though like with temporary basal rates. Boluses for food: Still have to carb count and tell it how much you are eating. Un-bolused food: You need to tell CamAPS about carbs you don’t bolus for (e.g. hypo treatments) to make sure the algorithm has all the info needed to learn. Although you still have to do a lot with CamAPS, you are removing a lot of basal work that can be complicated.How does CamAPS work?\nCamAPS works_*_ by including many different models to predict things like BGL and ISF. At midnight each day, all the competing models are ranked according to how well they fit the last few days worth of data, and the models that fit that data best are used for the next day [cite].\nThis looks like a stacked generalisation (blended) approach which was most notably used to win the $1 million Netflix prize in 2009 [cite]. Given everyone has different insulin needs, this seems like a pretty good approach. For example, you could have different models to predict BGL for toddlers and pregnant women, then CamAPS can choose the model that best suites you without you having to be asked.\n*: CamAPS is proprietary so this is just speculation based on public info. I cannot find the list of different models that are used by CamAPS or any more specific details about the algorithm. If you know anything more please point me in the right direction.\nUsing CamAPS The first 10 days using CamAPS you are recommended to not turn on auto-mode, as it has to collect data for its models. The main problem during this time was that CamAPS does not have much functionality with auto-mode off, e.g. no temporary basal since boost and ease-off only work with auto-mode on.\nAfter 10 days we turned on auto-mode. The first day it was weak, the second day it was super aggressive, the third day it was weak again. It was clearly still tuning the models, so we were not trusting CamAPS too much.\nThe second and third weeks were better. We were learning that CamAPS was not a silver bullet, and when we needed to intervene. We started to experiment a bit more with features like extended bolus, boost and ease-off.\nNow after 3 months of an algorithm being in charge of Sam’s pancreas, here are some pros:\nEasy to use. CamAPS makes it hard to screw up. If I wake up drowsy at 2am to correct a high, CamAPS makes it difficult to accidentally give too much insulin. Fewer decisions. CamAPS decides all the basal rates, no help from us. We still do lots of work around meals and give large corrections when needed. No more dawn effect. None. CamAPS has been able to catch and correct the dawn effect almost every day for months. Trust is earned. We are getting more trusting with the algorithm. This results in us widening our alarms, meaning we get more sleep. Personal glucose targets are great. This is the main lever we pull when trying to get the algorithm to do what we want. At the moment we have set high targets during sleep and lower targets during the day. A few Cons:\nFirst weeks are not good. You learn about CamAPS while CamAPS learns about you. No luck with extended meal bolus. There is a feature to have an extended meal bolus, but it has been useless for us. We end up just doing it manually. Still goes low and high overnight. The algorithm is not perfect, so there are nights where we have to intervene. Cost. CamAPS is a paid app and costs about $2,000NZD per year. Android APS is free. We chose CamAPS because of the amount of research on 1–5 year olds, but we are always considering other options (especially when AndroidAPS has features like SMS bolus).### Is CamAPS Better Than MDI? With Multiple-Daily-Injections (MDI) and a Continuous Glucose Monitor (CGM) Sam had good numbers, HbA1c 6.6% and time-in-range (TIR) 76%. But MDI is a lot of work! We consider CamAPS a success if we:\nhave the same or better HbA1c and TIR. get more sleep with fewer alarms and actions during the night. have more flexibility while making fewer decisions TIR and GMI Let’s look at the numbers: Left: AGP graph from last month on MDI. Right: AGP graph from the last month on CamAPS\nYou can see on CamAPS has completely removed the the dawn effect and post breakfast spike around 6am. The numbers are similar, but CamAPS is slightly better.\nGetting Better Sleep I wear a smart watch which tracks my sleep, let’s look at those numbers: Left: Sleep graph from last month on MDI. Right: Sleep graph from the last month on CamAPS\nThis shows my average sleep is about the same with MDI and CamAPS, slightly won by CamAPS (again).\nI also counted the number of times I had to wake up between 11pm–4am:\nMDI: 20 times CamAPS: 17 times. Again a slight win to CamAPS.\nHaving More Flexibility and Making Fewer Decisions For CamAPS to be better than MDI, I want to think less about T1D. I don’t know how to measure this so it has to be subjective.\nWith all the negatives that come with using Dana-i with CamAPS, the big win here is sometimes you can just trust the algorithm and let it take the reins (even for a just minute). This sometimes gives a necessary breather.\nSo I think a win for CamAPS here as well.\nIs it a Success? Is moving to CamAPS with a Dana-i pump from MDI worth it? I think it is.\nCamAPS has given us a lot of small wins; slightly better TIR, slightly better sleep, slightly more flexible life. There are real drawbacks, so I am not sure every person would come to the same conclusion. Until something better comes along, Sam will continue having his pancreas replaced with CamAPS.I decided to tack on the end of this post the software setup we use with CamAPS.\nDexcom Follow and Xdrip Even though promised in multiple locations [cite, cite], Dexcom follow integration with CamAPS has been delayed/postponed. This is a massive issue for many, because it provides a solid and customisable alerting system CamAPS is lacking.\nCurrently, CamAPSs notifications are:\nSMS only, requiring cell reception (which is dodgy at our house) Not configurable on followers’ phone. All the limits and timings are setup on his phone, so changing alerts (e.g. day vs. night) is annoying, and different followers can’t have different alert thresholds. You can setup Dexcom Follow with CamAPS by using Xdrip. Xdrip can read the BGL from CamAPS if setup in “Companion mode” and then sends them to Dexcom.\nThis setup still has some problems, like:\nXdrip (currently) doesn\u0026rsquo;t let you add or remove Dexcom followers. I had to work around that with curl Because of how Xdrip is working there are gaps in the BGL readings (especially if there is no change in reading). Installing and setting up Xdrip is difficult CamAPS Sidekick A few other issues I had with CamAPS:\nThe phone falls asleep if CamAPS is open. When I am looking after Sam I usually prop up the phone so I can glance at his BGL. If the phone falls asleep I have to stop what I am doing to go check his BGL. It is difficult to find the exact timings of the active boluses and carbs. If you want to learn how insulin and carbs work, you need to know exact timings of when stuff happens. It doesn’t show the insulin-on-board from the basal. If I am going to manually correct Sam, I need to take into consideration how much the algorithm has been giving as well. This is why I made CamAPSSidekick as a small app that talks to diasend to display his data (without sleeping) on his (or any) phone.\nCamAPS Sidekick screen shot\nCamAPSSideKick is now how we read the data from CamAPS, it makes things just a little easier. I also have it on an old phone next to my bed, so I can easily see his numbers during the night.\nCamAPSSidekick is definitely not a polished app and it has a ton of features I want to add, but it has been useful for us. My next goal is to make it good enough that I think others can use it as well.\nIf you are interested in CamAPSSidekick please reach out, it will give me the motivation to keep improving it :)\nLinks Posts about CamAPS you should read:\nBionicWookie posts about CamAPS “Another closed-loop” and “Mixing CamAPS with xDrip+” Diabettech posts about CamAPS [1][2][3][4] Useful links:\nCamAPS Installation Guide [pdf] CamAPS Instruction Manual [pdf] CamAPS publications [zip file] Roman Hovorka talking about how CamAPS is actually implemented [video]. Study from Interviews of Parents of young children [1–7] using CamAPS [link] Review of parents mental health looking after T1D children [cite] Interview with Roman Hovorka creator of CamAPS [link] Medical device manufacturing information; It can $30 million to launch a medical device, where only 10% of that spent on engineering costs ","permalink":"https://maori.geek.nz/posts/2022/2022-11-03_automating-my-sons-pancreas-with-camaps-and-danai-insulin-pump/","summary":"\u003cp\u003eThis is the third post about my 2yo son’s Type 1 Diabetes (\u003cstrong\u003eT1D\u003c/strong\u003e). Previously, I wrote about his \u003ca href=\"https://maori.geek.nz/the-unreasonable-math-of-type-1-diabetes-8c96bdf5b7fb\"\u003einitial management\u003c/a\u003e coming out of the hospital, then his \u003ca href=\"https://maori.geek.nz/6-months-as-a-full-time-pancreas-34ae09106293\"\u003emanagement with a Continuous Glucose Monitor (\u003cstrong\u003eCGM)\u003c/strong\u003e\u003c/a\u003e. This post is about the last three months we have been using the \u003ca href=\"https://www.intuitivetherapeutics.co.nz/about/dana-i-insulin-pump\"\u003eDana-i insulin pump\u003c/a\u003e with the \u003ca href=\"https://camdiab.com/\"\u003eCamAPS Hybrid-Closed-Loop\u003c/a\u003e (\u003cstrong\u003eHCL\u003c/strong\u003e) algorithm.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eObligatory: nothing here is medical advice.\u003c/em\u003e### Insulin Pump\u003c/p\u003e\n\u003cp\u003eThe pump we chose was the \u003ca href=\"https://www.intuitivetherapeutics.co.nz/about/dana-i-insulin-pump\"\u003eDana-i pump\u003c/a\u003e. I wrote about the reasons we picked that pump \u003ca href=\"https://maori.geek.nz/which-insulin-pump-to-choose-be57fc81a4ca\"\u003ehere\u003c/a\u003e, but basically the Dana-i is small, remotely controllable and can be used with the CamAPS hybrid-closed-loop.\u003c/p\u003e","title":"Automating my Son’s Pancreas with CamAPS and Dana-i Insulin Pump"},{"content":"Managing our Son’s Type 1 Diabetes with Multiple Daily Injections and Dexcom G6 Continuous Glucose Monitor In a previous post, I wrote about my 18 month old son’s (Sam) Type 1 Diabetes (T1D) diagnosis and the first month when we were learning as much as possible and struggling. That post was received with so much love and positivity, I have decided to follow up with a 6 month recap.\nSam has now had diabetes for a quarter of his life (18–24 months). We have spent these first 6 months learning to be Sam’s pancreas using Multiple Daily Injections (MDI) and the Dexcom G6 Continuous Glucose Monitor (CGM). This post is about what we are doing and what we are focused on. It will hopefully be helpful for people in a similar situation and to serve as a reminder to ourselves how far we have come managing T1D.\nObligatory: nothing here is medical advice.#### The Dark Days: MDI + finger pricks\nAfter leaving the hospital, we managed Sam’s T1D with Multiple Daily Injections (MDI), a **blood glucose meter (**caresens dual) and an oncall nurse helping us. This was it for 2 weeks, from 21st Jan until 4th Feb. It did not go well.\nHere is what the final week looked like:\nThe insulin injections (left) are P for protaphane and N for Novorapid. The Blood Glucose Level (BGL) readings (right) are from a meter and manual finger prick test.\nA quick BGL reminder anything under 4 is low and under 3 is very low and can result in hypoglycemic seizure and death. Anything over 10 is high, over 15 is very high and, if untreated, can result in diabeteic ketoacidosis and death. mmol/L is the unit we use.\nYou can see from this page that:\nSam was getting about 5 insulin injections and 10 BGL tests per day. Each injection and test requires stabbing him (a little). Sam had wild readings ranging from urgent lows (2.9) to massive highs (20.3) coming out of nowhere and remaining undetected for hours. We were so afraid of lows that we over reacted to every low with too many carbs. Also, we under reacted to highs with not enough insulin. We were stressed every time we gave him food or insulin. Feeling afraid to feed your child is not good for anyone. The squashed columns on the right where I would test him multiple times overnight. The nighttime routine was:\ndo a BGL test at 10pm and 2am. A BGL test involved stabbing his finger to draw **** blood. If low I would give him some sugar. If very high I would give him insulin (more stabbing). If I gave him sugar or insulin, he would need a follow up test in about an hour to make sure it is working. The night finishes when Sam wakes up at 5am(!) wanting to play. All the testing, treating, retesting and stabbing would result in nights like:\nManage T1D with a manual glucose meter is hard. We were really trying and still getting bad and confusing results. We were always afraid that he would go low without detection and the worst possibilities were constantly playing in our minds. This was not sustainable.#### The Less Dark Days: MDI + Dexcom G6\nDexcom G6 Continuous Glucose Monitor (CGM) changed everything. I cannot overstate how much better managing diabetes is with a CGM.\nSam at the beach with his Dexcom G6\nWith a manual blood glucose meter we would have 10 BGL readings a day, with a G6 we get 288(!); a reading every 5 minutes, 24/7. Also, we only have to stab Sam once every 10 days to replace the CGM, so that is 2880 times less stabs than manual testing.\n“Quantity has a quality all its own”; the amount of data that a CGM produces isn’t just more of the same data you get from a manual blood glucose meter. Let’s look at the difference:\nData from manual meter readings\nData from CGM\nThe additional data shows you exactly what is happening and allows you to learn and adapt with that knowledge.\nIt is also MUCH safer! Do you see that 2.2 mmol/L reading happening at midnight? Undetected lows are what we are most scared of. Even if I woke up every night and tested at 10pm and 2am, I would have still missed it by 2 hours!\nA CGM fundamentally changed the way we managed diabetes by:\nseeing BGL at any time. Among other things, this lets us quickly answer the question “is this tantrum because he is low?” without needing to hold down and stab a screaming toddler. showing a direction that BGL is going. 10mmol/L and going up means I might need to correct, 10mmol/L and going down means everything is going to plan. notifying us if BGL is too high or low. I have multiple alarms set to wake me if I need to do something. The last thing I do before going to bed each night is make sure the alarms are set correctly. showing Time-In-Range (TIR), the % of the day spent between 3.9-10mmol/L. This is the main metric we try to improve; a good day is 70%, our best day is 96%. showing patterns of how insulin and food affect BGL. CGMs show you directly what food and insulin does to you BGL, so you can better adapt.#### Long term T1D Having a CGM has made us less worried about short term T1D problems. Now we have more time to worry about the long term complications like going blind (retinopathy), losing limbs (amputation) and organ failure (heart and kidney failure). This is a different kind of worry.\nA lot of research has shown HbA1c is the best predictor of long term effects of T1D.\nWhat is HbA1c? Glucose in the bloodstream will sometimes randomly link (glycate) with a red blood cell’s haemoglobin (Hb) creating “Glycated haemoglobin” or HbA1c. The more glucose there is in the bloodstream the more likely this linking will happen. A red blood cell is in circulation for about 120 days, so the amount of HbA1c in the bloodstream can be used to approximate average blood glucose level over that time. More on HbA1c here.\nIn the Diabetes Control and Complications Trial (DCCT) that involved 1,441 type 1 diabetics for 6 years (1983–1989), a high HbA1c was the best predictor for the long term effects of T1D. For example, every 1% drop in HbA1c gives a 30% drop chance of developing retinopathy [cite].\nSam’s HbA1c when on MDI was good:\nFinal HbA1c results for MDI\nI am really proud of 6.6%. It is well under the 7% that the 2020 guidelines from the American Diabetes Association (ADA) recommends for children.\nHowever, according to one study only about 10% of kids from 2–5 are under that 7% guideline. **Why are so few kids in the recommended range?**Socioeconomic status, parental education [cite] and use of a CGM [cite] are all related to better HbA1c results.\nSam’s good HbA1c results are because my wife and I can take care of Sam fulltime and we can afford a CGM. I want to make it very clear that HbA1c results, like many things, are a measure of privilege and not a lack of willpower or moral failing. This disease hurts people more if they are already struggling, and that is just one more shitty thing about T1D.HbA1c is useful, though it is not the full story. For example, here are three people with an HbA1c of 7%:\n[source]\nAs you can see, a person can maintain an HbA1c of 7% but still have large BGL swings (Glucose Variability GV) and low time-in-range (TIR) . Both bad GV and TIR have been linked to health complications [cite, cite]. For example, here is the relationship between TIR and microalbuminuria (a sign of kidney disease):\nMicroalbuminuria is a sign of kidney disease\nThere is no medical test that can approximate GV or TIR, so GV and TIR have to be calculated from the CGM data. Here is Sam’s TIR and GV:\nDexcom Clarity results for last 90 Days on MDI + dexcom.\nThis shows his:\nGlucose Variability (GV), measured as standard deviation, is 2.7 mmol/L. GMI, which is an approximation of the HbA1c, was 6.7%. So pretty close to the actual 6.6% result. his Time-In-Range (TIR) is 76%, with 21% above range and 3% below. This is within a recommended range [cite] of 70% TIR, above target less than 25% and 4% below. This is all more good news for Sam.Although Sam has a good TIR now, we definitely didn’t start there. A core benefit of a CGM is that it gives you the power to improve. You can actually see us getting better at managing T1D by looking at Sam’s TIR over many months:\nSam’s TIR over time\nThe biggest change we made to improve his TIR was “the loop” (which I discussed in my preivous post).#### The CGM Loop (aka Pancreas Job Description)\nThe CGM loop we use has been described as Sugar Surfing or bump and nudge. The idea is to observe the BGL patterns and adapt your strategies to get blood sugars to where you want them.\nFirstly, using the CGM and experimenting with food and insulin, we were able to approximated his ratios as being:\nCarbohydrate Ratio (CR) is 1:20, e.g. **** 1 unit of insulin for 20g of carbs Insulin Sensitivity Factor (ISF) is 1:10, e.g. **** 1 unit of insulin will drop Sam’s BGL by 10 mmol/L ISF:CR is 10:20 or 1:2, e.g. 2g of carbs will raise Sam’s BGL by 1 mmol/L Note: These are only approximations because his insulin sensitivity changes a lot during the day; typically for dinner we give 10–20% less insulin.\nApplying those ratios is not as easy though as our insulin pens (used to inject insulin) were not made for kids as small as Sam. They can only deliver at 0.5u intervals, so we can:\ncorrect for 5, 10 or 15 mmol/L with 0.5u, 1u, 1.5u of insulin give meal insulin for 10, 20 or 30g of carbs with 0.5u, 1u, 1.5u Since we can give finer grained carbs, e.g. a 2g gummy, we tend to round up the insulin doses and then bring him up later with small snacks. This makes our Meal Time Loop look like:\nLook at BGL. Add 0.5u to **correction insulin** **** for every 5 mmol/L we want to bring him down. Calculate the meal insulin by doing a quick carb count (and rounding up to the nearest 10g) add 0.5u of insulin to **meal insulin** **** for every 10g of carbs**.** Inject the **meal insulin + correction insulin** **** units. However, **** if the meal dose is more than 1.5u we will split it into two injections and give the second one as he starts to eat. Anything more than 1.5u drops his BGL too fast. Wait until he is about 6 mmol/L OR for 20 minutes OR until he starts to drop quickly OR he starts demanding food before giving him the food. This is a gut call based on the situation. Eat the food. If he doesn’t eat everything, keep an extra close eye BGL. Repeat for every meal. There is a lot of inaccuracy in the above loop, we are doing a lot of rounding and guessing. This does mean that we have to always keep an eye on Sam’s BGL with the Monitoring Loop:\nGlance at his CGM data and look at BGL reading, direction and the rate of change. Predict his future BGL. Try and guess what Sam’s BGL will be in about 5, 10, 30, 60 minutes out. You have to consider the CGM lag (CGMs are 10 mins behind), rate of change, when he last had some carbs or insulin, activity level, temperature, time of day, and anything else that might affect his BGL. This is a shot in the dark. You do seem to get better at it though, since you are practising every 5 minutes. Calculate dose. Now that we have a guess of what he will be in the future we can change that by either giving some food or some insulin. Sam’s BGL will go up 1 mmol/L with 2gs of sugar or drop by 5 mmol/L with 0.5u of insulin. Do we dose? We will always give carbs if he risks going very low. Otherwise, we have to decide how confident we are in the above guesses to give him insulin or carbs. There is always the option to wait another 5 minutes to see if your guess was right. Dose. It is easier to give Sam sugar than insulin. Wait 5 minutes for the next reading. Go to the step 1 We run these loops all day, every day. This is what the daytime of being a diabetic is like, why replacing a pancreas is a full time job.\nAlso part of the job, every night you are oncall. At night, we rely on alarms to wake us if we need to do something. My alarms wake me when:\nHigh alarms (over 10mmol/L for an hour) Low alarm alert (under 5.6mmol/L) Urgent Low Alarm (under 3.1mmol/L) We use the Dexcom Follow app for alarms:\nDexcom Follow allows only limited alarm setup\nIf one of these alarms goes off, we first wake up then we run through the same steps as the monitoring loop. The only differences are:\nAny treatment might wake Sam up, so we will more likely wait and see. FYI: giving him sugar is more likely to wake him than insulin; it seems Sam can sleep through stabbing but not chewing. If it isn’t urgent, wait. If you can wait, wait. For example, if I woke up to the low alarm but his BGL looks flat, I will drop the alarm to be at 5.0mmol/L and go back to sleep. The Meal Loop, the Monitoring Loop and the Overnight Oncall are the major aspects of being a full time pancreas. The more irregular activities include maintaining a stock of supplies, hiding sugary treats everywhere, replacing needles and CGMs and the doctor’s visits.\nOnce all that is done, you might have time to do other things, like a full time job that pays you money.#### Tips and tricks\nThis is just a bunch of stuff that we wish we would have known earlier.\nMeal Loop:\nSplitting up meal insulin into two shots (split bolusing) was one of our biggest improvements. If we gave him 2u for a meal he would drop too fast and it would make the timing very stressful. By splitting the doses we have managed a much stabler rise and drop of blood sugars and allowed us some breathing room if the toddler decides that today he hates pasta (for whatever reason they decide these things) We say “prickle” before any finger prick or injection to calm him down. Sam doesn’t actually mind injections so much, he just hates being held still. Saying “prickle” lets him know that once we are done he can go run around again. Find ISF and CR ratios in the morning. The only time during the day a toddler has not eaten something is right after they wake up. The rest of the day is full of snacks and running around. The morning is the best time to dial in those ratios. Keto sometimes. If Sam is high but he is hungry or deserves a reward we usually turn to keto bread, keto gummies, keto jelly, or lo carb kombucha (juice). Sam is not on a low carb diet, but keto food has allowed us to feed him without feeling guilty. Monitoring Loop:\nIdentify basal insulin patterns. We noticed 2–3 hours after Protaphane he would go super low. We switched to Lantus which would drop much slower after 4–5 hours. This increased Sam’s TIR a ton. Keep records. Having a good record of when insulin or carbs are given helps a ton in timing and identifying patterns. iPhone display set up with night mode and “guided access” will not let it sleep so that I can glance at BGL anytime during the night. Note: guided access will stop alarms from working, so I use an old iphone as a display. Overnight Oncall:\nUse diluted insulin. If Sam is 12mmol/L at bedtime, a pen would deliver too much insulin. We got diluted insulin which we can give accurately from a needle and we use only for nighttime corrections. Rules of thumb for quick action. At 2am doing calculations is hard. I recommend simplifying the maths to easy to follow rules, like if the low alarm goes off, give 4g of carbs, set an alarm for 30 mins, then go back to sleep. Sam being a bionic kid with his CGM My wife and I are both working full time as Sam’s pancreas to give him the best chance at a long and healthy life. It takes a lot of time and effort but Sam is doing great. Without a CGM we would be back in the fearful, unsustainable dark days and Sam would undoubtedly be doing worse. Every person diagnosed with type 1 diabetes should walk out of the hospital with a free CGM. I hope this will be the case soon.#### Next Steps\nI started writing this post at the end of Sam being on MDI, so it is a little out of date. Sam is now on the Dana-I pump looping with CamAPS, I wrote about our pump selection here.\nI have already started to write about our experiences with CamAPS. For the impatient, CamAPS + Dana-i:\nis good. is not a silver bullet. requires xDrip to get Dexcom follow alarms to work. doesn’t show the information I want, so I made a custom app I plan on releasing. we get better results with less guessing, decisions and bad nights. Results from the last month on CamAPS loop with Dana-I pump The Future is Now ","permalink":"https://maori.geek.nz/posts/2022/2022-09-27_6-months-as-a-full-time-pancreas/","summary":"\u003ch4 id=\"managing-our-sons-type-1-diabetes-with-multiple-daily-injections-and-dexcom-g6-continuous-glucose-monitor\"\u003eManaging our Son’s Type 1 Diabetes with Multiple Daily Injections and Dexcom G6 Continuous Glucose Monitor\u003c/h4\u003e\n\u003cp\u003eIn a \u003ca href=\"https://maori.geek.nz/the-unreasonable-math-of-type-1-diabetes-8c96bdf5b7fb\"\u003eprevious post\u003c/a\u003e, I wrote about my 18 month old son’s (Sam) Type 1 Diabetes \u003cstrong\u003e(T1D)\u003c/strong\u003e diagnosis and the first month when we were learning as much as possible and struggling. That post was received with so much love and positivity, I have decided to follow up with a 6 month recap.\u003c/p\u003e\n\u003cp\u003eSam has now had diabetes for a quarter of his life (18–24 months). We have spent these first 6 months learning to be Sam’s pancreas using Multiple Daily Injections (\u003cstrong\u003eMDI\u003c/strong\u003e) and the Dexcom G6 Continuous Glucose Monitor (\u003cstrong\u003eCGM\u003c/strong\u003e). This post is about what we are doing and what we are focused on. It will hopefully be helpful for people in a similar situation and to serve as a reminder to ourselves how far we have come managing T1D.\u003c/p\u003e","title":"6 Months as a Full Time Pancreas"},{"content":"And what is NGSP, IFCC, GMI and eAG? What is HbA1c? Glucose in the bloodstream will sometimes randomly link (glycate) with a red blood cell’s haemoglobin (Hb) creating “Glycated haemoglobin” or HbA1c. The more glucose there is in the bloodstream the more likely this linking will happen. A red blood cell is in circulation for about 120 days, so the amount of HbA1c in the bloodstream can be used to approximate average blood glucose level over that time.\nDiabetics typically have more glucose in their blood, so will have a higher HbA1c. Having a high HbA1c increases the chance of developing negative long term effects of diabetes like retinopathy, amputation or organ failure [cite].\nSo what is HbA1c? A method to judge how well diabetes is being managed to predict long term health.\nThe rest of this post is some history, calculations and caveats of HbA1c.Here is an example HbA1c reading from a clinic:\nHbA1c reading from clinic.\n49 mmol/mol (IFCC) is the number of HbA1c molecules per 1000 Hb molecules, i.e. 49/1000 or 4.9% 6.6% (NGSP aka DCCT) is the same but reported as a percentage Wait! NGSP is 6.6% and IFCC is 4.9%. So which is correct? The short answer is:\nIFCC (2003) is newer, more accurate measurement, and clinics are moving towards that as the recommended measure. NGSP (1996) is older and related to clinical outcomes rather than accuracy. The conversion is: NGSP = 0.09148 × IFCC + 2.152 IFCC = 10.93 × NGSP - 23.50\nNGSP vs IFCC Why are these two measurements different?\nNGSP (formerly the “National Glycohemoglobin Standardization Program”) was tasked with defining a standard way to report HbA1c after the Diabetes Control and Complications Trial (DCCT) showed in 1993 that:\nHbA1c had a linear relationship with average glucose. HbA1c was the highest risk factor for complications of diabetes. Lowering HbA1c is worth the risk of potential hypoglycemia. Accepting a high HbA1c (hyperglycemia) was 3x more riskier. The DCCT involved 1,441 type 1 diabetics for 6 years (1983–1989), and massively impacted the way we manage diabetes today [pdf].\nAt that time each method, manufacturer and individual machine that measured HbA1c was reporting different results, even from the same blood sample. This made any global HbA1c recommendations impossible.\nThe NGSP standardised each machine by calibrating them on the same reference values. The values are not 100% accurate, but consistency was the main goal not accuracy. Now advice like “all toddlers should have an HbA1c of less than 7%” is useful no matter where you get your results from.\nThe International Federation of Clinical Chemistry (IFCC) measurement is more accurate. It doesn’t use a % measurement so it is not confused with NGSP.#### HbA1c to Estimated Average Glucose (eAG)\nRelationship shown between HbA1c and Average Glucose from DCCT (1993)\nIt doesn’t really matter if you use IFCC or NGSP, for a diabetic the most useful metric is the estimated average glucose (eAG). Blood Glucose Level (BGL) is the unit a diabetic interacts with every day, so is more readily understood.\nSince the DCCT showed a linear relationship between HbA1c and average blood glucose, eAG could be calculated with: eAG(mg/dl) = 28.7 × NGSP − 46.7 eAG(mmol/l) = 1.59 × NGSP − 2.59\nThis comes out to a table like:\nOr another way to look at it:\nWhat is GMI? Example reading from Dexcom Clarity\nYou can also use average glucose levels to approximate HbA1c, called Glucose Management Indicator (GMI).\nModern Continuous Glucose Meters (CGM) like dexcom G6 give a glucose reading every 5 minutes 24/7. With the mean glucose calculated from the CGM, GMI is calculated with: GMI(NGSP) = 3.31 + 0.02392 × [mean glucose in mg/dL] GMI(NGSP) = 3.31 + 0.43056 × [mean glucose in mmol/L]\nNote: The GMI is only an approximation of HbA1c readings and not 100% accurate, as seen above NGSP is 6.6% where GMI is 6.7%.#### Limitations and caveats of HbA1c?\nHbA1c is not enough to predict health. Large spikes and drops in BGL can also increase diabetes complications [cite]. These swings are not visible from the HbA1c alone, as it only calculates the mean value. To show this we need to also use other metrics like time-in-range or standard deviation, both which a CGM can show.\nHbA1c readings can also be affected because:\nHbA1c is weighted towards more recent events because not all red blood cells live a full 120 days. So glucose levels for the previous 30 days will impact results more than the levels from 90–120 days. HbA1c can be different between people because of sex, age, and/or BMI. Chronic illnesses affect HbA1c if they impact red blood cells, e.g. sickle cell anemia red blood cells last only 10–20 days. Environment and behaviour affect HbA1c, e.g. inhaled carbon monoxide binds with red blood cells. So smokers, who inhale more carbon monoxide, have higher HbA1c levels. FYI: There is a ton of studies about relationship between smoking an HbA1c [1], [2], [3] All this means that HbA1c readings are a useful tool to predict long term health for diabetics, but:\nA bad reading is not indicative you are doing badly overall.\nA good reading doesn’t mean that you will have no long term consequences of diabetes.#### Links\nLessons From the Diabetes Control and Complications Trial— JULIO V. SANTIAGO 1993. This is a great article. It explains why HbA1c measurement is the way it is, the research behind it, and I recommend it as a great read about the topic.\nThe NGSP website answers a lot of questions about HbA1c here http://www.ngsp.org. The IFCC webpage not so much.\nThis article lists a bunch of ways they test for HbA1c\n","permalink":"https://maori.geek.nz/posts/2022/2022-09-19_what-is-hba1c/","summary":"\u003ch4 id=\"and-what-is-ngsp-ifcc-gmi-and-eag\"\u003eAnd what is NGSP, IFCC, GMI and eAG?\u003c/h4\u003e\n\u003cp\u003e\u003cstrong\u003eWhat is HbA1c?\u003c/strong\u003e Glucose in the bloodstream will sometimes randomly link (\u003cstrong\u003eglycate\u003c/strong\u003e) with a red blood cell’s haemoglobin (\u003cstrong\u003eHb\u003c/strong\u003e) creating “\u003cstrong\u003eGlycated haemoglobin”\u003c/strong\u003e or \u003cstrong\u003eHbA1c\u003c/strong\u003e. The more glucose there is in the bloodstream the more likely this linking will happen. A red blood cell is in circulation for about 120 days, so the amount of HbA1c in the bloodstream can be used to approximate average blood glucose level over that time.\u003c/p\u003e","title":"What is HbA1c?"},{"content":"While catching up with some friends in 2021, they nerd sniped me by talking about all the interesting scaling problems they were having at Clubhouse. I quickly found myself in a meeting with the Rohan, Clubhouse’s cofounder, discussing all the other novel challenges that such an ambitious product has. Before I knew it, I was on the infra team at Clubhouse.\nI joined Clubhouse in June 2021 and have decided to leave in September 2022. This is a quick post about my work there.While at Clubhouse I mostly worked on:\nImproving scalability by building caches where needed, improving DB queries, and evaluating technologies to improve performance. Improving reliability by adding metrics and alerts, migrating tables from Postgres to DynamoDB. This was a massively successful project where I learnt a ton about data store performance. Adding static typing to the codebase with MyPy annotations. By getting to 100% we were able to quickly find and fix many potential issues. I also wrote a small script that used MyPy types to find places in the codebase where postgres queries where executed. Improving CI by speeding up and expanding tests. This was done (in part) by migrating from CodeBuild to Github actions. Rearchitecting and refactoring notifications code to help make experimentation easier. The notification code was very scalable, developed in the furnace of exponential growth. But, it was a bit difficult to change an improve. Adding tests and refactoring made it much easier run experiments that resulted in fewer but more impactful notifications. All these projects were team efforts. Mostly I was learning on the job how to solve these problems, so I heavily relied on the mentorship of colleagues. Being able to jump on a call with an expert and chat to them about a problem is a massive perk I will miss.I am leaving Clubhouse at the beginning of September 2022.\n2022 has been the worst year. My 2yo son was diagnosed with Type 1 Diabetes (T1D); My wife’s home country, Ukraine, was invaded by foreign military; COVID-19 finally made its presence endemic in New Zealand.\nI am in a privileged position where I get to choose what I want to do. These events have altered my priorities significantly. So, I am going to spend more time with my family, more time managing and learning about T1D, and more time building and using skills to contribute to open source T1D projects like Nightscout, xDrip, and AndroidAPS (I have already started learning Android development). I am looking forward to see if I can contribute to the community we recently/reluctantly joined.Throughout my time at Clubhouse, everyone has been amazing and supportive. I want to especially thank the infra team, Luke and Jordan; my manager Mircea; my colleagues Moh, Michelle, RK and Bin; and the founders Paul and Rohan.\nI wholeheartedly recommend working with these people if you can. You will be on a team of experts that are also wonderful people.\n","permalink":"https://maori.geek.nz/posts/2022/2022-09-10_my-awesome-year-at-clubhouse/","summary":"\u003cp\u003eWhile catching up with some friends in 2021, they \u003ca href=\"https://xkcd.com/356/\"\u003enerd snipe\u003c/a\u003ed me by talking about all the interesting scaling problems they were having at Clubhouse. I quickly found myself in a meeting with the Rohan, Clubhouse’s cofounder, discussing all the other novel challenges that such an ambitious product has. Before I knew it, I was on the infra team at Clubhouse.\u003c/p\u003e\n\u003cp\u003eI joined Clubhouse in June 2021 and have decided to leave in September 2022. This is a quick post about my work there.While at Clubhouse I mostly worked on:\u003c/p\u003e","title":"My Awesome Year (+) At Clubhouse"},{"content":"This is just a short set of commands that I used to add a dexcom share follower with curl. I am writing this mostly so I remember if I have to do it again, but also if someone else needs it.\nThis is based off the commands that listed here.\nNote: use _https://share1.dexcom.com_ if you have a US account and _https://shareous1.dexcom.com_ for non-US account.Authenticate SESSION_ID=$(curl -v -H \u0026quot;Accept: application/json\u0026quot; \\ -H \u0026quot;Content-Type: application/json\u0026quot; \\ -H \u0026quot;User-Agent: Dexcom Share/3.0.2.11 CFNetwork/672.0.2 Darwin/14.0.0\u0026quot; \\ -X POST \\ \u0026quot;https://shareous1.dexcom.com/ShareWebServices/Services/General/LoginPublisherAccountByName\u0026quot; -d '{ \u0026quot;accountName\u0026quot;:\u0026quot;\u0026lt;username\u0026gt;\u0026quot;, \u0026quot;applicationId\u0026quot;:\u0026quot;d8665ade-9673-4e27-9ff6-92db4ce13d13\u0026quot;, \u0026quot;password\u0026quot;:\u0026quot;\u0026lt;password\u0026gt;\u0026quot; }' | tr -d '\u0026quot;')\nSESSION_ID is now used to authenticate the other calls like…\nList Current Subscribers curl -v -H \u0026quot;Accept: application/json\u0026quot; \\ -H \u0026quot;Content-Type: application/json\u0026quot; \\ -H \u0026quot;User-Agent: Dexcom Share/3.0.2.11 CFNetwork/672.0.2 Darwin/14.0.0\u0026quot; \\ -X POST \\ https://shareous1.dexcom.com/ShareWebServices/Services/Publisher/ListPublisherAccountSubscriptions?sessionId=$SESSION_ID | jq\nIf the subscriber is not there then you need to first create a contact…\nCreate a Contact CONTACT_ID=$(curl -v -H \u0026quot;Accept: application/json\u0026quot; \\ -H \u0026quot;Content-Type: application/json\u0026quot; \\ -H \u0026quot;User-Agent: Dexcom Share/3.0.2.11 CFNetwork/672.0.2 Darwin/14.0.0\u0026quot; \\ -X POST \\ \u0026quot;https://shareous1.dexcom.com/ShareWebServices/Services/Publisher/CreateContact?sessionId=$SESSION_ID\u0026amp;amp;contactName=\u0026lt;name\u0026gt;\u0026amp;amp;emailAddress=\u0026lt;email\u0026gt;\u0026quot; \\ | tr -d '\u0026quot;')\nThen you need to send an invite to that contact\nSend Invite to Contact curl -v -H \u0026quot;Accept: application/json\u0026quot; \\ -H \u0026quot;Content-Type: application/json\u0026quot; \\ -H \u0026quot;User-Agent: Dexcom Share/3.0.2.11 CFNetwork/672.0.2 Darwin/14.0.0\u0026quot; \\ -X POST \\ \u0026quot;https://shareous1.dexcom.com/ShareWebServices/Services/Publisher/CreateSubscriptionInvitation?sessionId=$SESSION_ID\u0026amp;amp;contactId=$CONTACT_ID\u0026quot; -d ' { \u0026quot;AlertSettings\u0026quot;: { \u0026quot;HighAlert\u0026quot;: { \u0026quot;MinValue\u0026quot;: 200, \u0026quot;AlarmDelay\u0026quot;: \u0026quot;PT1H\u0026quot;, \u0026quot;AlertType\u0026quot;: 1, \u0026quot;IsEnabled\u0026quot;: false, \u0026quot;RealarmDelay\u0026quot;: \u0026quot;PT2H\u0026quot;, \u0026quot;Sound\u0026quot;: \u0026quot;High.wav\u0026quot;, \u0026quot;MaxValue\u0026quot;: 401 }, \u0026quot;LowAlert\u0026quot;: { \u0026quot;MinValue\u0026quot;: 39, \u0026quot;AlarmDelay\u0026quot;: \u0026quot;PT30M\u0026quot;, \u0026quot;AlertType\u0026quot;: 2, \u0026quot;IsEnabled\u0026quot;: false, \u0026quot;RealarmDelay\u0026quot;: \u0026quot;PT2H\u0026quot;, \u0026quot;Sound\u0026quot;: \u0026quot;Low.wav\u0026quot;, \u0026quot;MaxValue\u0026quot;: 70 }, \u0026quot;FixedLowAlert\u0026quot;: { \u0026quot;MinValue\u0026quot;: 39, \u0026quot;AlarmDelay\u0026quot;: \u0026quot;PT0M\u0026quot;, \u0026quot;AlertType\u0026quot;: 3, \u0026quot;IsEnabled\u0026quot;: true, \u0026quot;RealarmDelay\u0026quot;: \u0026quot;PT30M\u0026quot;, \u0026quot;Sound\u0026quot;: \u0026quot;UrgentLow.wav\u0026quot;, \u0026quot;MaxValue\u0026quot;: 55 }, \u0026quot;NoDataAlert\u0026quot;: { \u0026quot;MinValue\u0026quot;: 39, \u0026quot;AlarmDelay\u0026quot;: \u0026quot;PT1H\u0026quot;, \u0026quot;AlertType\u0026quot;: 4, \u0026quot;IsEnabled\u0026quot;: false, \u0026quot;RealarmDelay\u0026quot;: \u0026quot;PT0M\u0026quot;, \u0026quot;Sound\u0026quot;: \u0026quot;NoData.wav\u0026quot;, \u0026quot;MaxValue\u0026quot;: 401 } }, \u0026quot;Permissions\u0026quot;: 1, \u0026quot;DisplayName\u0026quot;: \u0026quot;\u0026lt;display_name\u0026gt;\u0026quot; } '\nAfter this the contact should receive an email with the invite link to start following.For more info see https://gist.github.com/StephenBlackWasAlreadyTaken/adb0525344bedade1e25\n","permalink":"https://maori.geek.nz/posts/2022/2022-08-03_add-a-dexcom-share-follower-with-curl/","summary":"\u003cp\u003eThis is just a short set of commands that I used to add a dexcom share follower with curl. I am writing this mostly so I remember if I have to do it again, but also if someone else needs it.\u003c/p\u003e\n\u003cp\u003eThis is based off the commands that listed \u003ca href=\"https://gist.github.com/StephenBlackWasAlreadyTaken/adb0525344bedade1e25\"\u003ehere\u003c/a\u003e.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eNote: use\u003c/em\u003e \u003ccode\u003e_https://share1.dexcom.com_\u003c/code\u003e \u003cem\u003eif you have a US account and\u003c/em\u003e \u003ccode\u003e_https://shareous1.dexcom.com_\u003c/code\u003e \u003cem\u003efor non-US account.\u003c/em\u003e\u003cstrong\u003eAuthenticate\u003c/strong\u003e\n\u003ccode\u003eSESSION_ID=$(curl -v -H \u0026quot;Accept: application/json\u0026quot; \\   -H \u0026quot;Content-Type: application/json\u0026quot; \\   -H \u0026quot;User-Agent: Dexcom Share/3.0.2.11 CFNetwork/672.0.2 Darwin/14.0.0\u0026quot; \\   -X POST \\   \u0026quot;https://shareous1.dexcom.com/ShareWebServices/Services/General/LoginPublisherAccountByName\u0026quot; -d '{   \u0026quot;accountName\u0026quot;:\u0026quot;\u0026lt;username\u0026gt;\u0026quot;,   \u0026quot;applicationId\u0026quot;:\u0026quot;d8665ade-9673-4e27-9ff6-92db4ce13d13\u0026quot;,   \u0026quot;password\u0026quot;:\u0026quot;\u0026lt;password\u0026gt;\u0026quot;   }' | tr -d '\u0026quot;')\u003c/code\u003e\u003c/p\u003e","title":"Add a Dexcom Share Follower with curl"},{"content":"A quick guide to pumps available in New Zealand We are tired. Managing our 2 year old son’s (Sam) type 1 diabetes (T1D) with Multiple Daily Injections (MDI) is a lot of work. It’s normal to get up and give food or insulin multiple times a night, and it is necessary to give multiple injections at precise times during meals.\nWe need a better tool! The best tool available is an insulin pump. MDI is coarse, we give 8–10 giant doses of insulin per day. A pump, on the other hand, is fine grained, it can give hundreds of tiny doses a day.\nA pump can also be driven by an algorithm that can read Blood Glucose Levels (BGL) from a Continuous Glucose Monitor (CGM) and decide how much insulin is needed. The best algorithms available at the moment are called Hybrid Closed Loop (HCL). Hybrid because it still requires user input the amount of carbs eaten; Closed Loop because it automatically delivers insulin according to BGL.\nWith a pump we should be able to improve time-in-range (TIR), where blood sugars are between 3.9–10mmol/L, while reducing the amount of burden and stress on us as parents who are managing a T1D toddler.\nHowever, choosing a pump is a huge commitment. The cost to purchase is between $5,000-$10,000, additionally hundreds a month in ongoing consumables and support. If you are lucky and get a pump funded through insurance or healthcare, they won’t let you change pumps for at least 4 years, potentially forcing you to live with a bad choice for a long time.\nI am writing this post as a quick guide/review/justification for pump selection. This is focused towards a T1D toddler in New Zealand, but should be of some use to others as well.\n_This is not medical advice! Also, double check all the presented specifications/features because they change quickly and we have not had first hand experience with any of these pumps._The pump requirements for a T1D toddler are a little different than an adult:\nSmall and light: as the pump will be attached to Sam 24/7. He will break it if he gets annoyed with it or even notices it. Infusion set: the port where the insulin is administered should be small, hard to rip off, easy to apply, and with a small needle for small Sam. Typically, they need to be replaced every 3 days. Continuous glucose monitor: because of limited interoperability, choosing a pump is also choosing a CGM. Like the infusion sets, the CGM should be small and hard to rip off. The measurements will impact insulin dosing so it should also be accurate. Dexcom G6 (what we are with currently) is probably the best, but we have limited experience with others. Easy to use: we should be able to easily explain to a carer how to use it and perform the needed functions. Good safe guards as well, like maximum insulin on board and a child lock are needed. Remote monitoring/use: pumps are usually built to be used by the person they are attached to. With a toddler this is not the case. If he is sleeping or running around, it would be good to not have to wake him up or chase him to give him insulin. If we are sleeping, I want to be able to quickly check his BGL and be alerted if there is something wrong. We can never have a good nights sleep if we are not 100% confident we will be woken up if something is wrong. Smart algorithm: An HCL algorithm would remove a ton of work. Each algorithm is unique so comparisons are mostly about features and clinical evidence. Support and training: We need support and training for us and our diabetes team. If something is going wrong, we need to be able to troubleshoot. There are a few other features that would be nice:\nWaterproof for baths, showers or swimming\nNo phone required to operate the HCL. Given the pitiful range of bluetooth and the speed Sam can run, this might be a good idea.\nReplaceable batteries.\nFuture developments. With things moving so fast, being stuck on an old system for 4 years would be painful.In New Zealand, there are two (only two!) funded pumps and one unfunded pump that fits our criteria:\nTandem t:slim X2 + Dexcom G6 (funded)\nMedtronic 780G + Guardian 4 (funded)\nDana-I + Dexcom G6 (unfunded)#### Tandem t:slim X2 + Dexcom G6\nSmall and light: Very thin, but heavy: 79 x 51 x 15mm and 112 grams Infusion set: We were recommended the TruSteel infusion set with 6mm metal needles. The size may limit the location of the sites. CGM: Dexcom G6, our preferred CGM. Lasts 10 days, easy to apply, accurate. Easy to use: The touch screen is a bit clunky and easy to mis click on. It had a good child lock, and the menu seemed intuitive. Remote monitoring/use: No remote bolus. The pump connects directly to the Dexcom transmitter, so no phone is needed. You can continue to use the Dexcom app with the Follow and Clarity apps for alerts and metrics. The pump itself is not connected to the phone (yet), so seeing what the pump is doing requires physical access. Smart algorithm: Control-IQ, which recently made its way to New Zealand, is not licensed for under 6 year old (or under 25kg). Basal-IQ can be used with toddlers, it will suspend basal doses if his BGL gets too low. This is an excellent safety feature, but not as smart as an HCL. Support and training: Excellent support from the team at NZMS. Other notes:\nIt is not water proof Internal battery requires charging every few days To get history the pump needs to be plugged into a computer It is widely used in New Zealand and we talked with some parents using it and they love it. Tandem are working on remote bolus, but it will probably take a while to get to New Zealand Links\nProduct Page Manual Brown SA, Kovatchev BP, Raghinaru D et al (2019) Six-month randomized, multicenter trial of closed-loop control in type 1 diabetes: This is the longest randomised controlled closed-loop study to date, involving 168 people with type 1 diabetes (age ≥ 14 years) showing mean improvement of time-in-range of more than 10%.#### Medtronic 780G + Guardian 4 Small and light: bulky and awkward. 54 x 97 x 25mm and 106 grams Infusion set: 6mm steel needle site was recommended. CGM: Guardian 4 is a bit finicky to apply, only lasts 7 days. It takes longer to activate than Dexcom G6 and has slightly worse accuracy. It isn’t bad, it is just slightly worse in many areas to the G6. Easy to use: Physical buttons, no touch screen. **** Looked easy to use. Has a child lock, and inputing carbs was pretty straight forward. Remote monitoring/use: No remote bolus. The pump connects to an app which has a share feature so we can see what the pump is doing anywhere, and be alerted if something is wrong. Smart algorithm: SmartGuard HCL looks amazing. It has no weight restriction and available for 12+months kids. It removes a ton of the work and setup and will learn insulin sensitivity. SmartGuard works without being connected to a phone. Support and training: It has excellent support and training through Intermed Other notes:\nReplaceable AA batteries Waterproof sites Even if the pump is funded, the Guardian 4 CGM won’t be(!). So using SmartGuard will still cost money. Medtronic are working on remote bolusing and longer lasting infusion sets [link] but won’t be available in New Zealand anytime soon. Links:\nManual Product Page Silva JD, Lepore G, Battelino T, et al. Real-World Performance of the MiniMed™ 780G System: First Report of Outcomes from 4120 Users. This shows solid time-in-range results with minimal hypos.#### Dana-I + Dexcom G6 Small and light: lightest pump, not as thin as t:slim but not awkward like 780G. **** 85mm x 44mm x 19mm and 86 grams Infusion set: Comes with the smallest 4.5mm needles. This is tiny for tiny Sam. CGM: Dexcom G6, our preferred CGM. Easy to use: The pump itself has a very simple interface, but in our case most of the interactions will be with the CamAPS app (which looks amazing!). Remote monitoring/use: Remote bolusing through CamAPS!!! **** Alerts via SMS to up to 5 phones. Uploads the pump and CGM data to diasend every 5 minutes, the only way to remotely monitor the Sam. xDrip works with CamAPS as a workaround for sending information to other locations like Nightscout. Smart Algorithm: CamAPS is available and looks amazing. It seems to have the best clinical results out of any commercial HCL. It has a lot of features, like boost, which would make parenting much easier. It is possible to use AndroidAPS but that doesn\u0026rsquo;t have an HCL in stable release (yet). Support and training: Intuitive Therapeutics is a small company that sells and provides pump support and CamAPS provides app support from UK. This pump is not funded so does not have widespread use in New Zealand, that is a risk. Other notes:\n$5,500 NZD for the pump. Add to that, about $160 a month in consumables, $130 a month for CamAPS, and $400 for Dexcom G6, the first year will cost about $14,000. That takes this pump out of reach for many. We have private insurance which will hopefully cover about 1/2 of that. Replaceable AAA batteries CamAPS is only available on Android, and the phone is required for operation. So keeping a phone in range is a must otherwise the pump will fall back to default settings. CamAPS has said they will sync to Dexcom in the future (though I found an article saying that it was promised in 2020). This would allow Dexcom integrations like the Follow and Clarity apps, as well as other. I think Dana-i with CamAPS is an amazing system. Not being funded means not as many people can use it.\nLinks:\nManual Dana-i Product Page CamAPS FX app page Webinar introducing CamAPS FX Review/Overview of CamAPS and One Month Followup Tauschmann M, Allen JM, Nagl K et al (2019) Home use of day-and-night hybrid closed-loop insulin delivery in very young children: A multicenter, 3-week, randomized trial. __ This shows CamAPS to be good for young children AND that diluted insulin provided no benefit.Given the above information, our order of preferred pump is: Dana-i with CamAPS + Dexcom G6. This has solid results for an HCL for 2 year olds. Has a super easy interface. Is slightly let down on the remote monitoring and alerting. The big negative here is no funding. MiniMed 780G. SmartGuard looks amazing and can be used by a 2 year old. The main negatives are the awkward shape and size of the pump to be carried by a toddler, and no remote bolusing or control. Also, Guardian G4 is just slightly worse than Dexcom G6 in most areas. t:slim x2. Control-IQ not being available for 2 year olds, and not being able to remotely monitor the pump are the big negatives. The size and shape are positives. If Sam was older and managing his own T1D we might have selected this pump. This is of course a personal choice. Others will have different priorities for their pumps, but hopefully this helps.Two honourable mentions for pumps are:\nYpsoPump + Dexcom G6: Is a small light pump that has some cool features and glass cartridges so they can be pre-filled and stored in the fridge. The version sold in New Zealand is not the two-way communication model, but that newer model works with CamAPS soon (very soon). Hopefully, PharmaCo who supplied the YpsoPump, will upgrade their model to be CamAPS compatible soon. Omnipod 5 + Dexcom G6: This is our dream pump. It being tubeless, app controlled, with an HCL(!) would be awesome for Sam. It is just very new, and it doesn\u0026rsquo;t look like the manufacturer will ever be available in New Zealand. Moving to USA/Australia just to get access to this has crossed our minds.Other links Table comparison of pumps [link] Clinical studies for CamAPS [here, here] Boughton, C.K., Hovorka, R. New closed-loop insulin systems (2021) discusses all three (CamAPS, Control IQ, SmartGuard) HCLs above. This is the best one stop discussion of the current state of HCLs. Table 1 comparison of HCL systems The Algorithm for Precision Medicine by Matt Might ","permalink":"https://maori.geek.nz/posts/2022/2022-06-03_which-insulin-pump-to-choose/","summary":"\u003ch4 id=\"a-quick-guide-to-pumps-available-in-new-zealand\"\u003eA quick guide to pumps available in New Zealand\u003c/h4\u003e\n\u003cp\u003eWe are tired. Managing our 2 year old son’s (Sam) type 1 diabetes (\u003cstrong\u003eT1D\u003c/strong\u003e) with Multiple Daily Injections (\u003cstrong\u003eMDI\u003c/strong\u003e) is a lot of work. It’s normal to get up and give food or insulin multiple times a night, and it is necessary to give multiple injections at precise times during meals.\u003c/p\u003e\n\u003cp\u003eWe need a better tool! The best tool available is an \u003cstrong\u003einsulin pump\u003c/strong\u003e. MDI is coarse, we give 8–10 giant doses of insulin per day. A pump, on the other hand, is fine grained, it can give hundreds of tiny doses a day.\u003c/p\u003e","title":"Which Insulin Pump to Choose?"},{"content":"This is not medical advice, don’t base any treatments on this.\nIn January 2022, our 18 month old son, Sam, was diagnosed with Type 1 Diabetes (T1D). This was stressful, sad, and scary as we spent 5 days in hospital with him while he recovered from Diabetic Keto Acidosis (DKA). Within an hour of him being diagnosed a wonderful diabetes nurse gave us a literal backpack filled with books and information we needed to learn to keep him alive. We started to read and try to understand what it takes to manage T1D. Immediately the massive cognitive overhead it takes to just survive with this condition hit us.\nI find the best way to learn something is to try explain it to someone else. This post is me trying to explain the maths involved in managing T1D, with a few small rants about how shit it is.\nFood Go Up, Insulin Go Down Insulin is a molecule created by the pancreas that lets glucose from blood enter cells to be used as energy. T1D is an autoimmune disease where the immune system attacks the insulin creating cells until the pancreas stops creating insulin altogether. T1D means you have a faulty pancreas; there is no cure, no diet that fixes it and you don’t grow out of it. It is a lifelong condition that you have to manage 24 hours a day.\nGlucose enters the blood when you eat basically anything, but especially carbohydrates. Without insulin glucose will build up in the blood, eventually causing your body to enter a state called Diabetic Keto Acidosis (DKA), then coma, then death**. Insulin must be added** to lower the amount of glucose in the blood. Too much insulin and your glucose level will go too low and you go Hypoglycaemic (hypo), **** then coma, then death.\nManaging T1D is walking on a knifes edge between DKA and Hypoglycemia by balancing blood glucose levels with insulin.The units of T1D math are:\nInsulin is measured in units (u), typically 100 units per 1ml. Blood Glucose Level (BG or BGL) is measured in mmol/L (in USA its mg/dL which is mmol/L*18). Carbohydrates are measured in grams (g). Successfully managing T1D means keeping BGL between 4–8 mmol/L (72–144 mg/dL). Between 4 and 8 is the goal, but a full day in that range almost never happens. The two main levers to achieve this goal are:\nEat carbs to make BGL go up. Inject insulin to make BGL go down. Those are the basics of managing T1D, but there is so much more.\nCarbohydrate Counting Eating carbohydrates increases blood glucose, counting carbs to know how much you eat is a requirement. Fortunately, most food has a label like this bread:\nThis bread is 40.1g of carb per 100g (i.e. 40.1% carb) and in 2 slices of the bread is 25g of carb.\nCarbohydrates are absorbed by the body at different rates. For example, here is the rate at which pure glucose is absorbed compared to bread (glycemic response curve):\nDotted lines are glycemic response curves for different foods for someone without T1D [Bellman et al.]\nThe glycemic curve for both glucose and bread peak at 30-40 minutes, but glucose rises BGL much faster. That rise is measured using the Glycemic Index (GI) **** where a higher GI food raises BGL faster. For example, the GI for white bread is 70, wheat bread is 50, so white bread will raise BGL faster than wheat.\nBy knowing the total carbs and the GI of a food, we get an idea of the glycemic response curve and the impact on BGL.\nInsulin Curves Managing T1D means injecting insulin to reduce blood glucose. There are different types of insulin with different rates of release:\nWith the “onset of action”, “peak of action” and “duration of action” we can get an idea about how these insulins work.\nA unit of any type of insulin is equivalent. This means we can mix and match the different types to get the desired curves we want. The most common way to mix insulins is called the Basal-Bolus therapy:\na Basal is long lasting insulin (e.g. protaphane) taken 1–2 times a day. a Bolus is fast acting insulin (e.g. novorapid )taken with each meal. Adding a basal and bolus together we can get an insulin curve closer to the glycemic response:\nEverybody is different So now we know that carbs make BGL go up; insulin makes BGL go down. But by how much?\nBefore we can do any calculations, there are three variables we need to define relationships between (insulin, carbs and BGL). Since “How much insulin do I need?” is the most common T1D question, it is practical to relate carbs and BGL to units of insulin:\nThe Carbohydrate Ratio (CR) is the ratio of 1 unit of insulin to grams of carb, e.g. a ratio of 1:25 (or just 25) means that if you eat 25g of carbs (2 slices of bread) you need 1u of insulin. The Insulin Sensitivity Factor (ISF) is the ratio of 1 unit of insulin to mmol/L BGL, e.g. an ISF of 1:6 (or just 6) means if you take 1u of insulin your BGL will drop by 6 mmol/L. Since ISF and CR are related to 1u of insulin, then ISF:CR is the third relationship. With ISF and CR we can answer the common questions for T1D:\nHow much insulin do I need for a 35g carb meal? **grams of carbs/CR**, e.g. 35g of carbs/25 CR = 1.4u of insulin. How much insulin do I need to reduce my BGL by 3? **BGL / ISF**, e.g. with an ISF of 6 that is 3/6 = 0.5u of insulin. How many carbs do I need to raise my BGL by 3? CR/ISF * BGL e.g. 25/6*3 = 12.5g carbs to raise blood sugar by 3.Every person has a different ISF and CR. The first thing you need to manage T1D is to find out your ISF and CR. This is not straight forward. The best way to find them is guessing their values and slowly increasing or decreasing them until you find something that works. There are a couple rules of thumb to make it easier though:\nThe 100 rule: take the mean total daily dose of insulin (TDD) from few recent days and ISF=100/TDD. The 350/450/500 rule: same as above but for CR, e.g. CR=450/TDD. You just pick the rule that sounds about right. Another way to calculate these values is through self experimentation. By eating carbs or taking insulin after fasting, we can look at BGL change and derive ISF and CR. But this approach is still error prone and difficult (especially for hungry toddlers).\nThe Loop The T1D loop to get the target BGL of 4–8 mmol/L is:\nCount the carbs you are about to eat, make sure glycemic index isn’t too high so the insulin can handle it. Measure BGL and if it is too high or low calculate a correction to **** add or remove insulin from the meal to bring BGL to within range. Calculate the correction dose with **(BGL-target BGL)/ISF=correction insulin**. Calculate the insulin units from the carbs with **grams of carbs/CR=meal insulin**. Inject the **meal insulin + correction insulin** **** units. Wait a bit (20–40mins) so we can match the peaks of the insulin with the glycemic response. The higher the food GI the more important the timing. Eat ALL the food. Repeat for every meal. Sam being distracted from the injection with cheese A real example of this loop is Sam’s lunch today; Peanut butter on 2 slices of wheat bread:\nCount carbs: There are 25 grams of carbs in a peanut butter sandwich. Wheat bread has a low GI of 50, so nice slow glucose rise. Measure BGL: Get Sams BGL with a finger prick. It comes out at 12; we want him to be at 6. His ISF is 8, so the correction will be (12–6)/8=0.75u of insulin. Calculate the insulin: Sam has a CR of 20 so the 25g of carbs in the sandwich require 25/20=1.25u of insulin. Inject: So the total insulin is 1.25 + 0.75 = 2u of insulin for lunch. Wait: Given novorapid insulin peaks in about 1 hour, and wheat bread has a low GI we give Sam the food in 15mins. Pro toddler tip: don’t show him the food until he can eat it, patience is not a toddler’s virtue. Eat: We make sure he eats all the carbs, so the math is correct. Repeat: start talking about what Sam is having for dinner. This is a lot of work. Now let’s see why this isn’t as straight forward as it seems.### Everything is Hard. Nothing Makes Sense. WTF?\nA person managing T1D will have to also think about a ton of different ways the above calculations are effected.\nHere are some day-to-day things you need to take into account:\nExercise can make BGL go UP or DOWN! Generally high intensity makes it go up, endurance makes it go down. This can also change with age and metabolism. The liver produces glucose as well, which can mean BGL goes up even when not eating anything. Adrenaline from excitement, stress, anxiety can raise BGL. A hot bath or shower can raise (then lower) the measured BGL. This might be an incorrect reading, so may be dangerous to treat. Every food impacts BGL, e.g. a large low carb protein shake will quickly raise BGL. Sickness (and all the horrible symptoms) throws off your ISF and CR. Rule of thumb is to measure BGL twice as often when sick. ISF and CR change with time of day, e.g. many people have a higher ISF in the morning. Sleeping changes ISF and CR; a nap can really throw off your day. If your BGL is high for a while (with high levels of ketones) you will need more insulin. Rule of thumb is 1.5 times insulin. Soon after diagnosis there may be a “honey moon” period where your pancreas can still produce some insulin so you need to inject less. This can last a few weeks or months. ISF and CR change as we age, especially around puberty. Each day might be different than the day before so we must constantly adjust and evaluate. Insulin types have crazy wide possible values, e.g. protaphane peaks between 3–12 hours (!!!). Planning for this is impossible and has resulted in a few hypos for Sam. Also managing T1D means you have to deal with practical and logistical concerns:\nCarb counting while out, say in a restaurant, is usually a total guess. Even if you break out scales and ask for ingredients. Insulin will lose potentness if it gets too warm or too cold. It might not even look any different, so the only way you find out is to inject it and hope. Some insulins are a mixture that seperate (e.g. protaphane). If you don’t premix them enough, their resulting curve will be off. But don’t shake too hard, that can do weird things to insulin and make it less effective. A person doing the above calculations may be suffering the effects of T1D while trying to fix themselves. One day we gave too much insulin to Sam and he dropped pretty low. For the next couple days we were gun shy of giving him too much, so didn’t give enough. Giving insulin is scary, not giving enough insulin is scary. The horror that is the US health system forcing some people to ration insulin because of its high price. Even if a person has health insurance the price of the insulin needed for a Mc Donalds milk shake might be more than the shake itself. This is all made more difficult because Sam is only 18 months old:\nExplaining to a toddler that they can’t have food is impossible. Often we are doing all the math under duress of a screaming toddler. We have to inject the insulin BEFORE he eats and we don’t know if he will actually eat the food. Have you tried convincing a toddler to eat something they don’t want to? If that happens we quickly find and prepare some food with the equivalent carbs, otherwise he will go into a hypo. To actually measure BGL requires stabbing a finger and drawing blood, and injecting insulin is another stab. Causing pain to an infant that can’t understand what is going on is horrible. Sam is very small so his doses are minuscule. Our insulin pens can only dose at intervals of 0.5u of insulin; a significant % of his total dose. Overshooting and undershooting are common occurrence causing wild swings in BGL. A symptom of being high or low is aggression and mood swings. A symptom of being a toddler is aggression and mood swings. Working out if a tantrum is because of diabetic Sam or toddler Sam is impossible. The final exception to note is that T1D is just a shit disease and it makes BGL do weird stuff all the time.\nHere is a poster explaining lots of these and more:\nThe Unreasonable Maths of T1D The worst part about it is that; If you are better at math you will live longer. Who makes a disease where the good math people live longer? — Scott Hanselman T1D\nI call the maths of T1D unreasonable because I am finding all this stuff difficult and stressful. Even though I know all the relevant numbers, even if I have all the information, even if I am comfortable with the calculations, even if all the factors are accounted for; when we give Sam food or insulin (and sometimes when we do nothing at all) his BGL changes in wildly unexpected ways.\nManaging T1D is hard, even with a ton of support. I cannot imagine doing this alone, or managing my own T1D, or dealing with the US health system/insurance. If you are dealing with the physiological and psychological issues related to T1D, I want you to know that it is hard, and you are doing a great job.\nMe and Sam in Wellington Hospital ICU\nEpilogue: Insulin Pumps + Continuous Glucose Monitors (CGM) = Artificial Pancreas The solution to all the unreasonableness of T1D maths is obviously to get a computer to do it. Computers doing hard calculations in a loop is what they do best.\nTo automate the delivery of insulin we need three things:\nA Continuous Glucose Monitor (CGM), e.g. the **** Dexcom G6, **** reads the BGL level of a person every 5 minutes and can report it to other devices. An Insulin Pump, e.g. the Omnipod, continuously injects tiny doses of insulin throughout the day. A system/algorithm, e.g. OpenAPS or Loop, that continuously reads the output from the CGM, predicts future BGL, and tell a pump to deliver doses of insulin. These three items together create an artificial pancreas, or closed loop system. This removes the need for a human to do most calculations, and just set what value you want your BGL and let the system keep it there.\nSo why should you still learn all the unreasonable math?\nCGMs and insulin pumps are not always funded by New Zealand health care (even though it is highly recommended). A CGM can cost $400 a month, and a pump has a large upfront cost (e.g. $10k) and also a high monthly cost (e.g. $200-$800). Even though patient outcomes have been shown to improve with these devices, the cost makes them out of reach for a lot of people. They are subsidised in Australia. There are not many closed loop systems available in New Zealand, I have found Medtronic 770G pump + Guardian Sensor 3 and the Tandem t-slim 2 + Dexcom G6. The other option is to DIY your own with open-source. Sam is 18 months old, most CGMs and insulin pumps can only be used on much older kids with a minimum weight, e.g. 6 years and older and 25kg minimum for Tandem + Dexcom. So, we have to wait for Sam to get older and the technology to improve to work with younger kids. Over reliance on technology for your life, without any backup, is not a good idea. Pumps break, CGMs can be ripped off, software breaks, batteries die. Having a backup is always a good. Dexcom G6 on Sam’s hip We did just get Sam a Dexcom G6 CGM, and we are loving it. Before, we used to measure Sam’s BGL at each meal, 3 hours after the meal and at 10pm and 2am to check he wasn’t going low in his sleep. Now we can check our phones for his BGL and will get a notification from Dexcom and home assistant if he goes too low or too high. This has improved everyone’s life.### Thanks\nThanks to doctors nurses and staff of the Wellington Hospital Emergency Department, ICU, children’s ward and diabetes unit. Thank you to the family, friends, and coworkers who have reached out with support and experiences with T1D. Finally, thanks **** to the people in the T1D community for all the years you advocated for better healthcare and technology so that when we showed up there was so many choices.\nUseful links Calculating and predicting blood glucose levels in T1D is an active area of research [here] and [here] [Glycemic index search index] (https://glycemicindex.com/gi-search/)for finding the GI of different foods Carbohydrates and Blood Sugar Carbohydrate Counting in Children and Adolescents with Type 1 Diabetes (2018) Calculating Insulin Dose Scott Hanselman — Solving Diabetes with an Open Source Artificial Pancreas Detailed description of how OpenAPS a closed loop artifical pancreas makes its decisions Home Assistant Dexcom integration Loops algorithm for predicting glucose News/Studies related to New Zealand funding CGMs and pumps [here] and [here] ","permalink":"https://maori.geek.nz/posts/2022/2022-02-17_unreasonable-math-of-type-1-diabetes/","summary":"\u003cp\u003e\u003cem\u003eThis is not medical advice, don’t base any treatments on this.\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eIn January 2022, our 18 month old son, Sam, was diagnosed with Type 1 Diabetes (T1D). This was stressful, sad, and scary as we spent 5 days in hospital with him while he recovered from Diabetic Keto Acidosis (DKA). Within an hour of him being diagnosed a wonderful diabetes nurse gave us a literal backpack filled with books and information we needed to learn to keep him alive. We started to read and try to understand what it takes to manage T1D. Immediately the massive cognitive overhead it takes to just survive with this condition hit us.\u003c/p\u003e","title":"The Unreasonable Math of Type 1 Diabetes"},{"content":"I use Redis and Redis uses Lua as a scripting language. Today was my first day using Lua in anger, and I am still angry.My problem is that in Redis I have many sets of keys, e.g. s1 = {1,2,3} s2 = {3,4} and keys 1=a, 2=b, 3=c, 4=d. I want to return all values of all keys in the union of given sets, e.g. f(s1 s2) = a b c d\nI could SUNION s1 s2 which returns 1 2 3 4, then MGET 1 2 3 4 to then get a b c d. This is kind of wasteful because it is 2 round trips to Redis when I want to do it in 1.\nEnter Lua and EVAL.\nRedis lets you send along a Lua script that can do a bunch of stuff all at once; I can call Redis with: EVAL \u0026quot; local indexes = redis.call('SUNION', unpack(KEYS)) return redis.call('MGET', unpack(indexes)) \u0026quot; 2 s1 s2\nNow this will call the SUNION in the input KEYS, which are s1 s2 and then return the MGET result. unpack here is basically just splat (* in Ruby and Python) taking the Lua table (array) and splitting it out into arguments for the call method.\nThere, we have done it, and everything works.**Ohhhh, nooooo. It just broke in production!?!**So what I found out was that unpack has a max size (about 8,000). So, if either the number of sets OR the number of keys is greater than 8K Lua throws an error.\nSo now I have to write more Lua. What is interesting to know is that MGET is actually pretty slow. I don’t know why, but many GETs are faster than 1 MGET [cite]. Don’t know why, but that makes this little bit easy at least. local indexes = redis.call('SUNION', unpack(KEYS)) local values = {} for i=1,#indexes do local value = redis.call('get', indexes[i]) table.insert(values, value) end return valuesNow onto the harder problem, the number of KEYS being above 8,000.\nSo, since we are calling from the client, we could just split it up there, e.g. if the client has 18,000 keys we just call the above script three times. But that gets us back to where we were initially, calling Redis multiple times.\nSo lets do something like: local splitby = 8000 local indexes = {} if #KEYS \u0026lt;= splitby then indexes = redis.call('sunion', unpack(KEYS, 1, #KEYS) ) elseif #KEYS \u0026lt;= (splitby * 2) then indexes = redis.call('sunion', unpack(KEYS, 1, splitby), unpack(KEYS, splitby + 1, #KEYS) ) ...\nOh wait, this doesn’t work! When you use unpack more than once in function IT ONLY SELECTS THE FIRST ELEMENT OF THE LIST. I will repeat that, it breaks silently, then only sends the first element of its list as an argument.\nThis is like 2 hours of my life, including filing a bug against Redis because I didn’t understand this weird behaviour. Lua stole 2hours of my life. If you would like a better description of this and other unpack pains go here.So here we go again, this time with the correct solution: local all_indexes = {} local step = 0 for i=1,1000 do if #KEYS == step then break end local next_step = step + 8000 if next_step \u0026gt; #KEYS then next_step = #KEYS end local indexes = redis.call('sunion', unpack(KEYS, step+1, next_step) ) table.insert(all_indexes, indexes) step = next_step end``local values = {} local seen = {} for i=1,#all_indexes do local indexes = all_indexes[i] for j=1,#indexes do local getkey = indexes[j] if seen[getkey] ~= true then seen[getkey] = true local value = redis.call('get', getkey) table.insert(values, value) end end end return values\nFirst we break apart the incoming KEYS into parts and sunion them in batches adding the results to the all_indexes table.\nThen we loop over that table of tables, and get each key, making sure not to get the same key twice.\nDone.To summarise: because unpack has a size limit and weird behaviour, my simple TWO LINE SCRIPT is now at least 30 complicated lines.\nI don’t like calling Lua a bad language, it clearly has its place in the world. But Lua hurt me today, and I just wanted to share.\n","permalink":"https://maori.geek.nz/posts/2021/2021-09-27_my-troubles-with-lua/","summary":"\u003cp\u003eI use Redis and Redis uses Lua as a scripting language. Today was my first day using Lua in anger, and I am still angry.My problem is that in Redis I have many sets of keys, e.g. \u003ccode\u003es1 = {1,2,3} s2 = {3,4}\u003c/code\u003e and keys \u003ccode\u003e1=a, 2=b, 3=c, 4=d\u003c/code\u003e. I want to return all values of all keys in the union of given sets, e.g. \u003ccode\u003ef(s1 s2) = a b c d\u003c/code\u003e\u003c/p\u003e","title":"My Troubles with Lua"},{"content":"I got a new graphics card because I want to play around with some machine learning using tensorflow and pytorch. Before I jump into all those high level concepts, with layers and layers of abstractions, I want to understand a little bit more about CUDA and how it works. I usually learn by doing, so I decided to do something pretty easy, implement Game Of Life (GOL) using CUDA.\nGOL is good use case for CUDA. It is a simple mathematical operation that can be massively parallelised. In this post I get a pretty fast implementation of GOL working in CUDA, and compare it against another implementation in Golang. **TLDR: CUDA wins… by a lot.**CUDA lets you compile code to be run on a graphics card. You write in a C like language two types of functions, one for the CPU (host) and one for graphics card (device). The code starts with a main function executed on the CPU and can call out to device kernel functions declared with __global__.\nWhen you call a kernel function, it is run many times on the same set of arguments. The only difference between each execution is the assigned thread and block positions, each either in 1D, 2D or 3D space. This looks like:\n2D layout of threads and blocks from here\nRead more about this execution model here.\nA thread position can be a max of 1024 locations (e.g. in 2d space that would be 32x32 square), and a block position up to 65535³ (so, a lot of blocks). When a kernel is called the number of threads and blocks is defined before the function with \u0026lt;\u0026lt;\u0026lt;blocks, threads\u0026gt;\u0026gt;\u0026gt;, e.g. addVector\u0026lt;\u0026lt;\u0026lt;5,10\u0026gt;\u0026gt;\u0026gt;(), CUDA will then assign GPUs to execute these 5 blocks each with 10 threads.At this point I should say that I based most of this code off of “Conway’s Game of Life on GPU using CUDA”, an excellent article that you should read.\nGOL maps really well to this execution model. Each cell can be calculated by a thread, and we divide the world up into 32x32 blocks of cells. To calculate x and y for each cell we use: x = threadIdx.x + (blockDim.x * blockIdx.x); y = threadIdx.y + (blockDim.y * blockIdx.y);\nthreadIdx and blockIdx are the x and y position of the thread and block and blockDim is the total number of threads in a block dimension (here 32). So in thread (2,3) in block (4,5), x = 2 + (32 * 4) = 130 and y = 3 + (32 * 5) = 163.The last real complication of CUDA here is that you can only pass 1D arrays as arguments (kinda, I am not 100% sure, I am new to C). This means that once we have the x and y co-ordinates we need to map them onto a 1d array. The mapping I am using looks like [(x0,y0),...,(x31,y0),(x0,y1)...].\nFor all the calculations in GOL we first need to find the rows y-1, y, y+1. We already have y so now we need y-1 (or y_down)and y+1 (or y_up) and wrap around the world: y_up = (y + 1) % size y_down = (y - 1 + size) % size\nThen we can find where the offset for these rows in the array by multiplying by the world size: y_offset = y * size y_up_offset = y_up * size y_down_offset = y_down * size\nFor example, if y=4 in a world size 32x32 then y_offset would be 4*32.\nWe then need the x_left and x_right values and their array offsets: x_left = (x - 1 + size) % size x_right = (x + 1) % size offset = x + y_offset\nNow we can inspect all cells around (x,y): aliveCells = world[x_left + y_up_offset] + world[x + y_up_offset] + world[x_right + y_up_offset] + world[x_left + y_offset] + world[x_right + y_offset] + world[x_left + y_down_offset] + world[x + y_down_offset] + world[x_right + y_down_offset];\nAccording to the GOL rules if the cell has 2 or 3 neighbours it survives, if it has 3 and is dead it becomes alive, anything else it dies. buffer_world[offset] = aliveCells == 3 || (aliveCells == 2 \u0026amp;amp;\u0026amp;amp; world[offset]) ? 1 : 0;\nWe assign the value into a different array called buffer_world, because all other threads are using the world for their calculations.\nThe kernel function arguments are then buffer_world, world and the world size (to calculate offset). The function then looks like: __global__ void game_of_life_turn( ubyte *world, ubyte *buffer_world, short size )The size of the world will change the number of threads and blocks we need. To make it super simple we just require the number of threads to be 32,32 and the size of the world to be multiples of that. This makes it easy to divide the world into blocks: dim3 threadsPerBlock(32, 32) uint blockDimSize = size / 32 dim3 numBlocks(blockDimSize, blockDimSize)\nTo then calculate a whole world we just run: for (turn = 0; turn \u0026lt; turns; turn++) { game_of_life_turn\u0026lt;\u0026lt;\u0026lt;numBlocks, threadsPerBlock\u0026gt;\u0026gt;\u0026gt;( device_world, device_buffer_world, size ) std::swap(device_world, device_buffer_world) }\nThere is a bunch of boiler plate stuff I missed like cudaMalloc and cudaMemcpy. But basically this is a working GOL implementation in CUDA!I want to compare this CUDA implementation with something executed on the CPU and chose to do that in Golang. Now you may be asking\nGraham, why are you comparing CUDA with GoLang rather than just plain C?\nThe answer is that the only reason I am writing C is because I want to learn me some CUDA. And, typically, if I wanted to write something in C, I would just write it in Go. Learning the parallelism model in C also looks like a pain.\nTo get the Go implementation I copied over the CUDA implementation and replaced the parallelism with goroutines. A goroutine per cell is a bit expensive, so we chunk the world up into the number of CPUs the system has and then execute each in parallel. concurrency := runtime.NumCPU() for i := 0; i \u0026lt; turns; i++ { gameOfLifeWorldTurn(world, worldBuffer, size, concurrency) tmp := world world = worldBuffer worldBuffer = tmp }\nWith the goroutine call looking like: func gameOfLifeWorldTurn(univ []int, buffer []int, size int, concurrency int) { var wg sync.WaitGroup wg.Add(concurrency) for job := 0; job \u0026lt; concurrency; job++ { go func(job int) { defer func() { wg.Done() }() x1 := job * (size / concurrency) x2 := (job + 1) * (size / concurrency) for x := x1; x \u0026lt; x2; x++ { for y := 0; y \u0026lt; size; y++ { gameOfLifeCellTurn(x, y, univ, buffer, size) } } }(job) } wg.Wait() }\ngameOfLifeCellTurn is the same implementation as the CUDA function except it doesn\u0026rsquo;t need to calculate the x and y from the thread/block model.#### Results\nA good way to compare these implementations is in “millions of cells calculated per second”. Starting at a world size of 256x256 and incrementing by 256 each time, I ran each implementation for 10k turns. These are the results:\nSo Golang is hovering around 1billion cells per second being processed, where when CUDA really gets going is around the 80 billion mark.\nIn direct comparison with one another:\nThis shows that CUDA is 50 to 100 times faster than the Golang implementation. The CUDA implementation could even be further optimised by selecting better thread and block dimensions.CUDA is a beast. I think the programming model is intuitive, and pretty easy to pick up. The sharp edges are all in C with things like the pointers. I was always ending up in unallocated space and breaking everywhere.\nUp next I want to see if I can soften some of those sharp edges by comparing the two implementations in this post to a Golang implementation that calls out to CUDA. This will hopefully mix the best of both worlds; CUDA speed with Golang for the easy implementation.All code is available here\n","permalink":"https://maori.geek.nz/posts/2021/2021-08-16_game-of-life-cuda-vs-golang/","summary":"\u003cp\u003eI got a new graphics card because I want to play around with some machine learning using tensorflow and pytorch. Before I jump into all those high level concepts, with layers and layers of abstractions, I want to understand a little bit more about CUDA and how it works. I usually learn by doing, so I decided to do something pretty easy, implement Game Of Life (GOL) using CUDA.\u003c/p\u003e\n\u003cp\u003eGOL is good use case for CUDA. It is a simple mathematical operation that can be massively parallelised. In this post I get a pretty fast implementation of GOL working in CUDA, and compare it against another implementation in Golang. **TLDR: CUDA wins… by a lot.**CUDA lets you compile code to be run on a graphics card. You write in a C like language two types of functions, one for the CPU (\u003ccode\u003ehost\u003c/code\u003e) and one for graphics card (\u003ccode\u003edevice\u003c/code\u003e). The code starts with a \u003ccode\u003emain\u003c/code\u003e function executed on the CPU and can call out to \u003ccode\u003edevice\u003c/code\u003e kernel functions declared with \u003ccode\u003e__global__\u003c/code\u003e.\u003c/p\u003e","title":"Game Of Life: CUDA vs Golang"},{"content":"There are a couple different types of security; preventative will stop something bad from happening and detective will alert you when something bad does happen.\nI have box that is SSH accessible on the internet that I want secure. There are lots of good sources for preventative SSH security (e.g. no passwords, tight config…) but not many for detective security. To be more secure (and decrease my paranoia) I want to be notified when someone SSH’s into that box from the wide internet, and I usually read my email.The first part is being able to send an email from a script. I could use any old SMPT server, but using an authorised one will decrease the chance my email goes straight to spam. I use Gmail’s SMTP server since it is pretty easy to setup.\nSSMTP is a pretty simple tools to send mail from the command line, it just needs a config like: # /etc/ssmtp/ssmtp.conf```# Use gmails SMTP\nmailhub=smtp.gmail.com:587\nUseSTARTTLS=YES\nhostname=localhost# Setup gmail root=\u0026lt;[user@gmail.com](mailto:user@gmail.com)\u0026gt; AuthUser=\u0026lt;[user@gmail.com](mailto:user@gmail.com)\u0026gt; AuthPass=\u0026lt;gmail_password\u0026gt;# Dont let SSMTP override FROM address\nFromLineOverride=NO`\n\u0026lt;user@gmail.com\u0026gt; is the user that authenticates to the SMTP server. I recommend setting up a new gmail account not used for anything else. \u0026lt;gmail_password\u0026gt; is either the normal gmail password (no 2FA and you have to enable “less secure apps” access) or generate “unique app password”. With this you can send a test email with:\necho “Test Email” | ssmtp -vv \u0026lt;your_email@gmail.com\u0026gt;Next is to set up a PAM script to be called when someone creates an SSH session. I followed this answer on askubuntu.com which I will outline below.\nCreate the file /etc/pam.d/email-on-login.sh: #!/bin/bash recepient=\u0026quot;\u0026lt;your_email@gmail.com\u0026gt;\u0026quot; if [ \u0026quot;$PAM_TYPE\u0026quot; != \u0026quot;close_session\u0026quot; ] \u0026amp;amp;\u0026amp;amp; \\ [[ \u0026quot;$PAM_RHOST\u0026quot; != 10.0.0.* ]] ; then host=\u0026quot;hostname\u0026quot; subject=\u0026quot;Subject: SSH Login: $PAM_USER from $PAM_RHOST on $host\u0026quot; message=\u0026quot;You should check on that\u0026quot; printf \u0026quot;$subject\\n$message\u0026quot; | ssmtp \u0026quot;$recepient\u0026quot; fi\nThis is the script to execute when a SSH session is opened. It sends an email on the write PAM_TYPE and if the address is not from the local network.\nTo execute this on each session, add to /etc/pam.d/sshd the line: session optional pam_exec.so seteuid /etc/pam.d/email-on-login.shThis was a post with very little detail in it, leaving many questions unanswered like:\nWhat is SMTP? What is PAM? How many emails per second does it take to get banned from Gmails SMTP server? They will remain unanswered unless you read one of these:\nPrivileged Access Management (PAM) Simple Mail Transfer Protocol (SMTP) Gmail SMTP server limits (about 100 per day) ","permalink":"https://maori.geek.nz/posts/2021/2021-07-19_get-an-email-if-someone-sshs-into-your-box/","summary":"\u003cp\u003eThere are a couple different types of security; \u003cstrong\u003epreventative\u003c/strong\u003e will stop something bad from happening and \u003cstrong\u003edetective\u003c/strong\u003e will alert you when something bad does happen.\u003c/p\u003e\n\u003cp\u003eI have box that is SSH accessible on the internet that I want secure. There are lots of good sources for preventative SSH security (e.g. no passwords, tight config…) but not many for detective security. To be more secure (and decrease my paranoia) I want to be notified when someone SSH’s into that box from the wide internet, and I usually read my email.The first part is being able to send an email from a script. I could use any old SMPT server, but using an authorised one will decrease the chance my email goes straight to spam. I use Gmail’s SMTP server since it is pretty easy to setup.\u003c/p\u003e","title":"Get an Email if Someone SSHs into Your Box"},{"content":"AWS DynamoDB is popular because it is super fast \u0026amp; scalable. However, when using the Python client boto3 to fetch a large number of documents we started to noticed some unexplained slowness. This was super annoying as some of our queries were taking 20s to process BUT the actual dynamo query returned in only 6s. So we wanted to find out where that 14s was going.Using dynamodb-local I put 250,000 basic items: with table.batch_writer() as batch: for i in range(250000): batch.put_item(Item={ 'id': i, 'updated_at': i, 'name': \u0026quot;some data\u0026quot;, 'status': \u0026quot;some other stuff\u0026quot; })\nI can then fetch these items in 10k batches with: scan_pages = int(sys.argv[1]) scan_kwargs = { 'FilterExpression': \u0026quot;updated_at \u0026gt; :val\u0026quot;, 'ExpressionAttributeValues': { \u0026quot;:val\u0026quot;: -1}, 'Limit': 10000 } pages = 0 start_key = None while True: if start_key is not None: scan_kwargs['ExclusiveStartKey'] = start_key response = table.scan(**scan_kwargs) pages +=1 start_key = response.get('LastEvaluatedKey', None) if pages \u0026gt;= scan_pages: break\nI wrote the same query in Golang to compare the results against python:\nProfiling the python code with cProfile: `import cProfile\nimport pstats\nwith cProfile.Profile() as pr:\n\u0026hellip; the query here``stats = pstats.Stats(pr) stats.sort_stats(pstats.SortKey.TIME)\nstats.print_stats()`\nFor 100k documents this took 4 seconds and printed out: cumtime filename 1.823 /.../botocore/parsers.py:309(_parse_shape)\nSo it spent nearly 1/2 the time in the _parse_shape method in [botocore/parsers.py](https://github.com/boto/botocore/blob/develop/botocore/parsers.py#L799). This method (called by _handle_json_body) takes the already parsed JSON and looks for more suitable types. For Dynamo this is useless because it already has its own type system, and the payloads are huge so it does lots of recursion.\nLets monkey patch this method away and see what happens: # Monkey Patch from botocore.parsers import PROTOCOL_PARSERS parser = PROTOCOL_PARSERS.get(\u0026quot;json\u0026quot;) def new_fn(self, raw_body, shape): return self._parse_body_as_json(raw_body) setattr(parser, \u0026quot;_handle_json_body\u0026quot;, new_fn)\nThe speed without _parse_shape:\nWhat!?! Without _parse_shape boto3 actually outperforms Golang. So much wasted CPU time!\nUser beware! This monkey patch affects all boto3 clients you create and some (like KMS) do require this better typing. So beware before you ship this hack.This post is not saying boto3 or DynamoDB are bad, just that (especially for large amounts of documents) they are not well suited for one another.\nDAX (the DynamoDB Accelerator) uses a different parser and datatype (CBOR) and has its own client to avoid such issues. Maybe DynamoDB should have its own optimised client as well?Thanks to Bin and Jordan for helping with this post\n","permalink":"https://maori.geek.nz/posts/2021/2021-07-11_make-pythons-dynamodb-client-faster-with-this-one-simple-trick/","summary":"\u003cp\u003eAWS DynamoDB is popular because it is super fast \u0026amp; scalable. However, when using the Python client \u003ccode\u003eboto3\u003c/code\u003e to fetch a large number of documents we started to noticed some unexplained slowness. This was super annoying as some of our queries were taking 20s to process BUT the actual dynamo query returned in only 6s. So we wanted to find out where that 14s was going.Using \u003ca href=\"https://hub.docker.com/r/amazon/dynamodb-local\"\u003edynamodb-local\u003c/a\u003e I \u003ccode\u003eput\u003c/code\u003e 250,000 basic items:\n\u003ccode\u003ewith table.batch_writer() as batch:   for i in range(250000):   batch.put_item(Item={   'id': i,   'updated_at': i,   'name': \u0026quot;some data\u0026quot;,   'status': \u0026quot;some other stuff\u0026quot;   })\u003c/code\u003e\u003c/p\u003e","title":"Make Python’s DynamoDB client faster with this one simple trick"},{"content":"For the life of me I could not find a python decorator that memoizes methods on an instance. Something that replaces the pattern: class A: def x(self): if self._x is not None: return self._x self._x = something_difficult() return self._x\nSure there is [functools](https://docs.python.org/3.9/library/functools.html) [] (https://docs.python.org/3.9/library/functools.html)`[@cache](https://docs.python.org/3.9/library/functools.html)` [and] (https://docs.python.org/3.9/library/functools.html)`[@cached_property](https://docs.python.org/3.9/library/functools.html)` but these use a global lru_cache for everything. I want the cache to live and die with the instance.\nI am still pretty new to python, but slowly getting the hang of it. That being said I came up with this: `class memoize:\ndef init(self, func):\nself.func = func\nfargspec = inspect.getfullargspec(func)\nif len(fargspec.args) != 1 or fargspec.args[0] != \u0026ldquo;self\u0026rdquo;:\nraise \u0026ldquo;@memoize must be (self)\u0026rdquo;\nset key for this function self.func_key = str(func)``def get(self, instance, cls):\nif instance is None:\nraise \u0026ldquo;@memoize\u0026rsquo;s must be bound\u0026rdquo;\nSet the cache if not hasattr(instance, \u0026ldquo;_memoize_cache\u0026rdquo;):\nsetattr(instance, \u0026ldquo;_memoize_cache\u0026rdquo;, {})\nreturn function bound to instance return types.MethodType(self, instance)\ndef call(self, *args, **kwargs):\nargs[0] will always be self instance = args[0]\ncache = instance._memoize_cache\nreturn if cached if self.func_key in cache:\nreturn cache[self.func_key]\ncalculate and cache result = self.func(*args, **kwargs)\ncache[self.func_key] = result\nreturn result`\nThe quick breakdown of the memoize decorator is:\n__init__ first makes sure the method has signature (self) and create the func_key for the cache __get__ creates the instance cache if None then returns the bound function __call__ gets the instance cache, returns value if function already cached, otherwise calls the wrapped function and caches the result And now the function looks much nicer: class A: @memoize def x(self): return something_difficult()\nThis might be a bad idea for some reason (given that I couldn’t find anyone implementing this). Friendly comments are welcome :)\n","permalink":"https://maori.geek.nz/posts/2021/2021-06-30_python-decorator-to-memoize-instance-methods/","summary":"\u003cp\u003eFor the life of me I could not find a python decorator that memoizes methods on an instance. Something that replaces the pattern:\n\u003ccode\u003eclass A:   def x(self):   if self._x is not None:   return self._x   self._x = something_difficult()   return self._x\u003c/code\u003e\u003c/p\u003e\n\u003cp\u003eSure there is \u003ccode\u003e[functools](https://docs.python.org/3.9/library/functools.html)\u003c/code\u003e [] (\u003ca href=\"https://docs.python.org/3.9/library/functools.html%29%60%5B@cache%5D%28https://docs.python.org/3.9/library/functools.html%29%60\"\u003ehttps://docs.python.org/3.9/library/functools.html)`[@cache](https://docs.python.org/3.9/library/functools.html)`\u003c/a\u003e [and] (\u003ca href=\"https://docs.python.org/3.9/library/functools.html%29%60%5B@cached_property%5D%28https://docs.python.org/3.9/library/functools.html%29%60\"\u003ehttps://docs.python.org/3.9/library/functools.html)`[@cached_property](https://docs.python.org/3.9/library/functools.html)`\u003c/a\u003e but these use a global \u003ccode\u003elru_cache\u003c/code\u003e for everything. I want the cache to live and die with the instance.\u003c/p\u003e\n\u003cp\u003eI am still pretty new to python, but slowly getting the hang of it. That being said I came up with this:\n`class memoize:\u003cbr\u003e\ndef \u003cstrong\u003einit\u003c/strong\u003e(self, func):\u003cbr\u003e\nself.func = func\u003c/p\u003e","title":"Python Decorator to Memoize Instance Methods"},{"content":"So I have been doing a lot of Python and Django lately, but I miss Go. As an experiment I wanted to see how difficult it would be to directly call Golang functions from Python using my favourite build tool Bazel.\nI found this post, which explains how to build and connect the two manually:\nPython and Go : Part II - Extending Python With Go\nGreat now I am like 90% there!\nSo all you need is a Golang file main.go: package main import \u0026quot;C\u0026quot; import \u0026quot;fmt\u0026quot; //export hello func hello(inputC *C.char) *C.char { input := C.GoString(inputC) return C.CString(fmt.Sprintf(\u0026quot;Hello %s\u0026quot;, input)) } func main() {}\nThis just defines a method that accepts a C type and tags it to be exported.\nThe Python code looks like main.py: import ctypes so = ctypes.cdll.LoadLibrary('./_golib.so')``hello = so.hello hello.argtypes = [ctypes.c_char_p] hello.restype = ctypes.c_void_p``free = so.free free.argtypes = [ctypes.c_void_p]``ptr = hello('World'.encode('utf-8')) out = ctypes.string_at(ptr) free(ptr)``print(out.decode('utf-8'))\nThis code loads the Golang library, defines the method to call (and its free method that comes free with Golang lib), calls it, which prints out Hello World\nThe only special Bazel code is: go_library( name = \u0026quot;project_lib\u0026quot;, srcs = [\u0026quot;main.go\u0026quot;], **cgo = True,** importpath = \u0026quot;github.com/grahamjenson/bazel-python-to-golang\u0026quot;, )``go_binary( name = \u0026quot;project\u0026quot;, embed = [\u0026quot;:project_lib\u0026quot;], **linkmode=\u0026quot;c-shared\u0026quot;, out=\u0026quot;_golib.so\u0026quot;** )\nSetting the cgo and linkmode args, and writing the output file to a library name.\nI thought this post would be longer, but it turns out to be super easy!\nWellington\n","permalink":"https://maori.geek.nz/posts/2021/2021-06-20_calling-golang-functions-from-python-with-bazel/","summary":"\u003cp\u003eSo I have been doing a lot of Python and Django lately, but I miss Go. As an experiment I wanted to see how difficult it would be to directly call Golang functions from Python using my favourite build tool Bazel.\u003c/p\u003e\n\u003cp\u003eI found this post, which explains how to build and connect the two manually:\u003c/p\u003e\n\u003cp\u003e\u003ca href=\"https://www.ardanlabs.com/blog/2020/07/extending-python-with-go.html\"\u003ePython and Go : Part II - Extending Python With Go\u003c/a\u003e\u003c/p\u003e\n\u003cp\u003eGreat now I am like 90% there!\u003c/p\u003e","title":"Calling Golang functions from Python with Bazel"},{"content":"I joined Coinbase March 28th 2016; my last day is April 2nd 2021. This is a quick write up about my experience on joining and working in the infrastructure team for 5 years at Coinbase, growing from 100 to over 1000 employees, from 3.5 million users to more than 40 million, and from trading $3 billion total to now often trading that in a single day.\nCoinbase in March 2016\nCoinbase in March 2021\nIn 2015, I got an email from a recruiter:\nI may have responded because of the personal nature of the email (mentioning my GER project), but TBH I have no idea why I responded. I emailed back that I was more into DevOps work which they also had positions for. So we set up some online interviews.\nI didn’t really need this job so I was super relaxed during the interviews, mostly I just had a good time solving interesting problems. One question was to implement the Mandelbrot set which I did in Ruby, and another was some data munging problem I did in Javascript. In one interview Brian Armstrong dropped by to ask a few questions, including if I was a Coinbase user. After saying “no” he sent me $15 of Bitcoin (he used to do that with everyone).\nI think because I had fun and demonstrated some skill with multiple languages they decided to fly me (and my wife) to SF for a week long interview called a work-trial.\nAt the work-trial I was given a project to setup a third-party code-coverage tool for Ruby. That quickly got derailed when I realized the tool wouldn’t scale to Coinbase’s large codebase, so I started looking around for other problems to solve. I found a way to speed up some test times, worked with an engineer on a visualization, and also tried to help out (i.e. flailed around) during a few DDOS incidents.\nHanging out in a cool SF office with a ton of talented engineers building high stakes applications was exhilarating. There is nothing like being in a room full of experts working hard to solve an immediate problem. I think my willingness to jump in, help and to find useful things to do when stuck is why I got an offer at the end of the week.Ultimately, I decided to join Coinbase for a few reasons. First, Rob Witoff (then director of infrastructure) laid out a vision of an organization I wanted to be a part of. Next, I saw Brian Armstrong \u0026amp; Fred Ehrsam as a good team; Brian seemed like a steady hand in rocky seas and Fred was so enthusiastic about crypto and the company, his optimism was infectious. Finally, every person I talked to, or worked with, was friendly and fully engaged with their work.\nI don’t think I could have gotten that kind of insight (and might not have joined) if not for the work-trial. Unfortunately, work-trials don’t scale to a large company so they are not part of the interview process anymore.\nReally the only thing that I wasn’t sure about was moving to the United States. New Zealand is home; family, friends, stability, safety\u0026hellip; But San Francisco isn’t that far away, it’s only a 12 hour flight. I could get on a flight in the evening and have breakfast at home the next day.I moved to the US in March 2016. It was a pretty stressful time. I started on the “infra \u0026amp; security” team that had 2 infra engineers. I liked the team and heavily relied on my new colleagues for advice about all the complexities of living in the US (SSN, banks, health insurance…) and for onboarding me into the company.\nWithin a month I was the only member left on the infra team. To be clear, I was not the first infra engineer, but for a short time I was the only one (with the exception of a wild Lian).\nBeing the only FTE infra engineer meant that I had a lot of say in building the new team. We interviewed a lot of great candidates and grew the team to 7 people in 18 months. I think helping build that team was probably my greatest impact and proudest achievement at Coinbase. They are all wonderful talented people and I would work with any of them again.\nMy US visa ran out after 18 months in September 2017, so I moved back to New Zealand for a bit. Then to the office in the UK (where I could get a visa) for a little over a year. As soon as I got another visa, I went back to San Francisco in September 2019. By the time I got back the US the team had grown from 7 to 30-ish. Then the pandemic. Now, after another year it’s 60-ish. That is crazy growth.#### What did I do there?\nOn my first day at Coinbase Rob and I sat in a room and wrote down what my first project was “Decrease the time a developer takes to go from commit to production”. So I did that for 5 years.\nI started by improving the way AWS resources were provisioned with GeoEngineer, a Ruby DSL that maps to Terraform with additional validations and Coinbase-specific logic. The goal was to decrease the time it takes to provision/manage resources for projects. In 2018, we started migrating away from the Ruby DSL to a YAML format called GPS to better standardize. It has been written and spoken about here, here, here, and here.\nNext, I moved onto extending and building the code-review system Heimdall. Heimdall decides if a Git SHA in a repository was reviewed, allowing it to be deployed. I talked about Heimdall here, and here and it is discussed here (but it’s called Sauron).\nI was the primary maintainer/builder for Codeflow, the paved-road for engineers to deploy code. If Codeflow was down, nothing would get released, so it had to be reliable and secure. I talk about Codeflow here.\nWe were having reliability issues with existing deployers and developers wanted more options, so I built and open-sourced Odin and Fenrir. Odin deploys to Auto Scaling Groups and Fenrir to serverless Lambda. I have talked about these deployers here, here and here and here and here.\nMonorepo is the last big project I worked on at Coinbase. Codeflow requires one git repository per project, where a monorepo can have many projects in a repo. Currently the Coinbase monorepo has hundreds of projects implemented in Golang, Ruby, Python, and JavaScript. The monorepo has been written about here and here. A monorepo is a large multi-org effort involving many engineers adopting new technologies and working together. But, I am most proud that I got the first commit to start with 4c0ffee\nI also helped out with scaling applications, managing blockr.io, automated documentation, slack integrations, CI servers, removing legacy services, responding to incidents, docker build pipelines, security scanning, disaster recovery planning, mentoring, interviewing… 5 years is a long time.#### Why am I leaving?\nI want to stay in New Zealand. I would love to continue to work at Coinbase, the people are amazing, the problems are interesting. However, moving back to the US isn’t an option right now. The distance is not a 12 hour flight anymore, now it’s a 2 week stint in managed isolation with an infant. That is too far to be from family and friends. This may only be the case for another 6 months-ish, but that is too long to live in limbo.\nLeaving San Francisco, November 2020\nWhat is next I think I am going to enjoy New Zealand with my friends and family for a few months before looking for something else. Whatever I do next, I will probably do for the same reasons I joined Coinbase; working with cool people on hard/fun problems.This post in no way encapsulates all the fun and memories I had at Coinbase, which are too numerous to count.\nTo all the engineers, designers, managers, recruiters, operations, staff, and folks I worked with: Cheers for the awesome 5 years :)\nWanaka, February 2021\n","permalink":"https://maori.geek.nz/posts/2021/2021-04-03_5-awesome-years-at-coinbase/","summary":"\u003cp\u003eI joined Coinbase March 28th 2016; my last day is April 2nd 2021. This is a quick write up about my experience on joining and working in the infrastructure team for 5 years at Coinbase, growing from 100 to over 1000 employees, from 3.5 million users to more than 40 million, and from trading $3 billion \u003cstrong\u003etotal\u003c/strong\u003e to now often trading that in a single day.\u003c/p\u003e\n\u003cp\u003e\u003cimg alt=\"image\" loading=\"lazy\" src=\"/posts/2021/2021-04-03_5-awesome-years-at-coinbase/images/1.png#layoutTextWidth\"\u003e\nCoinbase in March 2016\u003c/p\u003e","title":"5 Awesome Years at Coinbase"},{"content":" Everyone has a different path to where they are today. At inflection points in my career I often think back to the jobs that have got me here. I just wanted to list all my paid jobs to reflect on that journey.Candy Mixer at Four Square (1995–96): A family friend owned a dairy (corner store for non-kiwis) and they needed someone to make the pre-packed $1–2 mixed candy bags. Paid $5 an hour and free candy.\nPicked Beans in a Field (2004): needed a job for beer money and crawling in the dirt in the middle of summer is definitely work. Paid per Kg so usually under minimum wage, around $8 per hour.\nDish Washer in Gisborne Hospital (2004–05): Over the Christmas holidays the hospital needed someone to wash the evening meal dishes. It felt good to start off with a giant mess and clean it up over a few hours. Paid $9 per hour and got free dinner (but hospital food).\nHouse Mover for NZVanLines (2005): Moved people in and out of houses around the Gisborne area. What I learnt was cloths dryers are usually left behind, Americans tip really well, and movers play Tetris all day long. Paid $9 per hour, but sometimes got free pies and beer.\nHouse Painter (2005): painted a few houses in summer holidays. Sun up to sun down, hard work, wear sunscreen. $9 per hour.\nTurners and Growers Fruit and Vegetable Distribution Center (2005): I worked the evening shift by organizing boxes full of food to be picked up by retailers. The people there worked hard but always looked for fun in the tedium. I lifted heavy boxes 5–8 hours, 3 nights a week, for 8 months while mostly eating fruit, my healthiest job. $11.50 per hour.\nTutor/Marker at Massey University (2005–2012): in the courses where I got good marks I was allowed be a lab tutor and/or marker. The best way I have found to make sure I know something is to stand up in front of 100 people and teach them. $15 per hour.\nConference Organizer (2011): A PhD student and a PostDoc approached me about hosting the New Zealand Computer Science Research Student Conference (NZCSRSC 2011). They told me “you won’t have to do much” so I said “ok”. After committing to be organizers, the PostDoc went back to Germany and the PhD student went back to Pakistan, leaving me to alone to run the conference. The first thing I did was rope in more people to help telling them “you won’t have to do much”. With a large and awesome team no one really does that much, so problem solved. One of the most rewarding experiences of my life. Not paid, still a job.\nMix\u0026amp;Mash 2011 \u0026amp; 2013 Winner: About $7,000 in prize money for expressing myself with visualization. Not a job, got paid.\nHashBang (2013): After I graduated my friend approached me to work at his Ruby on Rails consultancy. I didn’t know Ruby but wasn’t doing anything so agreed. I showed up, started writing tests, fixing bugs, worked with amazing people, and solving customer problems. Was there only a short time, but learnt so much. Technically my first job as a programmer.\nLoyaltyNZ aka Flybuys (2013–2016): full stack engineer, later senior engineer building and maintaining a big Rails application. The engineering team was great, but it wasn’t a software company so engineering concerns were often low on the priority list. Got to work with a small team on a product that served most New Zealand households.\nLecturer (2014): I got to teach some classes at Massey University. This was very rewarding as I was able to meet some very impressive programmers that were just about to graduate and whose careers I still follow.\nCoinbase (2016–2021): I joined Coinbase in March 2016 on the infrastructure \u0026amp; security team in my first DevOpsy role. When I joined there were 3 infra engineers, when I left there were more than 60. My first project was to “decrease friction for a commit to get to production”, I ended up doing that for 5 years. The team was so awesome, the scale was immense, I learnt something everyday and I had such a blast.Mostly I try to learn something from every job, even if it is bad. I also try to get different experiences at each job, trying to not repeat myself too much.\n","permalink":"https://maori.geek.nz/posts/2021/2021-03-16_every-job-i-have-had/","summary":"\u003cp\u003e\u003cimg alt=\"image\" loading=\"lazy\" src=\"/posts/2021/2021-03-16_every-job-i-have-had/images/1.jpeg#layoutTextWidth\"\u003e\nEveryone has a different path to where they are today. At inflection points in my career I often think back to the jobs that have got me here. I just wanted to list all my paid jobs to reflect on that journey.\u003cstrong\u003e\u003cem\u003eCandy Mixer at Four Square (1995–96):\u003c/em\u003e\u003c/strong\u003e A family friend owned a dairy (corner store for non-kiwis) and they needed someone to make the pre-packed $1–2 mixed candy bags. Paid $5 an hour and free candy.\u003c/p\u003e","title":"Every Job I Have Had"},{"content":"How does go build compile the simplest Golang program? This post is here to answer that question.\nThe simplest go program (I can think of) is main.go: package main``func main() {}\nIf we run go build main.go it outputs an executable main that is 1.1Mb and does nothing. What did go build do to do create such a useful binary?\ngo build has some args that are useful for seeing how it builds:\n**-work**: go build creates a temporary folder for work files. This arg will print out the location of that folder and not delete it after the build **-a**: Golang caches previously built packages. -a makes go build ignore the cache so our build will print all steps **-p 1**: This sets the concurrency to a single thread to log output linear **-x**: go build is a wrapper around other Golang tools like compile. -x outputs the commands and arguments that are sent to these tools Running go build -work -a -p 1 -x main.go will output not only the main binary, but a lot of logs describing exactly what build did to create main.\nThe logs starts with: WORK=/var/folders/rw/gtb29xf92fv23f0zqsg42s840000gn/T/go-build940616988\nThis is the work directory whose structure looks like: ├── b001 │ ├── _pkg_.a │ ├── exe │ ├── importcfg │ └── importcfg.link ├── b002 │ └── ... ├── b003 │ └── ... ├── b004 │ └── ... ├── b006 │ └── ... ├── b007 │ └── ... └── b008 └── ...\nWhat are these incrementing directory numbers?\ngo build defines an action graph of tasks that need to be completed. Each action in this graph gets its own sub-directory (defined in [NewObjdir](https://github.com/golang/go/blob/master/src/cmd/go/internal/work/action.go#L318)). The first node b001 in the graph is the root task to compile the main binary. Each dependent action has a higher number, the final being b008. (I don’t know where b005 went, I assume its ok)The first action to be executed is the leaf of the graph, b008: `mkdir -p $WORK/b008/\ncat \u0026gt;$WORK/b008/importcfg \u0026laquo; \u0026lsquo;EOF\u0026rsquo;\nimport config EOFcd /\u0026lt;..\u0026gt;/src/runtime/internal/sys /\u0026lt;..\u0026gt;/compile -o $WORK/b008/_pkg_.a -trimpath \u0026quot;$WORK/b008=\u0026gt;\u0026quot; -p runtime/internal/sys -std -+ -complete -buildid gEtYPexVP43wWYWCxFKi/gEtYPexVP43wWYWCxFKi -goversion go1.14.7 -D \u0026quot;\u0026quot; -importcfg $WORK/b008/importcfg -pack -c=16 ./arch.go ./arch_amd64.go ./intrinsics.go ./intrinsics_common.go ./stubs.go ./sys.go ./zgoarch_amd64.go ./zgoos_darwin.go ./zversion.go/\u0026lt;..\u0026gt;/buildid -w $WORK/b008/pkg.a\ncp $WORK/b008/pkg.a /\u0026lt;..\u0026gt;/Caches/go-build/01/01b\u0026hellip;60a-d`\nThe b008 action:\ncreates the action directory (all actions do this so I ignore this later on) creates the importcfg file to be used by the compile tool (it is empty) changes the directory to the [runtime/internal/sys](https://golang.org/pkg/runtime/internal/sys/) packages source folder. This package contains constants used by the runtime compile this package Use buildid to write (-w) metadata to the package and copy the package to the go-build cache (all packages are cached so I ignore this later on) Let’s break this down the arguments sent to the compile tool (also described in go tool compile --help):\n-o is the output file -trimpath this removes the prefix from the source file paths $WORK/b008=\u0026gt; (probably helps with debugging?) -p sets the package path used by import -std compiling standard library (not sure what this does) -+ compiling runtime (another mystery) -complete the compiler outputs a complete package (no C or assembly). -buildid adds build id to the metadata (as defined here) -goversion required version for compiled package -D the relative path for local imports is \u0026quot;\u0026quot; -importcfg import configuration file refers to other packages -pack create package archive (.a) instead of object file (.o) -c concurrency of the build finished with a list of files in the package Most of these arguments are the same for all _compile_ calls, so I ignore them later.\nThe output of **b008** is the file **$WORK/b008/_pkg_.a** **** for **runtime/internal/sys**Let’s dive into buildid for a second.\nThe buildid is in the format \u0026lt;actionid\u0026gt;/\u0026lt;contentid\u0026gt;. It is used as an index to cache packages to improve go build performance. The \u0026lt;actionid\u0026gt; is the hash of the action (all calls, arguments, and input files). The \u0026lt;contentid\u0026gt; is a hash of the output .a file. For each go build action, it can look up in the cache for contents created by another action with the same \u0026lt;actionid\u0026gt;. This is implemented in buildid.go.\nThe buildid is stored as metadata in the file so that it does not need to be hashed every time to get the \u0026lt;contentid\u0026gt;. You can see this id with go tool buildid \u0026lt;file\u0026gt; (also works on binaries).\nIn the log of b008 above the buildID is being set in by the compile tool as gEtYPexVP43wWYWCxFKi/gEtYPexVP43wWYWCxFKi. This is a just a place holder and is later overwritten with go tool buildid -w to the correct gEtYPexVP43wWYWCxFKi/b-rPboOuD0POrlJWPTEi before being cached.The next action to be run is b007: `cat \u0026gt;$WORK/b007/importcfg \u0026laquo; \u0026lsquo;EOF\u0026rsquo;\nimport config packagefile runtime/internal/sys=$WORK/b008/pkg.a\nEOF\ncd /\u0026lt;..\u0026gt;/src/runtime/internal/math\n/\u0026lt;..\u0026gt;/compile\n-o $WORK/b007/pkg.a\n-p runtime/internal/math\n-importcfg $WORK/b007/importcfg\n\u0026hellip;\n./math.go`\nThis writes the importcfg but it includes the line packagefile runtime/internal/sys=$WORK/b008/_pkg_.a. This means b007 depends on the output of b008 compile’s the [runtime/internal/math](https://golang.org/pkg/runtime/internal/math/) package. If you inspect [math.go](https://golang.org/src/runtime/internal/math/math.go), it has import \u0026quot;runtime/internal/sys\u0026quot; built by b008 The output of **b007** is the file **$WORK/b007/_pkg_.a** **** for **runtime/internal/math**The next action is b006: `cat \u0026gt;$WORK/b006/go_asm.h \u0026laquo; \u0026lsquo;EOF\u0026rsquo;\nEOF\ncd /\u0026lt;..\u0026gt;/src/runtime/internal/atomic\n/\u0026lt;..\u0026gt;/asm\n-I $WORK/b006/\n-I /\u0026lt;..\u0026gt;/go/1.14.7/libexec/pkg/include\n-D GOOS_darwin\n-D GOARCH_amd64\n-gensymabis\n-o $WORK/b006/symabis\n./asm_amd64.s/\u0026lt;..\u0026gt;/asm -I $WORK/b006/ -I /\u0026lt;..\u0026gt;/go/1.14.7/libexec/pkg/include -D GOOS_darwin -D GOARCH_amd64 -o $WORK/b006/asm_amd64.o ./asm_amd64.scat \u0026gt;$WORK/b006/importcfg \u0026laquo; \u0026lsquo;EOF\u0026rsquo;\nimport config EOF\n/\u0026lt;..\u0026gt;/compile\n-o $WORK/b006/pkg.a\n-p runtime/internal/atomic\n-symabis $WORK/b006/symabis\n-asmhdr $WORK/b006/go_asm.h\n-importcfg $WORK/b006/importcfg\n\u0026hellip;\n./atomic_amd64.go ./stubs.go``/\u0026lt;..\u0026gt;/pack r $WORK/b006/pkg.a $WORK/b006/asm_amd64.o`\nHere is where we step out of the normal .go files and start dealing with lower level “Go assembly” .s files. b006:\nFirst this makes the header file go_asm.h goes to the [runtime/internal/atomic](https://golang.org/pkg/runtime/internal/atomic/) package (a bunch of low-level functions). runs the [go tool asm](https://golang.org/cmd/asm/) tool (described with go tool asm --help) to build the symabis “Symbol Application Binary Interfaces (ABI) file” and then the object file asm_amd64.o Uses compile create the _pkg_.a file including the symabis file and the header with -asmhdr. Uses pack to add the asm_amd64.o object file to _pkg_.a package archive The asm tool is called with the args:\n-I: include the action b007 and includes folders. includes has three files asm_ppc64x.h funcdata.h and textflag.h all having low level function definitions, e.g. FIXED_FRAME defines the size of the fixed part of a stack frame -D: Adds a predefined symbol -gensymabis: flag to generate the symabis file -o: The output file The output of **b006** is **$WORK/b006/_pkg_.a** **** for **runtime/internal/atomic**Next is b004: cd /\u0026lt;..\u0026gt;/src/internal/cpu /\u0026lt;..\u0026gt;/asm ... -o $WORK/b004/symabis ./cpu_x86.s``/\u0026lt;..\u0026gt;/asm ... -o $WORK/b004/cpu_x86.o ./cpu_x86.s``/\u0026lt;..\u0026gt;/compile ... -o $WORK/b004/_pkg_.a ./cpu.go ./cpu_amd64.go ./cpu_x86.go``/\u0026lt;..\u0026gt;/pack r $WORK/b004/_pkg_.a $WORK/b004/cpu_x86.o\nb004 is the same as b006 for the package [internal/cpu](https://golang.org/pkg/internal/cpu/). First we we assemble the symabis and object files, then compile the go files and pack the .o files into _pkg_.a.\nThe output of **b004** is **$WORK/b004/_pkg_.a** **** for **internal/cpu**The next action is b003 `cat \u0026gt;$WORK/b003/go_asm.h \u0026laquo; \u0026lsquo;EOF\u0026rsquo;\nEOF\ncd /\u0026lt;..\u0026gt;/src/internal/bytealg/\u0026lt;..\u0026gt;/asm ... -o $WORK/b003/symabis ./compare_amd64.s ./count_amd64.s ./equal_amd64.s ./index_amd64.s ./indexbyte_amd64.scat \u0026gt;$WORK/b003/importcfg \u0026laquo; \u0026lsquo;EOF\u0026rsquo;\nimport config packagefile internal/cpu=$WORK/b004/pkg.a\nEOF\n/\u0026lt;..\u0026gt;/compile \u0026hellip; -o $WORK/b003/pkg.a -p internal/bytealg ./bytealg.go ./compare_native.go ./count_native.go ./equal_generic.go ./equal_native.go ./index_amd64.go ./index_native.go ./indexbyte_native.go/\u0026lt;..\u0026gt;/asm ... -o $WORK/b003/compare_amd64.o ./compare_amd64.s /\u0026lt;..\u0026gt;/asm ... -o $WORK/b003/count_amd64.o ./count_amd64.s /\u0026lt;..\u0026gt;/asm ... -o $WORK/b003/equal_amd64.o ./equal_amd64.s /\u0026lt;..\u0026gt;/asm ... -o $WORK/b003/index_amd64.o ./index_amd64.s /\u0026lt;..\u0026gt;/asm ... -o $WORK/b003/indexbyte_amd64.o ./indexbyte_amd64.s/\u0026lt;..\u0026gt;/pack r $WORK/b003/pkg.a $WORK/b003/compare_amd64.o $WORK/b003/count_amd64.o $WORK/b003/equal_amd64.o $WORK/b003/index_amd64.o $WORK/b003/indexbyte_amd64.o`\nb003 is the same as the previous actions b004 b006 for the package [internal/bytealg](https://golang.org/pkg/internal/bytealg/). The main complication with this package is that there are multiple .s files to create many .o object files that each need to be added to the _pkg_.a file.\nThe output of **b003** is **$WORK/b003/_pkg_.a** **** for **internal/bytealg**The penultimate action, b002: `cat \u0026gt;$WORK/b002/go_asm.h \u0026laquo; \u0026lsquo;EOF\u0026rsquo;\nEOF\ncd /\u0026lt;..\u0026gt;/src/runtime\n/\u0026lt;..\u0026gt;/asm\n\u0026hellip;\n-o $WORK/b002/symabis\n./asm.s ./asm_amd64.s ./duff_amd64.s ./memclr_amd64.s ./memmove_amd64.s ./preempt_amd64.s ./rt0_darwin_amd64.s ./sys_darwin_amd64.s\ncat \u0026gt;$WORK/b002/importcfg \u0026laquo; \u0026lsquo;EOF\u0026rsquo;\nimport config packagefile internal/bytealg=$WORK/b003/pkg.a\npackagefile internal/cpu=$WORK/b004/pkg.a\npackagefile runtime/internal/atomic=$WORK/b006/pkg.a\npackagefile runtime/internal/math=$WORK/b007/pkg.a\npackagefile runtime/internal/sys=$WORK/b008/pkg.a\nEOF``/\u0026lt;..\u0026gt;/compile\n-o $WORK/b002/pkg.a\n\u0026hellip;\n-p runtime\n./alg.go ./atomic_pointer.go ./cgo.go ./cgocall.go ./cgocallback.go ./cgocheck.go ./chan.go ./checkptr.go ./compiler.go ./complex.go ./cpuflags.go ./cpuflags_amd64.go ./cpuprof.go ./cputicks.go ./debug.go ./debugcall.go ./debuglog.go ./debuglog_off.go ./defs_darwin_amd64.go ./env_posix.go ./error.go ./extern.go ./fastlog2.go ./fastlog2table.go ./float.go ./hash64.go ./heapdump.go ./iface.go ./lfstack.go ./lfstack_64bit.go ./lock_sema.go ./malloc.go ./map.go ./map_fast32.go ./map_fast64.go ./map_faststr.go ./mbarrier.go ./mbitmap.go ./mcache.go ./mcentral.go ./mem_darwin.go ./mfinal.go ./mfixalloc.go ./mgc.go ./mgcmark.go ./mgcscavenge.go ./mgcstack.go ./mgcsweep.go ./mgcsweepbuf.go ./mgcwork.go ./mheap.go ./mpagealloc.go ./mpagealloc_64bit.go ./mpagecache.go ./mpallocbits.go ./mprof.go ./mranges.go ./msan0.go ./msize.go ./mstats.go ./mwbbuf.go ./nbpipe_pipe.go ./netpoll.go ./netpoll_kqueue.go ./os_darwin.go ./os_nonopenbsd.go ./panic.go ./plugin.go ./preempt.go ./preempt_nonwindows.go ./print.go ./proc.go ./profbuf.go ./proflabel.go ./race0.go ./rdebug.go ./relax_stub.go ./runtime.go ./runtime1.go ./runtime2.go ./rwmutex.go ./select.go ./sema.go ./signal_amd64.go ./signal_darwin.go ./signal_darwin_amd64.go ./signal_unix.go ./sigqueue.go ./sizeclasses.go ./slice.go ./softfloat64.go ./stack.go ./string.go ./stubs.go ./stubs_amd64.go ./stubs_nonlinux.go ./symtab.go ./sys_darwin.go ./sys_darwin_64.go ./sys_nonppc64x.go ./sys_x86.go ./time.go ./time_nofake.go ./timestub.go ./trace.go ./traceback.go ./type.go ./typekind.go ./utf8.go ./vdso_in_none.go ./write_err.go\n/\u0026lt;..\u0026gt;/asm \u0026hellip; -o $WORK/b002/asm.o ./asm.s\n/\u0026lt;..\u0026gt;/asm \u0026hellip; -o $WORK/b002/asm_amd64.o ./asm_amd64.s\n/\u0026lt;..\u0026gt;/asm \u0026hellip; -o $WORK/b002/duff_amd64.o ./duff_amd64.s\n/\u0026lt;..\u0026gt;/asm \u0026hellip; -o $WORK/b002/memclr_amd64.o ./memclr_amd64.s\n/\u0026lt;..\u0026gt;/asm \u0026hellip; -o $WORK/b002/memmove_amd64.o ./memmove_amd64.s\n/\u0026lt;..\u0026gt;/asm \u0026hellip; -o $WORK/b002/preempt_amd64.o ./preempt_amd64.s\n/\u0026lt;..\u0026gt;/asm \u0026hellip; -o $WORK/b002/rt0_darwin_amd64.o ./rt0_darwin_amd64.s\n/\u0026lt;..\u0026gt;/asm \u0026hellip; -o $WORK/b002/sys_darwin_amd64.o ./sys_darwin_amd64.s\n/\u0026lt;..\u0026gt;/pack r $WORK/b002/pkg.a $WORK/b002/asm.o $WORK/b002/asm_amd64.o $WORK/b002/duff_amd64.o $WORK/b002/memclr_amd64.o $WORK/b002/memmove_amd64.o $WORK/b002/preempt_amd64.o $WORK/b002/rt0_darwin_amd64.o $WORK/b002/sys_darwin_amd64.o`\nb002 is the reason for all actions seen so far. It is the [**runtime**](https://golang.org/pkg/runtime/) package containing all the operations needed for a go binary to run. For example, it contains [mgc.go](https://golang.org/src/runtime/mgc.go) the implementation of the garbage collection in Go (that also imports both internal/cpu from b004 and runtime/internal/atomic from b006).\nb002 although probably the most complex package in the core library, is built using the same pattern we have seen before, it just contains files. It uses asm compile and pack to build _pkg_.a.\nThe output of **b002** is **$WORK/b002/_pkg_.a** **** for **runtime**The final action, the one that pulls everything together, is b001: `cat \u0026gt;$WORK/b001/importcfg \u0026laquo; \u0026lsquo;EOF\u0026rsquo;\nimport config packagefile runtime=$WORK/b002/pkg.a\nEOF\ncd /\u0026lt;..\u0026gt;/main/\u0026lt;..\u0026gt;/compile ... -o $WORK/b001/_pkg_.a -p main ./main.gocat \u0026gt;$WORK/b001/importcfg.link \u0026laquo; \u0026lsquo;EOF\u0026rsquo;\npackagefile command-line-arguments=$WORK/b001/pkg.a\npackagefile runtime=$WORK/b002/pkg.a\npackagefile internal/bytealg=$WORK/b003/pkg.a\npackagefile internal/cpu=$WORK/b004/pkg.a\npackagefile runtime/internal/atomic=$WORK/b006/pkg.a\npackagefile runtime/internal/math=$WORK/b007/pkg.a\npackagefile runtime/internal/sys=$WORK/b008/pkg.a\nEOF/\u0026lt;..\u0026gt;/link -o $WORK/b001/exe/a.out -importcfg $WORK/b001/importcfg.link -buildmode=exe -buildid=yC-qrh2sY_qI0zh2-NE7/owNzOBTqPO00FkqK0_lF/HPXqvMz_4PvKsQzqGWgD/yC-qrh2sY_qI0zh2-NE7 -extld=clang $WORK/b001/_pkg_.amv $WORK/b001/exe/a.out main`\nFirst it builds an importcfg that includes runtime built in b002 to then compile main.go to _pkg_.a Then it creates importcfg.link which includes all previous actions packages, plus command-line-arguments referencing the main package we built. Using [link](https://golang.org/cmd/link/) to then create an executable file rename and move the binary to main link has the new arguments:\n-buildmode: set to build an executable -extld: reference to the external linker Finally, we have the output we want; the output of **b001** is the **main** binary.#### Similarities with Bazel\nThe building of an action graph in order to have efficient caching is the same idea the build tool Bazel uses for fast builds. Golang’s actionid and contentid map neatly to the action cache and the content-addressable store (CAS) Bazel uses in caching. Bazel is a product of Google, so is Golang. It would make sense that they would have a similar philosophy of how to build software quickly and reliably.\nIn Bazel’s rules_go package you can see how it reimplements go build in its [builder](https://github.com/bazelbuild/rules_go/tree/master/go/tools/builders) code. This is a very clean implementation because the action graph, the folder management, and the caching are handled externally by Bazel.\nThe Next Steps go build does a lot to compile a program that does nothing! I didn’t even get into much specific detail about the tools (compile asm) or their inputs and output files ( .a .o .s). Also, we are still only compiling the most basic program. We could add complications like:\nimporting another package, e.g. using fmt to print Hello World adds another 23 actions to the action graph having a go.mod file referencing external packages Setting GOOS and GOARCH to other architectures, e.g. compiling to WASM has entirely different actions and arguments Running go build and inspecting logs is a very top-down approach to learning how the Golang compiler works. It is a great starting point to dive into more resources like:\nIntroduction to the Go compiler Go: Overview of the Compiler Go at Google: Language Design in the Service of Software Engineering Source code like [**build.go**](https://github.com/golang/go/blob/master/src/cmd/go/internal/work/build.go) **** the definition of the go build command, or [**compile/main.go**](https://github.com/golang/go/blob/master/src/cmd/compile/main.go) **** the entry point to go tool compile There is a lot of information out there so still lots to learn about compiling the simplest program.\n","permalink":"https://maori.geek.nz/posts/2020/2020-09-11_how-go-build-works/","summary":"\u003cp\u003eHow does \u003ccode\u003ego build\u003c/code\u003e compile the simplest Golang program? This post is here to answer that question.\u003c/p\u003e\n\u003cp\u003eThe simplest go program (I can think of) is \u003ccode\u003emain.go\u003c/code\u003e:\n\u003ccode\u003epackage main``func main() {}\u003c/code\u003e\u003c/p\u003e\n\u003cp\u003eIf we run \u003ccode\u003ego build main.go\u003c/code\u003e it outputs an executable \u003ccode\u003emain\u003c/code\u003e that is 1.1Mb \u003cstrong\u003eand does nothing\u003c/strong\u003e. What did \u003ccode\u003ego build\u003c/code\u003e do to do create such a useful binary?\u003c/p\u003e\n\u003cp\u003e\u003ccode\u003ego build\u003c/code\u003e has some args that are useful for seeing how it builds:\u003c/p\u003e","title":"How “go build” Works"},{"content":"I am a huge fan of CodeBullet, an educational YouTuber that makes game playing AI bots. That is because computer science education more than just conveying complex ideas, it is also getting people excited using those ideas to solve interesting problems, something CodeBullet does very well.\nTwo years ago CodeBullet showed a bot learning to play Asteroids. I though I wonder if I can do that… in Golang. The first thing was build a desktop app in Golang, then it was to build an asteroids game. Finally, this post is about how I used the goNEAT implementation of NEAT to build an AI that can play my asteroids game.\nThe constraint to use Golang is artificial. If I were being practical I would use Python and boot up a demo of PyGame Asteroids and plug into the PyTorch NEAT implementation. But I want to use Golang, so here we are.\nYou can play the AI at https://grahamjenson.github.io/asteroids/### NeuroEvolution of Augmenting Topologies (NEAT)\nNeuroEvolution of Augmenting Topologies\nNEAT is an algorithm that combines a few ideas like speciation and evolving neural network graphs into a genetic algorithm. I would prefer not to implement it myself (yet), since making these kinds of algorithms performant is a massive time sink. Good news though, on the official(?) list of NEAT implementations there are three written in Golang; one is not working, one has not been touched in 5 years, and the last is goNEAT that looks both maintained and well implemented.\nThe biggest drawback to goNEAT is that it is not very usable or well documented. It does have some examples though, like XOR and pole balancing which I can copy and adapt.\nThere was one minor issue that required me to vendor goNEAT; I had to remove its [_viper_](https://github.com/spf13/viper) dependency to make it WASM compatible.\nAsteroids I wrote the asteroids game with an AI bot in mind. The game logic and rending can be separated so I can train the bot as a Golang binary (not WASM) to take advantage of native performance.\nTo implement the Asteroids testing I used the experiments setup in goNEAT: err = experiment.Execute( context, start_genome, AsteroidGenerationEvaluator{ OutputPath: out_dir_path, PlayTimeInSeconds: 120, FrameRatePerSecond: 15, } )\nAsteroidGenerationEvaluator will simulate a 120 second long asteroids game (or until the AI dies) running at 15 frames a second. So a single game takes at most 1800 calls to update the game.\nTo work with the training framework in goNEAT, AsteroidGenerationEvaluator implements the method GenerationEvaluate: func (ex AsteroidGenerationEvaluator) GenerationEvaluate( population *genetics.Population, epoch *experiments.Generation, context *neat.NeatContext ) (err error) { // Calculate the fitness of all organisms in the population }\nThis calculates the fitness of the organisms for a population (generation) by getting each of them to play an Asteroids game to calculate their score.\nThis looks like: for _, org := range population.Organisms { net := org.Phenotype // Neural Network game := \u0026amp;amp;asteroids.Game{} frames := PlayTimeInSeconds * FrameRatePerSecond for f = 0; f \u0026lt; frames; f++ { // calculate the inputs inputs := FindInputs(game) // send those inputs to the network net.LoadSensors(inputs) net.Activate() // run the network for i, output := range net.ReadOutputs() { // if a key output is pushed switch i { case 0: key = keys.KEY_UP case 1: key = keys.KEY_SPACE case 2: key = keys.KEY_LEFT case 3: key = keys.KEY_RIGHT } if output \u0026gt; 0.5 { // output activated pressedKeys[key] = true } } // Update the GameState game.Update(pressedKeys) if game.GameOver() { break } } // Use game score as fitness function // Fitness is normalized to between 0 and 1 org.Fitness = norm(game.Score) }\nLoop over each organism Initialize a game Find the bots inputs Send inputs to the organisms neural network Gets the networks outputs and map them to “pressed” keys Updates the game with the outputs Checks if GameOver After the game assign the organism’s fitness as the game’s score The variables that need to be experimented with are the inputs and the fitness function. What the bot sees in the world and how we compare the organisms will be the main factors in finding a “good” bot.#### Coordinates of the Asteroids\nThe inputs are what the bot “sees”. The easiest input to find is the relative x,y coordinates of the asteroids to the ship. These coordinates center the ship at 0,0 and rotate around so an asteroid in front of the ship lands on an axis. Since the inputs are a fixed length, I will just find the 5 closest asteroids and send them as inputs.\nAn important aspect of training is the randomness in the game. Two organisms should be compared with as little noise as possible, so we are fix the randomness per-generation using rand: game := \u0026amp;amp;asteroids.Game{} rand.Seed(int64(generation)) // Force same game per generation defer rand.Seed(time.Now().UnixNano()) game.Init()\nThe defer statement will reset the randomness to not mess with the NEAT algorithm.\nTraining this bot created something like:\nAfter 100 generations\nThis bot can fire at least. It also might be aiming, but that is probably just noise. After lots of training this bot peaked scoring about 20 points. That is the amount you get if you just hold down fire.#### Whiskers\nThere are a few problems with the previous x,y inputs:\nCoordinates don’t map to actions; e.g. if x=10 do I fire or turn left. An input is useless on its own; the AI must look at two inputs to get any use this requires its own training. Not a High/Low measurement; **** because of normalization an x=0.4 means the asteroid is left of the ship and x=0.6 means the asteroid is right What did CodeBullet use? Whiskers. He drew 8 lines of sight and if one of them hit an asteroid it returned the distance. Like how a sea-lion uses its whiskers to see the world.\nCodeBullet’s whiskers\nTo do this, the ship gets 8 whiskers and using SAT collision detection to find the closest asteroids distance makes a bot like:\n100 generations\nLive Longer and Prosper After some more training, the bots seem to be all really lazy. They just camp the same location to aim and shoot at passing asteroids. That is boring.\nI think these camping bots exist because we don’t consider the speed a bot gets its points. I want the ship to get a high score AND get it quickly. So I am reducing the time a game runs and changing the fitness function to game.Score + survived_seconds. This trains bots like:\nWell at least it is moving around. There is something still wrong with this bot. I think it is because the weighting between score and survived_seconds is wrong.\nI am going to do a little experiment and change the fitness function to only be survived_seconds:\nSurvival Bot\nThat looks cool, the bot dodged an asteroid. So, I think that survival is far more important than score, so with the fitness of score + (10*seconds):\nNow the bots are actively trying not to shoot asteroids! I think that is because when an asteroid is shot it splits in two and makes the game harder. So the bot is now trying to shoot as little as possible.\nI tested out a few different weights but nothing worked well. I think weighting the bots like score + (N * seconds) is the wrong approach. It just adds one more dimension of difficulty.\nSimulating a bots life With the ability to simulate my asteroids game I could calculate some numbers to help me decide what to do next. I can think of two basic strategies to try “do nothing” and “spin and shoot” and take a random 1000 games and see their mean/min/max lifespan: Do nothing bot: mean:**5.66** min:**0.5** max:**34.5** Fire and spin bot: mean:**4.96** min:**0.3** max:**26.8**\nThese were surprising. There is a game that kills a bot in less than a second (that is not a fun game), and one that takes over 30 seconds. Also, my intuition that not shooting was a better strategy for survival is true.\nWhat we want is to remove all bots that do nothing, and only try keep active bots. One way to do that is to force a minimum lifespan the fitness: if seconds \u0026lt; 6{\nreturn 0\n} else return score``\nNow, most bots that do nothing, or just spin and shoot, will have 0 fitness.\nAlso, further reducing the time a bot plays to 15 seconds should make getting a large score much more difficult. Now bots are trying to maximize their score in a small amount of time while surviving long enough to not just sit still.\nExponential inputs The inputs to the neural network are linear. A asteroid that is 100px away is 2x as important as one that is 200px away. I think this is intuitively incorrect. For example, a bear that is 100m away is much more than just 2x concerning than one that is 200m away.\nI am also going to change the inputs from the whiskers to grow exponentially with this function exp:\nThis function (from here) will take a number between 0 and 1 and return a number in the same range, e.g. for a=40:\nWith a=40 an asteroid that is 100px away is now 5x more important than one that is 200px away. A bot now will hopefully now appropriately react to an incoming asteroid.\nGame Difficultly I think the game is too hard. To fix that I have:\nmade the asteroids split less: they were getting in between the whiskers make the ships rotation and velocity faster always set the training frame-rate to 60fps, so whiskers have more chance of hitting an asteroid on a turn After this a trained bot looks like:\nDodging an asteroid\nRandomness and Precision I have come to the conclusion that I should eliminate all randomness during training. A bot should train on the exact same level for all generations. I think training is hard enough without making it even more difficult on the bot. Training a specific asteroids bot should be much faster than training a general one.\nI also adjusted the fitness algorithm to be: return score * exp(secondsPercent)\nsecondsPercent is the percentage of game time the bot is alive, and exp is the function defined above, e.g. a max game time of 10 seconds and the bot gets a score of 20 in 7 seconds its fitness will be 6.3-ish.\nThe goal is to reward surviving on an exponential curve. With this we can remove the previous if statement as surviving for 0 or 5 seconds are about the same, and not surviving the game is punished significantly.\n170 Generations resulted in a bot that can play pretty well\nThis strategy resulted in a bot that can play, and get scores into the 70s, sometimes. It also looks human-ish as it is not spinning around quickly or performing weirdly.\nPerformance This might be a good time to talk about the performance of using a goNEAT bot with WASM inside the browser. We can check how long it takes the AI to make a decision using the chromes dev tools:\nThis shows that the entire animation loop, including calling the AI takes only 3.64ms, with only 0.12ms spent asking the bot to make a decision. This is encouraging as it means we can make these bots significantly more complicated before they become a performance issue on the front end.\nPartial Ordered Fitness Here is the problem, I want a bot to get a high and score quickly. These two goals form a partial order, a Pareto front of possible good solutions. It is difficult to build a fitness function that does not over-optimize on score (creating short lived rapid fire bots) or survival (a bot that waits for asteroids to come to it). I need a fitness function to focus on a single value.\nSo I made the game even easier with slower and less asteroids. Now, even simple bots can win given enough time. Then I can focus the fitness function on a single target: if !won { return 0 } return 1 - secondsPercent\nThis will fail any bot who can’t win, so only optimizing on how long it takes to win.\nAfter a NEAT trains a bunch of species that can win easy games, we increase the difficulty. This is the boiling a frog method to training a bot.\nAfter a few iterations of this we get a bot that plays human-ish:\nThen a bot that plays a little better:\nThen after more training, I finally have a bot that is starting to look superhuman:\nPerfection This project never ends. It is impossible to train the perfect bot, there is always one more thing to do, one more level of difficulty, or one more trick to teach. For example, my bot is not an ambi-turner, it is like Zoolander and cant turn left.\nSo far I have been using the NEAT algorithm and goNEAT’s implementation as a black box. I haven’t really dived into it to see how it is working, or how to optimize the many parameters that go into training. I think the next steps will be to really understand, and probably reimplement, the NEAT algorithm. Or like CodeBullet try other games and other algorithms to go wide and not deep.\nYou can play some AI bots here\n","permalink":"https://maori.geek.nz/posts/2020/2020-08-29_learning-to-play-asteroids-in-golang-with-neat/","summary":"\u003cp\u003eI am a huge fan of \u003ca href=\"https://www.youtube.com/channel/UC0e3QhIYukixgh5VVpKHH9Q\"\u003eCodeBullet\u003c/a\u003e, an educational YouTuber that makes game playing AI bots. That is because computer science education more than just conveying complex ideas, it is also getting people excited using those ideas to solve interesting problems, something CodeBullet does very well.\u003c/p\u003e\n\u003cp\u003eTwo years ago CodeBullet showed \u003ca href=\"https://www.youtube.com/watch?v=N1WRualRBOQ\"\u003ea bot learning to play Asteroids\u003c/a\u003e. I though I wonder if I can do that… in Golang. The first thing was \u003ca href=\"https://maori.geek.nz/golang-desktop-app-with-webview-lorca-wasm-and-bazel-3283813bf89\"\u003ebuild a desktop app in Golang\u003c/a\u003e, then it was to \u003ca href=\"https://maori.geek.nz/making-asteroids-game-with-golang-lorca-webview-and-wasm-9a8bed30cf72\"\u003ebuild an asteroids game\u003c/a\u003e. Finally, this post is about how I used the \u003ca href=\"https://github.com/yaricom/goNEAT\"\u003egoNEAT\u003c/a\u003e implementation of \u003ca href=\"https://www.cs.ucf.edu/~kstanley/neat.html\"\u003eNEAT\u003c/a\u003e to build an AI that can play my asteroids game.\u003c/p\u003e","title":"Learning to play Asteroids in Golang with NEAT"},{"content":"In my past few posts [1][2] I have written about building a desktop application using Golang with Lorca/Webview to run a WebAssembly (WASM) binary. Now, I want to actually try use these technologies in anger and produce a distributable desktop application.\nI have chosen to make the game Asteroids (code here). It is reasonably complicated, fun and lets me play and learn more about Golangs [syscall/js](https://golang.org/pkg/syscall/js/) package and algorithms like Separating Axis Theorem (SAT) for collision detection. The goal is using Golang to build a single binary that can be downloaded onto different platforms (macOS, windows, linux) to play a fun game.\nCode and releases are at https://github.com/grahamjenson/asteroids\nPLAY THE GAME HERE Reimplementing Solar Example I first wanted to reimplement reimplement a simple example animation to make sure all the technologies play nice, and learn how to build a rendering loop in JS. I chose the Solar example from Mozilla, it draws the sun, earth and moon on a canvas, all orbiting around each other.\nThe first thing I did was abstract the Lorca and Webview code from my previous post into methods that take a common config: func main() { config := \u0026amp;amp;desktop.Config{ WasmBin: WASM_BIN, Width: 300, Height: 400, Title: \u0026quot;Solar\u0026quot;, } desktop.CreateLorca(config) //desktop.CreateWebview(config) }\nThis makes is easier to switch between the two frameworks.\nThe WASM_BIN constant is created using the [go_embed_data](https://github.com/bazelbuild/rules_go/blob/master/go/extras.rst#go_embed_data) rule (thanks Ed Schouten). This exposes the WASM binary as a []byte that can be given to the frontend: go_embed_data( name = \u0026quot;wasm_embed\u0026quot;, src = \u0026quot;//games/solar/wasm\u0026quot;, package = \u0026quot;main\u0026quot;, string = False, var = \u0026quot;WASM_BIN\u0026quot;, )\nThe actual Solar code is copied and converted from the example using the syscall/js package. The [gowasm-experiments](https://github.com/stdiopt/gowasm-experiments) (specifically “bouncy”) had some great examples to help.\nThe WASM app first gets the window and document, appends the canvas element, and gets the 2d context ctx to use for drawing: window := js.Global() document := window.Get(\u0026quot;document\u0026quot;) canvas := document.Call(\u0026quot;createElement\u0026quot;, \u0026quot;canvas\u0026quot;) document.Get(\u0026quot;body\u0026quot;).Call(\u0026quot;appendChild\u0026quot;, canvas) ctx := canvas.Call(\u0026quot;getContext\u0026quot;, \u0026quot;2d\u0026quot;)\nThen the sun, earth and moon images are loaded into \u0026lt;img\u0026gt; tags, e.g.: sun := document.Call(\u0026quot;createElement\u0026quot;, \u0026quot;img\u0026quot;) sun.Set(\u0026quot;src\u0026quot;, \u0026quot;[https://mdn.mozillademos.org/files/1456/Canvas_sun.png](https://mdn.mozillademos.org/files/1456/Canvas_sun.png)\u0026quot;)\nThe core animation loop is created and started: `var loop js.Func\nloop = js.FuncOf(func(this js.Value, args []js.Value) interface{} {\nctx.Call(\u0026ldquo;clearRect\u0026rdquo;, 0, 0, 300, 300) // clear the Canvas\n// Draw Everything here\nwindow.Call(\u0026ldquo;requestAnimationFrame\u0026rdquo;, loop) // next frame\nreturn nil\n})\nwindow.Call(\u0026ldquo;requestAnimationFrame\u0026rdquo;, loop) // start the loop`\n[requestAnimationFrame](https://developer.mozilla.org/en-US/docs/Web/API/window/requestAnimationFrame) will call the loop function before repainting the browser window, syncing up to the browsers frame rate (which is usually 60fps) to our animations.\nThe loop js.Func first clears the canvas with clearRect, then calculates the image positions and draws them onto the canvas, e.g.: t := time.Now() s := float64(t.Second()) ms := float64(t.Nanosecond()) / 1000000.0 tau := (2.0 * math.Pi)``ctx.Call(\u0026quot;rotate\u0026quot;, (tau/60.0)*s+(tau/60000.0)*ms) ctx.Call(\u0026quot;translate\u0026quot;, 105, 0) ctx.Call(\u0026quot;drawImage\u0026quot;, earth, -12, -12) ...\nThis creates the application:\nThe Mozilla Solar example reimplemented in Golang rendered using Lorca\nCreating this in Golang and building with Bazel was a pretty nice development loop. The final binary is about 9Mb, pretty large compared to what it does. But, the amount of code was tiny for such a satisfying animation.\nCanvas Context Wrapper One of the annoying things I found while converting was calling the context with ctx.Call. If the method is misspelled it can crash the WASM app. To make this a bit safer and more Go-ish I wrote a quick wrapper around ctx: type Context2D struct { JS *js.Value }``func NewContext2D(canvas js.Value) *Context2D { ctx := canvas.Call(\u0026quot;getContext\u0026quot;, \u0026quot;2d\u0026quot;) return \u0026amp;amp;Context2D{JS: \u0026amp;amp;ctx} }``func (ctx *Context2D) ClearRect(x, y, width, height int) { ctx.JS.Call(\u0026quot;clearRect\u0026quot;, x, y, width, height) }``...\nNow the above animation code is much cleaner, looking like: ctx.Rotate((tau/60.0)*s+(tau/60000.0)*ms) ctx.Translate(105, 0) ctx.DrawImage(earth, -12, -12)#### Asteroids\nTo build an asteroids game the Lorca, WebView, WASM and game loop code are basically the same as the Solar app. The parts that are different and interesting to me is the vector graphics engine, the collision detection, and the game logic.\n2D Vectors and Matrix Manipulation 8 years ago I wrote a small ruby library called [pathby](https://github.com/grahamjenson/pathby) that created, manipulated and rendered Bezier curves. I learnt from writing this library about the different matrix transformations like rotation, translation, and scaling. These are the core of the vector engine.\nIn my Golang vector library I define: type Vector []float64 type Matrix []Vector type Polygon { Matrix }\nTo represent translation, rotation and scaling our Matrix needs to be 3x3. Since the matrix is 3x3 the Vectors must all be 3x1. Given we know the sizes of our matrices and vectors, their operations are pretty straight forward (if tedious) to implement and test. I didn’t focus on making their implementations super optimal though, Since “asteroids” was released in 1979, I figure I can trade some performance for cleaner code at the moment.\nThe Polygon is a 3xn matrix, a list of vectors where each vector is a point. It may have require further attributes so keeping it a struct is useful. This is the core struct used to render the game as vectors and matrix are hidden from the end app.\nThe polygon has all the transformation methods, the difference is they are all performed around the centroid of the polygon.\nThe RenderPolygon method can be used to render the polygon to the canvas: func RenderPolygon(ctx *canvas.Context2D, s *Polygon) { ctx.BeginPath() first := true var firstPoint vector2d.Vector for _, v := range s.Matrix { if first { ctx.MoveTo(v[0], v[1]) first = false firstPoint = v } else { ctx.LineTo(v[0], v[1]) } } ctx.LineTo(firstPoint[0], firstPoint[1]) //close ctx.Stroke() }\nThis method assumes the polygon is closed.\nSeparating Axis Theorem The most interesting part of this project was implementing the Separating Axis Theorem. The core idea is:\nTwo convex objects do not overlap if there exists an axis onto which the two objects’ projections do not overlap.\nMy intuitive understanding of this is:\nif you can place a light anywhere such that the objects shadows have a gap between them, then they are not connected.\nThis isn’t perfect but it helps me see how the algorithm is implemented. For example, checking every single location for the light is impractical, really you only need to light up the places where there might be a gap, parallel to the objects edges. If the light is parallel to the edge then the shadow is perpendicular (normal )to it.\nSo, we need to project the objects onto the normals of each edge, and look for any gaps between the projections. This looks like the algorithm: func (p1 *Polygon) SAT(p2 *Polygon) bool { for _, e := range p1.Edges() { if !checkNormal(p1, p2, e.Normal) { return false } } for _, e := range p2.Edges() { if !checkNormal(p1, p2, e.Normal) { return false } } return true }``func checkNormal(a, b *Polygon, normal Vector) bool { minA, maxA := a.flattenPointsOn(normal) minB, maxB := b.flattenPointsOn(normal) // Either |---|--|---| // 1. b---b a---a // a \u0026gt; b // 2. a---a b---b // b \u0026gt; a if minA \u0026gt; maxB || minB \u0026gt; maxA { return false } return true }\nThis algorithm will return if an object is colliding or not. We can get much more information than that. Since we know the projections of each edge (minA, maxA, minB, maxB) we can also tell:\nif a is inside b, or vice-versa, how much they overlap what vector needs to be added a to stop colliding with b. That is, if the objects are colliding we can tell if: // Either |---|--|---| // a---b--a---b // B is greater than A // a---b--b---a // B is inside A // b---a--b---a // A is greater than B // b---a--a---b // A is inside B\nSo having the checkNormal function also return the vector of collision, the size of the projection overlap, and whether one object is inside the other will let us calculate these values for the two colliding objects. For example, if we wanted to have two objects feel “solid”, they must never overlap so on a collision we just move the object by vector*overlap.\nI looked at many SAT implementations, jriecken/sat-js was really helpful.\nGame Logic So now we have a vector graphics library and collision detection. The final part is actually implementing asteroids. The game loop looks like: game.Update(dt, pressedButtons) ctx.ClearRect(0, 0, width, height) // clear canvas game.Render(ctx)\nJS does not handle multiple pressed buttons, and I want to be able to fire and turn left at the same time. So we implement pressedButtons by adding all buttons that are down to a map and removing them when they are up: pressedButtons := map[int]bool{}``window.Call( \u0026quot;addEventListener\u0026quot;, \u0026quot;keyup\u0026quot;, js.FuncOf(func(this js.Value, args []js.Value) interface{} { e := args[0] e.Call(\u0026quot;preventDefault\u0026quot;) pressedButtons[e.Get(\u0026quot;keyCode\u0026quot;).Int()] = false return nil }))``window.Call( \u0026quot;addEventListener\u0026quot;, \u0026quot;keydown\u0026quot;, js.FuncOf(func(this js.Value, args []js.Value) interface{} { e := args[0] e.Call(\u0026quot;preventDefault\u0026quot;) pressedButtons[e.Get(\u0026quot;keyCode\u0026quot;).Int()] = true return nil }))\ndt is calculated as the time since the last rendered frame. This lets the game logic not be tied to the frame rate.\nThe core game struct looks like: type Game struct { asteroids []*Asteroid ship *Ship ... }\nThe ship and the asteroids have very similar properties, the main difference is the ship reacts to use input.\nThe Ship looks like: type Ship struct { template *vector2d.Polygon // The raw Ship shape projection *vector2d.Polygon // Where it is in the scene // Location and rotation x, y float64 velocityX, velocityY, rotation float64 }\nThe template is the polygon of the ship centered around 0,0, this does not change throughout the game. The projection is the shape of the ship in relation to the game; translated and rotated into the correct position on the screen.\nThe ships Update method looks like: if pressedButtons[KEY_LEFT] { s.rotation += math.Pi * dt * 2 } if pressedButtons[KEY_RIGHT] { s.rotation += -math.Pi * dt * 2 } if pressedButtons[KEY_UP] { s.velocityX += dt * 60 * math.Sin(s.rotation) s.velocityY += dt * 60 * math.Cos(s.rotation) }``// add friction s.velocityX *= dt * 60 * 0.90 s.velocityY *= dt * 60 * 0.90``s.x += s.velocityX s.y += s.velocityY``// update projection s.projection = s.template .Clone() .Translate(s.x, s.y) .Rotate(s.rotation)\nThis takes the pressedKeys and uses them to alter the rotation and velocity of the ship. Then we decrease the velocity with friction and update the coordinates. Finally, the ship is translated and rotated into the correct position, and stored in the projection.\nThe ships Render function simply sends its projection to the RenderPolygon method described above.\nThe asteroids, the ship and the ships bullets all share this similar structure; coordinated, velocities, rotation, template, and projection. The hardest part is adjusting all the numbers to make the game fun.\nEach loop Update also calculates the collisions between:\nbullet and asteroids: the asteroid is removed and two smaller asteroids are added. If the asteroid is too small, no new asteroids are added. Score is increased by 10. the ship and asteroids: this is “game over” and we go back to the menu. the asteroids: most other “asteroid” implementation do not calculate asteroids hitting one another. I wanted to really test out my SAT implementation by making the asteroids move out of each others way. You win the game if the list of asteroids is empty. I have not been able to test this code yet, as it is a pretty hard game.#### Development Lorca vs. Webview\nBoth Lorca (Chrome) and Webview (Safari) provide excellent development tooling, including the ability to analyze individual frames, and dissect the performance of canvas calls:\nChrome Performance Tools\nSafari tools for the webview\nBoth these tools show that Lorca and Webview render each frame in less than 3ms; there is no obvious performance difference between frameworks. This leaves around 14ms to spare, so not much need to optimize my vector or collision detection algorithms yet.\nOne difference between Lorca and Webview is that Lorca outputs fmt.Println\u0026rsquo;s to the terminal. This output is wrapped in JSON, so is hard to read, and it can be a performance problem. It is nice to see debug statements in the terminal like a normal app, so that I can pipe out JS logs and investigate after closing the window.\nDistribution To make it easier to distribute binaries for this game I needed to find a more elegant solution for picking between Lorca and Webview. Lorca will work on windows, linux, and is useful for development. Webview will only work on macOS, but has a nicer appearance.\nTo solve this I decided to use build constraints. These constraints allow me to implement a single method CreateDesktopApp, and based on the build GOOS, GOARCH and whether cgo is enabled, selects the best framework.\nTo do this first lets set up the BUILD.bazel file for asteroids: # use lorca go_binary( name = \u0026quot;asteroids\u0026quot;, embed = [\u0026quot;:go_default_library\u0026quot;], visibility = [\u0026quot;//visibility:public\u0026quot;], goos = \u0026quot;darwin\u0026quot;, goarch = \u0026quot;amd64\u0026quot;, )``# use webview go_binary( name = \u0026quot;asteroids_darwin\u0026quot;, embed = [\u0026quot;:go_default_library\u0026quot;], visibility = [\u0026quot;//visibility:public\u0026quot;], goos = \u0026quot;darwin\u0026quot;, goarch = \u0026quot;amd64\u0026quot;, cgo = True, pure = \u0026quot;off\u0026quot;, )``# use lorca go_binary( name = \u0026quot;asteroids_windows\u0026quot;, embed = [\u0026quot;:go_default_library\u0026quot;], visibility = [\u0026quot;//visibility:public\u0026quot;], goos = \u0026quot;windows\u0026quot;, goarch = \u0026quot;amd64\u0026quot;, )``# use lorca go_binary( name = \u0026quot;asteroids_linux\u0026quot;, embed = [\u0026quot;:go_default_library\u0026quot;], visibility = [\u0026quot;//visibility:public\u0026quot;], goos = \u0026quot;linux\u0026quot;, goarch = \u0026quot;amd64\u0026quot;, )\nHere you can see that we have set up four binaries: darwin with and without cgo and pure, windows and linux.\nThen we need to change the main function to look like: func main() { config := \u0026amp;amp;desktop.Config{...} desktop.CreateDesktopApp(config) }\nThis will now always call a single function to select the framework. The desktop package now has two files each implementing CreateDesktopApp:\nwebview.go with the constraint // +build darwin,cgo lorca.go with the constraint // +build windows linux darwin,!cgo [gazelle](https://github.com/bazelbuild/bazel-gazelle) will set up the correct imports for the different platforms, and go build will select the correct file (and framework) to use.\nThese binaries are uploaded here.\nOne final problem that I am not tackling in this post is that these binaries not signed or packaged in a way to make them easily runnable. I will leave that for another post.\nConclusion As a test to build and distribute a reasonably complicated Golang desktop application this has been a success. I have a desktop application, I have tested it on macOS and windows, and both work as expected.\nThere is still more work to do:\nget more customization (icons, menus…) that work across platforms package the applications in an OS specific manner so they are trusted and execute easily decrease and the size of the binary, perhaps by using tinygo The asteroids game is also fun.\nReferences I used a lot of resources to get this working, from using the WASM libraries in go, to rendering animations to canvas, to vector drawing tools, and the algorithms used in my implementation. Here are a few:\nhttps://jlongster.com/Making-Sprite-based-Games-with-Canvas https://codepen.io/anthonydugois/full/mewdyZ https://developer.mozilla.org/en-US/docs/Web/API/Canvas_API/Tutorial/Basic_animations https://stdiopt.github.io/gowasm-experiments/bouncy/ code at https://github.com/stdiopt/gowasm-experiments/blob/master/bouncy/main.go https://www.html5rocks.com/en/tutorials/speed/animations/ https://github.com/bugra/matrix/blob/master/matrix.go https://github.com/grahamjenson/pathby/blob/master/lib/transformations.rb https://www.mathsisfun.com/algebra/matrix-determinant.html https://www.mathsisfun.com/algebra/matrix-inverse-minors-cofactors-adjugate.html https://stackoverflow.com/questions/471962/how-do-i-efficiently-determine-if-a-polygon-is-convex-non-convex-or-complex https://bell0bytes.eu/centroid-convex/ https://github.com/jriecken/sat-js/blob/master/SAT.js ","permalink":"https://maori.geek.nz/posts/2020/2020-08-20_making-asteroids-game-with-golang-lorcawebview-and-wasm/","summary":"\u003cp\u003eIn my past few posts \u003ca href=\"https://maori.geek.nz/golang-desktop-app-webview-vs-lorca-vs-electron-a5e6b2869391\"\u003e[1]\u003c/a\u003e\u003ca href=\"https://maori.geek.nz/golang-desktop-app-with-webview-lorca-wasm-and-bazel-3283813bf89\"\u003e[2]\u003c/a\u003e I have written about building a desktop application using \u003ca href=\"https://golang.org/\"\u003eGolang\u003c/a\u003e with \u003ca href=\"https://github.com/zserge/lorca\"\u003eLorca\u003c/a\u003e/\u003ca href=\"https://github.com/webview/webview\"\u003eWebview\u003c/a\u003e to run a \u003ca href=\"https://webassembly.org/\"\u003eWebAssembly\u003c/a\u003e (WASM) binary. Now, I want to actually try use these technologies in anger and produce a distributable desktop application.\u003c/p\u003e\n\u003cp\u003eI have chosen to make the game Asteroids (\u003ca href=\"https://github.com/grahamjenson/asteroids\"\u003ecode here\u003c/a\u003e). It is reasonably complicated, fun and lets me play and learn more about Golangs \u003ccode\u003e[syscall/js](https://golang.org/pkg/syscall/js/)\u003c/code\u003e package and algorithms like \u003ca href=\"https://en.wikipedia.org/wiki/Hyperplane_separation_theorem#Use_in_collision_detection\"\u003eSeparating Axis Theorem (SAT)\u003c/a\u003e for collision detection. The goal is using Golang to build a single binary that can be downloaded onto different platforms (macOS, windows, linux) to play a fun game.\u003c/p\u003e","title":"Making Asteroids Game with Golang, Lorca/Webview and WASM"},{"content":"On my quest towards building a GoLang Desktop application I found some useful frameworks, Lorca and Webview (which I wrote about in my previous post). These frameworks create a window which GoLang can inject HTML, CSS, and JavaScript to build the UI.\nBut I don’t want to write JavaScript(!) and deal with all the complexities that comes with it like npm, webpack, typescript… Fortunately, I can just compile GoLang to WebAssembly (WASM) and use that in place of JavaScript. WASM is a binary format that can be executed natively in most modern browsers. My previous post showed how to build a WASM web-app with the Bazel.\nAlso, using WASM in a desktop app (as opposed to a web app) dodges two of its main downsides:\nLarge WASM binaries (especially from GoLang): these take a long time to send over the network increasing loading times. In a desktop app the binary isn’t transferred over the network, so less overhead. Browser incompatibility: Some WebAssembly methods are unavailable in browsers and older browsers are not supported at all. In a desktop app the “browser” is controlled (Chrome for Lorca and Safari for webview), so no compatibility issues. Let’s get started and build the app.\nThe Main Desktop App First, I want to sketch out the main function for the app: func main() { // Create the Data URI of index.html url := fmt.Sprintf( \u0026quot;data:text/html,%s\u0026quot;, url.PathEscape(assets.INDEX_HTML) )`` // Create Lorca UI ui, _ := lorca.New(url, \u0026quot;\u0026quot;, 600, 200) defer ui.Close()`` // Create a JS function that returns the WASM binary in base64 ui.Bind(\u0026quot;getWASM\u0026quot;, func() string { return assets.WASM_BIN })`` // Initialize the wasm_exec.js script ui.Eval(assets.WASM_EXEC_JS)`` // Call the initial JS functions to load the WASM ui.Eval(assets.INIT_JS)`` \u0026lt;-ui.Done() }\nNote: the core difference between with webview, is using _Init_ instead of _Eval_\nThis app will start a Chrome window then:\nload the webpage with HTML assets.INDEX_HTML define a function getWASM that is a promise of assets.WASM_BIN, the WASM binary as a base64 encoded string load the JS GoLang WASM library assets.WASM_EXEC_JS copied from $(go env **GOROOT**)/misc/wasm/wasm_exec.js load the JS initialization script assets.INIT_JS This is pretty simple example Lorca app, it is basically the demo example. The main complication is the assets package and how it can be created. This is where Bazel comes in.\nEmbedding Files in GoLang Binary using Bazel Note: an easier way to do this was pointed out to me using the [_go_embed_data_](https://github.com/bazelbuild/rules_go/blob/fbbbfde2dff5072fe118b369a699d456ec756b0c/go/extras.rst#id3) rule, better to use that.\nThe assets package contains the HTML, JS and WASM files as string constants. There are GoLang tools like pkger or go-bindata to do this, but we can keep this simple with a Bazel rule in to_go_constant.bzl: def to_go_constant(name, package, constant, file, base64 = False): pkgStr = '\u0026lt;(echo \u0026quot;package %s\u0026quot;)' % package conStr = '\u0026lt;(echo \u0026quot;const %s string = \\\u0026quot;)\u0026rsquo; % constant\nsuffix = \u0026lsquo;\u0026lt;(echo \u0026ldquo;`\u0026rdquo;)\u0026rsquo;\ngenGo = \u0026lsquo;cat %s %s - %s \u0026rsquo; % (pkgStr, conStr, suffix)if base64: printContents = 'cat $(SRCS) | base64' else: printContents = 'cat $(SRCS)'native.genrule(\nname = name,\nsrcs = [file],\nouts = [name + \u0026ldquo;.go\u0026rdquo;],\ncmd = printContents + \u0026rsquo; | \u0026rsquo; + genGo + \u0026lsquo;\u0026gt; $@\u0026rsquo;\n)`\nThis rule uses cat to take a file and output a .go file with the package and const defined. If base64=True then the file is base64 encoded.\nAn example of this rule in action is adding the assets.INDEX_HTML built from the file index.html: load(\u0026quot;//:to_go_constant.bzl\u0026quot;, \u0026quot;to_go_constant\u0026quot;)``to_go_constant( name = \u0026quot;index\u0026quot;, constant = \u0026quot;INDEX_HTML\u0026quot;, file = \u0026quot;:index.html\u0026quot;, package = \u0026quot;assets\u0026quot;, )\nThis takes the index.html file: `\n` and creates the index.go file: package assets const INDEX_HTML string = `` index.go can be added to the go_library rule for the assets package: go_library( name = \u0026quot;go_default_library\u0026quot;, srcs = [ \u0026quot;assets.go\u0026quot;, \u0026quot;:index.go\u0026quot;, # keep ], importpath = \u0026quot;github.com/.../assets\u0026quot;, )\nNow the INDEX_HTML constant is in the assets package.\nIt is straight forward for most the other files, but a bit more complicated for the WASM binary data: to_go_constant( name = \u0026quot;wasmbin\u0026quot;, base64 = True, constant = \u0026quot;WASM_BIN\u0026quot;, file = \u0026quot;//project/wasm\u0026quot;, package = \u0026quot;assets\u0026quot;, )\nThe file //project/wasm refers to the generated go_binary rule that compiles the WASM (as described below). base64 is also True so that the binary data can be encoded as a constant.\nNote: There is one minor issue is the _wasm_exec.js_ file, it has a few _`’s in it that must be replaced._\nThe WASM part The WASM binary is the client-side GoLang application running in the browser. This example injects a Hello World \u0026lt;p\u0026gt; tag into the body: package main``import ( \u0026quot;fmt\u0026quot; \u0026quot;syscall/js\u0026quot; )``func main() { fmt.Println(\u0026quot;Hello World\u0026quot;)`` document := js.Global().Get(\u0026quot;document\u0026quot;) p := document.Call(\u0026quot;createElement\u0026quot;, \u0026quot;p\u0026quot;) p.Set(\u0026quot;innerHTML\u0026quot;, \u0026quot;Hello World\u0026quot;) document.Get(\u0026quot;body\u0026quot;).Call(\u0026quot;appendChild\u0026quot;, p) }\nThe go_binary rule for this just needs goos=\u0026quot;js\u0026quot; and goarch=\u0026quot;wasm\u0026quot; for its output to be a WASM binary, e.g.: go_binary( name = \u0026quot;wasm\u0026quot;, embed = [\u0026quot;:go_default_library\u0026quot;], goarch = \u0026quot;wasm\u0026quot;, goos = \u0026quot;js\u0026quot;, visibility = [\u0026quot;//visibility:public\u0026quot;], )\nThe output of this rule is used to above to create the WASM_BIN constant above.\nInitializing the App assets.INIT_JS is built from init.js: loadWebASM = () =\u0026gt; { const go = new Go(); getWASM().then( (b64) =\u0026gt; { // Decode and convert to ArrayBuffer buf = Uint8Array.from(atob(b64), c =\u0026gt; c.charCodeAt(0)).buffer return WebAssembly.instantiate(buf, go.importObject) }).then((result) =\u0026gt; { go.run(result.instance); }).catch((err) =\u0026gt; { console.error(\u0026quot;loading wasm failed: \u0026quot; + err); }); }``loadWebASM()\nThis code takes the WASM_BIN promised by getWASM, then decodes and converts it to an ArrayBuffer. This buffer is passed to the [WebAssembly.instantiate](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/WebAssembly/instantiate) function that compiles and initializes the application to an instance, which is then go.run. This is the only JavaScript needed.\nPure GoLang Desktop App The above code is all that is needed to now write pure(-ish) GoLang desktop applications. This also makes it easy to convert other WASM apps to desktop applications, e.g. I converted the WASM app at https://github.com/olivewind/go-webassembly-canvas to a desktop application with minimal effort:\nConclusion There are many libraries and tools written in GoLang that could use a nice user interface, e.g. terraform and docker (just off the top of my head). A web app does not fit their requirements very well since they are editing local files and talking to local daemons. The alternative is then to reimplement or bind their API and integrations to another language which is complicated, error prone and a lot of work. A method of building pure GoLang desktop applications makes a lot of sense for these kinds of tools.\nThis approach can pretty quickly create a distributable, single executable, desktop application with minimal external dependencies. An app that can reuse many GoLang libraries and tools that have already been built. I suppose the worst part of this approach is that it throws away years of development on JavaScript tools like React and Vue. But that just means I have to write more Go to reimplement them, so it isn’t that bad :)\n","permalink":"https://maori.geek.nz/posts/2020/2020-08-11_golang-desktop-app-with-webviewlorca-wasm-and-bazel/","summary":"\u003cp\u003eOn my quest towards building a GoLang Desktop application I found some useful frameworks, \u003ca href=\"https://github.com/zserge/lorca\"\u003eLorca\u003c/a\u003e and \u003ca href=\"https://github.com/webview/webview\"\u003eWebview\u003c/a\u003e (which I wrote about in \u003ca href=\"https://maori.geek.nz/golang-desktop-app-webview-vs-lorca-vs-electron-a5e6b2869391\"\u003emy previous post\u003c/a\u003e). These frameworks create a window which GoLang can inject HTML, CSS, and JavaScript to build the UI.\u003c/p\u003e\n\u003cp\u003eBut I don’t want to write JavaScript(!) and deal with all the complexities that comes with it like npm, webpack, typescript… Fortunately, I can just compile GoLang to \u003ca href=\"https://developer.mozilla.org/en-US/docs/WebAssembly\"\u003eWebAssembly\u003c/a\u003e (WASM) and use that in place of JavaScript. WASM is a binary format that can be executed natively in \u003ca href=\"https://developer.mozilla.org/en-US/docs/WebAssembly\"\u003emost modern browsers\u003c/a\u003e. \u003cem\u003eMy\u003c/em\u003e \u003ca href=\"https://maori.geek.nz/a-web-app-using-bazel-golang-wasm-and-proto-c020914f4341\"\u003e\u003cem\u003eprevious post\u003c/em\u003e\u003c/a\u003e \u003cem\u003eshowed how to build a WASM web-app with the Bazel.\u003c/em\u003e\u003c/p\u003e","title":"GoLang Desktop App with webview/Lorca, WASM and Bazel"},{"content":"I want to build a local desktop Golang app, there are a few ways to do this:\nElectron: bundled Node.js and the Chromium browser to create a packaged local web-app. Usable with Golang frameworks like go-app or go-astilectron. Lorca: using the locally installed Chrome driving it using its dev-tools communication protocol. Webview: create a native window with webview and render the app inside it using CGo bindings. I have already written about building a simple electron app, so this post will go into how to build an app using Lorca and Webview, and then compare the three different options.\nLorca A simple Lorca app in Go looks like: func main() { // Create UI with data URI ui, _ := lorca.New(\u0026quot;data:text/html,\u0026quot;+url.PathEscape(\nHello Hello, world! `), \"\", 600, 200) defer ui.Close()`` // Create a GoLang function callable from JS ui.Bind(\"hello\", func() string { return \"World!\" })`` // Call above `hello` function then log to the JS console ui.Eval(\"hello().then( (x) =\u003e { console.log(x) })\")`` // Wait until UI window is closed \u003c-ui.Done() }` This is remarkably simple for the complexity it is hiding! The above opens a Chome window, connects over a websocket to its dev-tools endpoint, sends the HTML to load, and provides the communication between Go and JS:\nWhat is even more cool is that you can call a JS function inside chrome and get the output in Go(!): n := ui.Eval(Math.random()).Float() fmt.Println(n)\nUsing this library was so easy, so intuitive, so functional, that I was confused when it just worked. I thought there must be a catch, something complicated that I was missing. But no, it just worked.\nAn additional bonus is that you get the chrome dev tools to help debug any issues or adjust the layout. Also, I love the use of JS promises to implement the async calls between Go and JS, given I have been writing about promises since 2014.\nThe big downside to Lorca is that because it uses Chrome, some application details (like the system menu, icon, title) cannot be customized. The tradeoff is then between application polish and a simple application. Depending on what you are trying to build this might be a deal-breaker, e.g. it would be fine if you are building an internal tool, but for an enterprise application this might not look great.\nWebview Webview is a library that helps building a web app directly on top of a native components. The code to do this looks like: `func main() {\nw := webview.New(true)\ndefer w.Destroy() w.SetSize(600, 200, webview.HintNone) // Create a GoLang function callable from JS\nw.Bind(\u0026ldquo;hello\u0026rdquo;, func() string { return \u0026ldquo;World!\u0026rdquo; })\n// Create UI with data URI\nw.Navigate(`data:text/html,\n\u0026lt;!doctype html\u0026gt;\nHello Hello, world! `) w.Run()\n}`\nThis is very similar API to Lorca, which I assumed was based on webview. Though unlike Lorca, the output is a bit different:\nYou can see the in the above screen shot the webview application window has no drop shadow, has no border, and it is initialized in the bottom left corner of the screen. This can be customized through the Window method that returns an unsafe.Pointer to the OS dependent window object ([NSWindow](https://developer.apple.com/documentation/appkit/nswindow) in macOS). This is where the difficulty begins.\nTo work with the Window object we must write binding from Go to the native component. For example, if we wanted our window to start centered, we would call center on the NSWindow. So we need to write a binding in three files (adapted from gogoa):\n**ns_window.go** package main // #cgo CFLAGS: -x objective-c // #cgo LDFLAGS: -framework Cocoa //#include \u0026quot;ns_window.h\u0026quot; import \u0026quot;C\u0026quot; import \u0026quot;unsafe\u0026quot; type NSWindow struct { ptr unsafe.Pointer } func (self *NSWindow) Center() { C.Center(self.ptr) }\n**ns_window.h** #include \u0026lt;Cocoa/Cocoa.h\u0026gt; void Center(void *);\n**ns_window.m** #include \u0026quot;ns_window.h\u0026quot; void Center(void *self) { NSWindow *window = self; [window center]; }\nThen in the main() function we can center the window with: window := NSWindow{w.Window()} window.Center()\nUnlike Lorca, webview can be fully customized for our application. The problem is that it requires a bit of work.\nThere are a few other parts of webview that make working with it a bit difficult:\nIf using Bazel and gazelle, webview\u0026rsquo;s generated Build.bazel file is incorrect and clinkopts = [“-framework WebKit”] must be patched. Calling w.Init only works when w.Navigate is called, but then the w.Eval calls stop working. To set the title you could write a binding as described above, or you have to use the Dispatch method e.g. w.Dispatch(func() { w.SetTitle(\u0026quot;Title\u0026quot;) }). This is incorrect in the provided examples. I am not sure how much of this is webview and how much is NSWindow. More investigation and learning on my part should make it clearer why these things are happening.\nElectron My previous post was about building a simple Electron app that looks like:\nElectron is used in many large products like VSCode. This is probably because bundling everything into a single application makes portability much simpler and applications can be extensively customized. The downside of bundling the app with a browser and Node.js is that it makes the distribution very large.\nGetting Golang to play with Electron is also a bit difficult. There are frameworks that make this easier, like go-astilectron, but these are complicated and mostly feature incomplete. Another way might be to use Golang compiled to WASM, which I wrote about before, but this is also not a simple solution.\nThe benefits of Electron are that it is portable, customizable, and battle tested for application distribution. It is just a bit complicated with Golang.\nComparison I think the main comparison to be made is customizability vs. simplicity. Lorca is by far the simplest with very limited customizability, webview can be fully customized with some difficulty, and Electron is fully customizable but difficult to use with Golang.\nAlso the size of the bundle is very different between the frameworks; Lorca has a 8.7 MB binary, webview 3.7Mb, and Electron a 157Mb bundle size.\nThe debugging tools also vary: Lorca and Electron use the Chrome dev tools, where webview uses the Safari dev tools.\nConclusion: Both Lorca and webview work well with Golang, have small distribution sizes, and similar APIs. The main difference is between the underlying renderer being native and debug tooling.\nElectron I think is probably too complicated to use with Golang without a lot of difficulty.\nA potential workflow is to use Lorca during development and webview for distribution. Lorca provides familiar tooling for debugging and development, where webview provides the customizability for distribution. Lorca would also be a nice backup as a means of cross-compilation to other operating systems that webview does not support.\nNote: there are still more options like [_wails_](https://github.com/wailsapp/wails) or [_gotk_](https://github.com/gotk3/gotk3) that can provide other means to build/distribute apps.\n","permalink":"https://maori.geek.nz/posts/2020/2020-08-06_golang-desktop-app-webview-vs.-lorca-vs.-electron/","summary":"\u003cp\u003eI want to build a local desktop Golang app, there are a few ways to do this:\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003e\u003ca href=\"https://www.electronjs.org/\"\u003eElectron\u003c/a\u003e: bundled \u003ca href=\"https://nodejs.org/en/\"\u003eNode.js\u003c/a\u003e and the \u003ca href=\"https://www.chromium.org/\"\u003eChromium\u003c/a\u003e browser to create a packaged local web-app. Usable with Golang frameworks like \u003ca href=\"https://github.com/maxence-charriere/go-app\"\u003ego-app\u003c/a\u003e or \u003ca href=\"https://github.com/asticode/go-astilectron\"\u003ego-astilectron\u003c/a\u003e.\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"https://github.com/zserge/lorca\"\u003eLorca\u003c/a\u003e: using the locally installed Chrome driving it using its d\u003ca href=\"https://chromedevtools.github.io/devtools-protocol/\"\u003eev-tools communication protocol\u003c/a\u003e.\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"https://github.com/webview/webview\"\u003eWebview\u003c/a\u003e: create a native window with \u003ca href=\"https://developer.apple.com/documentation/webkit/webview\"\u003ewebview\u003c/a\u003e and render the app inside it using CGo bindings.\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eI have already written about \u003ca href=\"https://maori.geek.nz/building-an-electron-app-with-bazel-d124ed550957\"\u003ebuilding a simple electron app\u003c/a\u003e, so this post will go into how to build an app using Lorca and Webview, and then compare the three different options.\u003c/p\u003e","title":"Golang Desktop App: Webview vs. Lorca vs. Electron"},{"content":"I want to build a desktop app with:\nElectron.js: A framework to build a desktop application using a Node.js and the Chromium browser Bazel: a build system to quickly build, test and run applications They seem like they might go well together, so let’s see. Note: we will be focusing on macOS only for simplicity.\nAll code is located at https://github.com/grahamjenson/bazel-electron\nThe Electron Application An electron app (in macOS) is a folder with the electron binaries (Node.js Chromium, and libraries downloaded from https://github.com/electron/electron/releases) and three application files located at: electron/Electron.app/Contents/Resources/app/ ├── package.json ├── main.js └── index.html\npackage.json can just be {\u0026quot;main\u0026quot;:\u0026quot;main.js\u0026quot;} to reference the main entry point for the Node.js app main.js. main.js must initialize the main Chromium window of the frontend application defined in index.html.\nA simple main.js starts with some definitions: const {app, BrowserWindow} = require('electron') let mainWindow = null\nThen set up an initialize function for the app: function initialize () { app.setName('Electron Simple App') app.on('ready', () =\u0026gt; { createWindow() }) app.on('window-all-closed', () =\u0026gt; { app.quit() }) app.on('activate', () =\u0026gt; { if (mainWindow === null) { createWindow() } }) }\nThen write the createWindow function: function createWindow () { const windowOptions = { width: 600, minWidth: 600, height: 500, title: app.getName() } mainWindow = new BrowserWindow(windowOptions) mainWindow.loadURL('file://' + __dirname + '/index.html') mainWindow.webContents.openDevTools() mainWindow.on('closed', () =\u0026gt; { mainWindow = null }) }\nFinally call the initialize() function.\nThe index.html is the entry point into your frontend application linked above with loadURL. This can be as simple as: `\nHello ` This electron app renders as:\nBazel bits I got the above working by just writing and copying files around in the electron folders, but I want Bazel to do that for me. I want a Bazel rule like: load(\u0026quot;:electron.bzl\u0026quot;, \u0026quot;electron_app\u0026quot;) electron_app( name = \u0026quot;simple-app\u0026quot;, app_name = \u0026quot;simple-app\u0026quot;, index_html = \u0026quot;:index.html\u0026quot;, main_js = \u0026quot;:main.js\u0026quot;, )\nSo that I can run bazel run :simple-app to build then start the electron app.\nThe first step is to download the electron binaries in the WORKSPACE file: http_file( name = \u0026quot;electron_release\u0026quot;, sha256 = \u0026quot;594326256...ca1f41ec\u0026quot;, urls = [\u0026quot;[https://github.com/electron/electron/releases/download/v8.4.1/electron-v8.4.1-darwin-x64.zip](https://github.com/electron/electron/releases/download/v8.4.1/electron-v8.4.1-darwin-x64.zip)\u0026quot;], )\nNote: I would like to use the _http_archive_ rule instead, but the app uses symlinked folders that confuse Bazel’s _glob_ function. To fix this the rule unzips the file (which is not optimal but it works)\nSo the rule will look like: electron_app = rule( implementation = electron_app_, executable = True, attrs = { \u0026quot;app_name\u0026quot;: attr.string(), \u0026quot;main_js\u0026quot;: attr.label(allow_single_file = True), \u0026quot;index_html\u0026quot;: attr.label(allow_single_file = True), \u0026quot;_electron_release\u0026quot;: attr.label( allow_single_file = True, default = Label(\u0026quot;[@electron_release](http://twitter.com/electron_release)//file\u0026quot;), ), \u0026quot;_electron_bundle_tool\u0026quot;: attr.label( executable = True, cfg = \u0026quot;host\u0026quot;, allow_files = True, default = Label(\u0026quot;//:bundle\u0026quot;), ), \u0026quot;_electron_app_script_tpl\u0026quot;: attr.label( allow_single_file = True, default = Label(\u0026quot;//:run.sh.tpl\u0026quot;), ), }, outputs = { \u0026quot;apptar\u0026quot;: \u0026quot;%{name}.tar\u0026quot;, \u0026quot;run\u0026quot;: \u0026quot;%{name}.sh\u0026quot;, }, )\nThis takes the app_name, main.js and index.html from the user. It then uses\n_electron_release: downloaded release from github _electron_bundle_tool: a golang script to create the electron app _electron_app_script_tpl: the script used to run the application The _electron_bundle_toolgolang script bundle.go:\nUnzip the Electron release, add those files to tar copy package.json main.js and index.html add those files to tar write tar This looks like: func main() { outputFile := os.Args[1] name := os.Args[2] mainJS := os.Args[3] indexHTML := os.Args[4] electronZIP := os.Args[5] appName := name + \u0026quot;.app/\u0026quot;`` // Unzip rawFiles, _ := Unzip(electronZIP, \u0026quot;electronZIP\u0026quot;)`` tarFiles := map[string]string{}`` // Add Electron Files to tar for _, f := range rawFiles { zipPrefix := \u0026quot;electronZIP/Electron.app/\u0026quot; if strings.HasPrefix(f, zipPrefix) { tarFiles[f] = appName + strings.TrimPrefix(f, zipPrefix) } }`` // Add App files to tar appFolder := \u0026quot;Contents/Resources/app/\u0026quot; tarFiles[mainJS] = appName + appFolder + \u0026quot;main.js\u0026quot; tarFiles[indexHTML] = appName + appFolder + \u0026quot;index.html\u0026quot; ioutil.WriteFile(\u0026quot;package.json\u0026quot;, []byte(PACKAGE_JSON), 0644) tarFiles[\u0026quot;package.json\u0026quot;] = appName + appFolder + \u0026quot;package.json\u0026quot;``// Write Tar File writeTar(outputFile, tarFiles) }\nThe _electron_app_script_tpl is the script to run to open the electron app, which just un-tars the app, then opens it, i.e.: tar -xf {{app}} // Open app and wait for exit open -W {{name}}.app\nAll together\nWith this all setup bazel run :simple-app will start the application up. There is no live reload or other nice dev tools, so there is lots to improve about this workflow. The nice thing is that the built tar is easy to distribute as a completed app immediately.\nBazel to Electron Bazel is a pretty useful tool and electron is surprisingly simple to get a basic application built. Next steps would be adding tools like [go-app](https://github.com/maxence-charriere/go-app) to build applications using golang or maybe use [lorca](https://github.com/zserge/lorca) instead of electron to reduce the final tar size to something more easily distributable.\nAgain, all code is located at https://github.com/grahamjenson/bazel-electron\n","permalink":"https://maori.geek.nz/posts/2020/2020-07-31_building-an-electron-app-with-bazel/","summary":"\u003cp\u003eI want to build a desktop app with:\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003e\u003ca href=\"https://www.electronjs.org/\"\u003eElectron.js\u003c/a\u003e: A framework to build a desktop application using a \u003ca href=\"https://nodejs.org/en/\"\u003eNode.js\u003c/a\u003e and the \u003ca href=\"https://www.chromium.org/\"\u003eChromium\u003c/a\u003e browser\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"https://bazel.build/\"\u003eBazel\u003c/a\u003e: a build system to quickly build, test and run applications\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eThey seem like they might go well together, so let’s see. \u003cem\u003eNote: we will be focusing on\u003c/em\u003e \u003cstrong\u003e\u003cem\u003emacOS only\u003c/em\u003e\u003c/strong\u003e \u003cem\u003efor simplicity.\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e\u003cem\u003eAll code is located at\u003c/em\u003e\u003c/strong\u003e \u003ca href=\"https://github.com/grahamjenson/bazel-electron\"\u003e\u003cstrong\u003e\u003cem\u003ehttps://github.com/grahamjenson/bazel-electron\u003c/em\u003e\u003c/strong\u003e\u003c/a\u003e\u003c/p\u003e\n\u003ch4 id=\"the-electron-application\"\u003eThe Electron Application\u003c/h4\u003e\n\u003cp\u003eAn electron app (in macOS) is a folder with the electron binaries (Node.js Chromium, and libraries downloaded from \u003ca href=\"https://github.com/electron/electron/releases\"\u003ehttps://github.com/electron/electron/releases\u003c/a\u003e) and three application files located at:\n\u003ccode\u003eelectron/Electron.app/Contents/Resources/app/   ├── package.json   ├── main.js   └── index.html\u003c/code\u003e\u003c/p\u003e","title":"Building an Electron App with Bazel"},{"content":"\nI was looking for a domain name, got bored at not finding any I liked, so started looking for other, more specific domain names. I tried aaaaaa.com but this was taken, then aaaaaaa.com , then aaaaaaaa.com and so on. I was surprised by how many of these were registered.\nSo I began to wonder how may a’s it would take before there was a domain name that I could register. Then I expanded that to wonder about b and c …\nHere, I wrote a script: `#!/bin/bash\nwhoischeck() {\nFROM https://linuxconfig.org/check-domain-name-availability-with-bash-and-whois whois $1.com | egrep -q \u0026lsquo;^No match|^NOT FOUND|^Not fo|AVAILABLE|^No Data Fou|has not been regi|No entri\u0026rsquo;\nreturn $?\n}lettercheck() { A=$1 for ((i=1;i\u0026lt;=64;i++)) do if whoischeck $A; then echo \u0026quot;$A.com is available\u0026quot; return fi A=$A$1 done }for l in a b c d e f g h i j k l m n o p q r s t u v w x y z\ndo\nlettercheck $l\ndone`\nHere are the results, these are the smallest single letter .com domains you can register (as of 21/7/2020): aaaaaaaaaaaaaaaaaaaaaaaaaaa.com bbbbbbbbbbb.com ccccccccccccccccccc.com ddddddddddddddd.com eeeeeeeeeeeeeeeeee.com fffffffffffffffff.com gggggggggggggg.com hhhhhhhhhhhhhhhh.com iiiiiiiiiiiiiiiii.com jjjjjjjjjjjjjj.com kkkkkkkkkkkkkkkkk.com lllllllllllllllllllllll.com mmmmmmmmmmmmmmmm.com nnnnnnnnnnnnnnnn.com oooooooooooooooooo.com pppppppppppppp.com qqqqqqqqqqqqqqq.com rrrrrrrrrrrrrr.com ssssssssssssssssss.com ttttttttttttttttt.com uuuuuuuuuuuuuu.com vvvvvvvvvvvv.com wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww.com x*.com are all registered yyyyyyyyyyyyy.com zzzzzzzzzzzzzzz.com\nThe fact that you can’t register a single x*.com domain name is pretty obvious why. But why does Alibaba own wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww.com ?\n","permalink":"https://maori.geek.nz/posts/2020/2020-07-22_longest-single-letter.com-domain-name/","summary":"\u003cp\u003e\u003cimg alt=\"image\" loading=\"lazy\" src=\"/posts/2020/2020-07-22_longest-single-letter.com-domain-name/images/1.png#layoutTextWidth\"\u003e\u003c/p\u003e\n\u003cp\u003eI was looking for a domain name, got bored at not finding any I liked, so started looking for other, more specific domain names. I tried \u003ccode\u003eaaaaaa.com\u003c/code\u003e but this was taken, then \u003ccode\u003eaaaaaaa.com\u003c/code\u003e , then \u003ccode\u003eaaaaaaaa.com\u003c/code\u003e and so on. I was surprised by how many of these were registered.\u003c/p\u003e\n\u003cp\u003eSo I began to wonder how may \u003ccode\u003ea\u003c/code\u003e’s it would take before there was a domain name that I could register. Then I expanded that to wonder about \u003ccode\u003eb\u003c/code\u003e and \u003ccode\u003ec\u003c/code\u003e …\u003c/p\u003e","title":"Longest Single Letter .com Domain Name"},{"content":"Gary Bernhardt at PyCon 2014 talked about “The Birth and Death of Javascript”. He predicted, among other things, a giant exclusion zone around San Francisco, a world altering event in 2020, and that WASM would eventually take over the web. Let’s ignore the prior two prescient predictions and focus on the third, WASM taking over.\nSo, I want to build a simple web app by joining a bunch of new and fun technologies together:\nBazel: A build tool that simplifies complex pipelines Golang: A statically typed, compiled programming language that can compile to multiple platforms including WASM WASM: WebAssembly is a binary format to be executed in browsers Proto: An IDL for data structures and RPC endpoints TLDR: The code is here, _git clone_ then run _bazel run :server_ go to _localhost:7000_:\nGolang WASM application built with Bazel, using Proto IDL\nI want to build a web-app like ec2instances.info where I can search the specs and costs of AWS EC2 instances. I want to use Golang for both the front and backend language (no Javascript, yay!) by compiling it into WASM and using the go-app package. Communication will be defined using the Proto IDL so I can generate all the boilerplate code. Everything will be stitched together Bazel. Let’s get started.#### Bootstrapping\nBazel and Golang are good friends, but to get them working together you need to setup [rules_go](https://github.com/bazelbuild/rules_go) for build logic, and [gazelle](https://github.com/bazelbuild/bazel-gazelle) for automation. For Protobuf Bazel needs [com_google_protbuf](https://github.com/protocolbuffers/protobuf) and [rules_proto_grpc](https://github.com/rules-proto-grpc/rules_proto_grpc) to compile proto files.\nFor importing Go modules rules_go does not use the go.mod file butits own go_repository rule. gazelle can convert go.mod to rules_go with bazel run //:gazelle — update-repos -from_file=go.mod. The packages this project needs are:\n[github.com/lyft/protoc-gen-star](http://github.com/lyft/protoc-gen-star): for generating proto code [github.com/maxence-charriere/go-app](http://github.com/maxence-charriere/go-app): for building WASM sites in Golang The last external files needed are bootstrap.css to style the site, and the instances.json from ec2instances.info as a data-source.\nAll this code can be seen in the [WORKSPAC](https://github.com/grahamjenson/bazel-golang-wasm-proto/blob/master/WORKSPACE)E.\nCaveat: there are some libraries that don’t play nice with WASM and require patching. In Bazel this is pretty easy and you can see the required patches in the _third_party_ dir.#### Proto \u0026amp; Generation\nThe apps API is described in api.proto: message Instance { string name = 1; ... string price = 6; }``message Instances { repeated Instance instances = 1; }``message SearchRequest { string query = 1; }``service Api { **rpc Search (SearchRequest) returns (Instances);** }\nThis describes the Search API which takes a query and returns a list of Instance models.\nHere is where gazelle is useful, it will generate the needed Bazel files like protos/[BUILD.bazel](https://github.com/grahamjenson/bazel-golang-wasm-proto/blob/master/protos/BUILD.bazel) : proto_library( name = \u0026quot;api_proto\u0026quot;, srcs = [\u0026quot;api.proto\u0026quot;], )``go_proto_library( name = \u0026quot;api_go_proto\u0026quot;, compilers = [ \u0026quot;[@io_bazel_rules_go](http://twitter.com/io_bazel_rules_go)//proto:go_grpc\u0026quot;, ], importpath = \u0026quot;.../bazel-golang-wasm-protoc/protos/api\u0026quot;, proto = \u0026quot;:api_proto\u0026quot;, )``go_library( name = \u0026quot;go_default_library\u0026quot;, embed = [\u0026quot;:api_go_proto\u0026quot;], importpath = \u0026quot;.../bazel-golang-wasm-protoc/protos/api\u0026quot;, )\nThis tells Bazel how to take the api.proto file and convert it to a go_library for this application. At the moment the only compiler is gRPC, but I want HTTP. So let’s write some custom proto compilers and add them.\nProto Server and Client Code Generation The Proto IDL is powerful because there are lots of tools to write custom compilers. Here I am using Lyft’s [protoc-gen-star](http://github.com/lyft/protoc-gen-star) because it provides lots of the features I need to parse proto files and generate code.\nI want HTTP server and client code generated from the proto. First lets write the server [tools/protoc-gen-server/main.go](https://github.com/grahamjenson/bazel-golang-wasm-proto/blob/master/tools/protoc-gen-server/main.go). The generated code’s template looks like: const serviceTpl = package {{ .Package.ProtoName }}\nimport (\u0026hellip;)\n{{ range .Services }}\nfunc Register{{ .Name }}HTTPMux(\nmux *http.ServeMux,\nsrv {{ .Name }}Server,\n) {\n{{ range .Methods}}\nmux.HandleFunc(\n\u0026ldquo;{{ $method }}\u0026rdquo;,\nfunc(w http.ResponseWriter, r *http.Request) {\nin := new({{ .Input.Name }})\ninJSON, _ := ioutil.ReadAll(r.Body)\ndefer r.Body.Close()\njson.Unmarshal(inJSON, in)\nret, _ := srv.{{ .Name }}(context.Background(), in)\nretJSON, _ := json.Marshal(ret)\nw.Write(retJSON)\n})\n{{ end }}\n}\n{{ end }}\n``\nThis template creates the RegisterApiHTTPMux method to register the API with an [http.ServeMux](https://golang.org/pkg/net/http/#ServeMux).\nA ModuleBase is needed to implement a few methods to render the code: type protoModule struct {...}``func (m *protoModule) **Name**() string { return \u0026quot;server\u0026quot; } func (m *protoModule) **InitContext**(c pgs.BuildContext) {...} func (m *protoModule) **Execute**( targets map[string]pgs.File, pkgs map[string]pgs.Package, ) []pgs.Artifact {...}\nThe main function of this generator passes our module in and calls **Render**(): func main() { pgs.Init().RegisterModule( \u0026amp;amp;protoModule{ModuleBase: \u0026amp;amp;pgs.ModuleBase{}}, ).RegisterPostProcessor( pgsgo.GoFmt(), ).**Render**() }\nThe client code is mostly the same except for the template: const serviceTpl = package {{ .Package.ProtoName }}\nimport (\u0026hellip;)\n{{ range .Services }}\n{{ range .Methods}}\nfunc Call{{ .Service.Name }}{{ .Name }}(\ninput {{ .Input.Name }},\n) (*{{ .Output.Name }}, error) {\nstr, _ := json.Marshal(input)\nreq, _ := http.NewRequest(\n\u0026ldquo;POST\u0026rdquo;,\n\u0026ldquo;{{ $method }}\u0026rdquo;, strings.NewReader(string(str)),\n)\nreq.Header.Add(\u0026ldquo;Content-Type\u0026rdquo;, \u0026ldquo;application/json\u0026rdquo;)\nclient := \u0026amp;http.Client{}\nresp, _ := client.Do(req)\ndefer resp.Body.Close()\nbody, _ := ioutil.ReadAll(resp.Body)\ninstances := {{ .Output.Name }}{}\njson.Unmarshal(body, \u0026amp;instances)\nreturn \u0026amp;instances, nil\n}\n{{ end }}\n{{ end }}\n``\nThis creates the **CallApiSearch** method that can be used to call the HTTP server.\nThe go_proto_compiler rule tells Bazel that these are proto compilers: **go_proto_compiler**( name = \u0026quot;go_server\u0026quot;, options = [\u0026quot;plugins=server\u0026quot;], plugin = \u0026quot;//tools/protoc-gen-server\u0026quot;, suffix = \u0026quot;.pb.server.go\u0026quot;, ... )\nThen attach attach them to the compilers list to generate our new code: go_proto_library( name = \u0026quot;api_go_proto\u0026quot;, compilers = [ \u0026quot;[@io_bazel_rules_go](http://twitter.com/io_bazel_rules_go)//proto:go_grpc\u0026quot;, **\u0026quot;//tools:go_server\u0026quot;, # keep \u0026quot;//tools:go_client\u0026quot;, # keep** ], ... )\nAlthough this is a lot of work to define two methods, it is great to know that generating code from a proto file is not that difficult. This removes lots of the boilerplate while making it way easier to update and change the API.#### Frontend into WASM\nThe main goal of this project is to write frontend code in Golang. For this the go-app package is really useful as it contains lots of the primitives for building and rendering HTML.\nThe projects frontend code is broken into three models, a search bar as input, a table to render the results, and a manager for control.\nThe search bar looks like: type SearchBar struct { app.Compo manager *Manager searchString string }``func (p *SearchBar) SetManager(manager *Manager) { p.manager = manager }``func (p *SearchBar) Render() app.UI { return app.Div().Body( app.Input().Value(p.searchString).OnKeyup(p.OnInputChange), ) }``func (p *SearchBar) OnInputChange(src app.Value, e app.Event) { p.searchString = src.Get(\u0026quot;value\u0026quot;).String() p.Update() p.manager.UpdateInstances(p.searchString) }\nThis component renders an \u0026lt;input\u0026gt; box that calls OnInputChange when the value changes, which updates the manager with the query string.\nThe table component looks like: type InstanceTable struct { app.Compo manager *Manager instances []*api.Instance }``func (p *InstanceTable) SetManager(manager *Manager) { p.manager = manager }``func (p *InstanceTable) Render() app.UI { nodes := []app.Node{} for _, i := range p.instances { nodes = append(nodes, app.Tr().Body( app.Td().Body(app.Text(i.Name)), ... app.Td().Body(app.Text(i.Price)), )) }``return app.Table().Class(\u0026quot;table\u0026quot;).Body( app.Tr().Body( app.Th().Scope(\u0026quot;col\u0026quot;).Body(app.Text(\u0026quot;Name\u0026quot;)), ... app.Th().Scope(\u0026quot;col\u0026quot;).Body(app.Text(\u0026quot;Price\u0026quot;)), ), app.TBody().Body(nodes...), ) }\nThis component has a list of instances which are rendered into a table.\nThe final component is the manager: type Manager struct { app.Compo searchBar *SearchBar instanceTable *InstanceTable }``func (h *Manager) Render() app.UI { return app.Div().Body( app.Header().Body(app.Nav().Class(\u0026quot;navbar\u0026quot;).Body( h.searchBar, )), app.Div().Class(\u0026quot;container-fluid\u0026quot;).Body( h.instanceTable, ), ) }``func (h *Manager) Search(q string) []*api.Instance { instances, err := api.**CallApiSearch**(api.SearchRequest{ Query: q, }) if err != nil { return []*api.Instance{} } return instances.Instances }``func (h *Manager) UpdateInstances(q string) { instances := h.Search(q) h.instanceTable.instances = instances h.instanceTable.Update() }\nThe manger joins all the components together. It takes the search bar’s query string then sends it to the generated method **api.CallApiSearch** to get a list of instances, which it then updates the table component with.\nRunning WASM build is like executing a Go binary inside your browser, so we need a main function: func **main**() { manager := \u0026amp;amp;Manager{ searchBar: \u0026amp;amp;SearchBar{}, instanceTable: \u0026amp;amp;InstanceTable{}, }`` manager.searchBar.SetManager(manager) manager.instanceTable.SetManager(manager)`` app.Route(\u0026quot;/\u0026quot;, manager) app.Run() }\nThis initializes the components sets and starts the app running.\nTo build a WASM project in Bazel you just need to set goarch and goos: go_binary( name = \u0026quot;app.wasm\u0026quot;, embed = [\u0026quot;:go_default_library\u0026quot;], **goarch = \u0026quot;wasm\u0026quot;, goos = \u0026quot;js\u0026quot;,** )#### Server\nThe server for this application implements the Search API backend: type Server struct { instances []*api.Instance }``func (server *Server) **Search**( ctx context.Context, in *api.SearchRequest, ) (*api.Instances, error) { if server.instances == nil { server.parseInstances() }`` instances := []*api.Instance{} for _, instance := range server.instances { str, _ := json.Marshal(*instance) if strings.Contains(string(str), in.Query) { instances = append(instances, instance) } } return \u0026amp;amp;api.Instances{Instances: instances}, nil }``func (server *Server) **parseInstances**() { fileName := \u0026quot;external/com_github_ec2instances/file/instances.json\u0026quot; ec2Instances := []ec2Instance{} server.instances = []*api.Instance{}`` file, _ := ioutil.ReadFile(fileName) json.Unmarshal(file, \u0026amp;amp;ec2Instances)`` for _, e := range ec2Instances { server.instances = append(server.instances, \u0026amp;amp;api.Instance{ Name: e.PrettyName, ... Price: e.Pricing[\u0026quot;us-east-1\u0026quot;][\u0026quot;linux\u0026quot;].OnDemand, }) } }\nThe server parses the instances.json file to build a list of instances that the Search API looks for instances in.\nThe backend HTTP server looks like: `func main() {\nmux := http.NewServeMux() app := \u0026amp;amp;app.Handler{ Title: \u0026quot;EC2Instances\u0026quot;, Author: \u0026quot;Graham Jenson\u0026quot;, Styles: []string{\u0026quot;bootstrap.css\u0026quot;}, } mux.HandleFunc(\n\u0026ldquo;/app.wasm\u0026rdquo;,\nfunc(w http.ResponseWriter, r *http.Request) {\nhttp.ServeFile(w, r, \u0026ldquo;wasm/js_wasm_pure_stripped/app.wasm\u0026rdquo;)\n},\n)`` mux.HandleFunc(\n\u0026ldquo;/bootstrap.css\u0026rdquo;,\nfunc(w http.ResponseWriter, r *http.Request) {\nhttp.ServeFile(w, r, \u0026ldquo;external/\u0026hellip;/bootstrap.css\u0026rdquo;)\n},\n)\napi.RegisterApiHTTPMux(mux, \u0026amp;server.Server{})\nmux.Handle(\u0026quot;/\u0026quot;, app)`` log.Fatal(http.ListenAndServe(\u0026quot;:7000\u0026quot;, mux))\n}`\nThis function creates a Server a go-app and an http.ServeMux. It returns the app.wasm and bootstrap files, and registers the server’s API and app to will handle the all other routes.\nTo make sure all the files are available to this function, they are listed in data of the go_binary: go_binary( name = \u0026quot;server\u0026quot;, data = [ **\u0026quot;//wasm:app.wasm\u0026quot;,** \u0026quot;[@com_github_bootstrap](http://twitter.com/com_github_bootstrap)//file:bootstrap.css\u0026quot;, \u0026quot;[@com_github_ec2instan](http://twitter.com/com_github_ec2instan)ces//file:instances.json\u0026quot;, ], embed = [\u0026quot;:go_default_library\u0026quot;], ... )\nRunning with Bazel\nOnce all this is in place, to start the server you just need to run bazel run :server which will:\ndownload all the needed files (rules, go modules, bootstrap, instances.json) build and compile the proto files into Go code build the Golang binaries (both local and WASM) then package the backend binary will all the needed files and execute it to start the server on port 7000 The power of Bazel is that all of these actions are cached and on the next build it will only recompile exactly what is needed. It means that if you change a file like the proto, Bazel will only rebuild what is needed to update your application.#### Should you use WASM?\nI like it, and go-app is really cool, but compared to React or other JS frameworks it is still pretty lite. I will say that, for me, using Bazel with Golang is a pleasure compared to the horrors I have faced in the JS world. However, JS libraries are more mature and have solved problems that the Go code has not even considered yet.\nAt the minimum, I recommend learning to use Bazel, Proto and WASM. If just for a to keep a healthy comparison between tools, but also because after the initial learning curve you will find them surprisingly simple and powerful.\nHow big is the WASM file? For this application it is 15Mb compared to the 100kb of JS in ec2instances.info site. I could probably trim that down a bit by removing unused dependencies, like gRPC, but it looks like the minimum size is still around 8MB. This is probably the most limiting aspect of this workflow. go-app mitigates this by aggressively caching the WASM binary but this has its own issues.\nThere are many use-cases where the bundle size is not as important. Reliability and performance sometimes are higher priorities than load time. In these cases WASM might be a good alternative. Also, there are currently many web applications bundled to be run locally with tools like electron. These are good candidates for WASM, and might be my next project.\nWhat are the next steps? This was a pretty basic application. More time spent building these applications will show more limitations and also hopefully make them better. go-app is really cool but has some sharp edges.\nFinding or building a full fledged frontend Golang framework like React would be the next step towards Bernhardt’s original vision.\n","permalink":"https://maori.geek.nz/posts/2020/2020-04-01_web-app-using-bazel-golang-wasm-and-proto/","summary":"\u003cp\u003e\u003ca href=\"http://destroyallsoftware.com/\"\u003eGary Bernhardt\u003c/a\u003e at PyCon 2014 talked about \u003ca href=\"https://www.destroyallsoftware.com/talks/the-birth-and-death-of-javascript\"\u003e“The Birth and Death of Javascript”\u003c/a\u003e. He predicted, among other things, a giant exclusion zone around San Francisco, a world altering event in 2020, and that WASM would eventually take over the web. Let’s ignore the prior two prescient predictions and focus on the third, \u003cstrong\u003eWASM taking over\u003c/strong\u003e.\u003c/p\u003e\n\u003cp\u003eSo, I want to build a simple web app by joining a bunch of new and fun technologies together:\u003c/p\u003e","title":"A Web App Using Bazel Golang WASM and Proto"},{"content":"In Bazel you put files into Rules and get files out, e.g: pkg_tar( name = \u0026quot;package\u0026quot;, extension = \u0026quot;tar.gz\u0026quot;, srcs = [:file1, :file2] )\nThe pkg_tar rule takes :file1 and :file2 and spits out the tarball :package.tar.gz, which you can pass to another rule as input. So you input files into rules that output files into other rules, that output files into other rules\u0026hellip; until eventually you get the file you want. This is a graph that can look like:\nPart of the github.com/sorbet/sorbet rule graph\nRules are made up of many actions. Actions are just scripts, usually written in python or bash (because most systems can execute them). Like rules, actions take files in and spit files out.\nThe output of an action should only depend on the explicitly stated inputs. That is actions should be hermetic, isolated from all but explicit dependencies. This is Bazel’s broader philosophy and most of its design decisions are consequences of this.\nIf every action is hermetic then we can get a giant benefit; speed! We can run an action once cache its output and don’t need to run it again until it’s inputs change. Also, with explicit inputs and outputs Bazel can construct a graph of all actions to calculate efficient and parallel execution. Bazel does this in three phases:\nLoading Phase: Load all the rules. Analysis Phase: Calculates the action graph, and hash inputs to look up in the cache and see what needs to be run. Execution Phase: Process the necessary actions. Separating the Execution and Analysis phase means we have to register the actions to let Bazel decide when to run them. For example, the action to create a tarball in [pkg_tar](https://github.com/bazelbuild/bazel/blob/master/tools/build_defs/pkg/pkg.bzl#L120) looks like: ctx.actions.run( executable = ctx.executable.build_tar, inputs = file_inputs + ctx.files.deps + [arg_file], arguments = [\u0026quot;--flagfile\u0026quot;, arg_file.path], outputs = [ctx.outputs.out], )\nThis code is in the python-esk language Starlark, that comes with specific limitations and different core libraries to encourage hermetic actions. A quick explanation of this code is:\n[ctx](https://docs.bazel.build/versions/master/skylark/lib/ctx.html) is the context of the rule. ctx.actions is how you register an action to a rule. ctx.actions.run is an action that calls a script. executable is a reference to the build_tar script to be run. inputs are all files needed to run this script. arguments are sent to the executable. outputs are all files this action generates. By explicitly specifying the inputs, outputs and using a hermetic script to run, Bazel can build large projects very fast.\nHermetic Tools But wait, what the hell is _build_tar_ executable above? Also those arguments won\u0026rsquo;t work with _tar_?\nbuild_tar is a python reimplementation of tar. Why would Bazel need to reimplement _tar_? tar is not hermetic. Try this out: $\u0026gt; tar -cz file1 | sha256sum d0f...44b $\u0026gt; tar -cz file1 | sha256sum ee2...777\ntar’s output changes as it attaches a created at date so depends on something not explicitly stated as an input. This can be worked around, but it depends on the version of tar on the system, an undeclared dependency.\nTo make this hermetic build_tar sets the date to the \u0026ldquo;implausibly old\u0026rdquo; time stamp 1970-01-01 00:00:00. This means we can\u0026rsquo;t (easily) use tar in an action, even if it has been a common tool for 40 years!.\nThis is where Bazel really starts to lose people. Many existing tools are not hermetic so Bazel can’t reliably cache their output. To fix this we need to reimplement these tools from scratch. For example…\nDown the Rabbit Hole with Docker Docker is a great tool. However, previously I have described how docker build isn’t hermetic even if the built images are identical. Containers are a reality of modern development so how are we meant to use Bazel to build them?\nUse [**rules_docker**](https://github.com/bazelbuild/rules_docker) the re-implementation of **docker build** in Bazel. Containers aren\u0026rsquo;t anything more than tarballs with a manifest. rules_docker contains rules and actions for squashing tarballs together to create new containers.\nTypically the first RUN command in a Dockerfile is apt-get update \u0026amp;amp;\u0026amp;amp; apt-get install ..., how are we going to _apt-get_ hermetically?.\nUse distroless and its re-implementation of apt-get/dpkg. apt-get update downloads a list of tarball deb packages, which can be selected and extracted with apt-get install. Distroless contains rules to smash deb tarballs into container tarballs using an unchanging Debian repository snapshot to make sure we always get the same versions.\nSome deb packages like ca-certificates have installer scripts, how are we going to deal with non-hermetic installer scripts?\nRe-implement them as Bazel rules. For example the ca-certificates install script is re-implemented with the [cacert](https://github.com/GoogleContainerTools/distroless/blob/master/cacerts/cacerts.bzl) rule and [extract.sh](https://github.com/GoogleContainerTools/distroless/blob/master/cacerts/extract.sh) script in distroless. This can be quite some work and require knowledge that is typically abstracted with apt-get.\nFalling down this rabbit hole causes us to throw away many existing tools and reimplement them hermetically.\nUp to you There are many more issues with Bazel (laid out here) but I think most stem from Bazel’s core philosophy of hermetic builds. The speed and reliability are undeniable, but so is the pain when you throw away a tool you like and have to reimplement it Bazel’s way.\nI liken Bazel to Haskell. Both have rigid, inflexible philosophies. Bazel with being hermetic and Haskell being functionally pure. This can immediately turn people off as they have to throw away well known tools like docker or for loops. The few who do stick around can come out of the fire with a deeper understanding of the tradeoffs they are making.\nAt the end of the day, if the benefits outweigh the downsides maybe Bazel is the tool for you. I won’t recommend it, like I wouldn’t recommend Haskell, because it is up to you if you will trade pain for speed.\n","permalink":"https://maori.geek.nz/posts/2019/2019-08-05_bazel-why-people-lovehate-it/","summary":"\u003cp\u003eIn Bazel you put \u003cstrong\u003efiles\u003c/strong\u003e into \u003cstrong\u003eRules\u003c/strong\u003e and get \u003cstrong\u003efiles\u003c/strong\u003e out, e.g:\n\u003ccode\u003epkg_tar(   name = \u0026quot;package\u0026quot;,   extension = \u0026quot;tar.gz\u0026quot;,   srcs = [:file1, :file2]   )\u003c/code\u003e\u003c/p\u003e\n\u003cp\u003eThe \u003ccode\u003epkg_tar\u003c/code\u003e rule takes \u003ccode\u003e:file1\u003c/code\u003e and \u003ccode\u003e:file2\u003c/code\u003e and spits out the tarball \u003ccode\u003e:package.tar.gz,\u003c/code\u003e which you can pass to another rule as input. So you input files into rules that output files into other rules, that output files into other rules\u0026hellip; until eventually you get the file you want. This is a graph that can look like:\u003c/p\u003e","title":"Bazel: Why people love/hate it"},{"content":"For reasons I wanted git log to be parsed in Go and it looks like the easiest way to do that is to output it as JSON and parse into Go structs.\nFirst we need to define the format for the JSON (borrowed from here): var GITFORMAT string = \u0026ndash;pretty=format:{\n\u0026ldquo;commit\u0026rdquo;: \u0026ldquo;%H\u0026rdquo;,\n\u0026ldquo;parent\u0026rdquo;: \u0026ldquo;%P\u0026rdquo;,\n\u0026ldquo;refs\u0026rdquo;: \u0026ldquo;%D\u0026rdquo;,\n\u0026ldquo;subject\u0026rdquo;: \u0026ldquo;%s\u0026rdquo;,\n\u0026ldquo;author\u0026rdquo;: { \u0026ldquo;name\u0026rdquo;: \u0026ldquo;%aN\u0026rdquo;, \u0026ldquo;email\u0026rdquo;: \u0026ldquo;%aE\u0026rdquo;, \u0026ldquo;date\u0026rdquo;: \u0026ldquo;%ad\u0026rdquo; },\n\u0026ldquo;commiter\u0026rdquo;: { \u0026ldquo;name\u0026rdquo;: \u0026ldquo;%cN\u0026rdquo;, \u0026ldquo;email\u0026rdquo;: \u0026ldquo;%cE\u0026rdquo;, \u0026ldquo;date\u0026rdquo;: \u0026ldquo;%cd\u0026rdquo; }\n},``\nThen define the structs: type GitPerson struct { Name string json:\u0026ldquo;name\u0026rdquo; Email string json:\u0026ldquo;email\u0026rdquo; Date *time.Timejson:\u0026ldquo;date\u0026rdquo; }``type GitCommit struct { Commit stringjson:\u0026ldquo;commit\u0026rdquo; Parent stringjson:\u0026ldquo;parent\u0026rdquo; Refs stringjson:\u0026ldquo;refs\u0026rdquo; Subject stringjson:\u0026ldquo;subject\u0026rdquo;``` Author GitPerson json:\u0026quot;author\u0026quot;\nCommiter GitPerson json:\u0026quot;commiter\u0026quot;\n}`\nThe function will then run git log and Unmarshal the result: `func gitLog() ([]GitCommit, error) {\nargs := []string{\n\u0026ldquo;log\u0026rdquo;,\n\u0026ldquo;\u0026ndash;date=iso-strict\u0026rdquo;,\n\u0026ldquo;\u0026ndash;first-parent\u0026rdquo;,\nGITFORMAT,\n}`` cmd := exec.Command(\u0026ldquo;git\u0026rdquo;, args\u0026hellip;)\nout, _ := cmd.Output()\nlogOut := string(out)\nlogOut = logOut[:len(logOut)-1] // Remove the last \u0026ldquo;,\u0026rdquo;\nlogOut = fmt.Sprintf(\u0026quot;[%s]\u0026quot;, logOut) // Add []\ngitCommitList := []GitCommit{}\njson.Unmarshal([]byte(logOut), \u0026amp;gitCommitList)\nreturn gitCommitList, nil\n}`\n--date=iso-strict makes sure that the dates that are outputted are parsable by Go.\nHopefully if you find a reason to need git log in Go this might help :)\n","permalink":"https://maori.geek.nz/posts/2019/2019-07-23_git-log-as-json-in-go/","summary":"\u003cp\u003eFor reasons I wanted \u003ccode\u003egit log\u003c/code\u003e to be parsed in Go and it looks like the easiest way to do that is to output it as JSON and parse into Go structs.\u003c/p\u003e\n\u003cp\u003eFirst we need to define the format for the JSON (borrowed from \u003ca href=\"https://gist.github.com/varemenos/e95c2e098e657c7688fd\"\u003ehere\u003c/a\u003e):\n\u003ccode\u003evar GITFORMAT string = \u003c/code\u003e\u0026ndash;pretty=format:{\u003cbr\u003e\n\u0026ldquo;commit\u0026rdquo;: \u0026ldquo;%H\u0026rdquo;,\u003cbr\u003e\n\u0026ldquo;parent\u0026rdquo;: \u0026ldquo;%P\u0026rdquo;,\u003cbr\u003e\n\u0026ldquo;refs\u0026rdquo;: \u0026ldquo;%D\u0026rdquo;,\u003cbr\u003e\n\u0026ldquo;subject\u0026rdquo;: \u0026ldquo;%s\u0026rdquo;,\u003cbr\u003e\n\u0026ldquo;author\u0026rdquo;: { \u0026ldquo;name\u0026rdquo;: \u0026ldquo;%aN\u0026rdquo;, \u0026ldquo;email\u0026rdquo;: \u0026ldquo;%aE\u0026rdquo;, \u0026ldquo;date\u0026rdquo;: \u0026ldquo;%ad\u0026rdquo; },\u003cbr\u003e\n\u0026ldquo;commiter\u0026rdquo;: { \u0026ldquo;name\u0026rdquo;: \u0026ldquo;%cN\u0026rdquo;, \u0026ldquo;email\u0026rdquo;: \u0026ldquo;%cE\u0026rdquo;, \u0026ldquo;date\u0026rdquo;: \u0026ldquo;%cd\u0026rdquo; }\u003cbr\u003e\n},``\u003c/p\u003e","title":"Git Log as JSON in Go"},{"content":"Do you want to see something annoying? Create a Dockerfile like this: FROM ubuntu:18.04@sha256:9b1...b3c````RUN echo \u0026quot;Hello World\u0026quot;\nNow build it without cache twice: $\u0026gt; docker build --no-cache -q . sha256:4e0...20b $\u0026gt; docker build --no-cache -q . sha256:e28...3fa\nWhy are the SHA digests different? They should be exactly the same… right? The file system didn’t change, they are both built in the same environment.\nTo solve this mystery lets take a look inside each of these images: $\u0026gt; docker save sha256:4e0...20b \u0026gt; a.tar $\u0026gt; docker save sha256:e28...3fa \u0026gt; b.tar````$\u0026gt; mkdir a $\u0026gt; mkdir b````$\u0026gt; tar -xzvf a.tar -C a $\u0026gt; tar -xzvf b.tar -C b\nThe directories a and b now look like: ``# layer folders\n├── 478\u0026hellip;d90/\n│ ├── json\n│ ├── layer.tar\n│ └── VERSION\n├── 8ea\u0026hellip;b12/\n├── d2e\u0026hellip;fcd/\n├── eda\u0026hellip;695````# config file\n├── 4e0\u0026hellip;20b.json\nreference to config file and layers files └── manifest.json``\nConfig Docker’s JSON config file describes the environment that built the docker image and its history: { \u0026quot;architecture\u0026quot;: \u0026quot;amd64\u0026quot;, \u0026quot;config\u0026quot;: { ... }, \u0026quot;container\u0026quot;: \u0026quot;2e7...b3e\u0026quot;, \u0026quot;container_config\u0026quot;: { ... }, \u0026quot;created\u0026quot;: \u0026quot;2019-07-10T07:49:21.1663546Z\u0026quot;, \u0026quot;docker_version\u0026quot;: \u0026quot;18.09.2\u0026quot;, \u0026quot;history\u0026quot;: [ { \u0026quot;created\u0026quot;: \u0026quot;2019-06-18T22:51:33.33427803Z\u0026quot;, \u0026quot;created_by\u0026quot;: \u0026quot;/bin/sh -c #(nop) ADD file:4e6...098 in / \u0026quot; }, ... { \u0026quot;created\u0026quot;: \u0026quot;2019-07-10T07:49:21.1663546Z\u0026quot;, \u0026quot;created_by\u0026quot;: \u0026quot;/bin/sh -c echo \\\u0026quot;Hello World\\\u0026quot;\u0026quot;, \u0026quot;empty_layer\u0026quot;: true } ], \u0026quot;os\u0026quot;: \u0026quot;linux\u0026quot;, \u0026quot;rootfs\u0026quot;: { \u0026quot;type\u0026quot;: \u0026quot;layers\u0026quot;, \u0026quot;diff_ids\u0026quot;: [ \u0026quot;sha256:ba9...51e\u0026quot;, \u0026quot;sha256:fbd...ad6\u0026quot;, \u0026quot;sha256:dda...d0b\u0026quot;, \u0026quot;sha256:75e...5ca\u0026quot; ] } }\nManifest The manifest.json file describes the location of the layers and config file: [ { \u0026quot;Config\u0026quot;: \u0026quot;4e0...20b.json\u0026quot;, \u0026quot;RepoTags\u0026quot;: null, \u0026quot;Layers\u0026quot;: [ \u0026quot;8ea...b12/layer.tar\u0026quot;, \u0026quot;d2e...fcd/layer.tar\u0026quot;, \u0026quot;478...d90/layer.tar\u0026quot;, \u0026quot;eda...695/layer.tar\u0026quot; ] } ]\nLayers Each layer has a json file (which looks like the config file), a VERSION file with the string 1.0 (probably the packaging version), and a layer.tar file containing the images files.\nThe a and b images have different layer folders even though the layer.tar\u0026rsquo;s are exactly the same: $\u0026gt; sha256sum a/eda...695/layer.tar 75e...5ca a/eda...695/layer.tar````$\u0026gt; sha256sum b/4ed...42d/layer.tar 75e...5ca b/4ed...42d/layer.tar\nThis layer.tar SHA is referenced inside the config file, and the location of the layer is in the manifest.\nThe Digestive System Lets SHA256 the config files: $\u0026gt; sha256sum a/4e0...20b.json 4e0...20b a/4e0...20b.json $\u0026gt; sha256sum b/e28...3fa.json e28...3fa b/e28...3fa.json\nSo this is where the digest comes from. So what is different between the a and b images, we can see with diff : ``$\u0026gt; diff a/4e0\u0026hellip;20b.json b/e28\u0026hellip;3fa.json\n27c27\n\u0026lt; \u0026ldquo;container\u0026rdquo;: \u0026ldquo;2e7\u0026hellip;b3e\u0026rdquo;, \u0026ldquo;container\u0026rdquo;: \u0026ldquo;97a\u0026hellip;49c\u0026rdquo;,\n54c54\n\u0026lt; \u0026ldquo;created\u0026rdquo;: \u0026ldquo;2019-07-10T07:49:21.1663546Z\u0026rdquo;,\n\u0026ldquo;created\u0026rdquo;: \u0026ldquo;2019-07-10T07:49:30.0860002Z\u0026rdquo;,\n79c79\n\u0026lt; \u0026ldquo;created\u0026rdquo;: \u0026ldquo;2019-07-10T07:49:21.1663546Z\u0026rdquo;,\n\u0026quot;created\u0026quot;: \u0026quot;2019-07-10T07:49:30.0860002Z\u0026quot;,`` The created timestamps and the container keys causes the digests to be different. This is annoying because even two identical docker images will have a different digests if built milliseconds apart.\nBreaking the Config Digest Let\u0026rsquo;s see if we can remove the differences between these images and create a new digest. Can we create a valid docker image if we remove the container key (not sure we need this) and change all the dates 1970-01-01T00:00:00Z in the config file?\nAlso do the names of the files matter? I want to call the config file config.json and rename the layer folders to 1,2,3,4. Both require updates to the references in the manifest.json file and the layer json files.\nNow the folder looks like: # layer folders ├── 1/ ├── 2/ ├── 3/ ├── 4/ ├── config.json └── manifest.json\nLets re-tar and docker load again: $\u0026gt; tar -cf new-a.tar -C a/ . $\u0026gt; docker load -i new-a.tar Loaded image ID: sha256:24b...975\nThe digest equals the SHA256 of the config.json file, so this all looks correct and it runs!\nDockers Digest The docker image’s digest is the SHA256 of its config file. It is different on different builds because the timestamps (which can be replaced) and the container key (which can be ignored). The file names and folder structure don’t matter. The manifest file references the layers and config, but is not included in the digest. I can now understand and remove some of the apparent randomness in docker image digests. Ultimately what I want is to create a workflow where if a Dockerfile creates the same filesystem it has the same digest. Knowing how a digest is created and being able to manipulate it is the first step.\n","permalink":"https://maori.geek.nz/posts/2019/2019-07-11_how-to-digest-a-docker-image/","summary":"\u003cp\u003eDo you want to see something annoying? Create a \u003ccode\u003eDockerfile\u003c/code\u003e like this:\n\u003ccode\u003eFROM ubuntu:18.04@sha256:9b1...b3c````RUN echo \u0026quot;Hello World\u0026quot;\u003c/code\u003e\u003c/p\u003e\n\u003cp\u003eNow build it without cache twice:\n\u003ccode\u003e$\u0026gt; docker build --no-cache -q .   sha256:4e0...20b   $\u0026gt; docker build --no-cache -q .   sha256:e28...3fa\u003c/code\u003e\u003c/p\u003e\n\u003cp\u003eWhy are the SHA digests different? They should be exactly the same… right? The file system didn’t change, they are both built in the same environment.\u003c/p\u003e\n\u003cp\u003eTo solve this mystery lets take a look inside each of these images:\n\u003ccode\u003e$\u0026gt; docker save sha256:4e0...20b \u0026gt; a.tar   $\u0026gt; docker save sha256:e28...3fa \u0026gt; b.tar````$\u0026gt; mkdir a   $\u0026gt; mkdir b````$\u0026gt; tar -xzvf a.tar -C a   $\u0026gt; tar -xzvf b.tar -C b\u003c/code\u003e\u003c/p\u003e","title":"How to Digest a Docker Image"},{"content":"LearnLog 8:00am — CleanUp Last time I was able to get the basics of custom resources in Fenrir working. Today I want to finish:\nUploading a S3File with custom resource. Create Custom Resource S3ZipFile which will extract the ZIP file. Also clean up some of the code. Test out multi-account deploys with custom resources. Using the Bifrost standard to build deployers provides a framework for building deployers that can work across AWS accounts by simply creating an assumable role in an AWS account. This make onboarding new accounts super easy. CloudFormation custom resources complicates this a bit because I want to keep this simple onboarding, so I don’t want to deploy a lambda.\n9:30am — Validations and Tests Cleaning up some validation code, fixed really annoying issue where coinbase-fenrir lambda was not logging because of missing IAM permissions. Writing tests as well.\n11:00am — Old Client Code After updating the validations, I have to read over some pretty old code and re-understand it. The Fenrir client code is not the prettiest as it is more an example implementation of how to interact with the Fenrir step function and used largely as a means of sandbox testing.\nFenrir pulls the Lambda code zips out of its Docker container, to provide a hermetic(-ish) build environment, and then uploads them to S3. So just extending that for arbitrary files now.\n2:00pm — Boring Bits This is the boring and necessary bit of programming. Not learning anything, just making sure it is all well tested and will actually work with a variety of inputs.\nAfter getting the S3File resource working, it was actually not that much work to get the S3ZipFile resource to work as well. This resource will take an uploaded zip file and extract it to an S3 bucket with keys being prefixed by a value. To lower the amount of code, I am just using the S3File schema and validations for both. This might change later.\nThe cross account stuff will work from the Fenrir side. But I still have the question of how a CloudFormation stack in another account will be able to call a Lambda. It looks possible, but maybe some fun execution IAM policy is allowed. TBH I don’t know what permission I have given CloudFormation to call the lambda at the moment, but it works…\n5:30pm — Will have to change later Well shit. It is really annoying but I have a problem. I figured out how CloudFormation can call the Fenrir lambda, it is because the assumed role I call CloudFormation with has lambda:* perms, which work if it is in the same account. In another account I would have to allow the assumed role to invoke the lambda. This is not a good idea as it allows the bastion account access to other bastion accounts via the Fenrir Lambda.\nI think I will have to go with the SNS solution from the AWS blog post.\nUsing a SNS topic will(?) provide a layer of strict validation before executing the Fenrir Lambda which means no arbitrary input. The reason I don’t like this solution is because it feels like a Rube Goldberg machine, which AWS seems to encourage building.\nEnd Overall a productive day, but nothing much learned other than my own mistakes.\n","permalink":"https://maori.geek.nz/posts/2019/2019-06-21_cloudformation-s3file-and-s3zip-custom-resources/","summary":"\u003ch4 id=\"learnlog\"\u003eLearnLog\u003c/h4\u003e\n\u003ch4 id=\"800amcleanup\"\u003e8:00am — CleanUp\u003c/h4\u003e\n\u003cp\u003eLast time I was able to get the basics of custom resources in Fenrir working. Today I want to finish:\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003eUploading a S3File with custom resource.\u003c/li\u003e\n\u003cli\u003eCreate Custom Resource S3ZipFile which will extract the ZIP file.\u003c/li\u003e\n\u003cli\u003eAlso clean up some of the code.\u003c/li\u003e\n\u003cli\u003eTest out multi-account deploys with custom resources.\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eUsing the Bifrost standard to build deployers provides a framework for building deployers that can work across AWS accounts by simply creating an assumable role in an AWS account. This make onboarding new accounts super easy. CloudFormation custom resources complicates this a bit because I want to keep this simple onboarding, so I don’t want to deploy a lambda.\u003c/p\u003e","title":"CloudFormation S3File and S3Zip Custom Resources"},{"content":"Learnlog 9:50am — Fenrir CloudFormation I am taking today off from learning Bazel and instead I am looking at how to build a CloudFormation static site deployer for Fenrir.\nFenrir is a AWS SAM deployer that is basically sam deploy but in a Step Function. I want Fenrir to be able to deploy “full-stack” applications, including front end resources which can sit in S3 as a static assets. This would make it much easier for Coinbase engineering to work with serverless, as currently they have to deploy front end code separately from the serverless API’s that power them.\nAs I understand it, CloudFormation does not allow you to upload S3 objects. So I will have to build a “custom resource” to uploaded and extract files to a S3 bucket.\nI have never built a custom CloudFormation resource before and I am anticipating a bunch of problems, mostly static typing in both the [goformation](https://github.com/awslabs/goformation) library, and the JSON schema we use to validate Fenrir input.\nTo start with I am going to go have a look at what others have done and try get a base knowledge of the moving parts.\n10:30am — Reading and Scheming Found a few useful blog posts\nhttps://developer.okta.com/blog/2018/07/31/use-aws-cloudformation-to-automate-static-site-deployment-with-s3 https://advancedweb.hu/2019/01/01/cf_s3_object/ Also the Golang Lambda SDK already has all the types and a useful function wrapper to make custom resources. It does look like this library tries to hide all the complicated CF logic.\nI am going to reduce my initial scope for today and not deal with file extraction, and just focus on getting a custom resource working. That will still probably be a full days work.\nI have started making changes to step our Step Function framework which Fenrir uses, and to Fenrir. I am still unsure of what changes goformation will need so I will start from the top down and wait to start getting errors back. Typical trial and error development.\n12:00am — Fighting with Types Spent the last hour fighting types with Fenrir. I want the custom resource handler to be inside the Fenrir lambda. I like keeping related functionality together in Lambdas, rather than having lots of small Lambdas. I think that having lots of small lambdas is just trading off development costs for infra costs.\n12:30pm — GoLang Random Dict Order Bug Hit a super annoying bug involving and error that was not correctly being handled in my Fenrir test suite so I went down a rabbit hole.\nThe tests sometimes worked, which is infuriating. I eventually ran it down to an error being swallowed rather than exposed, and because of Go\u0026rsquo;s random order when iterating over a map this error would fail tests only occasionally.\nOh well an hour is now gone, great time for food.LUNCH#### 1:00pm — CloudFormation Custom Resources in Go Lambda\nThis is my new custom resource lambda handler, basically a copy/paste off the example in the README: func StaticSiteResources(awsc aws.Clients) cfn.CustomResourceLambdaFunction { return cfn.LambdaWrap(func(ctx context.Context, event cfn.Event) (physicalResourceID string, data map[string]interface{}, err error) { v, _ := event.ResourceProperties[\u0026quot;Echo\u0026quot;].(string) data = map[string]interface{}{ \u0026quot;Echo\u0026quot;: v, } return }) }\n1:30pm — Lambda Wrapping Had to refactor the LambdaWrap because it is impossible to stub the http client for tests\u0026hellip; how I miss Ruby sometimes.\nNow I am going to deploy this to my sandbox AWS account for testing.\nTo deploy I assume-role into my account then use the ./scripts/cf_bootstrap script to deploy. This builds Fenrir locally, uploads it to upload S3 and uses CloudFormation to update the resources.\nThen I created a simple test event { \u0026quot;ResponseURL\u0026quot;: \u0026quot;http://localhost\u0026quot;, \u0026quot;ResourceProperties\u0026quot;: {\u0026quot;Echo\u0026quot;: \u0026quot;asd\u0026quot;} }\nAnd directly invoked the function, which errored but logging was correct.\nNow comes the hard part, getting it working with Fenrir and goformation.\n3:00pm — goformation CustomResource Type The reason why goformation is hard is because it pulls down the Cloudformation spec from AWS and then generates all static Go classes. Custom resources cannot be in the specification so there is no class for them to be unmarshaled into.\nTo work around this I have injected into the unmarshalling a check to see if the type starts with Custom:: and then to select a generalizable CustomResource class. This is pretty simple and might have problems later but for now it is working.\ngoformation also generates JSON schema, which I am ignoring. I will rely on the clients (Fenrir) to edit the schema they validate against. So Fenrir will need to build and inject S3File schema into the larger AWS SAM schema.\n4:30pm — It Works, First Custom::Resource Deploy Custom CloudFormation resource working\nIt Worked!! Well it deploys without erroring. At the moment I have:\ngoformation marshalling and unmarshalling my custom resources. I will work with their maintainers too get my PR merged https://github.com/awslabs/goformation/pull/213 Step now allows for default handlers, so the lambda can be used outside of its Step Function https://github.com/coinbase/step/pull/43, e.g. for Custom CF resources. Fenrir now supports Custom::S3File type, though it doesn’t do anything yet https://github.com/coinbase/fenrir/pull/12 The schema for S3File is: var S3FileSchema = `{ \u0026quot;additionalProperties\u0026quot;: false, \u0026quot;properties\u0026quot;: { \u0026quot;Properties\u0026quot;: { \u0026quot;additionalProperties\u0026quot;: false, \u0026quot;properties\u0026quot;: { \u0026quot;Bucket\u0026quot;: { \u0026quot;type\u0026quot;: \u0026quot;string\u0026quot; }, \u0026quot;Key\u0026quot;: { \u0026quot;type\u0026quot;: \u0026quot;string\u0026quot; }, \u0026quot;Uri\u0026quot;: { \u0026quot;type\u0026quot;: \u0026quot;string\u0026quot; } }, \u0026quot;required\u0026quot;: [\u0026quot;Bucket\u0026quot;, \u0026quot;Key\u0026quot;, \u0026quot;Uri\u0026quot;], \u0026quot;type\u0026quot;: \u0026quot;object\u0026quot; }, \u0026quot;Type\u0026quot;: { \u0026quot;enum\u0026quot;: [ \u0026quot;Custom::S3File\u0026quot; ], \u0026quot;type\u0026quot;: \u0026quot;string\u0026quot; } }, \u0026quot;required\u0026quot;: [ \u0026quot;Type\u0026quot;, \u0026quot;Properties\u0026quot; ], \u0026quot;type\u0026quot;: \u0026quot;object\u0026quot; }\nAs you can see, I actually don’t allow ServiceToken which is required by CloudFormation. Fenrir replaces the ServiceToken with its own Lambda ARN: if res.Properties[\u0026quot;ServiceToken\u0026quot;] != nil { return resourceError(res, resourceName, \u0026quot;ServiceToken must be nil\u0026quot;) } res.Properties[\u0026quot;ServiceToken\u0026quot;] = lambdaArn\nThis means that the client doesn’t need to know where the Fenrir lambda is, so the template can be reused across accounts. It also makes S3File look more like other CloudFormation resources to developers.\nOne very ugly piece of code left is: `func JSONSchema() (string, error) {\n\u0026hellip;``` defs := newSchema[\u0026ldquo;definitions\u0026rdquo;].(map[string]interface{})\nprops := newSchema[\u0026ldquo;properties\u0026rdquo;].(map[string]interface{})\ndefs[\u0026ldquo;Custom::S3File\u0026rdquo;] = s3FileSchema```` res := props[\u0026ldquo;Resources\u0026rdquo;].(map[string]interface{})\npp := res[\u0026ldquo;patternProperties\u0026rdquo;].(map[string]interface{})\nreg := pp[\u0026quot;^[a-zA-Z0-9]+$\u0026quot;].(map[string]interface{})\nao := reg[\u0026ldquo;anyOf\u0026rdquo;].([]interface{})\nreg[\u0026ldquo;anyOf\u0026rdquo;] = append(ao, map[string]interface{}{\u0026quot;$ref\u0026quot;: \u0026ldquo;#/definitions/Custom::S3File\u0026rdquo;})\nnewSchemaStr, err := json.Marshal(newSchema)\nif err != nil {\nreturn \u0026ldquo;\u0026rdquo;, err\n}\nreturn string(newSchemaStr), nil\n}``\nThis code injects the S3File schema into the AWS SAM schema Fenrir uses to validate input. This is some pretty ugly golang. Exploring a map[string]interface{}becomes a ton of wrapping code, which in Ruby would be much cleaner.#### End\nThis has been a productive day. I started without ever having created a custom CloudFormation resource, and now I have created a custom S3File resource AND laid the ground work for allowing more custom resources in Fenrir.\nAlso, I am enjoying writing these learn logs (as long as I can show my work). It is a useful motivator and a good rubber duck.\n","permalink":"https://maori.geek.nz/posts/2019/2019-06-18_cloudformation-custom-resources-in-fenrir/","summary":"\u003ch4 id=\"learnlog\"\u003eLearnlog\u003c/h4\u003e\n\u003ch4 id=\"950amfenrir-cloudformation\"\u003e9:50am — Fenrir CloudFormation\u003c/h4\u003e\n\u003cp\u003eI am taking today off from learning Bazel and instead I am looking at how to build a CloudFormation static site deployer for Fenrir.\u003c/p\u003e\n\u003cp\u003eFenrir is a AWS SAM deployer that is basically \u003ccode\u003esam deploy\u003c/code\u003e but in a Step Function. I want Fenrir to be able to deploy “full-stack” applications, including front end resources which can sit in S3 as a static assets. This would make it much easier for Coinbase engineering to work with serverless, as currently they have to deploy front end code separately from the serverless API’s that power them.\u003c/p\u003e","title":"CloudFormation Custom Resources in Fenrir"},{"content":"Learnlog 7:30am — What to do today In my previous learnlog I built a tree of docker images able to be changed and rebuilt.\nWhat I want to try do today is write a tests on the built containers so that if they don’t meet some constraints the build will fail and their dependents won’t be built.\nLet’s boot up the previous days project and start reading.\n8:30am — Yak Shaving Yak shaved by cleaning up some of the images by copying the build tree of the valid DockerHub images.\nNow we have the folders: src/containers/ubuntu-19.10 src/containers/buildpack-deps-19.10 src/containers/ruby-2.6.3\nwith dependencies looking like:\nAlso wrote a graph script that generates these ^ PNGs easily.\n9:00am — Layers of Tests There are three types of tests:\nTest for the rules you have written, e.g. in the bzl files. Test the source of a BUILD, e.g. unit tests. Test the generated output of the rule, e.g. testing the ruby version inside a docker container. How to write tests for each are all mixed up together in the documentation complicating the issue further.\nIt looks like there is a native.test_suite function to for bzl tests. These are run with bazel test during the analysis phase. This is not what I am looking for because I need the docker containers to be built to test.\nIt seems that what I need is just a rule. To test the output I can just write a rule that depends on the outputted docker container.\n11:30am — Many Rules I ended up with a rule in docker.bzl: def _docker_exec_out_check_impl(ctx): name = ctx.attr.name image = ctx.attr.docker_image print(ctx.attr.docker_image.image_ref) check_output = ctx.outputs.check_output```` ctx.actions.run( executable = ctx.executable._docker_tool, inputs = [image.image_sha], arguments = [ check_output.path, \u0026quot;run\u0026quot;, \u0026quot;-t\u0026quot;, \u0026quot;--rm\u0026quot;, image_name + \u0026quot;:bazel\u0026quot;, \u0026quot;bash\u0026quot;, \u0026quot;-c\u0026quot;, \u0026quot;'\u0026quot; + ctx.attr.exec + \u0026quot;'\u0026quot;, ], outputs = [check_output], )````docker_exec_out_check = rule( implementation = _docker_exec_out_check_impl, attrs = { \u0026quot;docker_image\u0026quot;: attr.label( allow_single_file = True, mandatory = True, ), \u0026quot;exec\u0026quot;: attr.string(), \u0026quot;contains\u0026quot;: attr.string(), \u0026quot;_docker_tool\u0026quot;: attr.label( executable = True, cfg = \u0026quot;host\u0026quot;, allow_files = True, default = Label(\u0026quot;//rules:docker\u0026quot;), ), }, outputs = { \u0026quot;check_output\u0026quot;: \u0026quot;check_output\u0026quot; }, )\nI wanted to call the rule docker_exec_out_test but the _test postfix is restricted.\nThis is not great for a few reasons\n\u0026quot;'\u0026quot; + ctx.attr.exec + \u0026quot;'” in the rule is auto-escaped so using bash -c is not practical. The rule is not in the path of its dependents, so if it fails it will not fail its dependent builds. 12:44pm —More Complications Things are getting more complicated. Given this is a pretty simple thing to do, that usually is a signal I am doing it incorrectly.\nNow I have these rules: ``# docker_exec_out_check takes a build docker image and\ndef _docker_exec_out_check_impl(ctx):\nname = ctx.attr.name\nimage = ctx.attr.docker_image\nimage_name = image.image_name\ncheck_output = ctx.outputs.check_output moar = ctx.actions.declare_file(name + \u0026quot;_check_output\u0026quot; ) ctx.actions.run(\nexecutable = ctx.executable._docker_tool,\ninputs = [image.image_sha],\narguments = [\nmoar.path,\n\u0026ldquo;run\u0026rdquo;,\n\u0026ldquo;-t\u0026rdquo;,\n\u0026ldquo;\u0026ndash;rm\u0026rdquo;,\nimage_name + \u0026ldquo;:bazel\u0026rdquo;,\nctx.attr.exec,\n],\noutputs = [moar],\n)\nexecutable = ctx.executable._grep_tool, inputs = [moar], arguments = [ check_output.path, moar.path, \u0026#34;-e\u0026#34;, ctx.attr.contains ], outputs = [check_output], ) ````docker_exec_out_check = rule( implementation = _docker_exec_out_check_impl, attrs = { \u0026#34;docker_image\u0026#34;: attr.label( allow_single_file = True, mandatory = True, ), \u0026#34;exec\u0026#34;: attr.string(), \u0026#34;contains\u0026#34;: attr.string(), \u0026#34;_docker_tool\u0026#34;: attr.label( executable = True, cfg = \u0026#34;host\u0026#34;, allow_files = True, default = Label(\u0026#34;//rules:docker\u0026#34;), ),```` \u0026#34;_grep_tool\u0026#34;: attr.label( executable = True, cfg = \u0026#34;host\u0026#34;, allow_files = True, default = Label(\u0026#34;//rules:grep\u0026#34;), ), }, outputs = { \u0026#34;check_output\u0026#34;: \u0026#34;check_output\u0026#34; }, )`` `_grep_too` is a wrapper around grep, so once again I can output the standard out to a file. This does not even 100% working, and is very fragile. I am thinking that relying on bash scripts to do the heavy lifting of validating and building the container is better. That is, I will leave Bazel as the glue and push the implementation down into bash more. This will also fix the dependent build problem.#### LUNCH#### 2:30 — One Rule to Rule them all The deferred execution model of Bazel is really nice for dependency management as it lets you define the rules without actually executing them. But this can be very confusing when you have simple steps. So, putting all the logic for building and testing a docker container is much easier in a single script. `docker.bash` is now: ``#!/bin/bash````set -e````NAME=$1 DOCKER_FILE=$2 DOCKER_FOLDER=$(dirname $DOCKER_FILE) OUTPUT_FILE=$3 TEST_COMMAND=$4 TEST_VALUE=$5````DOCKER_BUILD_OUT=$(docker build -q -t $NAME:bazel -f $DOCKER_FILE $DOCKER_FOLDER)````if [ -n \u0026#34;$TEST_COMMAND\u0026#34; ]; then DOCKER_TEST=$(docker run -t --rm $NAME:bazel bash -c \u0026#34;$TEST_COMMAND\u0026#34;) echo $DOCKER_TEST | grep -e \u0026#34;$TEST_VALUE\u0026#34; fi````echo $NAME@$DOCKER_BUILD_OUT \u0026gt; $OUTPUT_FILE`` With the `docker_build` rule: ``def _docker_build(ctx): froms = [f.image_digest for f in ctx.attr.froms]```` ctx.actions.run( executable = ctx.executable._docker_tool, inputs = ctx.files.dockerfile + froms, arguments = [ ctx.attr.name, ctx.file.dockerfile.path, ctx.outputs.image_digest.path, ctx.attr.test_command, ctx.attr.test_value, ], outputs = [ctx.outputs.image_digest], )```` return struct( image_digest = ctx.outputs.image_digest, )````docker_build = rule( implementation = _docker_build, attrs = { \u0026#34;dockerfile\u0026#34;: attr.label( allow_single_file = True, mandatory = True, ), \u0026#34;froms\u0026#34;: attr.label_list(), \u0026#34;test_command\u0026#34;: attr.string(), \u0026#34;test_value\u0026#34;: attr.string(), \u0026#34;_docker_tool\u0026#34;: attr.label( executable = True, cfg = \u0026#34;host\u0026#34;, allow_files = True, default = Label(\u0026#34;//rules:docker\u0026#34;), ), }, outputs = { \u0026#34;image_digest\u0026#34;: \u0026#34;image_digest\u0026#34; }, )`` This is SOOOOO much simpler than before and easier to use and debug. Now I want to try something really hard, build a ruby app with gems and put it in an image. I have created these files: ``/src/apps/test/BUILD /src/apps/test/server.rb /src/apps/test/Gemfile`` #### 4:30 — Distraction Spent some of my time reading the `[rules_ruby](https://github.com/hvardhanx/bazel-ruby) `then got distracted by other work. Todays code is available here [https://github.com/grahamjenson/bazel-docker-tree](https://github.com/grahamjenson/bazel-docker-tree) Again, not sure if this is a useful format. But like taking notes in class, it makes it easier to remember where I am at and where I want to go. Will probably continue with this. ","permalink":"https://maori.geek.nz/posts/2019/2019-06-17_bazel-docker-tests/","summary":"\u003ch4 id=\"learnlog\"\u003eLearnlog\u003c/h4\u003e\n\u003ch4 id=\"730amwhat-to-do-today\"\u003e7:30am — What to do today\u003c/h4\u003e\n\u003cp\u003eIn my previous \u003ca href=\"https://maori.geek.nz/bazel-docker-dependency-tree-learnlog-2e53bb3ece6c\"\u003elearnlog\u003c/a\u003e I built a tree of docker images able to be changed and rebuilt.\u003c/p\u003e\n\u003cp\u003eWhat I want to try do today is write a tests on the built containers so that if they don’t meet some constraints the build will fail and their dependents won’t be built.\u003c/p\u003e\n\u003cp\u003eLet’s boot up the previous days project and start reading.\u003c/p\u003e\n\u003ch4 id=\"830amyak-shaving\"\u003e8:30am — Yak Shaving\u003c/h4\u003e\n\u003cp\u003eYak shaved by cleaning up some of the images by copying the build tree of the valid DockerHub images.\u003c/p\u003e","title":"Bazel Docker Tests"},{"content":"Learnlog 9:45 am — What I want What I want to do is use Google’s build tool Bazel to build a tree of Dockerfiles that are FROM each other and use Bazels dependency engine to automatically rebuild docker containers if a container they depend on changes. This is currently a manual job at many companies, and as they grow this becomes more tedious and in need of automating.\nI already have Bazel installed from a previous failed spike using https://github.com/bazelbuild/rules_docker which are a bit overcomplicated and difficult to use for someone who is just learning (like me).\nNow it is time to read some docs.\n9:55 am—So many docs I have created an empty WORKSPACE and am trying to read the “getting started” docs. There are none (or few) for someone who is trying to create a new rule, this probably means either it is too complicated to get started with, or too easy to write docs for.\nBazel has lots of docs for everything though, but it does include some intimidating lines like\n“Before creating or modifying any rule, make sure you are familiar with the evaluation model. You must understand the three phases of execution and the differences between macros and rules.”\nTelling me that before I do anything, I better learn it all first. No, I learn by doing!\n10:30am — Marcos or Rules Macros are super simple rules that are immediately executed. Rules are loaded, analyzed, then executed only if necessary. So I want rules.\nI have set up a folder structure: WORKSPACE # emtpy rules/docker.bzl rules/BUILD # empty src/containers/ubuntu/Dockerfile # FROM ubuntu src/containers/ubuntu/BUILD rules/docker.bzl is: print(“here”) # want to see when loaded``def _docker_build(ctx): print(“Building “ + ctx.attr.name) # TODO``docker_build = rule( implementation = _docker_build, )\nsrc/containers/ubuntu/BUILD is load(“//rules:docker.bzl”, “docker_build”)``docker_build(name = “ubuntu”)\nRunning bazel build //src/containers/ubuntu prints the debug lines which is a good first step.\nNow it is time to see if I can get it to call docker build .\n11:30am — Bazel’s Weird attr Schema So I have some more stuff working. I am getting to know Bazel’s weird schema for attrs.\nI am missing something… although I call ctx.actions.run to execute docker and pass in a Dockerfile. When I run bazel build it doesn’t run.\ndocker.bzl is now: print(\u0026quot;here\u0026quot;)``def _docker_build(ctx): name = ctx.attr.name dockerfile = ctx.attr.dockerfile.files.to_list()[0].path print(\u0026quot;Building \u0026quot; + name) print(\u0026quot;Dockerfile \u0026quot; + dockerfile) a = ctx.actions.declare_file(name + “.dockerout”) ctx.actions.run( executable = \u0026quot;docker\u0026quot;, inputs = ctx.attr.dockerfile.files, arguments = [“build”, dockerfile], outputs = [a], ) print(a) return struct( docker_sha = “ubusntu”, )``docker_build = rule( implementation = _docker_build, attrs = { \u0026quot;dockerfile\u0026quot;: attr.label(allow_files = True), \u0026quot;deps\u0026quot;: attr.label_list(allow_files = True), }, )\nwith ubuntu/BUILD being: load(“//rules:docker.bzl”, “docker_build”)``docker_build( name = “ubuntu”, dockerfile = “Dockerfile”, deps = [“Dockerfile”] )\nOI am hoping to at least get an error back from trying to call docker build before lunch. I am missing something, like having to register the file explicitly as a dependency to make Bazel realize I want to build it.\nI am pretty sure I just need to understand what this means:\n“[The implementation] function does not run any external commands. Rather, it registers actions that will be used later during the execution phase”\n12:00pm — Outputs I needed to explicitly declare the outputs of the rule, otherwise why would it run 🤦‍♂\nAlso found this file dockerfile_build.bzl, which is pretty much exactly what I want, except it is a repositroy_rule (only usable in WORKSPACE). The goal at the moment is not to be 100% hermetic but to learn and build towards that. So cutting corners is fine for today.\nNow when I run bazel build //src/containers/ubuntu:ubuntu I get the wonderful error docker failed: error executing command docker build src/containers/ubuntu/Dockerfile\nNow Lunch\n1:00pm — Toolchains? After a Greek wrap at the Brighton street market, and reading a bit more from the docker_build.bzl file I think I need a toolchain.\n1:45pm — Don’t need a toolchain! Well that was 45 mins of wasted time.\nThe problem was the it was trying to execute docker build src/containers/ubuntu/Dockerfile which doesn’t work because docker build expects a folder arg not a Dockerfile 🤦‍♂. Using the folder instead, docker is now building. Just waiting for the docker build to finish.\n2:00 pm— Reading docs The most useful documentation page while writing rules https://docs.bazel.build/versions/master/skylark/lib/ctx.html\n3:00pm — Understanding more So I have a bunch more stuff working now. I decided to use a simple script to execute docker instead of calling the docker command directly. This is because there is no (easy?) way to output the STDOUT from docker using ctx.actions.run to a file.\nI define the script using sh_binary in rules/BUILD : package(default_visibility = [“//visibility:public”])``sh_binary( name = “docker”, srcs = [“docker.bash”], )\nrules/docker.bash is: #!/bin/bash output=$1 shift echo “docker $@” docker $@ \u0026gt; $output\ndocker.bzl is now def _docker_build(ctx): name = ctx.attr.name folder = ctx.file.folder.path``ctx.actions.run( executable = ctx.executable._docker_tool, inputs = ctx.attr.folder.files, arguments = [ctx.outputs.dockerout.path, “build”, “-q”, “-t”, name, folder], outputs = [ctx.outputs.dockerout], )``return struct( docker_sha = “ubuntu”, )``docker_build = rule( implementation = _docker_build, attrs = { “folder”: attr.label( allow_single_file = True, mandatory = True, ), “_docker_tool”: attr.label( executable = True, cfg = “host”, allow_files = True, default = Label(“//rules:docker”), ), }, outputs = { “dockerout”: “%{name}.dockerout” }, )\nand ubuntu BUILD file is : load(“//rules:docker.bzl”, “docker_build”) docker_build(name = “ubuntu”, folder = “.”)\nI am having a bit of trouble with the location that docker build is being executed from. I cannot find a method that easily returns the folder of the build directory, so am passing in the folder manually for now.\nNow I am trying to make it rebuild when I change the Dockerfile\n3:30pm — Bazel Magic So Bazel is magic. I am still not sure what it is (or I am) doing, but I got the ubuntu image rebuilding when its Dockerfile is changed.\ndocker.bzl looks like this: def _docker_build(ctx): name = ctx.attr.name folder = ctx.file.dockerfile.dirname print([f.path for f in ctx.attr.dockerfile.files])``ctx.actions.run( executable = ctx.executable._docker_tool, inputs = ctx.files.dockerfile, arguments = [ ctx.outputs.imagesha.path, “build”, “-q”, “-t”, name, “-f”, ctx.file.dockerfile.path, folder ], outputs = [ctx.outputs.imagesha], )``return struct( image_sha = ctx.file.dockerfile, )``docker_build = rule( implementation = _docker_build, attrs = { “dockerfile”: attr.label( allow_single_file = True, mandatory = True, ), “_docker_tool”: attr.label( executable = True, cfg = “host”, allow_files = True, default = Label(“//rules:docker”), ), }, outputs = { “imagesha”: “sha” }, )\nwith the ubuntu BUILD looking as docker_build(name = “ubuntu”, dockerfile = “Dockerfile”)\nThe next challenge is the hardest part. I want to make a ruby-2.6 image FROM the ubuntu image we are creating that will rebuild if its base ubuntu changes. The Docker tree part.\n4:45 pm— Friday Drinks I created a few more files src/containers/ruby-2.6/Dockerfile that builds a ruby container FROM ubuntu:bazel and src/containers/ruby-2.6/BUILD: load(“//rules:docker.bzl”, “docker_build”)``docker_build( name = “ruby-2.6”, dockerfile = “Dockerfile”, froms = [“//src/containers/ubuntu”], )\nThe docker_build rule is now: docker_build = rule( implementation = _docker_build, attrs = { “dockerfile”: attr.label( allow_single_file = True, mandatory = True, ), **“froms”: attr.label_list(),** “_docker_tool”: attr.label( executable = True, cfg = “host”, allow_files = True, default = Label(“//rules:docker”), ), }, outputs = { “imagesha”: “sha” }, )\nThis has added froms which is a list of other Bazel docker builds that are bases for this container.\nI was hoping that just having the reference to its base ubuntu container would mean that when I changed the ubuntu Dockerfile it would rebuild the ruby container. This is not working, I am blocked. Now I am leaving to enjoy some Friday drinks.\n9:45pm — I had an idea Just putting a reference to a rule doesn’t automatically rebuild it. Maybe if you take the output of the rule and put it as input to the next rule that would work.\nThis worked! Here is the new _docker_build function: def _docker_build(ctx): name = ctx.attr.name folder = ctx.file.dockerfile.dirname`` froms = [f.image_sha for f in ctx.attr.froms]`` ctx.actions.run( executable = ctx.executable._docker_tool, inputs = ctx.files.dockerfile + froms, arguments = [ ctx.outputs.imagesha.path, “build”, “-q”, “-t”, name + “:bazel”, “-f”, ctx.file.dockerfile.path, folder ], outputs = [ctx.outputs.imagesha], )`` return struct( image_name = name, image_sha = ctx.outputs.imagesha, )\nThe change here is that I output the image_sha file and use that as input to its child containers docker build call (even though it is not used). Bazel then must detect a change in that file and then rebuild the container.\nThis has the nice benefit of only rebuilding the depended on images if they actually are rebuilt. For example, adding a comment to the ubuntu Dockerfile will not change the sha, so not rebuild the base images.\nThis is the last feature necessary to build a tree of docker images so a successful spike into Bazel.\n10:30pm — Pretty Graph As a final little project I wanted to display the dependency graph with: bazel query — noimplicit_deps ‘deps(//src/containers/ruby-2.6)’ — output graph \u0026gt; graph.in\nThis outputted a dot file which after some styling and dot -Tpng -Gdpi=300 -o graph.png graph.in resulted in:\nEnd The code is available at https://github.com/grahamjenson/bazel-docker-tree\nI am not sure this was a helpful blog, I just wanted to try something new and document my learning experience with Bazel. I think it has worked though I am not sure how useful it would be.\nI would like to spend more time with Bazel, trying to get pushing/caching working and integrating more with the existing rules. That is enough for today though :)\n","permalink":"https://maori.geek.nz/posts/2019/2019-06-15_bazel-docker-dependency-tree-learnlog/","summary":"\u003ch4 id=\"learnlog\"\u003eLearnlog\u003c/h4\u003e\n\u003ch4 id=\"945-amwhat-i-want\"\u003e9:45 am — What I want\u003c/h4\u003e\n\u003cp\u003eWhat I want to do is use Google’s build tool \u003ca href=\"https://bazel.build/\"\u003eBazel\u003c/a\u003e to build a tree of Dockerfiles that are \u003ccode\u003eFROM\u003c/code\u003e each other and use Bazels dependency engine to automatically rebuild docker containers if a container they depend on changes. This is currently a manual job at many companies, and as they grow this becomes more tedious and in need of automating.\u003c/p\u003e\n\u003cp\u003eI already have Bazel installed from a previous failed spike using \u003ca href=\"https://github.com/bazelbuild/rules_docker\"\u003ehttps://github.com/bazelbuild/rules_docker\u003c/a\u003e which are a bit overcomplicated and difficult to use for someone who is just learning (like me).\u003c/p\u003e","title":"Bazel Docker Dependency Tree: LearnLog"},{"content":"Serverless, specifically AWS Lambda, is awesome. It scales from 0 to near infinity, it costs next to nothing, and it integrates with almost everything. The trouble starts when going from one engineer deploying applications into one account, to lots of engineers deploying into many shared accounts. It’s hard to make sure applications follow the same good naming and security practices to stop everyone from stepping on each other’s toes.\nProviding a secure and pleasant experience for thousands of developers building and deploying hundreds of serverless applications to dozens of AWS accounts is the goal. To that end we developed and open sourced Fenrir, our AWS SAM deployer. This post is about how we use Fenrir to deploy serverless in a large organization.\nWhat the Framework (SAM, serverless…) Doesn’t Do Serverless frameworks typically include a CLI that can create/update AWS resources and deploy code. For example, both serverless deploy and sam deploy use AWS Cloud Formation (CF) to release code. These deploy commands are useful when getting started, and can easily be put into a CI/CD pipeline to accelerate application release.\nWhen more engineers start deploying serverless applications it is a good idea to ensure they:\nUse consistent naming: good naming (and tagging) of resources, like Lambda and API Gateway, will keep accounts clean and make obvious which resources belong to which projects. Follow recommended security practices: e.g. practice “least privilege” by giving Lambdas separate security groups and IAM roles. Create a reliable workflow: cleanly handle failure in a way that shows developers what happened, why it happened, and how to remedy. Record what is deployed: quickly answering what is currently deployed allows engineers to debug and understand the current state of the world. Our solution was to build a centralized deployer. This deployer provides clear boundaries to developers working in the same AWS account and blocks deployment unless common practices are followed. This removes the cognitive overhead of a lot of details and allow engineers to focus on their application code.\nFenrir Serverless Serverless Deployer Fenrir\nFenrir is our AWS SAM deployer; at its core is a reimplementation of the sam deploy command as an AWS Step Function, so it’s a serverless serverless (_serverless_²) deployer. sam deploy is an alias for a python script with two steps aws create-change-set and aws cloudformation execute-change-set.\nFenrir’s state machine replicates these steps with explicit state transitions, retries, and error handling:\nThe input to this state machine is a SAM template with some additional data like ProjectName, ConfigName and the AWS account to deploy to. The Fenrir state machine then performs the following steps:\nValidate: fills in defaults then validates the template is correct and all referenced resources are allowed to be used. Lock: creates a lock to make sure that only one deploy per project can go out at a time. CreateChangeSet and wait to Execute: create a change-set for a CF stack. Waits for the change-set to be validated and become available. ExecuteChangeSet and wait for Success: waits for the execution to finish. This state machine finishes in either a Success state, a FailureClean state where the release was unsuccessful but cleanup was successful, or a FailureDirty state that should never happen and will alert the team.\nFenrir (like our other open source deployer Odin) follows the Bifrost standard for building deployers at Coinbase. Bifrost adds multi-account support, security by default, visibility into deploys, and simple integration into our existing tools.\nWhat Fenrir Doesn’t Do Fenrir only supports subset of AWS SAM. Limiting the template scope reduces the surface area for possible naming conflicts and security risks.\nThe supported resources are AWS::Serverless::Function, AWS::Serverless::Api, AWS::Serverless::LayerVersion, AWS::Serverless::SimpleTable. Each of these have limitations, for example the AWS::Serverless::Function resource’s limitations are:\nFunctionName is generated and cannot be defined. Role and VPCConfig.SecurityGroupIds if defined must refer to resources that have correct tags*. VPCConfig.SubnetIds must have the DeployWithFenrir tag equal to true. Events supported Types are:\nApi: It must have RestApiId that is a reference to a local API resource S3: Bucket must have correct tags* Kinesis: Stream must have correct tags* DynamoDB: Stream must have correct tags* SQS: Queue must have correct tags* Schedule CloudWatchEvent *: correct tags means ProjectName, ConfigName tags are correct.\nSNS is not on the list of supported events. As of writing, SNS does not support tags making it difficult to validate a Lambda is allowed to listen to an SNS topic. Finding ways to support such events and resources securely is a future goal of Fenrir.\nHello Fenrir A simple SAM template that works with Fenrir includes ProjectName and ConfigName, e.g. template.yml would look like: ProjectName: “coinbase/deploy-test” ConfigName: “development”``AWSTemplateFormatVersion: “2010–09–09” Transform: AWS::Serverless-2016–10–31 Resources: helloAPI: Type: AWS::Serverless::Api Properties: StageName: dev EndpointConfiguration: REGIONAL hello: Type: AWS::Serverless::Function Properties: CodeUri: . Role: lambda-role Handler: hello.lambda Runtime: go1.x Events: hi: Type: Api Properties: RestApiId: !Ref helloAPI Path: /hello Method: GET\nThe hello lambda code: package main import “github.com/aws/aws-lambda-go/lambda”``func main() { lambda.Start(func(_ interface{}) (interface{}, error) { return map[string]string{“body”: “Hello”}, nil }) }\nFenrir uses Docker to build and bundle code sent to AWS. The hello function requires /hello.zip to exist in the built docker container, e.g. the Dockerfile: FROM golang WORKDIR / RUN apt-get update \u0026amp;amp;\u0026amp;amp; apt-get install -y zip COPY . . RUN go get github.com/aws/aws-lambda-go/lambda RUN GOOS=linux GOARCH=amd64 go build -o hello.lambda . RUN zip hello.zip hello.lambda\nTo package and deploy the template using the Step Function you run fenrir package \u0026amp;amp;\u0026amp;amp; fenrir deploy:\npackage builds the Docker image then extracts the zip files deploy uploads the zip files and sends the template as input to the Fenrir Step Function Implementation Fenrir is implemented primarily using:\naws-sdk-go to interact with CloudFormation and other AWS resources step as the framework to build, test and deploy AWS Step Functions (Why Coinbase uses Step Functions) goformation to encode/decode CloudFormation and SAM resources as golang structs and validate them using JSON schema. goformation uses the AWS CloudFormation Resource Specification and SAM specification to generate code and JSON schema. Fenrir then uses these to encode, decode, modify and validate templates. This code generation makes it very easy for Fenrir to keep up to date with changes in SAM and release features quickly.\nFuture It’s hard to build tools that are scalable, secure, and easy to use. Fenrir gives our developers cutting edge tools with clear boundaries on how to use them. This is a huge win, but there is still lots of room for improvement by supporting more SAM resources, events and properties.\nSAM/Fenrir can’t deploy static websites to S3 behind CloudFront as CloudFormation does’t support uploading S3 Objects. A future Fenrir feature is to provide a custom CloudFormation resource that can upload files to S3 for static website hosting. This would make Fenrir a full-stack serverless² deployer.\nFinally, Fenrir is still in beta and we welcome and contributions or feature requests over on our Github repository.\nGood Reads AWS Lambda — how best to manage shared code and shared infrastructure How to set up multi-account AWS SAM deployments with GitLab Implementing safe AWS Lambda deployments with AWS CodeDeploy AWS Lambda — should you have few monolithic functions or many single-purposed functions?If you’re interested in helping us build a modern, scalable platform for the future of crypto markets, we’re hiring Infrastructure Engineers! This website may contain links to third-party websites or other content for information purposes only (“Third-Party Sites”). The Third-Party Sites are not under the control of Coinbase, Inc., and its affiliates (“Coinbase”), and Coinbase is not responsible for the content of any Third-Party Site, including without limitation any link contained in a Third-Party Site, or any changes or updates to a Third-Party Site. Coinbase is not responsible for webcasting or any other form of transmission received from any Third-Party Site. Coinbase is providing these links to you only as a convenience, and the inclusion of any link does not imply endorsement, approval or recommendation by Coinbase of the site or any association with its operators.\nUnless otherwise noted, all images provided herein are by Coinbase.\n","permalink":"https://maori.geek.nz/posts/2019/2019-04-11_introducing-fenrir-how-coinbase-is-scaling-serverless-applications/","summary":"\u003cp\u003eServerless, specifically AWS Lambda, is awesome. It scales from 0 to near infinity, it costs next to nothing, and it integrates with almost everything. The trouble starts when going from one engineer deploying applications into one account, to lots of engineers deploying into many shared accounts. It’s hard to make sure applications follow the same good naming and security practices to stop everyone from stepping on each other’s toes.\u003c/p\u003e\n\u003cp\u003eProviding a secure and pleasant experience for thousands of developers building and deploying hundreds of serverless applications to dozens of AWS accounts is the goal. To that end we developed and open sourced \u003ca href=\"https://github.com/coinbase/fenrir\"\u003eFenrir\u003c/a\u003e, our \u003ca href=\"https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/what-is-sam.html\"\u003eAWS SAM deployer\u003c/a\u003e. This post is about how we use Fenrir to deploy serverless in a large organization.\u003c/p\u003e","title":"Introducing Fenrir: How Coinbase is Scaling Serverless Applications"},{"content":"\nI wanted to parse an AWS SAM template file using Ruby. This format uses [intrinsic](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/intrinsic-function-reference.html) functions which can be YAML custom tags, e.g. [!Ref](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/intrinsic-function-reference-ref.html) to reference a parameter or resource.\nHow do I parse a YAML custom tag in ruby? There are two options YAML.add_domain_type or YAML.add_tag.\nDomain types are **** easier: require \u0026quot;yaml\u0026quot;``YAML.add_domain_type(\u0026quot;\u0026quot;, \u0026quot;Ref\u0026quot;) do |type, value| value.upcase end``out = YAML.safe_load(\u0026quot;test: !Ref value\u0026quot;)``puts out # {“test”=\u0026gt;”VALUE”} puts out.to_yaml # test: VALUE\nThe domain type tag replaces any !Ref tag with the return value of the given block. This can simply return a string or a more complex object.\nTags are a bit more complicated: `require \u0026ldquo;yaml\u0026rdquo;\nrequire \u0026ldquo;json\u0026rdquo;class Ref def init_with(coder) [@value](http://twitter.com/value) = coder.scalar end def to_s\n@value.upcase\nend\nendYAML.add_tag(\u0026quot;!Ref\u0026quot;, Ref)out = YAML.safe_load(\u0026ldquo;test: !Ref value\u0026rdquo;, [Ref])``puts out # {“test”=\u0026gt;#\u0026lt;Ref:0x00 @value=”value”\u0026gt;}\nputs out.to_yaml\ntest: !Ref value: value``puts out.to_json # {“test”:”VALUE”}` YAML.add_tag replaces !Ref with an instance of the tag class instantiated with init_with(coder) where coder is a Psych::Coder containing the typed value of the tag (in this case a scalar string).\nYou should always safely parse any user supplied YAML document with safe_load. To allow the YAML parser to instantiate the custom tag classes you have to whitelist them as the second argument to safe_load.\nFinish For simple implementation use add_domain_type, for more complicated solutions use add_tag.\nIf there is anything I missed please let me know :)\nPS If using the safe_yaml gem, you have to parse using YAML.safe_load(\u0026quot;test: !Ref value\u0026quot;, nil, {whitelisted_tags: [\u0026quot;!Ref\u0026quot;]})\n","permalink":"https://maori.geek.nz/posts/2019/2019-02-06_yaml-custom-tags-in-ruby/","summary":"\u003cp\u003e\u003cimg alt=\"image\" loading=\"lazy\" src=\"/posts/2019/2019-02-06_yaml-custom-tags-in-ruby/images/1.jpeg#layoutTextWidth\"\u003e\u003c/p\u003e\n\u003cp\u003eI wanted to parse an \u003ca href=\"https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md\"\u003eAWS SAM\u003c/a\u003e template file using Ruby. This format uses \u003ccode\u003e[intrinsic](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/intrinsic-function-reference.html)\u003c/code\u003e functions which can be YAML \u003ca href=\"http://blogs.perl.org/users/tinita/2018/01/introduction-to-yaml-schemas-and-tags.html\"\u003ecustom tags\u003c/a\u003e, e.g. \u003ccode\u003e[!Ref](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/intrinsic-function-reference-ref.html)\u003c/code\u003e to reference a parameter or resource.\u003c/p\u003e\n\u003cp\u003eHow do I parse a YAML custom tag in ruby? There are two options \u003ccode\u003eYAML.add_domain_type\u003c/code\u003e or \u003ccode\u003eYAML.add_tag\u003c/code\u003e.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eDomain types\u003c/strong\u003e are **** easier:\n\u003ccode\u003erequire \u0026quot;yaml\u0026quot;``YAML.add_domain_type(\u0026quot;\u0026quot;, \u0026quot;Ref\u0026quot;) do |type, value|   value.upcase   end``out = YAML.safe_load(\u0026quot;test: !Ref value\u0026quot;)``puts out # {“test”=\u0026gt;”VALUE”}   puts out.to_yaml # test: VALUE\u003c/code\u003e\u003c/p\u003e","title":"YAML Custom Tags in Ruby"},{"content":"\nPre-Req: Must have Docker for Mac installed\nI wanted to play with AWS SAM to understand how it works and compare to other tech like serverless.\nInstall and Test SAM Use brew to install the aws-sam-cli: ``brew upgrade\nbrew update\nbrew tap aws/tap\nbrew install aws-sam-cli\nsam \u0026ndash;version\nSAM CLI, version 0.10.0``\nNow init a go SAM project: cd $GOPATH/src/github.com/grahamjenson/ sam init --runtime go --name hello-sam cd hello-sam\nThis creates a folder with: hello-sam/ ├── Makefile \u0026lt;-- Make to automate build ├── README.md \u0026lt;-- This instructions file ├── hello-world \u0026lt;-- Source code for a lambda │ ├── main.go \u0026lt;-- Lambda function code │ └── main_test.go \u0026lt;-- Unit tests └── template.yaml\nInstall the aws-lambda-go dependency and test: ``go get github.com/aws/aws-lambda-go\ngo test ./\u0026hellip;\nok github.com/grahamjenson/hello-sam/hello-world 1.041s``\nLocal Server Start a HTTP server: GOOS=linux GOARCH=amd64 go build -o hello-world/hello-world ./hello-world sam local start-api\nIn another terminal: ``curl http://127.0.0.1:3000/hello\nHello, 192.168.10.10``\nThis will pull down a docker container to execute the lambda in. This can be very slow.\nBuild \u0026amp; Deploy SAM build and deploy is basically a wrapper around CloudFormation package and deploy commands.\npackage uploads relative assets from template.yaml to S3 and outputs a new file replacing them with S3 references packaged.yaml.\nFor this we need a S3 bucket: aws s3 mb s3://hello-sam\nNow we can use the CLI: sam package \\ --template-file template.yaml \\ --output-template-file packaged.yaml \\ --s3-bucket hello-sam\nSpecifically, sam package uploads hello-world/ to S3 and replaces its CodeUri parameter to S3 URI in packaged.yaml.\nNow we can deploy: sam deploy \\ --template-file packaged.yaml \\ --stack-name hello-sam \\ --capabilities CAPABILITY_IAM\ndeploy wraps and waits to complete create-stack and update-stack.\nThis does not give you a chance to review what will be deployed. If you use --no-execute-changeset argument then you get a change-set ARN to use with describe-change-set and execute-change-set.\nFirst impressions Though I do not like CloudFormation because of its limitations (especially when compared to terraform). Being used for this use-case is great. Not having to define all the finicky little resources required for an API gateway in terraform means much less work.\nThe SAM CLI being used to create a local HTTP server for an API gateway is a bit clunky but works well enough. I hope they add more serverless resources like Step Functions to increase what can be tested locally.\nI am now looking at past projects I should migrate over to SAM (pwnbot being the first) and future projects where I can use more advanced features like canary deploys, nested applications and web-sockets.\n","permalink":"https://maori.geek.nz/posts/2018/2018-12-24_hello-sam-aws-golang-quickstart/","summary":"\u003cp\u003e\u003cimg alt=\"image\" loading=\"lazy\" src=\"/posts/2018/2018-12-24_hello-sam-aws-golang-quickstart/images/1.png#layoutTextWidth\"\u003e\u003c/p\u003e\n\u003cp\u003e\u003cem\u003ePre-Req: Must have\u003c/em\u003e \u003ca href=\"https://docs.docker.com/docker-for-mac/install/\"\u003e\u003cem\u003eDocker for Mac installed\u003c/em\u003e\u003c/a\u003e\u003c/p\u003e\n\u003cp\u003eI wanted to play with AWS SAM to understand how it works and compare to other tech like \u003ca href=\"https://serverless.com/\"\u003eserverless\u003c/a\u003e.\u003c/p\u003e\n\u003ch4 id=\"install-and-test-sam\"\u003eInstall and Test SAM\u003c/h4\u003e\n\u003cp\u003eUse \u003ccode\u003ebrew\u003c/code\u003e to install the \u003ccode\u003eaws-sam-cli\u003c/code\u003e:\n``brew upgrade\u003cbr\u003e\nbrew update\u003cbr\u003e\nbrew tap aws/tap\u003cbr\u003e\nbrew install aws-sam-cli\u003cbr\u003e\nsam \u0026ndash;version\u003c/p\u003e\n\u003cblockquote\u003e\n\u003cp\u003eSAM CLI, version 0.10.0``\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cp\u003eNow init a \u003ca href=\"https://golang.org/\"\u003ego\u003c/a\u003e SAM project:\n\u003ccode\u003ecd $GOPATH/src/github.com/grahamjenson/   sam init --runtime go --name hello-sam    cd hello-sam\u003c/code\u003e\u003c/p\u003e\n\u003cp\u003eThis creates a folder with:\n\u003ccode\u003ehello-sam/   ├── Makefile                    \u0026lt;-- Make to automate build   ├── README.md                   \u0026lt;-- This instructions file   ├── hello-world                 \u0026lt;-- Source code for a lambda   │   ├── main.go                 \u0026lt;-- Lambda function code   │   └── main_test.go            \u0026lt;-- Unit tests   └── template.yaml\u003c/code\u003e\u003c/p\u003e","title":"Hello SAM: AWS Golang Quickstart"},{"content":"AWS Step Functions are hosted state-machines defined according to the Amazon States Language. To execute a Step function you send it JSON data which is given to an initial state to process then pass the output to another state. States are processed until a success or failure state is reached.\nHow a state processes its input and selects the next state depends on its Type. For example, a Task state can use a Lambda function to process the input, and a Choice state can select which state to go to next based on its input.\nStep functions are awesome because they:\nExplicitly define the order of execution, including all conditional paths, in a simple to understand model. Perform common tasks, like calling Lambda functions, removing a ton of boilerplate code. Handle errors and retrying in response to failure increasing reliability without sacrificing understandability. Here is a small example where a state-machine calls out to a Lambda function and makes a choice based on its output: { \u0026quot;StartAt\u0026quot;: \u0026quot;CallLambda\u0026quot;, \u0026quot;States\u0026quot;: { \u0026quot;CallLambda\u0026quot;: { \u0026quot;Type\u0026quot;: \u0026quot;Task\u0026quot;, \u0026quot;Resource\u0026quot;: \u0026quot;\u0026lt;lambda_arn\u0026gt;\u0026quot;, \u0026quot;Next\u0026quot;: \u0026quot;Worked?\u0026quot;, `\u0026quot;Retry\u0026quot;: [{ \u0026quot;ErrorEquals\u0026quot;: [\u0026quot;KnownError\u0026quot;] }], ` `\u0026quot;Catch\u0026quot;: [{ \u0026quot;ErrorEquals\u0026quot;: [\u0026quot;States.ALL\u0026quot;], \u0026quot;Next\u0026quot;: \u0026quot;`Failure`\u0026quot; }] `}, \u0026quot;Worked?\u0026quot;: { \u0026quot;Type\u0026quot;: \u0026quot;Choice\u0026quot;, \u0026quot;Choices\u0026quot;: [ { \u0026quot;Variable\u0026quot;: \u0026quot;$.Worked\u0026quot;, \u0026quot;BooleanEquals\u0026quot;: true, \u0026quot;Next\u0026quot;: \u0026quot;Success\u0026quot; } ], \u0026quot;Default\u0026quot;: \u0026quot;Failure\u0026quot; }, \u0026quot;Success\u0026quot;: { \u0026quot;Type\u0026quot;: \u0026quot;Succeed\u0026quot; }, \u0026quot;Failure\u0026quot;: { \u0026quot;Type\u0026quot;: \u0026quot;Fail” } } }\nThis state-machine looks like (generated with _step dot --states \u0026lt;state_machine\u0026gt;_):\nStartAt defines the initial state CallLambda that executes the lambda at \u0026lt;lambda_arn\u0026gt;. The lambda’s output is then sent to Worked?, which goes to Success if its $.Worked attribute is true, otherwise it goes to Failure. If CallLambda returns a KnownError, it will Retry. For other errors it will go to Failure **** asStates.ALL is a catch-all for any error.\nLambda code and Step functions are separated from one another in AWS and can be developed independently. This can make them difficult to test and validate, as a change in one can cause a bug in the other. To make it easier to develop and test Step functions and Lambda we built the Step framework.\nHere is an example of a state-machine using the Step framework: ``func StateMachine() (*machine.StateMachine) {\nstate_machine, _ := machine.FromJSON([]byte({ \u0026quot;StartAt\u0026quot;: \u0026quot;CallLambda\u0026quot;, \u0026quot;States\u0026quot;: { \u0026quot;CallLambda\u0026quot;: { \u0026quot;Type\u0026quot;: \u0026quot;TaskFn\u0026quot;, \u0026quot;Next\u0026quot;: \u0026quot;Worked?\u0026quot;, \u0026ldquo;Retry\u0026rdquo;: [{ \u0026ldquo;ErrorEquals\u0026rdquo;: [\u0026ldquo;KnownError\u0026rdquo;] }],\n\u0026ldquo;Catch\u0026rdquo;: [{\n\u0026ldquo;ErrorEquals\u0026rdquo;: [\u0026ldquo;States.ALL\u0026rdquo;],\n\u0026ldquo;Next\u0026rdquo;: \u0026ldquo;Failure\u0026rdquo;\n}] }, \u0026quot;Worked?\u0026quot;: { \u0026quot;Type\u0026quot;: \u0026quot;Choice\u0026quot;, \u0026quot;Choices\u0026quot;: [ { \u0026quot;Variable\u0026quot;: \u0026quot;$.Worked\u0026quot;, \u0026quot;BooleanEquals\u0026quot;: true, \u0026quot;Next\u0026quot;: \u0026quot;Success\u0026quot; } ], \u0026quot;Default\u0026quot;: \u0026quot;Failure\u0026quot; }, \u0026quot;Success\u0026quot;: { \u0026quot;Type\u0026quot;: \u0026quot;Succeed\u0026quot; }, \u0026quot;Failure\u0026quot;: { \u0026quot;Type\u0026quot;: \u0026quot;Fail” } }}))\nstate_machine.SetResourceFunction(\u0026ldquo;CallLambda\u0026rdquo;, LambdaHandler)\nreturn state_machine\n}``\nThe type _TaskFn_ is an extension of the spec to tell the Lambda which Task is calling it so it can route to the correct handler.\nLambdaHandler is the function that is called when the Task state CallLambda is reached: ``type Input struct {}\ntype Result struct {\nWorked bool\n}\nfunc LambdaHandler(_ context.Context, _ *Input) (Result, error) {\nreturn Result{true}, nil\n}``\nHandlers contain the logic. The path is controlled by the state-machine. State-machines can change the path based on the handlers output, but a handler cannot decide what state to jump to.\nTesting With Step a state-machine can be executed by calling StateMachine().Execute(\u0026quot;{}\u0026quot;). This sends {} as an input into the machine and returns:\nThe final output. The “path” of the states that were visited. Errors encountered by the process. This is used by tests: ``func Test_Machine(t *testing.T) {\nexec, err := StateMachine().Execute(\u0026quot;{}\u0026quot;)\nassert.NoError(t, err)\nassert.Equal(t, {\u0026quot;Worked\u0026quot;: true}, exec.OutputJSON)\nassert.Equal(t, []string{\n\u0026ldquo;CallLambda\u0026rdquo;,\n\u0026ldquo;Worked?\u0026rdquo;,\n\u0026ldquo;Success\u0026rdquo;,\n}, exec.Path())\n}``\nFuzz tests are also very useful to help build reliable state-machines. The [gofuzz](https://github.com/google/gofuzz) library will randomly generate input to make sure no unhandled errors are returned: ``func Test_With_Fuzz(t *testing.T) {\nfor i := 0; i \u0026lt; 50; i++ {\nvar input Input\nfuzz.New().Fuzz(\u0026amp;input)\n_, err := StateMachine().Execute(input)\nif err != nil {\nassert.NotRegexp(t, \u0026ldquo;Panic\u0026rdquo;, err.Error())\n}\n// Other assertions like final states\n}\n}``\nDeploy The ultimate goal is to deploy the Step function and Lambda to AWS. For this we need an executable binary, let’s call it hello. hello executed without any arguments must start a Lambda with run.Lambda(StateMachine()). hello json should print the state-machine with run.JSON(StateMachine()).\nThe step binary can bootstrap (directly upload) hello to AWS. To install step: go get github.com/coinbase/step cd $GOPATH/src/github.com/coinbase/step go build \u0026amp;amp;\u0026amp;amp; go install\nThen build and bootstrap hello: ``# Build your code for the Lambdas linux environment\nGOOS=linux go build -o lambda\nzip lambda.zip lambda\nexport AWS creds using https://github.com/coinbase/assume-role assume-role account user\nUse step to upload your code and state-machine to AWS step bootstrap \\\n-lambda \u0026ldquo;hello-lambda\u0026rdquo; \\\n-step \u0026ldquo;hello-step-function\u0026rdquo; \\\n-states \u0026ldquo;$(hello json)\u0026quot;``\nStep does not create the Lambda/IAM/Step function resources, these must be created first with a tool like terraform or geoengineer.\nPractices Here are a few good practices to follow using Step:\nHandle All Errors: Every TaskFn should have a catch for States.ALL errors. This will ensure the state-machine ends in a proper state. Fail Quickly: The faster a state-machine fails the less cleanup is needed. Fail if unknown JSON parameters are sent, if referenced resources don’t exist, or if other pre-conditions are not met. Fuzz Input: As described above, using the gofuzz can save you a lot of time as it highlights errors caused by invalid input. Comment: use the Comment attribute on states. The ultimate goal is to be able to fully understand the state-machine without looking at the code. Design defensively: Step functions should behave predictably, especially when failing. Alert if a Step function execution finishes in an unexpected state.### Deployers and Bifrost Bridges are safer than swimming — San Francisco Golden Gate Bridge, photographed by Graham Jenson\nWhile making deployers as Step functions, a set of conventions emerged which I am calling Bifrost. It is named after the mythical bridge because taking a bridge is easier (and safer) than swimming.\nDeployers, at their core, productionize developed assets. For example, starting a server, pushing code to a Lambda, or uploading a new version of a package or container. Given this is the step in the development process that shows your hard work to the world, it should be very reliable.\nBifrost helps to build reliable deployers. By grouping together common concepts, a deployer’s code can focus on its core functionality.\nThe core of all deployers is the bifrost.Release struct: ``type Release struct {\nAwsAccountID *string json:\u0026quot;aws_account_id,omitempty\u0026quot;\nAwsRegion *string json:\u0026quot;aws_region,omitempty\u0026quot;\nReleaseSHA256 string json:\u0026quot;-\u0026quot;\nUUID *string json:\u0026quot;uuid,omitempty\u0026quot;\nReleaseID *string json:\u0026quot;release_id,omitempty\u0026quot;\nProjectName *string json:\u0026quot;project_name,omitempty\u0026quot;\nConfigName *string json:\u0026quot;config_name,omitempty\u0026quot;\nBucket *string json:\u0026quot;bucket,omitempty\u0026quot;\nCreatedAt *time.Time json:\u0026quot;created_at,omitempty\u0026quot;\nTimeout *int json:\u0026quot;timeout,omitempty\u0026quot;\nError *ReleaseError json:\u0026quot;error,omitempty\u0026quot;\nSuccess *bool json:\u0026quot;success,omitempty\u0026quot;\n}``\nTo extend the release: ``type DeployerRelease struct {\nbifrost.Release\n\u0026hellip; // The attributes for your release\n} This model stores information needed to deploy, e.g. the list of services, paths to assets, SHA’s for validation. The release is:\nThe input and output for every state handler. This means each state has immediate access to all necessary information about the release. Not secure. The state history log is persisted forever, so be careful with what you put in it. Always validated. The Validate method on the release ensures everything is correct. This can be overridden, but should always call the original. The Machine How a deployers state-machine is organized will depend on the asset being deployed. However, the end state should always be either:\n\u0026quot;Success\u0026quot;: { \u0026quot;Type\u0026quot;: \u0026quot;Succeed\u0026quot; }: Deploy succeeded and everything is good. FailureClean\u0026quot;: { \u0026quot;Type\u0026quot;: \u0026quot;Fail\u0026quot;}: Failed to deploy, but successfully cleaned up so a retry can be attempted. \u0026quot;FailureDirty\u0026quot;: { \u0026quot;Type\u0026quot;: \u0026quot;Fail\u0026quot; }: Something went really bad, and you should alert someone to have a look. This means that a Step execution can fail in a Clean expected way, or a very bad and Dirty way. If a state-machine execution ends in a FailureDirty state (or any state not Success or FailureClean) then someone needs to be alerted.\nThe state-machines TaskFn handlers should be thin with fat models. Handlers should be very obvious in their implementation and push the complexities to the Release model (which as stated above is the input and output of each handler). The goal is to make it easy to understand the impact a handler will have on a state-machine.\nThe first handler in your state-machine should look like this: ``func Validate(awsc aws.Clients) DeployHandler {\nreturn func(ctx context.Context, release *models.Release) (*models.Release, error) {\n// Assign the release its SHA before anything alters it\nrelease.ReleaseSHA256 = to.SHA256Struct(release)\n// Extracts the region and account the Lambda is running in\n// This is used to set defaults for release attributes\nregion, account := to.AwsRegionAccountFromContext(ctx)\nrelease.SetDefaults(region, account, \u0026ldquo;coinbase-odin-\u0026rdquo;)\nif err := release.Validate(awsc.S3(nil, nil, nil)); err != nil {\nreturn nil, \u0026amp;errors.BadReleaseError{err.Error()}\n}\nreturn release, nil\n}\n}``\nThis function returns a handler that:\nCalculates the input release’s SHA. Sets the defaults of the release, including Region, Account and Bucket. Validates the input release SHA against one uploaded to S3. The reason this function returns a handler is so that the aws.Clients struct is persisted across calls. aws.Clients manages AWS clients where awsc.S3Client(nil, nil, nil) creates a S3 Client without assuming a role. This pattern is further described here.\nIt is neither secure nor practical to put all information into the release. For this and other functions we use S3. Each release has a bucket where:\n/\u0026lt;account_id\u0026gt;/\u0026lt;project_name\u0026gt;/\u0026lt;config_name\u0026gt; **** is the root dir release.RootDir(). /\u0026lt;root_dir\u0026gt;/\u0026lt;release_id\u0026gt; **** is the release dir release.ReleaseDir(). These directories are useful as an audit trail, sending signals like Halt to the step function or release instances, and asset storage for things like Lambda zip files.\nBifrost Example Project To assist in building Bifrost deployers we built a “paved path” archetypal implementation that is a basic EC2 deployer. Structure like: ./ ├── .circleci/ # example CI setup ├── aws/ │ ├── ec2/ # example EC2 client │ ├── mocks/ # mock AWS clients │ └── aws.go # setup for multi-account AWS clients ├── client/ │ └── client.go # example client code ├── deployer/ │ ├── fuzz_test.go # fuzz test example │ ├── integration_test.go # tests for the Deployer │ ├── machine.go # state-machine definition │ ├── handlers.go # handler functions for tasks │ └── release.go # bifrost release struct ├── releases/ │ └── release.json # example release ├── scripts/ │ └── bootstrap_depolyer # bootstraping script ├── bifrost.go # executable code ├── Gopkg.toml # Go dependencies └── Dockerfile # Build bifrost for deploy\nTo use this to start building your own deployer run: ``export ORG=\nexport DEPLOYER=\u0026lt;your_deployer\u0026gt;\ngit clone git@github.com:coinbase/bifrost.git $DEPLOYER\ncd $DEPLOYER\nscripts/rename``\nThis will correctly rename the folder and references in the files to your deployer creating an easy starting place.\nThe deployers state-machine looks like:\nBifrost EC2 Example\nIt validates the input, locks the release, validates resources exist, deploys, waits a bit, checks if the deploy is healthy, then succeeds if healthy, fails if an error, or waits to retry and check later. Although the exact details of this deployer are not obvious, the overall flow of the state-machine is understandable.#### Odin\nOdin deploys 12 Factor applications into Auto-Scaling groups. To demonstrate how Bifrost’s conventions impact Odin’s implementation, let’s look at how Odin works.\nOdin’s Release looks like: ``type Release struct {\nbifrost.Release\nServices map[string]*Service json:\u0026quot;services,omitempty\u0026quot;\nuserdata *string // Not serialized\nUserDataSHA256 *string json:\u0026quot;user_data_sha256,omitempty\u0026quot;\nHealthy *bool json:\u0026quot;healthy,omitempty\u0026quot;\n\u0026hellip; // ignored LifecycleHooks, Subnets, Image\n}``\nThis Release struct at the center of Odin contains:\nThe list of services to be deployed. userdata that might be sensitive so is not persisted, instead is uploaded to S3 and validated against a SHA UserDataSHA256. A Healthy check to see if all its services are also healthy. Odin’s state-machine looks like:\nThe “Happy path” is Validate, Lock, ValidateResources, Deploy, CheckHealthy, Healthy?, CleanUpSuccess, Success. As seen in the diagram at any point an error might occur and the state will retry, or catch the error and clean up. This is very similar to the Bifrost example with a few extra paths to recover from failure.#### Bifrost Going Forward\nThe goal of this post was to give an introduction to Step, Step functions, and how to build a deployer with Bifrost. Our goal is to automate as many different deployers as we can to reduce toil, increase security and make processes easier to understand.\nFor more discussion on the above topics see Baking Bread with Step, Open sourcing Odin, and Hitchhiker’s Guide to AWS Step Functions_._Unless otherwise indicated, all images provided herein are by Coinbase.\nThis website may contain links to third-party websites or other content for information purposes only (“Third-Party Sites”). The Third-Party Sites are not under the control of Coinbase, Inc., and its affiliates (“Coinbase”), and Coinbase is not responsible for the content of any Third-Party Site, including without limitation any link contained in a Third-Party Site, or any changes or updates to a Third-Party Site. Coinbase is not responsible for webcasting or any other form of transmission received from any Third-Party Site. Coinbase is providing these links to you only as a convenience, and the inclusion of any link does not imply endorsement, approval or recommendation by Coinbase of the site or any association with its operators.\n","permalink":"https://maori.geek.nz/posts/2018/2018-11-20_aws-step-functions-state-machines-bifrost-and-building-deployers/","summary":"\u003cp\u003e\u003ca href=\"https://docs.aws.amazon.com/step-functions/latest/dg/getting-started.html\"\u003e\u003cstrong\u003eAWS Step Functions\u003c/strong\u003e\u003c/a\u003e are hosted state-machines defined according to the \u003ca href=\"https://states-language.net/spec.html\"\u003eAmazon States Language\u003c/a\u003e. To \u003cstrong\u003eexecute\u003c/strong\u003e a Step function you send it JSON data which is given to an initial state to process then pass the output to another state. States are processed until a \u003cem\u003esuccess\u003c/em\u003e or \u003cem\u003efailure\u003c/em\u003e state is reached.\u003c/p\u003e\n\u003cp\u003eHow a state processes its input and selects the next state depends on its \u003ccode\u003eType\u003c/code\u003e. For example, a \u003ccode\u003eTask\u003c/code\u003e state can use a \u003ca href=\"https://aws.amazon.com/lambda/getting-started/\"\u003eLambda function\u003c/a\u003e to process the input, and a \u003ccode\u003eChoice\u003c/code\u003e state can select which state to go to next based on its input.\u003c/p\u003e","title":"AWS Step Functions, State Machines, Bifrost, and Building Deployers"},{"content":" AWS provides an API for everything! This includes the Pricing API to find out how much you can spend via their other APIs. This API can be difficult to use to answer questions like:\nHow much does an EC2 instance cost?\nThe difficulty comes because the API has a single endpoint [GetProducts](https://docs.aws.amazon.com/aws-cost-management/latest/APIReference/API_pricing_GetProducts.html) that returns all prices for every service and product. That is a ton of data that requires lots of filtering to get what you want. You can download all the EC2 price data at once from a file Amazon hosts here. This file is 510Mb!\nThe data can also be accessed in more hacky ways. For example, using this file that is only 124Kb. The file is used for their frontend so is in an annoying format (JSON object wrapped in JavaScript) and starts with the comment:\nThis file is intended for use only on aws.amazon.com. We do not guarantee its availability or accuracy.\nHow much does an EC2 on demand instance cost? Lets write a python script using the Pricing API to find the cost of an EC2 instance.\nTo start we need to install boto with pip install boto3. The script starts with imports and a client: import boto3, json client = boto3.client('pricing') # create the client\nTo call the pricing API we get_products with the ServiceCode='AmazonEC2' and some filters: response = client.get_products( ServiceCode='AmazonEC2', Filters=[ { 'Type': 'TERM_MATCH', 'Field': '\u0026lt;field\u0026gt;', 'Value': '\u0026lt;value\u0026gt;' }, ... ], )\nThe filters we needs are:\nOperating System is Linux: 'Field': 'operatingSystem', 'Value': 'Linux' Cost of running the instance: 'Field': 'operation', 'Value': 'RunInstance and 'Field': 'capacitystatus', 'Value': 'Used' On a shared instance: 'Field': 'tenancy', 'Value': 'Shared' Of an instance type 'Field': 'instanceType', 'Value': '\u0026lt;insance_type\u0026gt;, e.g. r4.large In a region 'Field': 'location', 'Value': '\u0026lt;region\u0026gt;'. An annoying part of the pricing API is it uses non-standard region IDs, e.g. use US East (N. Virginia) instead of us-east-1. The top level element is the PriceList: price_list = response[\u0026quot;PriceList\u0026quot;]\nBecause of the filters, this list should only contain one “unstructured” JSON object that can be parsed: price_item = json.loads(price_list[0])\nThis returns an object like: { \u0026quot;serviceCode\u0026quot;: \u0026quot;AmazonEC2\u0026quot;, \u0026quot;product\u0026quot;: { \u0026quot;productFamily\u0026quot;: \u0026quot;Compute Instance\u0026quot;, \u0026quot;sku\u0026quot;: \u0026quot;CGJXHFUSGE546RV6\u0026quot; \u0026quot;attributes\u0026quot;: { \u0026quot;memory\u0026quot;: \u0026quot;15.25 GiB\u0026quot;, \u0026quot;vcpu\u0026quot;: \u0026quot;2\u0026quot;, ... } }, \u0026quot;terms\u0026quot;: { \u0026quot;OnDemand\u0026quot;: { \u0026quot;CGJXHFUSGE546RV6.JRTCKXETXF\u0026quot;: { ... \u0026quot;priceDimensions\u0026quot;: { \u0026quot;CGJXHFUSGE546RV6.JRTCKXETXF.6YS6EN2CT7\u0026quot;: { \u0026quot;unit\u0026quot;: \u0026quot;Hrs\u0026quot;, ... \u0026quot;pricePerUnit\u0026quot;: { \u0026quot;USD\u0026quot;: \u0026quot;0.1330000000\u0026quot; } } } } }, \u0026quot;Reserved\u0026quot;: { ... } } }\nTo get the price data we need to do some digging: ``terms = price_item[\u0026ldquo;terms\u0026rdquo;]\nterm = terms[\u0026ldquo;OnDemand\u0026rdquo;].itervalues().next()\nprice_dimension = term[\u0026ldquo;priceDimensions\u0026rdquo;].itervalues().next()\nprice = price_dimension[\u0026lsquo;pricePerUnit\u0026rsquo;][\u0026ldquo;USD\u0026rdquo;]\nprint price\n0.1330000000`` That is how to get the price for an EC2 instance in AWS.\nMore links More answers are here, here and here.\n","permalink":"https://maori.geek.nz/posts/2018/2018-11-03_aws-api-to-get-ec2-instance-prices/","summary":"\u003cp\u003e\u003cimg alt=\"image\" loading=\"lazy\" src=\"/posts/2018/2018-11-03_aws-api-to-get-ec2-instance-prices/images/1.jpeg#layoutTextWidth\"\u003e\nAWS provides an API for everything! This includes the \u003ca href=\"https://aws.amazon.com/blogs/aws/aws-price-list-api-update-new-query-and-metadata-functions/\"\u003e\u003cstrong\u003ePricing API\u003c/strong\u003e\u003c/a\u003e to find out how much you can spend via their other APIs. This API can be difficult to use to answer questions like:\u003c/p\u003e\n\u003cblockquote\u003e\n\u003cp\u003eHow much does an EC2 instance cost?\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cp\u003eThe difficulty comes because the API has a single endpoint \u003ccode\u003e[GetProducts](https://docs.aws.amazon.com/aws-cost-management/latest/APIReference/API_pricing_GetProducts.html)\u003c/code\u003e that returns all prices for every service and product. That is a ton of data that requires lots of filtering to get what you want. You can download all the EC2 price data at once from a file Amazon hosts \u003ca href=\"https://pricing.us-east-1.amazonaws.com/offers/v1.0/aws/AmazonEC2/current/index.json\"\u003e\u003cstrong\u003ehere\u003c/strong\u003e\u003c/a\u003e. This file is 510Mb!\u003c/p\u003e","title":"AWS API to get EC2 Instance Prices"},{"content":"So your organization has many AWS accounts, but you have services (like monitoring, deploying, security) that require access to resources across many/all of those accounts.\nThere are a few options for these services:\nDeploy the service in each account: to add an account you need to recreate the entire service with all its resources, and then maintain all of that infrastructure in perpetuity. Deploy the service in one account with access keys to other accounts: each account requires a user with a policy and access key that is given to the service. This can quickly get out of hand as the number of keys explode, both in maintaining the services access to them and security concerns around rolling them. Deploy the service in one account that can assume roles into other accounts: the service requires a user, instance profile, or role that is trusted by roles in the other accounts. The service only needs to know the name of the role and the account ID to work. The latter option has the least surface area to secure, requires the least amount of maintenance, and is the easiest to scale with the number of accounts. This post briefly goes over how to manage the assumed roles in a service written in Go.\nTrusting Role A role allows a service to assume it with a policy like: { \u0026quot;Version\u0026quot;: \u0026quot;2012-10-17\u0026quot;, \u0026quot;Statement\u0026quot;: [{ \u0026quot;Effect\u0026quot;: \u0026quot;Allow\u0026quot;, \u0026quot;Principal\u0026quot;: { \u0026quot;AWS\u0026quot;: \u0026quot;arn:aws:iam::\u0026lt;account_id\u0026gt;:role/\u0026lt;remote_role_name\u0026gt;\u0026quot; }, \u0026quot;Action\u0026quot;: \u0026quot;sts:AssumeRole\u0026quot; }] }\nIf the role name is standardized, e.g. as the name of the service, then the only information needed to assume that role is the account_id. This can be very powerful, if the account ID is passed to the service as a parameter it breaks the dependency of the service on the accounts it works in. This makes adding a new account effortless and easily scalable.\nAssuming Roles Assuming roles into multiple accounts without continuously re-authenticating means the service has to maintain sessions for each role/account.\nFor this it needs to store two pieces of information: 1. session.Session for the service role 2. aws.Config for each assumed role\nTo store these we need an implementation: type Clients struct { session *session.Session configs map[string]*aws.Config }\nTo create or retrieve a session: func (c Clients) Session() *session.Session { if c.session != nil { return c.session } sess := session.Must(session.NewSession()) c.session = sess return sess }\nThis is a pretty vanilla method. It is much more interesting to create, store, and retrieve configs: ``func (c Clients) Config(\nregion *string,\naccount_id *string,\nrole *string) *aws.Config {\n// return no config for nil inputs\nif account_id == nil || region == nil || role == nil {\nreturn nil\n}\narn := fmt.Sprintf(\n\u0026ldquo;arn:aws:iam::%v:role/%v\u0026rdquo;,\n*account_id,\n*role,\n)\n// include region in cache key otherwise concurrency errors\nkey := fmt.Sprintf(\u0026quot;%v::%v\u0026quot;, *region, arn)\n// check for cached config\nif c.configs != nil \u0026amp;\u0026amp; c.configs[key] != nil {\nreturn c.configs[key]\n}\n// new creds\ncreds := stscreds.NewCredentials(c.Session(), arn)\n// new config\nconfig := aws.NewConfig().\nWithCredentials(creds).\nWithRegion(*region).\nWithMaxRetries(10)\nif c.configs == nil {\nc.configs = map[string]*aws.Config{}\n}\nc.configs[key] = config\nreturn config\n}``\nThis will cache a unique config for each role and region, and retrieve that configuration if already created.\nThe magic of this method is in stscreds.NewCredentials. It returns credentials that expire in 15 minutes, but will auto-refresh them when needed. This means that we can cache the config without having to worry about either session or config expiring the credentials.\nClients A method to create a S3 client looks like: func (c *Clients) S3( region *string, account_id *string, role *string) s3iface.S3API { return s3.New(c.Session(), c.Config(region, account_id, role)) }\nTo get a client for the service role we call the method with: c.S3(nil, nil, nil)\nOr to return a client for an assumed role: c.S3(\u0026quot;nz-north-1\u0026quot;, \u0026quot;0123456\u0026quot;, \u0026quot;role-name\u0026quot;)Having services assume roles into other accounts is a super powerful tool to scale an organization and team. Hope this helped.\n","permalink":"https://maori.geek.nz/posts/2018/2018-08-08_assuming-roles-in-aws-with-go/","summary":"\u003cp\u003eSo your organization has \u003ca href=\"https://engineering.coinbase.com/you-need-more-than-one-aws-account-aws-bastions-and-assume-role-23946c6dfde3\"\u003emany AWS accounts\u003c/a\u003e, but you have services (like monitoring, deploying, security) that require access to resources across many/all of those accounts.\u003c/p\u003e\n\u003cp\u003eThere are a few options for these services:\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003e\u003cstrong\u003eDeploy the service in each account\u003c/strong\u003e: to add an account you need to recreate the entire service with all its resources, and then maintain all of that infrastructure in perpetuity.\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eDeploy the service in one account with access keys to other accounts\u003c/strong\u003e: each account requires a user with a policy and access key that is given to the service. This can quickly get out of hand as the number of keys explode, both in maintaining the services access to them and security concerns around rolling them.\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eDeploy the service in one account that can assume roles into other accounts\u003c/strong\u003e: the service requires a user, instance profile, or role that is trusted by roles in the other accounts. The service only needs to know the name of the role and the account ID to work.\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eThe latter option has the least surface area to secure, requires the least amount of maintenance, and is the easiest to scale with the number of accounts. This post briefly goes over how to manage the assumed roles in a service written in Go.\u003c/p\u003e","title":"Assuming Roles in AWS with Go"},{"content":"I want an Error to be raised during JSON unmarshalling if an undetected field is found. This is useful if you are trying to be extra careful, for example double checking the client has not mispeled any inputs. However I want to exclude some fields from this that might be used elsewhere.Model is the struct that we are unmarshalling. We must create a new type so that the unmarshalling function is not recursive (as explained below): type XModel Model\nXModelExceptions composes the XModel with the fields we do not want to error: type XModelExceptions struct { XModel Other *string // Other won't raise an error }\nNow we define the UnmarshalJSON method for Model. This function sends its input to unmarshal a XModelExceptions struct using a json Decoder with DisallowUnknownFields() turned on. This will return an error if a field is found that is not defined in XModelExceptions. ``func (model *Model) UnmarshalJSON(data []byte) error {\nvar me XModelExceptions\ndec := json.NewDecoder(bytes.NewReader(data))\ndec.DisallowUnknownFields() // Force errors\nif err := dec.Decode(\u0026amp;amp;me); err != nil { return err } *model = Model(me.XModel) return nil }``\nIt is important to note that If XModelExceptions was composed of Model, the unmarshalling function would be recursive and break. Hence the reason for XModel to exist.\nThis code can be seen working in our Odin Deployer here.\n","permalink":"https://maori.geek.nz/posts/2018/2018-07-24_golang-raise-error-if-unknown-field-in-json-with-exceptions/","summary":"\u003cp\u003eI want an Error to be raised during JSON unmarshalling if an undetected field is found. This is useful if you are trying to be extra careful, for example double checking the client has not mispeled any inputs. \u003cstrong\u003eHowever\u003c/strong\u003e I want to exclude some fields from this that might be used elsewhere.\u003ccode\u003eModel\u003c/code\u003e is the struct that we are unmarshalling. We must create a new type so that the unmarshalling function is not recursive (as explained below):\n\u003ccode\u003etype XModel Model\u003c/code\u003e\u003c/p\u003e","title":"GoLang- Raise Error if Unknown Field in JSON (with exceptions)"},{"content":"In 2017, Puppet and DORA (DevOps research and assessment) published their annual State of DevOps Report that collates more that six years of survey data about the cultural and technical impacts of DevOps. Analyzing over 27,000 responses they found that high performing engineering organizations have:\n46x more frequent code deployments (on demand deployments) 440x faster lead time from commit to deploy (less than one hour) 96x faster mean time to recover from downtime (less than one hour) 5x lower change failure rate (0%-15%) That is, high performing teams:\ndon’t have to trade speed for stability or vice versa, because by building quality in, they get both.\nThis goes against the common wisdom that we break things if we move fast. Instead, if a team focuses on building quality automation into their workflows then stability follows.\nBy this standard Coinbase has a high performing engineering organization, in that we:\ndeploy hundreds of times per day across hundreds of projects. a feature can go from an idea, to code, to deployed into production in under an hour. failure rates are low, and are typically easily recoverable. This is possible because most of our change management and deployment processes are automated and our awesome engineers have adopted a DevOps culture.\nToday we are open sourcing a key part of that automation — our AWS deployer Odin. Odin takes a description of a project release and then safely and securely launches it into AWS using auto-scaling groups. The open-source Odin is a newer version of a closed Ruby version, and is still in alpha at Coinbase.\nIn this post we will describe the design of Odin, its features, and how such a deployer can help build a high performing engineering organization. At its core Odin is meant to be simple and straight forward to use, while enforcing good engineering and security standards. As such, Odin was built towards:\nEphemeral Blue/Green: create new services, wait for them to become healthy, delete old services; treating them as disposable and ephemeral. Declarative: describe what a successful release looks like, not how to deploy it. Scalable: can scale both vertically (larger instances) and horizontally (more instances). Secure: resources are verified to ensure that they cannot be used accidentally or maliciously. Gracefully Fail: handle failures to recover and roll back with no/minimal impact to users. Configuration Parity: minimize divergence between production, staging and development environments by keeping releases as similar as possible. No Configuration: once Odin is deployed it requires no further configuration. Multi Account: one deployer for all AWS accounts.To satisfy the No Configuration and Multi Account requirements, Odin was implemented using native AWS technologies: a AWS Lambda Function and AWS Step Function (using the [step](https://github.com/coinbase/step) framework) that deploys by assuming a role into an AWS account. This means that the only requirement on running Odin is an AWS account, and the only prerequisite to deploy into an account is an IAM role with permission to do so.Once the Odin lambda, step function and role are in AWS, a release can be deployed using the odin executable. For example: odin deploy deploy-test-release.json Odin deploy (sped up)\nWhere deploy-test-release.json file looks like: { \u0026quot;project_name\u0026quot;: \u0026quot;coinbase/deploy-test\u0026quot;, \u0026quot;config_name\u0026quot;: \u0026quot;development\u0026quot;, \u0026quot;subnets\u0026quot;: [\u0026quot;test_private_subnet_a\u0026quot;, \u0026quot;test_private_subnet_b\u0026quot;], \u0026quot;ami\u0026quot;: \u0026quot;ubuntu\u0026quot;, \u0026quot;user_data\u0026quot;: \u0026quot;{{USER_DATA_FILE}}\u0026quot;, \u0026quot;services\u0026quot;: { \u0026quot;web\u0026quot;: { \u0026quot;instance_type\u0026quot;: \u0026quot;t2.nano\u0026quot;, \u0026quot;security_groups\u0026quot;: [\u0026quot;ec2::coinbase/deploy-test::development\u0026quot;], \u0026quot;elbs\u0026quot;: [\u0026quot;coinbase-deploy-test-web-elb\u0026quot;], \u0026quot;profile\u0026quot;: \u0026quot;coinbase-deploy-test\u0026quot;, \u0026quot;target_groups\u0026quot;: [\u0026quot;coinbase-deploy-test-web-tg\u0026quot;] } } }\nThis Declaratively describes the project that has one service web, that is:\nDeployed onto an Ubuntu AMI Into 2 subnets With a security group and instance profile Attached to an ELB and target group To increase Configuration Parity all references to resources are tags instead of IDs, which can differ per environment.\nIf the user data key equals {{USERDATA_FILE}} the Odin executable replaces the user data with the .userdata file contents, e.g. deploy-test-release.json.userdata: `#cloud-config\nrepo_update: true\nrepo_upgrade: all``packages:\ndocker.io``runcmd: docker run -d nginx` This will start the web service with an nginx http server, which will pass the ELB and target group health checks.The Odin executable takes the deploy-test release file, attaches a few pieces of metadata like a release-id and created at date, and sends it to the Odin step function that:\nvalidates the sent release and all referenced resources. creates a new auto-scaling group for web service that starts nginx. waits for all EC2 instances in the web ASG to pass their ASG, ELB, and target group health checks. This may take a few minutes. Once healthy delete ASGs from a previous release and terminate their instances. This is Ephemeral Blue/Green where old instances are deleted and new servers created. With this Coinbase can enforce our 30-day fleet age policy where we aim to have 98% of our instances under 30 days old.Odin is a state machine, so we can visually see the progress of the deploy using the AWS console:\nOdin’s state machine takes the original release object and passes it through each state adding and editing data until it reaches a success or failure state. The main Odin states are:\nValidate: validate the release is correct. Lock: grabs a lock so the same project-configuration cannot be deployed concurrently. ValidateResources: validate resources w.r.t. the project, configuration and service using them. Deploy: creates an ASG and other resource for each service. CheckHealthy: check to see if the new instances created are healthy w.r.t. their ASGs, ELBs and target groups. If instances are seen to be terminating immediately halt release. CleanUpSuccess: if the release was a success, then delete the old ASGs. CleanUpFailure: if the release failed, delete the new ASGs. ReleaseLockFailure: try to release the lock and fail. Understanding how each state can go wrong and how to respond allows Odin to Gracefully Fail. Once a failure occurs Odin will try to leave AWS clean by deleting created resources. Some common failures are:\nBadReleaseError: The sent release was invalid or a resource it referenced was invalid. LockExistsError: Another deploy is currently going out, or a previous deploy failed in an unknown way and requires manual cleanup. DeployError: Unable to create a resource. HaltError: Halt was detected or instances were found terminating. TimeoutError: The deploy took too long to become healthy. The default time Odin waits is 10 minutes, but the max time is 1 year (how long a step function can run). Once Odin has finished deploying it will end in one of these states:\nSuccess: the release was deployed. FailureClean: the release was unsuccessful, but cleanup was successful so AWS was left in good state. FailureDirty: the release was unsuccessful, and cleanup failed so AWS was left in a bad state. This should never happen and you should alert if this happens, and file a bug in GitHub. It is technically possible to end at any state if there is an error in Odin that cannot be recovered. If this happens alert and file a bug in GitHub as it is definitely a bug.Scale has been important in every aspect of Coinbase recently. Around December 2017 we became both the 40th largest website in the USA and the top iOS app causing 20x more traffic than we received just a month before. The entire company had to quickly respond to this, especially our application engineers.\nFortunately, Odin was built with scale in mind and with only minor configuration changes applications could both increase their size and number of servers, as well as add auto-scaling rules to handle traffic spikes. For example, to scale the deploy-test web service we could: { ... \u0026quot;services\u0026quot;: { \u0026quot;web\u0026quot;: { ... \u0026quot;instance_type\u0026quot;: \u0026quot;c4.xlarge\u0026quot;, \u0026quot;autoscaling\u0026quot;: { \u0026quot;min_size\u0026quot;: 3, \u0026quot;max_size\u0026quot;: 5, \u0026quot;policies\u0026quot;: [ { \u0026quot;type\u0026quot;: \u0026quot;cpu_scale_up\u0026quot; }, { \u0026quot;type\u0026quot;: \u0026quot;cpu_scale_down\u0026quot; } ] } } } }\nWith these changes deploy-test can handle increased traffic and scale instances relative to CPU so be resilient to sudden traffic spikes.Deployers are critical pieces of infrastructure and must be Secure. Ensuring only authorized users can deploy, limiting what resources they can use, and being able to see who did what and when.\nAuthentication is handled by good IAM policies like ensuring that only Odin can deploy and only selected users can call the Odin step function.\nAuthorization is through using tags on resources so only the correct project, configuration and service can use them. Also, by restricting use of S3 you can limit who can deploy what project.\nReplay and Man in the Middle attacks are protected by validating the creation date is recent and comparing the release to one uploaded to S3.\nAuditing what happened and when is the easiest aspect of Odin. All executions of step functions and lambdas are written to logs by AWS which can be inspected. However to make them searchable you should automate their export to another service like Kibana or Datadog.#### Odin @ Coinbase\nHaving our engineers manually manage multiple release bundles and deploying with an Odin executable is not a great user experience. Also, the release information and the code may be sensitive or mission critical, so to be safe and secure we would have to limit who can deploy to only “trusted” engineers. Bad UX and limitations slow down all engineers, introduce significant bottlenecks, increase deploy failures, and make us less secure.\nTo fix these issues we built Codeflow, an internal web application (not open source, yet…) that manages configurations and interacts with Odin. Codeflow tries to remove bottlenecks by letting all engineers deploy as long as the code and configurations have been reviewed. For example, here is how to deploy in Codeflow: Codeflow and Odin have separate concerns of what is deployed and how to deploy respectively. Together they automate and secure our deploy pipeline by enabling our engineers to deploy. By focusing on this kind of automation Coinbase avoids trading speed for stability, and instead we get both and **** move deliberately to fix things.\nIf you want to work with a high performing engineering organization, you should join Coinbase!The links in this blog post are being provided as a convenience and for informational purposes only; they do not constitute an endorsement or an approval by Coinbase of any of the content or views expressed by or on any external site. Coinbase bears no responsibility for the accuracy, legality or content of the external site or for that of subsequent links. Contact the external site for answers to questions regarding its content.\n","permalink":"https://maori.geek.nz/posts/2018/2018-05-22_open-sourcing-coinbases-secure-deployment-pipeline/","summary":"\u003cp\u003eIn 2017, \u003ca href=\"https://puppet.com/\"\u003ePuppet\u003c/a\u003e and \u003ca href=\"https://devops-research.com/\"\u003eDORA\u003c/a\u003e (DevOps research and assessment) published their annual \u003ca href=\"https://puppet.com/resources/whitepaper/state-of-devops-report\"\u003eState of DevOps Report\u003c/a\u003e that collates more that six years of survey data about the cultural and technical impacts of DevOps. Analyzing over 27,000 responses they found that high performing engineering organizations have:\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003e46x more frequent code deployments (on demand deployments)\u003c/li\u003e\n\u003cli\u003e440x faster lead time from commit to deploy (less than one hour)\u003c/li\u003e\n\u003cli\u003e96x faster mean time to recover from downtime (less than one hour)\u003c/li\u003e\n\u003cli\u003e5x lower change failure rate (0%-15%)\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eThat is, high performing teams:\u003c/p\u003e","title":"Open Sourcing Coinbase’s Secure Deployment Pipeline"},{"content":"When writing blog posts I have a few requirements:\nOffline: I travel and commute which means that I have no internet frequently which is a great time to write distraction free. Easy Publishing: I moved all my blogging to medium because I spent far too much time doing things on the blog that were not writing. Syncing: I like to write on my laptop, but sometimes it is more convenient on my phone, like when I am waiting for a train. Formatting: I like Markdown, Medium doesn’t use markdown. This is my biggest issue with Medium. Typically I will write a post in Markdown then manually convert it to Medium which is super annoying.Previously I have used Evernote which is great offline and syncing, but it has a different format to both medium and markdown making which makes it difficult to publish with. I have also just used a plain text editor Sublime Text with GFM markdown plugin, then wrote content in Notes in iOS and moved it manually to my laptop. This is a lot of manual syncing, and once completed I still have to convert it to mediums formatting.\nRecently, I have been using the Medium iOS app and the website to directly write and edit. Of course this is the obvious solution but it is impossible to do offline (for now).\nThis post is my first attempt at using the highly recommended IA Writer written on both the iOS and MacOS versions. I would typically would have just written a “Test Post”, but wanted to also write a bit of a review as well. Here are some tests:\nTest 1. Images\nScreen shot:\nCamera image:\nTest 2: Headings\nHeading 1 Heading 2 Heading 3 Test 3: quote and code\nInline code Block code\u0026gt; This is a quote — anonymous\nTest 4: linking\nGoogle\nTest 5 Syncing\nAfter installing the IA writer onto my MacBook which syncs to my iCloud, the posts I had written on iOS had auto synced over. Very easy.\nTeat 6: Publishing\nThis post will have no changes made to it from the posted version from IA Writer.\nReview iOS review: IA Writer does a lot to help edit markdown on iOS, including a customizable keyboard which means I can remove options that I would never use. The inserting of images directly from the phone as that was always the hardest aspect of converting to Medium. The markdown viewer is super clean, I might use it to read README.md\u0026rsquo;s from Github.\nOSX review: After syncing images from the iPhone they needed to be resaved as they were not the correct orientation. This was a minimal effort though, just right-click then “Open in Preview” save.\nSyncing review: The posts sync quickly and auto update on one device if they have changed on another. If there is a conflict though, IA shows you the conflict and lets you chose which one to keep. There is no smart merging like Evernote has, which would be an awesome feature in the future.\nPublishing review: This post was published without any changes being made in Medium. The biggest problem with publishing is that it can create a draft, but not update it. I would like to be able to overwrite a draft, but that could be dangerous.\nConclusion IA Writer is a bit expensive, but unlike other editors it is not a subscription model. So, if you are going to use it for a year it is already cheaper than most competitors. I am going to continue to use IA Writer as I am enjoying its clean interface and easy workflow, hopefully they continue to add features to improve the already great experience.\n","permalink":"https://maori.geek.nz/posts/2018/2018-05-15_selecting-an-editor/","summary":"\u003cp\u003eWhen writing blog posts I have a few requirements:\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003e\u003cstrong\u003eOffline\u003c/strong\u003e: I travel and commute which means that I have no internet frequently which is a great time to write distraction free.\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eEasy Publishing\u003c/strong\u003e: I moved all my blogging to medium because I spent far too much time doing things on the blog that were not writing.\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eSyncing\u003c/strong\u003e: I like to write on my laptop, but sometimes it is more convenient on my phone, like when I am waiting for a train.\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eFormatting\u003c/strong\u003e: I like Markdown, Medium doesn’t use markdown. This is my biggest issue with Medium. Typically I will write a post in Markdown then manually convert it to Medium which is super annoying.Previously I have used \u003cstrong\u003eEvernote\u003c/strong\u003e which is great offline and syncing, but it has a different format to both medium and markdown making which makes it difficult to publish with.\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eI have also just used a plain text editor \u003cstrong\u003eSublime Text\u003c/strong\u003e with GFM markdown plugin, then wrote content in Notes in iOS and moved it manually to my laptop. This is a lot of manual syncing, and once completed I still have to convert it to mediums formatting.\u003c/p\u003e","title":"Selecting an Editor"},{"content":" AWS Bread, recipe below\nAs developers we are pretty good at writing fast code because we put a lot of emphasis on that skill (especially in job interviews). Where we have a little more trouble is writing slow code, processes that don’t take milliseconds or seconds to run (e.g. a web request), but take minutes, hours, or days (e.g. data backup and migration) to complete.\nAs a process takes longer to complete, some qualities become much more important:\nReliability: understanding error conditions, implementing good retry logic to mitigate failure, in the case of unrecoverable failure ensuring we end in a good state. Visibility: seeing progress to ensure it is working correctly, inspecting state of the running process, when it fails it reports why and where it failed so we can mitigate in the future. Understandability: using abstractions to reason about the whole process without needing to understand details, knowing the potential paths of the process to know if it is acting correctly. In this post I am going to explore some of the difficulties writing reliable slow code by trying to bake some bread. I am also introducing [step](https://github.com/coinbase/step), a new Go framework that uses AWS Step Functions with Lambda to write reliable slow code.\nBaking Bread Baking bread takes time, if you rush it you will fail, so I often follow this recipe:\nmix 140g flour, 1/2 cup water, yeast wait 12 hours add 140g flour, 1/2 cup milk, salt knead until the dough is cohesive wait 2 hours bake at 250C until golden brown (about 20 mins) take out, cool, eat The whole process takes about a day. Most of that time is spent waiting for an external process: the yeast to leaven. Also, variability means that we have to keep checking to see if the bread has finished baking. The entire process is slow and tedious, so lets automate it.\nCode We could bake bread by writing a straightforward script: bread = Bread.new bread.mix({flour, water, yeast}) sleep 12.hours bread.mix({flour, water, salt}) until bread.cohesive? { bread.knead() } sleep 2.hours bread.bake(250) until bread.golden? { sleep 60 } bread.remove.cool.eat\nThis code is easy to understand and with a few well placed logging statements it would be easy to see its progress. So it is understandable with good visibility, but is it very reliable?\nIf the process dies while it is baking, we could have some big problems, e.g. the oven is left on with no one watching it. It is difficult to add timeout logic, so if the oven breaks, the bread would never bake and we would starve in an infinite loop. Also, with no retry logic or any error handling a small interruption like a phone call could make us forget what state we are in have to restart the entire process again. This code is pretty fragile.\nJobs Let’s improve the script by using a job running framework like sidekiq or cron: class Job1 def run bread = Bread.new bread.mix({flour, water, yeast}) Job2.schedule(bread).in(12.hours) end end``class Job2 def run(bread) bread.add(more_stuff) bread.mix({flour, water, salt}) until bread.cohesive? { bread.knead() } Job3.schedule(bread).in(2.hours) end end``class Job3 def run(bread) bread.bake(250) until bread.golden? { sleep 60 } bread.remove.cool.eat end end\nSerializing the state of the bread after each job would achieve the following:\nAdd visibility to the process as now you can see the input and output of each job. Let the framework schedule where each job is run, so physical machine (oven) failure or replacement is not an issue. Allow the framework to retry any job as it can be rerun with its persisted input. The trade-off is that it is now harder to understand the whole process. Finding out how the process got to Job(N) you must know Job(N-1)’s code and state, because that is where the structure is defined. This way of starting a job is like old school GOTO statements, where the programmer has full control of their programs structure to their own detriment. As Dijkstra put it:\n“our intellectual powers are rather geared to master static relations and that our powers to visualize processes evolving in time are relatively poorly developed”\nThat is to say, in the future as you become a better baker and try more complex bread recipes with parallel tasks and conditional paths, you may not be able to understand how the overall process works.\nA framework that provides “static relations” and gives holistic understanding would describe how each job relates to the whole, e.g.: Job1: Next Job2 Job2: Next Job3 Job3: End true\nThis makes the structure of the process and all the paths that it can take explicit. That kind of framework looks a lot like a state machine…\nAWS Step Functions and State Machines A state machine would be a great framework to write slow code in, especially if it is highly available, has good visibility, and good tooling. While looking for such a framework I came across AWS Step Functions which are hosted state machines defined in JSON that can call external code in Lambda functions. This “serverless” choice of framework has some great advantages:\nThe state machine JSON is described in a well defined specification. Step Functions can run for a **year (**that is very slow). The entire history of a process, including all visited states, inputs, outputs and errors is accessible via the execution history API. Billions of state transitions are run each day, so the underlying framework is incredibly reliable. Retry, error handling, timeouts are all defined in the state machine, strongly separating structure and implementation. A (simplified) state machine that bakes bread looks like: { \u0026quot;StartAt\u0026quot;: \u0026quot;InitMix\u0026quot;, \u0026quot;States\u0026quot;: { \u0026quot;InitMix\u0026quot;: { \u0026quot;Next\u0026quot;: \u0026quot;WaitForLeaven\u0026quot; }, \u0026quot;WaitForLeaven\u0026quot;: { \u0026quot;Seconds\u0026quot;: 43200, \u0026quot;Next\u0026quot;: \u0026quot;Mix\u0026amp;amp;Knead\u0026quot; }, \u0026quot;Mix\u0026amp;amp;Knead\u0026quot;: { \u0026quot;Next\u0026quot;: \u0026quot;WaitForRise\u0026quot; }, \u0026quot;WaitForRise\u0026quot;: { \u0026quot;Seconds\u0026quot;: 7200, \u0026quot;Next\u0026quot;: \u0026quot;Bake\u0026quot; }, \u0026quot;Bake\u0026quot;: { \u0026quot;Next\u0026quot;: \u0026quot;WaitForGolden\u0026quot; }, \u0026quot;WaitForGolden\u0026quot;: { \u0026quot;Seconds\u0026quot;: 300, \u0026quot;Next\u0026quot;: \u0026quot;Golden?\u0026quot; }, \u0026quot;Golden?\u0026quot;: { \u0026quot;Choices\u0026quot;: [ { \u0026quot;Variable\u0026quot;: \u0026quot;$.golden\u0026quot;, \u0026quot;BooleanEquals\u0026quot;: true, \u0026quot;Next\u0026quot;: \u0026quot;RemoveCoolEat\u0026quot; } ], \u0026quot;Default\u0026quot;: \u0026quot;Bake\u0026quot; }, \u0026quot;RemoveCoolEat\u0026quot;: { \u0026quot;End\u0026quot;: true } } }\nAWS renders this state machine to look like:\nBaking Bread State Machine\nWe can clearly see all possible paths for the process to take, and we just need to implement the code for InitMix, Mix\u0026amp;amp;Knead and Bake. However, this might be difficult because tools to build, test and deploy AWS Step Functions are absent. Until now…\nstep the AWS Step Function Framework By simplifying some aspects of a Step Function, like only having a single Lambda, and building a set of tools for local testing, our [new framework] (https://github.com/coinbase/step)`[step](https://github.com/coinbase/step)`, written in Go, can be used to develop, test and deploy Step Functions. The three core components of step:\nLibrary: tools for building and deploying Step Functions in Go. Implementation: of the AWS State Machine specification to test entire executions. Deployer: to deploy Lambda’s and Step Functions securely. To code for the above state machine to bake bread looks like: type Bread struct {…} func InitMix(_ context.Context, interface{}) (*Bread, error) { bread := Bread{Flour, Water, Yeast} bread.Mix() return bread, nil }``func MixAndKnead(_ context.Context, bread *Bread) (*Bread, error) { bread.MixIn({Flour, Water, Salt}) bread.Knead() return bread, nil }``func Bake(_ context.Context, bread *Bread) (*Bread, error) { bread.Bake(250) return bread, nil }\nCombining these functions with the above state machine and testing them together looks like: import “[github.com/coinbase/step/machine](http://github.com/coinbase/step/machine)”``sm := machine.FromJSON(\u0026lt;state machine JSON above) // Attach Functions to States sm.SetResourceFunction(\u0026quot;InitMix\u0026quot;, InitMix) sm.SetResourceFunction(\u0026quot;Mix\u0026amp;amp;Knead\u0026quot;, MixAndKnead) sm.SetResourceFunction(\u0026quot;Bake\u0026quot;, Bake)``bread, err := sm.Execute(nil) … assert.Equal(t, []string{ \u0026quot;InitMix\u0026quot;, \u0026quot;WaitForLeaven\u0026quot;, \u0026quot;Mix\u0026amp;amp;Knead\u0026quot;, \u0026quot;WaitForRise\u0026quot;, \u0026quot;Bake\u0026quot;, \u0026quot;WaitForGolden”, \u0026quot;Golden?\u0026quot;, \u0026quot;RemoveCoolEat\u0026quot;, }, sm.ExecutionPath()) bread.Eat()\nThis is a high level view of how to use step, to see the nitty-gritty have a look at the [step-hello-world](https://github.com/coinbase/step-hello-world) repo, or the [step](https://github.com/coinbase/step) code.\nDeploying Deployers With Deployers Once a state machine has been built, step provides a way to deploy it to AWS. step-deployer is a Step Function that can deploy Step Functions. This makes it a “recursive deployer” because it can deploy itself. step-deployer\u0026rsquo;s state machine looks like:\nstep-deployer state machine\nThe core states of the step-deployer are:\nValidate: validate the sent release bundle Lock: grab a lock on the deployed project ValidateResources: ensure the resources exist and correct for this project Deploy: update the Step Function and lambda, then release the lock ReleaseLockFailure: try release the lock and fail The end states are:\nSuccess: everything deployed correctly FailureClean: something went wrong but recovered to a good state FailureDirty: something went wrong and left in a bad state. To deploy using step we can use step as a command line tool: step deploy -lambda \u0026lt;lambda name\u0026gt; \\ -step \u0026lt;step-fn-name\u0026gt; \\ -states \u0026lt;state-machine-json\u0026gt;\nFor example, to deploy the step-deployer we: go build . # Build\u0026amp;amp;Install step in your operating system go install``# Build step for linux lambda GOOS=linux go build -o lambda zip lambda.zip lambda``step deploy -lambda \u0026quot;coinbase-step-deployer\u0026quot; -step \u0026quot;coinbase-step-deployer\u0026quot; -states \u0026quot;$(step json)\u0026quot;\nThis will execute the step-deployer Step Function to deploy the step-deployer function. That is the fun part of building recursive deployers.#### Slow and Fast Code\nWriting slow code requires putting more emphasis on different code qualities. Those qualities fit well with using a state machine, like an AWS Step Function. step is a library used to make developing, testing, and deploying Step Functions easier. Also, homemade bread is delicious and you should really try the above recipe.\n","permalink":"https://maori.geek.nz/posts/2018/2018-04-12_slowly-baking-bread-with-aws-step-functions/","summary":"\u003cp\u003e\u003cimg alt=\"image\" loading=\"lazy\" src=\"/posts/2018/2018-04-12_slowly-baking-bread-with-aws-step-functions/images/1.jpeg#layoutTextWidth\"\u003e\nAWS Bread, recipe below\u003c/p\u003e\n\u003cp\u003eAs developers we are pretty good at writing \u003cstrong\u003efast code\u003c/strong\u003e because we put a lot of emphasis on that skill (especially in job interviews). Where we have a little more trouble is writing \u003cstrong\u003eslow code,\u003c/strong\u003e processes that don’t take milliseconds or seconds to run (e.g. a web request), but take minutes, hours, or days (e.g. data backup and migration) to complete.\u003c/p\u003e\n\u003cp\u003eAs a process takes longer to complete, some qualities become much more important:\u003c/p\u003e","title":"Slowly Baking Bread with AWS Step Functions"},{"content":"Building security into your company culture is necessary but challenging. The first line of defense against most attacks are aware and vigilant people with an attitude of “see something, say something”.\nattr. Tony Webster\nTraining employees on what is abnormal behavior and who to talk to when there is a potential problem can save your company from a lot of pain. For example, alerting security when a co-worker makes an unusual request because one of their accounts has been compromised by a hacker.\nLock your Laptops One way Coinbase has improved the awareness of security in our organization is gamifying locking laptops using PwnBot. Unlocked and unattended laptops are open targets to be compromised. Anyone can gain access, install malicious software, copy credentials and other sensitive information, or just change a background image. If the attacker is prepared, they could do all of the above in seconds using a small programmable USB stick, like a MalDuino, to automate their actions.\nPwnBot is a Slack bot that you call on someone else’s unlocked computer with /pwn @\u0026lt;your_name\u0026gt; awarding a point to the “pwner” and recording the “pwnie”.\nShane pwning Jenson\nEveryone at the company can check the score board with /pwn to see who the most vigilant and careless employees are.\nShane checking the score board\nThis game is unreasonably fun and good at encouraging people to lock their computers. After releasing PwnBot at Coinbase, the game was taken way too seriously and finding an unlocked computer immediately became difficult. I have seen people run across the entire office to lock their computer before someone notices.\nNew employees are introduced to PwnBot along with the other security tools and processes at Coinbase, and if they were not paying attention to the security training then they will get pwned very quickly. Security culture is a part of our company from day one because new employees are on the same front line with everyone else.\nLinks You can install PwnBot to your Slack team with:\nOr you can use the open-source PwnBot code to deploy your own bot.\nListen to Philip Martin (Coinbase head of security) discussing security @ Coinbase on Software Engineering Daily podcast\nGraham Jenson’s talk about Coinbase and Security without Friction @ KiwiRuby\n","permalink":"https://maori.geek.nz/posts/2017/2017-12-28_gamifying-security-culture-with-pwnbot/","summary":"\u003cp\u003eBuilding security into your company culture is necessary but challenging. The first line of defense against most attacks are aware and vigilant people with an attitude of “see something, say something”.\u003c/p\u003e\n\u003cp\u003e\u003cimg alt=\"image\" loading=\"lazy\" src=\"/posts/2017/2017-12-28_gamifying-security-culture-with-pwnbot/images/1.jpeg#layoutTextWidth\"\u003e\n\u003cem\u003eattr.\u003c/em\u003e \u003ca href=\"https://commons.wikimedia.org/wiki/File:If_you_see_something,_say_something_-_Grand_Central_Terminal_NYC_%2823185180069%29.jpg\"\u003e\u003cem\u003eTony Webster\u003c/em\u003e\u003c/a\u003e\u003c/p\u003e\n\u003cp\u003eTraining employees on what is abnormal behavior and who to talk to when there is a potential problem can save your company from a lot of pain. For \u003ca href=\"https://blog.coinbase.com/on-phone-numbers-and-identity-423db8577e58\"\u003eexample\u003c/a\u003e, alerting security when a co-worker makes an unusual request because one of their accounts has been compromised by a hacker.\u003c/p\u003e","title":"Gamifying Security Culture with PwnBot"},{"content":"At Coinbase we use a GitHub-flow-ish workflow to collaborate on code and develop features:\nCreate a branch Create/Edit/Delete/Rename files on branch Create a Pull Request (PR) merging the branch into master Get code reviewed and make requested changes Merge the PR However, we are still having discussions around the final step. How should we merge pull requests?\nThe options are:\nMerge the pull request, git merge Manually squash the branch with git rebase or git reset , force push, then merge Use the “Squash and Merge” function in GitHub, basically git --squash merge branch The core differences between these methods is how much friction it is for developers to use, and what historical information is left behind.\nHow much friction does each method add to the developers work flow? Any method should not make a developers job unreasonably hard or annoying, otherwise people will find work-arounds or just not follow any standards.\nUsing git blame and git show and other code archeology tools to see the context of a change can help a developer understand why a bug exists and how to fix it. The historic record of what happened is stored in both git and GitHub. What crumbs of information left, where they exist, and how accurate they are can make the difference between an hours work or a week.\nExperimental Repository Below is a bash function that creates a repository to experiment with: `create_example_repository() {\nTo ensure the same commit shas on all repositories export DATE=\u0026ldquo;2017-01-01T01:01:01\u0026rdquo;\nexport GIT_COMMITTER_DATE=$DATE # Initializing the repository with a README git init echo \u0026quot;# Repository\u0026quot; \u0026gt; README.md echo \u0026quot;This is an example repository\\n\u0026quot; \u0026gt;\u0026gt; README.md git add README.md git commit -m 'first commit' --date=$DATE # Creating an example branch or PR\ngit co -b new-branch\necho \u0026ldquo;## Branch\u0026rdquo; \u0026raquo; README.md\ngit add README.md\ngit commit -m \u0026lsquo;branch\u0026rsquo; \u0026ndash;date=$DATE\nThis branch has multiple commits echo \u0026ldquo;this is an example branch\u0026rdquo; \u0026raquo; README.md\ngit add README.md\ngit commit -m \u0026lsquo;fix tests\u0026rsquo; \u0026ndash;date=$DATE\ngit co master`` # To simulate distributed work, add another commit to master\necho \u0026ldquo;# Repository\u0026rdquo; \u0026gt; README.md\necho \u0026ldquo;This is an example repository\u0026rdquo; \u0026raquo; README.md\necho \u0026ldquo;That many people work on\\n\u0026rdquo; \u0026raquo; README.md\ngit add README.md\ngit commit -m \u0026lsquo;commit\u0026rsquo; \u0026ndash;date=$DATE\n}`\nThis will repeatedly create the exact repository including the commit SHA’s (will be different per person because of the committer email).\nWith git log --graph --oneline --all this repository shows:\nThe example repositories git log\nMerge create_example_repository git merge --no-commit new-branch git commit -m 'typical merging' --date=$DATE A simple merge of the example repository\nThis method leaves in the git history a record of exactly what happened. This is not useful though, as every mistake or misstep in the merged branch remains. This in turn makes git blame and git show less useful.\nHowever, this is the easiest method to use. GitHub directly support merging, and it requires no extra work meaning most projects start out using this method. As the codebase grows and more people work on it, running git blame and seeing fix tests on half the lines will become annoying.\nManual Squash and Merge create_example_repository git co new-branch git reset $(git merge-base master new-branch) git add README.md git commit -m 'squashed' --date=$DATE git co master git merge --no-commit new-branch git commit -m 'squashed merge' --date=$DATE Manual squash and merge of the example repository\nThis is the most difficult method to use as GitHub doesn’t directly support squashing a branch, so you have to become comfortable with git rebase and/or git reset. Also, force pushing the branch to GitHub can wipe out useful information like comments, making it difficult to look back and see why decisions were made.\ngit history is very clean though, where git blame and git show have accurate and concise branch and merge information.\nThis method is preferred by people who are comfortable with git and want a clean history. To reduce friction you have to use scripts, and this can make a difficult situation even more confusing. If we want other non-technical teams (like design, legal, and compliance) to contribute in our workflows having such a technical strategy can add difficulty.\nSquash and Merge create_example_repository git merge --squash new-branch git commit -m 'squashing done' --date=$DATE Squash and Merge of the example repository\nIn the [git merge](https://git-scm.com/docs/git-merge) man page it says that--squash creates a state “as if a real merge happened (except for the merge information)”. This is because it does not squash the branch, and it also doesn’t create a merge commit. The “Squash and Merge” function from GitHub replicates this, and is why it doesn’t squash or merge.\nNot having all the information makes the git history difficult to read and it is difficult to find where a commit came from. Information like when code was written (not merged), who wrote the code (not merged the PR) is impossible to find without also going to GitHub. This can make tools designed for git difficult to integrate, as they must also retrieve information from GitHub.\nGitHub information, like the pull request title and number, are added to the in the commit message. Also, the git history is super clean, if only used then it looks like there are no branches. So it is very easy to use and creates a super clean (if inaccurate) history.\nSummary Merge: History messy but accurate. Very Easy Manual Squash then Merge: History is clean. Very Difficult. Squash and Merge: Clean but inaccurate history, fragmented between git and GitHub. Easy. My personal bias is towards the simple merge, because I am comfortable with git history dumpster diving for information. But I typically work on projects with few developers, and I am too lazy to manually squash every branch.\nOn larger projects, where we are trying to get buy-in from other teams, I would lean towards squash-and-merge as some inaccuracy is a good exchange for ease of use and clean history.\nHow do you merge? ","permalink":"https://maori.geek.nz/posts/2017/2017-12-26_githubs-squash-and-merge-doesnt-squash-and-doesnt-merge-tradeoffs-with-merging/","summary":"\u003cp\u003eAt Coinbase we use a \u003ca href=\"https://help.github.com/articles/github-flow/\"\u003eGitHub-flow\u003c/a\u003e-ish workflow to collaborate on code and develop features:\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003eCreate a branch\u003c/li\u003e\n\u003cli\u003eCreate/Edit/Delete/Rename files on branch\u003c/li\u003e\n\u003cli\u003eCreate a Pull Request (PR) merging the branch into \u003ccode\u003emaster\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003eGet code reviewed and make requested changes\u003c/li\u003e\n\u003cli\u003eMerge the PR\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eHowever, we are still having discussions around the final step. \u003cstrong\u003eHow should we merge pull requests\u003c/strong\u003e?\u003c/p\u003e\n\u003cp\u003eThe options are:\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003eMerge the pull request, \u003ccode\u003egit merge\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003eManually squash the branch with \u003ccode\u003egit rebase\u003c/code\u003e or \u003ccode\u003egit reset\u003c/code\u003e , force push, then merge\u003c/li\u003e\n\u003cli\u003eUse the “Squash and Merge” function in GitHub, basically \u003ccode\u003egit --squash merge branch\u003c/code\u003e\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eThe core differences between these methods is how much \u003cstrong\u003efriction\u003c/strong\u003e it is for developers to use, and what \u003cstrong\u003ehistorical\u003c/strong\u003e information is left behind.\u003c/p\u003e","title":"GitHub’s “Squash and Merge” doesn’t Squash and doesn’t Merge! Trade-offs with Merging"},{"content":" Bastion at Castel Sant’Angelo\nYou need more than one AWS account. This is to isolate production resources, manage limits (especially API rate limiting), handle costs, simplify compliance and security concerns, and restrict user access.\nHowever, managing multiple AWS accounts can be difficult. To help with this AWS has built many useful features like organizations, consolidated billing, VPC peering, and descriptive IAM policies. Using these features effectively to structure and connect AWS accounts, a variety of patterns have been developed. One such pattern used by Coinbase is the AWS Bastion account.\nAWS Bastion A bastion account stores only IAM resources providing a central, isolated account. Users in the bastion account can access the resources in other accounts by assuming IAM roles into those accounts. These roles are setup to trust the bastion account to manage who is allowed to assume them and under what conditions they can be assumed, e.g. using temporary credentials with MFA.\nHere is a basic example of how to set up a bastion account with an id 0987654321098 and a “production” account with the id 123456789012.\nBastion account users assume read role into production and development accounts\nIn the production account create a role called read with the trust relationship: { \u0026quot;Statement\u0026quot;: [ { \u0026quot;Effect\u0026quot;: \u0026quot;Allow\u0026quot;, \u0026quot;Principal\u0026quot;: { \u0026quot;AWS\u0026quot;: \u0026quot;arn:aws:iam::0987654321098:root\u0026quot; }, \u0026quot;Action\u0026quot;: \u0026quot;sts:AssumeRole”, \u0026quot;Condition\u0026quot;: { \u0026quot;Bool\u0026quot;: { \u0026quot;aws:SecureTransport\u0026quot;: \u0026quot;true\u0026quot;, \u0026quot;aws:MultiFactorAuthPresent\u0026quot;: \u0026quot;true\u0026quot; }, \u0026quot;NumericLessThan\u0026quot;: { \u0026quot;aws:MultiFactorAuthAge\u0026quot;: \u0026quot;43200\u0026quot; } } } ] }\nThe conditions aws:MultiFactorAuthPresent and aws:MultiFactorAuthAge force the use of a recent (within 12 hours) MFA token. This policy can also include other conditions like IpAddress to limit the IP addresses the role can be assumed from.\nIn the bastion account, create a IAM group called assume-read with the policy: { \u0026quot;Statement\u0026quot;: [ { \u0026quot;Effect\u0026quot;: \u0026quot;Allow\u0026quot;, \u0026quot;Action\u0026quot;: [ \u0026quot;sts:AssumeRole\u0026quot; ], \u0026quot;Resource\u0026quot;: [ \u0026quot;arn:aws:iam::123456789012:role/read\u0026quot; ], \u0026quot;Condition\u0026quot;: { \u0026quot;Bool\u0026quot;: { \u0026quot;aws:MultiFactorAuthPresent\u0026quot;: \u0026quot;true\u0026quot;, \u0026quot;aws:SecureTransport\u0026quot;: \u0026quot;true\u0026quot; }, \u0026quot;NumericLessThan\u0026quot;: { \u0026quot;aws:MultiFactorAuthAge\u0026quot;: \u0026quot;43200\u0026quot; } } } ] }\nBastion users that have the assume-read group attached can now assume the read role into the production account. You can test this by selecting “Switch Role” in the AWS management console, as shown below:\nassume-role Switching roles is pretty easy in the AWS management console, but the command line is a different story. Assuming a role with the AWS CLI requires a few steps. First, an API call with the MFA token is made to the bastion account to create temporary credentials. Then, an API call using the new credentials is made to the other account to assume the role. This can get tedious quickly as it has be redone every time the temporary credentials expire.\nAt Coinbase we try to use the CLI as much as possible because it makes tasks easier to automate across our many AWS accounts. To simplify assuming roles through our bastion account using temporary credentials and MFA, we built and open-sourced the [assume-role](https://github.com/coinbase/assume-role) tool. assume-role can be used like this:\nThis GIF uses a prompt to make sure you are on the right account in the correct role: # AWS ACCOUNT NAME function aws_account_info { [ \u0026quot;$AWS_ACCOUNT_NAME\u0026quot; ] \u0026amp;amp;\u0026amp;amp; [ \u0026quot;$AWS_ACCOUNT_ROLE\u0026quot; ] \u0026amp;amp;\u0026amp;amp; echo \u0026quot;%F{blue}aws:(%f%F{red}$AWS_ACCOUNT_NAME:$AWS_ACCOUNT_ROLE%f%F{blue})%F$reset_color\u0026quot; }``# )ofni_tnuocca_swa($ is $(aws_account_info) backwards PROMPT=echo $PROMPT | rev | sed \u0026rsquo;s/ / )ofni_tnuocca_swa($ /\u0026rsquo;| rev``\nassume-role can be installed with brew: brew tap coinbase/assume-role brew install assume-role\nor curl: curl https://raw.githubusercontent.com/coinbase/assume-role/master/install-assume-role | bash\nPull requests and issues that make assume-role better and more secure are always welcome.\nSummary\nAWS is moving in a direction where you must have multiple accounts and this is increasing the surface area for security issues as access to them must be managed. Using an AWS bastion account to manage access and forcing users to have temporary credentials with MFA is recommended but can be difficult to use. Tools like [assume-role](https://github.com/coinbase/assume-role) can make it easier to use the bastion while being secure.\nThanks to Jack, Shane, Luke, Yuliya, Rob for help in crafting this post.\nLinks [assume-role](https://github.com/coinbase/assume-role) Github repository More on Bastion Accounts AWS support on Authenticating with MFA on the CLI Python and Bash assume role scripts with profiles Ruby Gem assuming roles AWS CLI assume role issues ","permalink":"https://maori.geek.nz/posts/2017/2017-10-17_you-need-more-than-one-aws-account-aws-bastions-and-assumerole/","summary":"\u003cp\u003e\u003cimg alt=\"image\" loading=\"lazy\" src=\"/posts/2017/2017-10-17_you-need-more-than-one-aws-account-aws-bastions-and-assumerole/images/1.jpeg#layoutTextWidth\"\u003e\nBastion at Castel Sant’Angelo\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eYou need more than one AWS account.\u003c/strong\u003e This is to isolate production resources, manage limits (especially API rate limiting), handle costs, simplify compliance and \u003ca href=\"https://aws.amazon.com/answers/account-management/aws-multi-account-security-strategy/\"\u003esecurity\u003c/a\u003e concerns, and restrict user access.\u003c/p\u003e\n\u003cp\u003eHowever, managing multiple AWS accounts can be difficult. To help with this AWS has built many useful features like \u003ca href=\"https://aws.amazon.com/organizations/\"\u003eorganizations\u003c/a\u003e, \u003ca href=\"https://aws.amazon.com/answers/account-management/aws-multi-account-billing-strategy/\"\u003econsolidated billing\u003c/a\u003e, \u003ca href=\"http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-peering.html\"\u003eVPC peering\u003c/a\u003e, and \u003ca href=\"http://docs.aws.amazon.com/IAM/latest/UserGuide/tutorial_cross-account-with-roles.html\"\u003edescriptive IAM policies\u003c/a\u003e. Using these features effectively to structure and connect AWS accounts, a variety of patterns have been developed. One such pattern used by Coinbase is the \u003cstrong\u003eAWS Bastion\u003c/strong\u003e account.\u003c/p\u003e","title":"You need more than one AWS account: AWS bastions and assume-role"},{"content":" I am pretty sure that I am human. I am also pretty sure that my wife, my friends, my colleagues, and other people I have met in real life are also human.\nI am less sure about disembodied entities on the internet, like you. If you are human, then welcome to this post and I would like to hear what you think about it in the comments. Before you post anything though, I need you to prove to me that you are human because I don’t want to read comments posted by evil-spammy-bots.\nWhy? I know asking you to prove that you are human is weird, but it is necessary. Without some kind of proof that you are human every service on the internet built for human communication would quickly becomes taken over by bots. This is because bots make hundreds of millions of dollars per year through advertisements, scams, and malicious content (e.g. viruses) that annoy/harm humans.\n#spambot\nEmail is a good example of a service where bots send the majority of all messages as it doesn’t require any proof that the sender is human. Bots do this so well because they are very fast at posting messages and this makes them cheap to run. Compared to bots, we humans are very slow. Our time and attention is an expensive resource. Forcing a bot to be as slow as a human is the reason for the proof. A bot (like this one) sending 100 million emails a day would take nearly 3 years to send the same amount of emails if it took 1 second to prove they were human.\nI want you to prove to me that you are human so I know it costs you about as much time to post a comment as it takes me to read it.\nHow? It is too difficult and expensive to prove that you are human directly to me. Instead, I am going to trust Medium to do it, and Medium in turn trusts Google, Facebook, or Twitter (if you logged in with one of these services). This trust is necessary because the difficulty and scale to prove you are human is so huge that it has fallen to a small group of large companies. These companies are trusted by an increasing number of services to stop spam on the internet. So how are these companies solving this difficult task?\nThey used to just have you write down letters from an image with distorted characters.\nn752Grr is proof I am human\nThis was used because humans have a far superior visual abilities than bots. At least we used to. Since around 2010 improvements to image reading algorithms helped bots solve these kind of problems as fast as humans. Add to that the decrease in outsourced labour costs (to about $0.35 to read 1000 images) and ingenious workarounds (like tricking people to read images to view streaming content) — reading distorted images can no longer prove you are human.\nThese days companies like Facebook, Google, and Twitter have different ways to decide your humanity.\nFacebook has the most strict policy that states you have to be a real human and use your real name.\nFacebook has “developed sophisticated systems to help block automated programs (or “bots”)”. However, these systems sometimes discriminate against other cultures and the disenfranchised seeking anonymity from authoritarian regimes.\nGoogle uses its reCAPTCHA service (that is free for everyone) which asks you to click a button or group some images.\nBot solving reCAPTCHA\nThis service tracks your mouse movements, IP address, cookies, and other information to try determine if you are human. Google says “You don’t have to verify your identity, to verify your humanity.”, but the system uses IP addresses that could identify you via other Google services e.g. gmail, youtube, Adwords…\nTwitter allows bots as long as they behave like good humans and don’t spam other users with unwanted messages. This policy has meant that now 9% to 15% of all Twitter users are bots, many of which are bad like @sunneversets100, an amplifier-bot of pro-Kremlin messages that in a 288 day period posted over 200k tweets (research borrowed from here). This bad-bot tweeted 5 times every 10 minutes for 288 days which is about 140 times more than an average user tweets.\nTypical Twitter user\nTwitter says it has programs to delete bots, but online tools that try to detect bots on Twitter show little decrease, e.g. this tool estimates that 55% (20 million) of Donald Trump’s followers are bots and 18% (16 million) of Barack Obama’s followers are too.\nToday, we have companies we are entrusting with securing our ever growing communication over the internet. They are trying to prove we are human using “sophisticated programs” whose inner workings are hidden from me and you. These programs often require personal information, can sometimes discriminate against groups of people, and have dubious success rates. Well, now I am less sure you are human. Especially if you logged in with Twitter.\nFuture The rise in propaganda, fake news, ransomware, and spam spread by bots is increasing the need for a good “proof of human”. The companies that we trust to determine our humanity are fighting an up-hill battle as the constant improvement of AI is making bots closer and closer to human. Soon we might live in a dystopian future where to participate in human communication you will be required to prove, by unknown methods, that you are human to a few large multi-national corporations. Wait, are we there already?\nAre you human?\nProof of Work replacing Proof of Human If we want to stop our dystopian present where a few companies control proof of human for most of the internet, we can to look at how similar problems have been solved by decentralized systems, e.g. distributed-ledges like Bitcoin. These systems are networks constructed of mutually distrusting parties that must work together towards a similar goal. To stop evil spammy bots from consuming their networks, these distributed ledgers have to employ multiple strategies. One such strategy is proof of work.\nA game is a “_voluntary attempt to overcome unnecessary obstacles”\n—_Bernard Suits\nProof of work is a game we force computers to play to slow them down. Bitcoin’s proof is to find a string that when added to a message makes their SHA-256 hash start with a certain number of 0s. For example, (from the Bitcoin wiki) if you wanted to find a proof of work for “Hello, World!” and the target was to find three 0’s, we could start by adding integers to the end of the string like: “Hello, world!0” =\u0026gt; 1312af178c253...``“Hello, world!1” =\u0026gt; e9afc424b79e4...``“Hello, world!2” =\u0026gt; ae37343a357a8...``…``“Hello, world!4248” =\u0026gt; 6e110d98b388e...``“Hello, world!4249” =\u0026gt; c004190b822f1...``“Hello, world!4250” =\u0026gt; 0000c3af42fc3...\nThis would take 4251 tries (the work) to find the answer (the proof). Finding this proof took time but everyone can easily validate it. This game also comes with a difficulty setting where we can make it harder for computers to play by asking them to find more 0’s.\nInstead of asking humans to perform a slow task like reading an image, we could ask computers to play our proof of work game. This would slow bots down to a more human speed, increasing their cost. Moreover, the difficulty setting becomes an adjustable lever we can use as computers become faster, or as bots launch focused attacks (like during an election). Another benefit is that, unlike proof of human, AI research and human labour costs don’t affect proof of work’s efficacy.\nThe costs to find a proof are distributed across the networks users based on how much they use/abuse it. As a human, you might pay only half a second more to post a comment, barely noticeable to you. Where as a bot trying to post the same comment a million times would be waiting for five days.\nProof of work has a few downsides though, it can lead to bad UX for users as they wait for their computers to find a proof, it also wastes power as the computer plays its game. Both of these can be mitigated by only requiring proofs from users that act like bots, or changing the difficulty for different actions, e.g. having a high difficulty for signup and a low difficulty for commenting.\nUsing proof of work instead of proof of human to secure communication is not perfect. Selecting the right proof and building a system that punishes bots but not humans is a difficult challenge. I think proof of work is ultimately better than proof of human because it can be understood without being hidden, its adjustable difficulty allows for finer control, and it can be implemented without needing to trust a few large companies.\nSecuring a Blockchain with Proof of Human Let’s invert the above scenario: could we build a blockchain that uses proof of human instead of proof of work?\nThis entire post came out of a joke where I wanted to try to build a blockchain that used the distorted images proof (or another proof of human) instead of proof of work. After some thought, I am not sure it is possible and here is why.\nAll proof of human methods (I have seen) require a trusted central party to generate the problem, keep some piece of information secret which is then used to validate the answer. It is not decentralized to have a trusted party.\nFor a decentralized proof of human method to secure a blockchain it should follow the following constraints:\nIt should be easier for a human to solve than a computer; a computer can solve it, it just has to cost them more than a human to solve It should be easy for any parties computer to validate It should have adjustable difficulty; as computers get better at being human, we real humans can increase the difficulty The answer to a problem cannot be not known by anyone; otherwise someone would have an advantage Reading distorted images has 2/4 of these properties, as whomever generated the distorted image knows the answer (4) and is the only party who can validate it (2).\nWeakening some of the above constraints might work. For example, if we had humans validate the answer using crowd computing, and the problem was something like “find a unique picture of a fluffy cat”. This problem satisfies (1) (3) and (4), until computers become a better judges of “fluffy”.\nA fluffy cat proving I am human\n“Proof of Human via Fluffy Cat” is a distributed reverse Turing test where the answers are validated by having many humans “vote” on the picture that is most likely taken by a human. The winner after a time limit is then selected to be human. The two problems with this solution is that it assumes humans are good at recognizing other humans, and making sure only humans “vote” requires another proof of human. Ultimately, this would just become a decentralized reddit where users are rewarded for looking at cat pictures all day.\nBesides the technical aspects of this idea, another concern is the morality of having a blockchain powered by proof of human. The problem is that this could create a dystopian future where people are paid to repeatedly prove that they are human in blockchain mines. It is concerning that we are nearly there as right now people are earning $0.35 to prove they are human 1000 times by reading distorted images.\nPost script Most of the ideas in this post have already been explored by multiple other sources: proof of human is just another name for CAPTCHA, proof of work to protect APIs has been around since the 90s and is actively used in implementations like Hashcash. I think using “proof of human” to secure a blockchain might be an original idea, even if it is a bad one. I am still interested to find out if it is possible. So, if you have any ideas about how to use a “proof of human” to implement a blockchain, leave a comment below… but only if you are human.\nFurther Reading Twelve ways to spot a bot\nAljazeera blocking comments now relying on Facebook and twitter\nMachines learning to flag toxic comments\nThe Economics of Spam (2012)\nWhy cant Twitter kill bots\nSymantec 2017 Security Report\nLife 3.0 equates the size of a consciousness to the speed at which it can make decisions. That is, being slow might be a necessary part of being a conscious and human.\n","permalink":"https://maori.geek.nz/posts/2017/2017-09-13_proof-of-human/","summary":"\u003cp\u003e\u003cimg alt=\"image\" loading=\"lazy\" src=\"/posts/2017/2017-09-13_proof-of-human/images/1.jpeg#layoutTextWidth\"\u003e\nI am pretty sure that I am human. I am also pretty sure that my wife, my friends, my colleagues, and other people I have met in real life are also human.\u003c/p\u003e\n\u003cp\u003eI am less sure about disembodied entities on the internet, like \u003cstrong\u003eyou\u003c/strong\u003e. If you are human, then welcome to this post and I would like to hear what you think about it in the comments. Before you post anything though, I need you to prove to me that you are human because I don’t want to read comments posted by evil-spammy-bots.\u003c/p\u003e","title":"Proof of Human"},{"content":"A Domain Specific Language (DSL) is a specialized way to clearly describe a problem domain. Ruby is a great language for creating DSLs because it lets developers decide how the language looks and is used. For example:\nRSpec for writing tests:\nFactoryGirl for mocking objects:\nGeoEngineer for defining cloud resources:\nThis post will briefly describe how to use Ruby to create DSLs like the examples above.\nBuilding Blocks Ruby DSLs typically use functions that take a block given as an argument to build an instance of an domain object:\nCalling instance_exec(obj, block) means that self in the block is equal to obj. In this example that means self is the instance of the newly created turtle.\nWe can build more of the problem domain by adding methods to the base object:\nThis is similar to the way that RSpec creates its describe and it methods. The DSL then builds a tree of domain objects, and once defined it can be used to solve the domains problem.\nRemoving Self An annoying “gotcha” in the above example is that to assign values you must call self. This is because calling name = (rather than self.name =) creates name variable scoped to the block instead of the self object. self is also really ugly, given the goal of a Ruby DSL is to hide the \u0026ldquo;cruft\u0026rdquo; that is not part of the domain.\nTo remove self we must use methods rather than =:\nThis is starting to look a bit nicer.\nThe Infinite DSL Some DSLs like FactoryGirl and GeoEngineer have user defined properties that cannot be defined ahead of time. Doing this dynamically requires overriding the method_missing function:\nThis is the general case of assigning variables so that now any variable can be assigned. This has one significant downside, that now the turtle can never have a method missing exception, so be careful.\nLazy Evaluation In many DSLs there are attributes that are expensive to calculate and not always needed. In GeoEngineer attributes can be lazily evaluated and then cached so only ever executed once. This is done by assigning the attribute as a Proc and executing that Proc only when it is being retrieved:\nTurtles All the Way Down Some DSLs need to be recursive, like a family tree of turtles. This can be accomplished by also handling blocks in method_missing to then create a domain object:\nThis DSL is now capable of building complex structures of turtles, to solve the may philosophical questions they pose.\nThe End Language affects the way someone thinks about a problem domain and how they search for a solution. The closer the language you use to the problem the less hurdles you have to leap to finding a solution. Building a DSL in Ruby that can represent a problem domain can help you quickly define, describe and solve problems. I like Turtles.\n","permalink":"https://maori.geek.nz/posts/2017/2017-02-17_turtles-all-the-way-down-building-simple-and-powerful-ruby-dsls/","summary":"\u003cp\u003eA Domain Specific Language (DSL) is a specialized way to clearly describe a problem domain. Ruby is a great language for creating DSLs because it lets developers decide how the language looks and is used. For example:\u003c/p\u003e\n\u003cp\u003e\u003ca href=\"https://github.com/rspec/rspec\"\u003e\u003cstrong\u003eRSpec\u003c/strong\u003e\u003c/a\u003e for writing tests:\u003c/p\u003e\n\u003cp\u003e\u003ca href=\"https://github.com/thoughtbot/factory_girl\"\u003e\u003cstrong\u003eFactoryGirl\u003c/strong\u003e\u003c/a\u003e for mocking objects:\u003c/p\u003e\n\u003cp\u003e\u003ca href=\"https://github.com/coinbase/geoengineer\"\u003e\u003cstrong\u003eGeoEngineer\u003c/strong\u003e\u003c/a\u003e for defining cloud resources:\u003c/p\u003e\n\u003cp\u003eThis post will briefly describe how to use Ruby to create DSLs like the examples above.\u003c/p\u003e\n\u003ch3 id=\"building-blocks\"\u003eBuilding Blocks\u003c/h3\u003e\n\u003cp\u003eRuby DSLs typically use functions that take a block given as an argument to build an instance of an domain object:\u003c/p\u003e","title":"Turtles All The Way Down: Building Simple and Powerful Ruby DSLs"},{"content":" Zion National Park’s amazing geology\nThe goal of the infrastructure team at Coinbase is to provide self-service tooling to our engineers to empower them to rapidly develop, monitor, and optimize services with low risk. With this mission in mind, we are currently in the process of building a workflow for creating and managing our codified infrastructure resources that looks like:\nPull Request: an engineer submits a pull request to a repository with a new codified resource they want. Validation: the new resource is automatically validated and follows our company standards for naming, tagging, and security. Plan and Review: a plan describing the actions needed to be taken to apply a change is presented alongside the code change to be reviewed by an infrastructure team member. Merge then Apply: if the plan is good, then the pull request can be merged and automatically applied to the cloud. This workflow manages our codified infrastructure the same way we manage our code with GitHub flow; i.e. open a pull request, ensure the change is valid with tests, merge the change into the master branch, then apply the changes to the necessary environments. The main idea of this workflow is to improve collaboration between the infrastructure and engineering teams. This will also speed up the development and deployment of resources, and make sure we deliver what is actually needed.\nTo decrease the learning curve, we want to standardize on a single tool to codify our resources. In the past, we have used a mixture of tools like CloudFormation, our open source tool Demeter, and Terraform. After looking at a number of tools, we found Terraform provides the most features our workflow requires: description of existing resources, easy definition and planning, support for variety of resources. However, using Terraform was difficult for a variety of reasons: lack of custom validations, coarse abstractions creating a lot of copy/paste code, and difficulties managing state.\nTo reuse as much of Terraform’s functionality as possible, we decided to build a thin wrapper around it that fits our desired workflow better. The tool we built is GeoEngineer __ (Geo for short): it provides a Ruby DSL (similar to Terraform’s) to codify resources, and a command line tool geo to plan and execute changes. This post describes how we use Geo at Coinbase to support this workflow that treats our infrastructure resources like code.\nPull Request The most difficult requirement to implementing our workflow is that any engineer at Coinbase should be able to submit a pull request that codifies a new resource or change an already codified resource. This means the workflow requires a short learning curve for engineers who might not be familiar with the details of AWS or Terraform. GeoEngineer provides a familiar programming environment (a Ruby DSL) which has branching, functions, and variables, allowing to abstract away details with reusable templates, helper functions, and projects. For example, we use templates to describe resources in patterns like our internal_elb template that codifies an Elastic Load Balancer (ELB) for internal use, a security group for the ELB, and a security group for EC2 instances attached to the ELB: project.from_template('internal_elb', 'api', { listeners: [{ in: 443, out: 8080 }] })\nThe DSL also supports helper functions to define smaller patterns inside resources, e.g. the function all_egress_everywhere creates a typical egress for a security group: def all_egress_everywhere egress { from_port 0 to_port 0 protocol '-1' cidr_blocks ['0.0.0.0/0'] } end``project.resource('aws_security_group', 'ec2_default') { all_egress_everywhere }\nYou may have noticed above that resources are defined on a project which Geo uses to group related resources together, e.g. a project definition: # ./projects/coinbase/foo.rb project = project('coinbase', 'foo') { environments 'staging', 'production' tags { ProjectName 'coinbase/foo' slack_channel 'foo' monitor 'true' } }\nAt Coinbase, we have one project per file and organize the files into organization folders (e.g. projects/\u0026lt;org\u0026gt;/\u0026lt;name\u0026gt;.rb) to make it easy to find where a resource should be codified. Projects can be applied to many environments, typically a project is developed in the ‘staging’ environment then applied to ‘production’ when it is ready. Optional project tags are applied to all sub resources to make identifying resources for accounting, alerting and debugging very easy.\nThe abstractions that GeoEngineer provides are mainly to shorten the learning curve, and they also have the benefits of removing large portions of copy and paste code. We are seeing about 80% less lines of Geo code v.s. the generated Terraform.\nWithout abstraction __ the __ Geo DSL is still very similar to Terraform, e.g. # Terraform Security Group resource \u0026quot;aws_security_group\u0026quot; \u0026quot;allow_all\u0026quot; { name = \u0026quot;allow_all\u0026quot; ingress { from_port = 0 to_port = 0 protocol = \u0026quot;-1\u0026quot; cidr_blocks = [\u0026quot;0.0.0.0/0\u0026quot;] } }``# GeoEngineer Security Group project.resource(\u0026quot;aws_security_group\u0026quot;, \u0026quot;allow_all\u0026quot;) { name \u0026quot;allow_all\u0026quot; ingress { from_port 0 to_port 0 protocol \u0026quot;-1\u0026quot; cidr_blocks [\u0026quot;0.0.0.0/0\u0026quot;] } }\nThis is so we can reuse Terraform’s great documentation with examples and easily keep up with its quickly increasing feature set.\nValidation Treating codified resources like code means we need tests to:\nEnsure the validity of the code and enforce standards, e.g. code style, security, and tagging. Provide helpful feedback if a proposed change does not satisfy some validations. Avoid any massive failure like accidentally deleting all resources, or as Google calls it “Automation: Enabling Failure at Scale”. GeoEngineer has many inbuilt validations, but it also allows custom validations to ensure that resources are correct for your particular organization. At Coinbase, security is our highest priority. As a result, we are constantly implementing security standards as Geo validations. However, each organization will have their own priorities and standards for resources. For example, at Coinbase we require all resources be tagged with the name of their project: class GeoEngineer::Resource validate -\u0026gt; { validate_has_tag(:ProjectName) if support_tags? } end\nIf a resource does not contain a ProjectName tag , then Geo will raise an error: $ geo plan ERROR: ProjectName attribute on subresource \u0026quot;tag\u0026quot; is nil for resource \u0026quot;aws_security_group.ec2_foo\u0026quot; Total Errors 1\nThe canonical AWS way of accomplishing a similar outcome would be with AWS Config. However, Geo will fail much earlier before any resources are created, and its validations are defined with much more control and less complexity.\nWe have also added validations on the geo CLI, e.g. we can ensure that geo apply is only ever run on the master branch: require 'git' class GeoEngineer::Environment before :apply, -\u0026gt; { g = Git.open('.') throw 'Not on master!' if g.lib.branch_current != 'master' } end\nGeoEngineer will run these validations on every plan and apply command. It will output the errors it finds and will not execute unless there are 0 errors. This is part of our team’s ‘low-risk’ mission, to provide a safety net to experiment, refactor, and learn with the knowledge that nothing bad can happen.\nPlan and Review To ensure changes in the codified resources accurately reflect what they will actually change, we built a tool called mars. It receives GitHub webhooks and returns the result of the corresponding geo plan as a comment on pull requests:\nGitHub comment made by our mars bot\nThis plan and the corresponding code changes are then reviewed using our consensus-based review system. When a developer adds a comment that indicates a positive review of the pull request, another bot called sauron will then allow this pull request to be merged:\nMerge then Apply We block merging to master using GitHub’s branch protection until the pull request has generated a plan, passed all validations, and been positively reviewed by an infrastructure team member.\nGitHub setting to make sure code changes are reviewed and valid\nThese checks are to ensure that there are no surprises when the changes are applied. If a merged pull request has mistakes or causes a failure, we go back to step 1 by creating a new pull request to fix the issue as well as adding a new validation to make sure the issue doesn’t occur in the future. Better validations improve our workflow and increase our trust merged code will not contain any errors.\nIn the future, we want to implement a service like mars that would automatically run geo apply on merges to the master branch and add the changes to the cloud. This is a significant step towards automation as it removes the last direct interaction between an engineer and the cloud. There are many challenges with automatically applying changes, including security of the workflow and ensuring the changes will not lead to any kind of significant failure. This service would help realize the ‘self-service’ mission of our infrastructure team.\nWhat’s Next? Treating our codified resources like code has opened up a number of possible future projects. One such project is to add semantics to GeoEngineer resources, e.g. security group and an Elastic Load Balancer understand what they are and their relation to one another. We hope these semantics can help provide better validations to ensure that resources behave as expected.\nAnother project would be improving Geo’s graph implementation. Currently, it takes the resources and abstractions (like project) and visualizes their relationships: (Anonymized) Graph of related projects and resources generated by geo graph\nAn improved graph would be useful for management and security by making it easier to see what is happening (or could happen) in the cloud.\nGeoEngineer’s plan command currently presents the plan directly from Terraform. This could be improved significantly, especially for resources like security groups and IAM policies where one small change can result in very large plans. Better presented plans could highlight the actual changes being requested and make it easier to review for the engineers.\nAnother goal of the team is contributing features from GeoEngineer back to Terraform. Ultimately we would like to use pure Terraform in this workflow, and Geo is trying to stay thin enough so that this could be accomplished easily in the future.\nThis workflow was built to treat our “infrastructure as code” with the same recommended coding practices of our other code. We hope that this will make it easier to develop and maintain our cloud in the long run.\nFinally, all contributions to GeoEngineer **** like feature requests, bugs, testing are always appreciated.\nResources GeoEngineer Source GeoEngineer Documentation Short video about GeoEngineer at AWS re:Invent https://www.youtube.com/watch?v=Pp12ElEgKGI ","permalink":"https://maori.geek.nz/posts/2017/2017-01-10_treating-infrastructure-like-code/","summary":"\u003cp\u003e\u003cimg alt=\"image\" loading=\"lazy\" src=\"/posts/2017/2017-01-10_treating-infrastructure-like-code/images/1.jpeg#layoutTextWidth\"\u003e\nZion National Park’s amazing geology\u003c/p\u003e\n\u003cp\u003eThe goal of the infrastructure team at Coinbase is to provide self-service tooling to our engineers to empower them to rapidly develop, monitor, and optimize services with low risk. With this mission in mind, we are currently in the process of building a workflow for creating and managing our \u003ca href=\"https://en.wikipedia.org/wiki/Infrastructure_as_Code\"\u003ecodified infrastructure resources\u003c/a\u003e that looks like:\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003e\u003cstrong\u003ePull Request\u003c/strong\u003e: an engineer submits a pull request to a repository with a new codified resource they want.\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eValidation\u003c/strong\u003e: the new resource is automatically validated and follows our company standards for naming, tagging, and security.\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003ePlan and Review\u003c/strong\u003e: a plan describing the actions needed to be taken to apply a change is presented alongside the code change to be reviewed by an infrastructure team member.\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eMerge then Apply\u003c/strong\u003e: if the plan is good, then the pull request can be merged and automatically applied to the cloud.\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eThis workflow manages our codified infrastructure the same way we manage our code with \u003ca href=\"https://guides.github.com/introduction/flow/\"\u003e\u003cem\u003eGitHub flow\u003c/em\u003e\u003c/a\u003e; i.e. open a pull request, ensure the change is valid with tests, merge the change into the master branch, then apply the changes to the necessary environments. The main idea of this workflow is to improve collaboration between the infrastructure and engineering teams. This will also speed up the development and deployment of resources, and make sure we deliver what is actually needed.\u003c/p\u003e","title":"Treating Infrastructure Like Code"},{"content":"While running or driving I enjoy listening to podcasts as they are a great way to passively learn and keep up with various technologies and trends, as well as be entertained. This post is a list of the podcasts I enjoy (or have enjoyed) listening to.\nProgramming DevChat podcasts Ruby Rogues: Ruby and programming focused panel podcast, I find the earlier episodes with the original crew were the the best and worth going back for a listen.\nJavascript Jabber: Javascript and programming podcast.\n@thoughtbot podcasts Giant Robots smashing into other Giant Robots: Earlier episodes were focused on ruby programming, now it is more focused on product development @thoughtbot.\nThe Bike Shed: Programming @thoughtbot podcast about technologies that they are interested in or using.\nOpen source podcasts The Changelog: About Open Source Software with a great segment on background and how to make open source sustainable\nDevOps After taking my first DevOps role, I wanted to find the best podcasts in the domain to catch up on community and technology. Here are a few that I have found and recently enjoyed listening to:\nArrested DevOps\nThe Cloudcast\nDevOps Cafe\nThe Ship Show\nAWS podcast\nTo Be Continuous: Podcast with the founder of CircleCI discussing development, teams, technology, product and all other aspects of running a company.\nThe Food Fight Show\nGeneral Technology InfoQ Podcast\nSoftware Defined Talk\nMisc This American life: Moving, funny, insightful look at people, places ideas in America.\n99% invisible: A look at the design of everything\nDan Carlin’s Hardcore History: history recounted from a “fan” of history and not a historian. The Mongol invasion, WW1 and longer themes like Germanic people are explored in an always enlightening and entertaining way.\nWelcome to Nightvale: A surreal, theatre of the absurd, creepy and ingenious story of a radio host in a desert town called Nightvale.\nHello Internet: A chat show between two of my favourite educational YouTuber’s CGP Grey and Brady Harlan.\nFinally If you have any podcasts that you think I should listen to, please comment :)\n","permalink":"https://maori.geek.nz/posts/2016/2016-11-26_podcasts/","summary":"\u003cp\u003eWhile running or driving I enjoy listening to podcasts as they are a great way to passively learn and keep up with various technologies and trends, as well as be entertained. This post is a list of the podcasts I enjoy (or have enjoyed) listening to.\u003c/p\u003e\n\u003ch4 id=\"programming\"\u003eProgramming\u003c/h4\u003e\n\u003ch4 id=\"devchat-podcasts\"\u003eDevChat podcasts\u003c/h4\u003e\n\u003cp\u003e\u003ca href=\"https://devchat.tv/ruby-rogues\"\u003e\u003cstrong\u003eRuby Rogues\u003c/strong\u003e\u003c/a\u003e: Ruby and programming focused panel podcast, I find the earlier episodes with the original crew were the the best and worth going back for a listen.\u003c/p\u003e","title":"Podcasts"},{"content":"Alcoa Keystone Charles Duhigg in his book Power of Habit discussed how Paul O’Neil, the CEO of Alcoa (Aluminum Company of America), was able to increase his company’s value by 27 billion by focusing on a goal unrelated the main company objectives, no workplace injuries. This is a keystone metric (habit), a broad goal for the entire company, not directly related to the company’s main goal but impactful and actionable. O’Neil chose safety as the single most important aspect because:\nMost organizations say “our human beings are the most important asset”, but in most places there is no proof that it is really true. It is just something you say […] Its Alcoa’s objective that people who work for Alcoa will never be hurt at work […] Safety is not a priority, it is a precondition! [cite]\n“We can’t afford it” and “it is too difficult, it will stop people from being able to do their work” were reasons people used as to why Alcoa could never reach this goal. Persisting, O’Neil began to have executive meetings on every workplace death of an employee, because he saw it as a personal responsibility to identify and fix anything that could led to an employee being hurt. He also told his executive and management staff that it was their responsibility if an injury occurred under their watch. This led to everyone in Alcoa taking the goal seriously, which in turn opened up communication channels for any employee to quickly escalate safety issues to higher management.\nTrying to reach this keystone metric resulted in many benefits for company; the injury rate dropped from 1.86 to 0.125 per year, the new communication channels were used for increasing productivity and visibility from the ground up to the executives, and bad managers were quickly identified and removed if they could not follow the safety procedures and guidelines. Everyone at the company was working towards the same goal, and succeeding.\nDevOps Keystone One of the reasons why I joined the DevOps team at Coinbase was the “30 day project”, a keystone metric to never have a server older than 30 days. I thought this was an ambitious (maybe impossible) goal, but I wanted to be part of the team with that kind of vision.\nThe age of the servers Coinbase runs on does not directly impact the performance of the company or team, and other teams might say “we can’t afford it” or “it is too difficult, and we would end up not doing more important work”. However, we have seen many benefits from trying to reach the 30 day goal:\nEverything is Redeployable: we have a process around redeploying and/or upgrading every server we have running. No Shadow Infrastructure: we are always looking for servers that are unknown and finding out what is their purpose, if they are necessary, and ensuring we understand how to manage and upgrade them. Reduce Code Rot: by rebuilding and redeploying services we ensure that all dependencies still resolve and are up to date. Revisiting Decisions: past decisions about architecture, storage, deployments are reevaluated to see if we can redo them with better outcomes. Reduce Security Response Time: if a security CVE, like OpenSSL’s heartbleed, is released we know with certainty that we can quickly upgrade and redeploy our entire infrastructure. Sharing Knowledge: the high frequency of deploys necessitates that multiple people will need to learn to redeploy a service. This means that the process will become more repeatable, less painful, safer, and better documented with every iteration. As a result of the hard work we put into the 30 day project we were able to organize an event called “Scorched Earth” where we rebuilt the base operating system (AMI) and every Docker container, then redeployed every server in our infrastructure in under 24 hours. We succeeded with 30 mins to spare, meaning that all of Coinbase was running on servers less than a day old! We did this with 0 downtime, 0 errors, and while the rest of the organization worked without disruption.\nIn the future we will continue to maintain our 30 day keystone metric not because it is a priority, but because it is a precondition of our team.\n","permalink":"https://maori.geek.nz/posts/2016/2016-10-07_keystone-metrics-in-devops-the-30-day-project-coinbase/","summary":"\u003ch4 id=\"alcoa-keystone\"\u003eAlcoa Keystone\u003c/h4\u003e\n\u003cp\u003eCharles Duhigg in his book \u003ca href=\"http://amzn.to/2cXb0Oa\"\u003ePower of Habit\u003c/a\u003e discussed how \u003ca href=\"https://en.wikipedia.org/wiki/Paul_H._O%27Neill\"\u003ePaul O’Neil\u003c/a\u003e, the CEO of Alcoa (Aluminum Company of America), was able to increase his company’s value by 27 billion by focusing on a goal unrelated the main company objectives, \u003cstrong\u003eno workplace injuries\u003c/strong\u003e. This is a \u003cem\u003ekeystone metric (habit)\u003c/em\u003e, a broad goal for the entire company, not directly related to the company’s main goal but impactful and actionable. O’Neil chose safety as the single most important aspect because:\u003c/p\u003e","title":"Keystone Metrics in DevOps: The 30 Day Project @ Coinbase"},{"content":" A bad way to implement a currency exchange would be to have the customer pass their order to an employee who then manually checks the customer balances, decides whether the order is valid, then writes down the order in a big order book (a la the old New York stock exchange). This method would be slow, prone to human error, not scalable, and more customers would require more employees. A better way would be to give the tools to the customer (with appropriate validations and security) to create the orders themselves, making the exchange more efficient and less expensive to run.\nA bad way to work as a DevOps team would be to pass a request (e.g. deploy an application, change some infrastructure, or fix a security issue) to a DevOps engineer who would then go and fulfill it (a la how many companies work). This method would be slow, prone to human error, not scalable, and a bigger company would require more DevOps engineers. A better way would be to give the tools to the engineers (with appropriate validations and security) to perform the operations themselves, making the entire company more efficient.\nAt Coinbase we have written about how we build and deploy our infrastructure securely and how this has increased our developer productivity. We have as a team philosophy to build Self Service infrastructure tools for our engineers to enable them to accomplish all their tasks in a safe and secure way.\nIf this sounds like a DevOps team you would like to be a part of, we are now recruiting DevOps and Software Engineers here.\n","permalink":"https://maori.geek.nz/posts/2016/2016-09-29_selfservice-devops-coinbase/","summary":"\u003cp\u003e\u003cimg alt=\"image\" loading=\"lazy\" src=\"/posts/2016/2016-09-29_selfservice-devops-coinbase/images/1.jpeg#layoutTextWidth\"\u003e\nA bad way to implement a currency exchange would be to have the customer pass their order to an employee who then manually checks the customer balances, decides whether the order is valid, then writes down the order in a big order book (a la the old New York stock exchange). This method would be slow, prone to human error, not scalable, and more customers would require more employees. A better way would be to give the tools to the customer (with appropriate validations and security) to create the orders themselves, making the exchange more efficient and less expensive to run.\u003c/p\u003e","title":"Self-Service DevOps @ Coinbase"},{"content":"\nName a technology that is more useful, more educational, more interesting, and more overpriced than a ultrasound machine. You can look inside of living things without the need for a powerful magnets or radioactivity and it is basically made from a speaker and microphone outputting to a screen.\nWhy doesn’t every high school biology class room have an ultrasound to show how muscles work, and hearts beat? Why don’t doctors have them immediately handy like a stethoscope or thermometer? Why can I not get one just because I am interested in how my injuries are healing? Probably because “a £20,000 [$30,000USD] scanner is generally classed as low cost.”\nAfter I spent $200 on a doctors visit because of an injured foot, where they used a cabinet sizes ultrasound machine that looked like a 1950’s TV, I wondered how much it would cost to purchase an ultrasound for myself. After a finding that a “cheap” ultrasound is still $8000, I just couldn’t reconcile the cost with the technology and the simplicity and usefulness of such a tool. So I decided to do a little research.\nTransducers A transducer is a ceramic with two pieces of metal on either side, if you squeeze the transducer it creates a current, if you put a current into the metal it moves the transducer. An ultrasound machine requires a dozen or more MHz frequency transducers formed into an array to get the required resolution. The preferred ceramic for a MHz transducer seems to be Lead zirconate titanate or PZT.\nWhen a current is applied in to the transducer it will expand, reversing the current will make it contract, doing this repeatedly will create a wave with a wavelength of about twice its thickness. Given the speed of sound in ceramics is about 3200m/s, to produce a 5MHz wave, the wavelength will be 0.64mm so the thickness of the PZT will need to be to be around 0.32mm. That is pretty thin.\nThis precise manufacturing of the transducer array is the probable reason for the cost of ultrasound machines as each transducer has such precise requirements and an ultrasound requires many of them. Individually these transducers cost quite a bit, and if you need 20 of them for an ultrasound then the cost increases quickly.\nThe cheapest MHz PZT transducers I have found are $12 each for 10Mz PZT discs. This would make a 20 transducer ultrasound still cost north of $200, and to make something more interesting like this 40x40 PZT array for 3D imaging would cost about $19,200 for just the transducer array.\nA way to reduce the price could be to make my own transducers. So can I manufacture my own transducer array? Maybe, and here is some resources which could be used to make home made transducers:\nhere is a cool video of a person making their own transducer out of Barium Titanate, here is a group of people trying to bring printing of integrated circuits to the home. here is a Ph.D. thesis on creating thin PZT films here is a way to make a “low cost, high density PZT phased array” here is a description of how to make a flexible PZT array here is a 64x 35MHz array used for imaging in high resolution One way to manufacture a PZT array is to use the dice-and-fill method described here, which creates a pixel array of transducers. Or I could build individual transducers by putting PZT into a medium like ethanol and then spray it onto some metal (probably silver) so that when the ethanol evaporates it leaves a thin film of PZT on the silver. Either way, these require specialist equipment and it is probably easier to contact a lab or manufacturer with the expertise to build the array, but where is the fun in that?\nHardware and Software A computer that can run a MHz frequency transducer is easy and cheap these days, e.g. a raspberry pi’s GPIO pins can run that frequency. Screens can be very cheap, or just output directly to a phone or another screen. So I don’t think that the ultrasounds computations and display hardware will be very expensive.\nThe software may be very expensive, but this is probably because ultrasound machines are each running closed source custom software. If there were a standard hardware platform to make ultrasound software for, with standards and API’s I am sure this would reduce the cost for each machine significantly. There are efforts like this open architecture for ultrasound control, but I am unsure of how much adoption they have.\nWhere most of the cost comes for the ultrasound machines hardware and software is probably in their medical certification. Quoting from the Hacker News user jes:\nSome specifications that must be met include ISO 13485, ISO 14971, IEC 60601 3rd Edition, IEC 62304, and probably ten more that I have forgotten about, such as RoHS, WEE, radiated emissions, etc. […] If you’re measuring the length of a fetal femur and translating the measured length to an estimated gestational age, you don’t wish to be wrong\nCreating any device for medical purposes can be incredibly expensive, but this ignores all the other uses that ultrasounds can have in education, imaging, sports training and just for fun. The catch 22 is that ultrasounds will never be used for these activities if they are difficult to use and expensive, and they will never become cheaper unless the ultrasound market becomes larger than just health.\nWho is doing something about expensive ultrasound machines Newcastle University is working on a $[40–50 ultrasound] (http://www.ncl.ac.uk/eee/research/coms2ip/sensors/ultrasound-imaging/)which has gotten some media attention. They reduce the cost by only having one transducer and moving it to generate the image.\nButterfly Network Inc is currently trying to create a medical imaging device, which is basically ultrasound on a single chip, as cheap as a stethoscope. They have raised over $100 million dollars and hopefully it will create some amazing technology so that everyone can purchase an affordable ultrasound machine.\nLumify from Phillips is a hand held ultrasound that plugs into a smart phone or tablet and looks awesome. But, it is not available to all consumers.\nechOpen and its fork Murgen are open source projects to make an ultrasound machine and dev kit. These kinds of projects could really help bring down the price of ultrasounds by making the technology available to developers and engineers, go check them out.\nMore links for those interested Basic principle of medical ultrasonic probes\nPhased Array Ultrasonics\nPocket Ultrasound\nHow does medical ultrasound imaging work?\nHow to use an ultrasound machine\nMobile Ultrasound Device with video\nPrinciples of Ultrasound\n[Response from] (http://liesandstartuppr.blogspot.fr/2016/12/why-are-medical-ultrasound-systems-so.html)[Nicolas Felix](https://medium.com/u/601fa6bb7e49) to this post describing in more detail the reasons for the expense of ultrasound machines\nHow do they test ultrasound machines? You use a phantom suspended in a jelly to simulate the human body, like the ballistics jelly they always use on myth busters.\n","permalink":"https://maori.geek.nz/posts/2016/2016-02-22_why-are-ultrasound-machines-so-expensive/","summary":"\u003cp\u003e\u003cimg alt=\"image\" loading=\"lazy\" src=\"/posts/2016/2016-02-22_why-are-ultrasound-machines-so-expensive/images/1.jpeg#layoutTextWidth\"\u003e\u003c/p\u003e\n\u003cp\u003eName a technology that is more useful, more educational, more interesting, and more overpriced than a ultrasound machine. You can look inside of living things without the need for a powerful magnets or radioactivity and it is basically made from a speaker and microphone outputting to a screen.\u003c/p\u003e\n\u003cp\u003eWhy doesn’t every high school biology class room have an ultrasound to show how muscles work, and hearts beat? Why don’t doctors have them immediately handy like a stethoscope or thermometer? Why can I not get one just because I am interested in how my injuries are healing? Probably because \u003ca href=\"http://www.ncl.ac.uk/eee/about/news/item/low-cost-design-makes-ultrasound-imaging-affordable-to-the-world-copy\"\u003e“a £20,000 [$30,000USD] scanner is generally classed as low cost.”\u003c/a\u003e\u003c/p\u003e","title":"Why are Ultrasound Machines So Expensive?"},{"content":"Strands published a great vision paper about The Big promise of recommender systems. In this post I break down the paper into a few useful requirements; Recommender systems should:\nbe easy to integrate and easy to remove. Vendor lock-in is a negative so having a minimal impact on the clients system is a big plus complete with internal marketing teams. Generated recommendations will compete for space and resources with internal marketing departments as their core goals are identical. So reuse the marketing metrics and methods in recommender systems to measure and drive real business value first collect enough data to avoid the cold start problem. Don’t release the recommender systems functions till there is enough data to provide good recommendations scale to the businesses needs, number of users and number of items be a hybrid of approach as these create a robust solution that solves many problems of exclusive algorithms be a balance between algorithms and UX. Focusing entirely on algorithms or UX will not create value, it must be a mixture of both use implicit instead of explicit feedback. Explicit ratings or reviews can be manipulated, however implicit (e.g. if they actually bought or watched the item) are more difficult to manipulate and will provide better data differentiate their products as recommender systems have become commoditized. If a system if difficult to evaluate and looks similar to other systems it will fail have contextual awareness of where the recommendations are being shown. For example, recommendations on mobile are different to web and different to email Check out my Good Enough Recommender (GER) to see how it can be used to implement the requirements in this list and check out the List of Recommender Systems and see how they implement these requirements\n","permalink":"https://maori.geek.nz/posts/2015/2015-12-21_9-things-that-recommender-systems-should-do/","summary":"\u003cp\u003e\u003ca href=\"http://recommender.strands.com/\"\u003eStrands\u003c/a\u003e published a great vision paper about \u003ca href=\"http://www.aaai.org/ojs/index.php/aimagazine/article/viewFile/2360/2232\"\u003e\u003cstrong\u003eThe Big promise of recommender systems\u003c/strong\u003e\u003c/a\u003e. In this post I break down the paper into a few useful requirements; Recommender systems should:\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003ebe easy to integrate and \u003cem\u003eeasy to remove\u003c/em\u003e. Vendor lock-in is a negative so having a minimal impact on the clients system is a big plus\u003c/li\u003e\n\u003cli\u003ecomplete with internal marketing teams. Generated recommendations will compete for space and resources with internal marketing departments as their core goals are identical. So reuse the marketing metrics and methods in recommender systems to measure and drive real business value\u003c/li\u003e\n\u003cli\u003efirst collect enough data to avoid the cold start problem. Don’t release the recommender systems functions till there is enough data to provide good recommendations\u003c/li\u003e\n\u003cli\u003escale to the businesses needs, number of users and number of items\u003c/li\u003e\n\u003cli\u003ebe a hybrid of approach as these create a robust solution that solves many problems of exclusive algorithms\u003c/li\u003e\n\u003cli\u003ebe a balance between algorithms and UX. Focusing entirely on algorithms or UX will not create value, it must be a mixture of both\u003c/li\u003e\n\u003cli\u003euse implicit instead of explicit feedback. Explicit ratings or reviews can be manipulated, however implicit (e.g. if they actually bought or watched the item) are more difficult to manipulate and will provide better data\u003c/li\u003e\n\u003cli\u003edifferentiate their products as recommender systems have become commoditized. If a system if difficult to evaluate and looks similar to other systems it will fail\u003c/li\u003e\n\u003cli\u003ehave contextual awareness of where the recommendations are being shown. For example, recommendations on mobile are different to web and different to email\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003e\u003cem\u003eCheck out my\u003c/em\u003e \u003ca href=\"https://github.com/grahamjenson/ger\"\u003e\u003cem\u003eGood Enough Recommender (GER)\u003c/em\u003e\u003c/a\u003e \u003cem\u003eto see how it can be used to implement the requirements in this list and check out the\u003c/em\u003e \u003ca href=\"https://github.com/grahamjenson/list_of_recommender_systems\"\u003e\u003cem\u003eList of Recommender Systems\u003c/em\u003e\u003c/a\u003e \u003cem\u003eand see how they implement these requirements\u003c/em\u003e\u003c/p\u003e","title":"9 Things that Recommender Systems Should Do"},{"content":"\nOver the last 6 months my wife and I have been working and travelling around Europe using Brighton U.K. as a base. Before we left we looked at a few different ways of getting a roof over our head while travelling, our ideal solution would be low hassle, low responsibility, low liability, and easy to set up.\nThe original idea was to rent an apartment in Brighton but this can be very expensive, difficult to find and organise from overseas, and requires signing contracts, putting up bonds and other legal arrangements. In addition to these problems, the local law also puts a stupid paradox in the way of any traveller renting an apartment; To rent a British apartment you need British bank account and to get a British bank account you need a British address. From the HSBC site\nYou may also be required to show bank statements if you want to rent a property. This is not always as straightforward as you might think, as many banks require you to provide proof of UK address to open an account.\nWhile looking at options I remembered AirBnB which came across my radar on the Giant Robots podcast, which made it really sound like a good solution to our housing problem. The benefits of using AirBnB over just about any other product are:\nThere are thousands of places in Brighton to select from Paying through AirBnB means that there are no awkward conversations or disagreements about money with your hosts (so no worries about tips, taxes or scams) Paying in NZD means no conversion fees or converting between currencies Different daily, weekly and monthly rates makes it cheaper the longer you stay So, after using AirBnB for our 6 month long trip, I decided to do a quick write up over some of the things I have learnt using the service.\nSome places are hotels, most places are homes Some places we stayed at were clearly just businesses using AirBnB as a way to book guests. These rooms are easily spotted because the owners usually have multiple listed properties or many listed rooms in the same property. The relationship to the owner is just like if you stayed at a hotel, very professional and direct.\nMostly though, the places we stayed at were just homes of people looking to make a little extra income from a spare or unused room. So when you show up to their house you are walking into where they live, and the relationship is completely different where you are more of a guest or a flatmate rather than a customer.\nBook early, the host decides the speed of the process One weekend we rented a car and drove to Cornwall where I decided not book a place to stay because I didn’t know which city we would end up in that night. What I didn’t know was that it was a school holiday, on a long weekend, and Cornwall is where people go on school holidays and/or long weekends.\nAt about lunch time I had a better idea where we would end up, so I found the only (cheap) AirBnB place nearby and requested to stay there the night. As we neared the location and the sun was setting, the request had yet to be accepted so I sent a few messages. After some time, we decided to get some dinner and I was checking my phone every few minutes hoping for a reply. At about 9pm, when stress levels were high because no other places had accommodation, I finally got a response. It was an apology for the delay explaining that their phone ran out of battery while they were on a long walk.\nThe lesson I learnt was that when you request to book a place with plenty of time to spare, as it takes 24 hours for a request to expire. Also, once the guest has requested a booking there is no way to cancel it without calling AirBnB customer support.\nMost AirBnB places don’t serve breakfast but are more DIY Despite its name, very few of the places we stayed at actually provided breakfast, but most provide coffee, tea, cooking facilities, a fridge, a cupboard for food, places for cloths, washing. Some provided soap, shampoo, and other amenities like mini kitchens and kettles.\nAll the places we stayed provided some cooking facilities, though one had a microwave and no oven, and another had no oven but a microwave.\nNo reviews may be worse than bad reviews When you stay with someone at their house and they are warm, friendly and offer you tea in the morning, it can be difficult to leave a review which says something like “the room was noisy because it is next to a busy street” or “the room is a bit small” because neither is really the fault of the host, and they were very nice people. It is hard to leave a bad review, especially for a nice host.\nThis makes the review system a little difficult to navigate, if a place has loads of good reviews it is probably really good. If it has average reviews, the host is probably really nice and the place is bad, or it actually is just average. If there are bad reviews don’t stay there. But, if it has no reviews it could either be new or it was difficult for guests to leave bad reviews for a nice host.\nSometimes you get what you pay for We once booked a cheap place to stay for one night, it was in the perfect location but it had 0 reviews. After organising how to check-into the place, we got this interesting message from the host:\n“Don’t tell my flatmates, they don’t like the whole AirBnB thing, if they ask tell them we are friends”.\nThis was a massive warning signal, but it was too close to the date we needed to stay there and there were no other places to stay.\nThe host turned out to be a student, and what we had rented ended up being a room in a University hostel. The student had gone away for the week and instead of wasting a week of rent on an empty room, put it on AirBnB. The room was cheap, it was unclean (it was a students room) and it had a single pillow for two people. Fortunately the flatmates didn’t speak english so we didn’t have to lie to them.\nThe place was cheap in the centre of a very expensive town, and we probably got what we paid for.\nAs a guest you have no insurance, AirBnB only provides the host with insurance Guests need travel insurance according to the AirBnB FAQ. If the host cancels unreasonably on you, or doesn’t show, or steals your stuff and locks you out, AirBnB is not liable. The worst you can do is leave a bad review.\nThis is in comparison to the hosts guarantee of $600,000 insurance with AirBnB, which occurs if the guest does damage.\nI can see why AirBnB does this, there are much less hosts than guests, and if they offered guests some kind of insurance their call centres would always be flooded with unhappy guests trying to claim for all sorts of minor things. It is important to know that as a guest, AirBnB is not liable for anything when you use their service.\nControversies with AirBnB AirBnB is truly disruptive in that it takes what used to be difficult, finding and paying for a roof over your head in a strange city, and makes it so easy for both hosts and guests that it is hurting the traditional models of accommodation. These traditional models include protections for renters, like not being able to be evicted at a minutes notice, which AirBnB does not have.\nThe problems with AirBnB have come to a head in San Fransisco where some people think that the skyrocketing cost of living is in part fuelled by people listing their houses on AirBnB, and that locals are being illegally forced out of their homes by landlords whose intention is to rent their houses on AirBnB. Here are some interesting articles about this situation:\nWhy are people so pissed off about Airbnb? I Have Read Prop F, and It is Worse Than You Think A Response to Above I think AirBnB needs to offer some guest assurances, and also “short term rentals” may need some kind of regulation to make sure that no-one is getting screwed or screwing other people. But the idea that the majority of AirBnB hosts are big-bad-landlords just doesn’t fit with my experiences where most hosts were just nice people using an empty room for additional income to help with their rent, to pay for renovations to their homes, or to help them save to go on their own trips around the world.\nIf the only answer to “disruptive technologies” like Netflix, Uber or AirBnB, is to regulate them out of business, or make it so hard for them to operate they don’t even bother e.g. internet tax or asset seizure, then this hurts everyone and doesn’t fix the reason why everyone abandoned the old models and used the new services instead.\nIf AirBnB didn’t exist, or it was made so difficult or unprofitable that way fewer hosts were around, when we showed up in Brighton we would have been homeless. You go and see if there is another service where you can rent a fully furnished apartment for 6 months, in Brighton, with no downpayment, paying in your local currency, and you get a cup of tea when you arrive.\nTo Sum Up AirBnB is an awesome service, and our trip was made easier, less stressful, cheaper and more pleasant because of AirBnB. If you are travelling for any amount of time I highly recommend the service.\n","permalink":"https://maori.geek.nz/posts/2015/2015-11-13_what-we-learned-living-in-airbnbs-for-6-months/","summary":"\u003cp\u003e\u003cimg alt=\"image\" loading=\"lazy\" src=\"/posts/2015/2015-11-13_what-we-learned-living-in-airbnbs-for-6-months/images/1.png#layoutTextWidth\"\u003e\u003c/p\u003e\n\u003cp\u003eOver the last 6 months my wife and I have been working and travelling around Europe using Brighton U.K. as a base. Before we left we looked at a few different ways of getting a roof over our head while travelling, our ideal solution would be low hassle, low responsibility, low liability, and easy to set up.\u003c/p\u003e\n\u003cp\u003eThe original idea was to rent an apartment in Brighton but this can be very expensive, difficult to find and organise from overseas, and requires signing contracts, putting up bonds and other legal arrangements. In addition to these problems, the local law also puts a stupid paradox in the way of any traveller renting an apartment; \u003cstrong\u003eTo rent a British apartment you need British bank account and to get a British bank account you need a British address.\u003c/strong\u003e From the \u003ca href=\"https://financialplanning.hsbc.co.uk/article/89/things-to-consider-before-you-move-to-uk?HBEU_dyn_lnk=Planning_MovingToUK_UsefulArticles_Link1\"\u003eHSBC site\u003c/a\u003e\u003c/p\u003e","title":"What We Learned Living in AirBnB’s for 6 months"},{"content":"\nHapiGER, the Happy (Good Enough) Recommender System, has a new version!\nThis version:\nis Simpler: with a smaller, better defined API is More Configurable: can better define how to generate recommendations giving more control over quality and speed has Added Functionality: can now generate similar things recommendations to find items that are similar to other items Links:\nHapiGER HapiGER on npm HapiGER source GER Source List of Recommender Systems ","permalink":"https://maori.geek.nz/posts/2015/2015-10-13_new-version-of-hapiger-recommender-system/","summary":"\u003cp\u003e\u003cimg alt=\"image\" loading=\"lazy\" src=\"/posts/2015/2015-10-13_new-version-of-hapiger-recommender-system/images/1.png#layoutTextWidth\"\u003e\u003c/p\u003e\n\u003cp\u003e\u003ca href=\"http://www.hapiger.com/\"\u003eHapiGER\u003c/a\u003e, the Happy (Good Enough) Recommender System, has a new version!\u003c/p\u003e\n\u003cp\u003eThis version:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003eis \u003cstrong\u003eSimpler\u003c/strong\u003e: with a smaller, better defined API\u003c/li\u003e\n\u003cli\u003eis \u003cstrong\u003eMore Configurable\u003c/strong\u003e: can better define how to generate recommendations giving more control over quality and speed\u003c/li\u003e\n\u003cli\u003ehas \u003cstrong\u003eAdded Functionality\u003c/strong\u003e: can now generate \u003cem\u003esimilar things\u003c/em\u003e recommendations to find items that are similar to other items\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eLinks:\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003e\u003ca href=\"http://www.hapiger.com/\"\u003eHapiGER\u003c/a\u003e\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"https://www.npmjs.com/package/hapiger\"\u003eHapiGER on npm\u003c/a\u003e\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"https://github.com/grahamjenson/hapiger\"\u003eHapiGER source\u003c/a\u003e\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"https://github.com/grahamjenson/ger\"\u003eGER Source\u003c/a\u003e\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"https://github.com/grahamjenson/list_of_recommender_systems\"\u003eList of Recommender Systems\u003c/a\u003e\u003c/li\u003e\n\u003c/ol\u003e","title":"New Version of HapiGER Recommender System"},{"content":"Why should you learn new things?\nBecause you can’t predict the future, and knowing is better than not knowing.\nWhat you learn doesn’t need to be applicable, you don’t have to put it on your CV, it doesn’t need to be measured and quantified, you don’t need to tell anyone.\nLearn anything; learn a toy, a gimmick, something too new or too old, something that is popular or something that is niche. It doesn’t matter if it is not supported, not serious or not professional. Learning is an exercise, so the worst that can happen is you get faster at learning the next time.\nThere are loud voices that cry “new is bad” and call programmers who want to learn “magpie developers” (as magpies like shiny new things), these voices are the dodo developers. A dodo developer will try to convince you that learning a new skill is not a necessary because “we didn’t need it yesterday, so we wont need it tomorrow”. That argument might work for a little bit, but soon the dodo will get caught on the ground when they should have learnt to fly.\n","permalink":"https://maori.geek.nz/posts/2015/2015-10-08_dodo-developer-doesnt-want-to-learn-new-things/","summary":"\u003cp\u003eWhy should you learn new things?\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e\u003cem\u003eBecause you can’t predict the future, and knowing is better than not knowing.\u003c/em\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWhat you learn doesn’t need to be applicable, you don’t have to put it on your CV, it doesn’t need to be measured and quantified, you don’t need to tell anyone.\u003c/p\u003e\n\u003cp\u003eLearn anything; learn a toy, a gimmick, something too new or too old, something that is popular or something that is niche. It doesn’t matter if it is not supported, not serious or not professional. Learning is an exercise, so the worst that can happen is you get faster at learning the next time.\u003c/p\u003e","title":"A Dodo Developer Doesn’t Want To Learn New Things"},{"content":"In this post I will briefly describe three optimisations that GER (Good Enough Recommendations) uses to improve recommendations. These optimisations could be useful for other recommender systems as they have increased the quality of GER’s recommendations.\nSeparated Data The two fundamental steps in recommender systems like GER are:\nfind a neighbourhood of similar people recommend things that those people like The similarity between people can vary widely from one set of items to another. For example, my friend and I have very similar tastes in action movies, but we don’t like the same drama movies. If all the data was combined, our similarity would be very weak. However, if we separated the data based on movie genre we would be very similar in the “action” genre, and dissimilar in the “drama” genre, now I can get some good “action movie” recommendations.\nGER allows users to separate their data into namespaces, which are just buckets of isolated data. This way GER can find more similar people and provide more targeted recommendations, with the added bonus that it also improves performance. However, deciding where to split the data can be difficult and is domain specific, e.g. how would you split restaurant recommendations? By price, by location, by mean portion size?\nRecent Data is More Relevant If a user looked at SLR cameras last week, and is looking at TVs today, would they be more receptive to recommendations for TVs or cameras? Probably TVs, because what they are doing now is more important to them. This is an example of a broader fact, data becomes less relevant over time to generate recommendations from. The rate at which data becomes less relevant changes based on domain. For example, recommending to a user an iPhone they liked 5 years ago is not good idea, but recommending to a user a movie they liked 5 years ago is better because movies tastes change slower than technology does.\nBy changing the measure of similarity between people to take into account how old the data is, can make older data less relevant. This way the recommendations can adjust themselves to a users tastes and whims of the user over time.\nTo implement this in GER the weight of an action is calculated with respect to time that action occurred. This is done using the event_decay_rate configuration, where the final weight is calculated as initial_weight * event_decay_rate ^ (- days since event). So, if the rate is 1.05 an event that occurred 14 days ago will have half its effect than if it occurred today.\nNo Recommendations Can Be Better Than Bad Recommendations Bad recommendations presented to the user can be worse than no recommendations. A user who sees bad recommendations might unsubscribe from an email, or not come back to the website. A recommender system that returns some kind of heuristic about how good it thinks the recommendations are, can help the client decide whether they should present the recommendations to the user. If the recommendations are bad, then the client might not send an email out, or just send a generic “top 10 items” instead, which may not cause the user to respond negatively.\nGER uses a minimum_history_required field which stops any recommendations being generated unless there is a selected amount of minimal data about a user. And when generating recommendations GER returns a confidence heuristic calculated using the amount of data available about the user, how many similar users were found, and the weights of the recommendations calculated. This way the client can see the confidence and decide whether to present the calculated recommendations.\nConclusion These are probably not novel optimisations, but they have been useful in improving the quality of GER’s recommendations. If you have any other optimisations, or comments about GER’s please comment below :)\n","permalink":"https://maori.geek.nz/posts/2015/2015-08-20_3-optimisations-that-improve-recommendations/","summary":"\u003cp\u003eIn this post I will briefly describe three optimisations that \u003ca href=\"https://github.com/grahamjenson/ger\"\u003eGER (Good Enough Recommendations)\u003c/a\u003e uses to improve recommendations. These optimisations could be useful for other recommender systems as they have increased the quality of GER’s recommendations.\u003c/p\u003e\n\u003ch4 id=\"separated-data\"\u003eSeparated Data\u003c/h4\u003e\n\u003cp\u003eThe two fundamental steps in recommender systems like GER are:\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003efind a neighbourhood of similar people\u003c/li\u003e\n\u003cli\u003erecommend things that those people like\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eThe similarity between people can vary widely from one set of items to another. For example, my friend and I have very similar tastes in action movies, but we don’t like the same drama movies. If all the data was combined, our similarity would be very weak. However, if we separated the data based on movie genre we would be very similar in the “action” genre, and dissimilar in the “drama” genre, now I can get some good “action movie” recommendations.\u003c/p\u003e","title":"3 Optimisations that Improve Recommendations"},{"content":"I want to run an application on my CoreOS clusters that uses hostnames to communicate between machines. This is a problem, because out of the box CoreOS machines cannot resolve hostnames of other machines in the cluster. So, I wrote a small fleet service that manages the /etc/hosts files on all the machines so they can correctly resolve each others hostnames. In this post I will briefly describe that service.\nThe Hosts Service The hosts.service: # hosts.service [Unit] Description=Hosts Manager After=etcd2.service``[Service] EnvironmentFile=/etc/environment Restart=always``ExecStartPre=-/usr/bin/etcdctl mkdir /hosts``ExecStart=/bin/sh -c 'while true; do etcdctl watch --recursive /hosts; \\ sleep 1;\\ echo \u0026quot;127.0.0.1 localhost\u0026quot; \u0026gt; /etc/hosts; \\ for i in $(etcdctl ls /hosts); do \\ echo $(etcdctl get $i) $(echo $i | cut -c 8-); \\ done \u0026gt;\u0026gt; /etc/hosts; \\ done'``ExecStartPost=/usr/bin/etcdctl set /hosts/%H $PRIVATEIP ExecStopPost=/usr/bin/etcdctl rm /hosts/%H``[X-Fleet] Global=true\nThis service uses etcdctl so must start after etcd2 in the /etc/environment file the PRIVATEIP is defined as the address the hostname should be resolved to. This can be set in the cloud-config with:`write_files: path: /etc/environment\ncontent: |\nPRIVATEIP=$private_ipv4` Restart=always to always restart this service Before the service starts (ExecStartPre=-) it will create the etcd directory hosts (etcdctl mkdir /hosts). Note: the =- means that if the directory creation fails (if it already exists) it will not stop the service ExecStart= will start an infinite loop (while true) that waits for changes in the hosts folder (etcdctl watch — recursive /hosts). When a change happens it will: wait (sleep 1) in case there is a burst of changes all at once overwrite the exists /etc/hosts file with localhost information loop over all entries in the /hosts directory (for i in $(etcdctl ls /hosts)) and append the value of the entry (etcdctl get $i) and the path (minus the /hosts bit) (echo $i | cut -c 8-) to the /etc/hosts file After the service has started (ExecStartPost) it will then register its hostname (%H) and address ($PRIVATEIP) in the /hosts directory. This change will be detected by the service, which will rewrite the /etc/hosts file with itself in it. When the service stops it will remove itself from the /hosts directory, which will cause other machines to update their /etc/hosts file. Global=true is there so that this service will automatically run on new machines to the cluster. This means that the /etc/hosts file on all machines in the cluster will always be up-to-date.For AWS users``For AWS users you may want to add the lines:``ExecStartPost=/bin/sh -c \u0026quot;etcdctl set /hosts/$(hostname | awk -F. '{print $1}') $PRIVATEIP\u0026quot; ExecStopPost=/bin/sh -c \u0026quot;etcdctl rm /hosts/$(hostname | awk -F. '{print $1}') $PRIVATEIP\u0026quot;``This is because calling hostname may return a host looking like node1.us-east-1.compute.internal, and some applications require only node1.``Further Reading``[CoreOS Essentials](http://amzn.to/1NW3xGG) ","permalink":"https://maori.geek.nz/posts/2015/2015-07-11_coreos-fleet-service-to-manage-etchosts/","summary":"\u003cp\u003eI want to run an application on my CoreOS clusters that uses hostnames to communicate between machines. This is a problem, because out of the box CoreOS machines cannot resolve hostnames of other machines in the cluster. So, I wrote a small \u003ca href=\"https://github.com/coreos/fleet\"\u003efleet\u003c/a\u003e service that manages the /etc/hosts files on all the machines so they can correctly resolve each others hostnames. In this post I will briefly describe that service.\u003c/p\u003e\n\u003ch4 id=\"the-hosts-service\"\u003eThe Hosts Service\u003c/h4\u003e\n\u003cp\u003eThe hosts.service:\n\u003ccode\u003e# hosts.service   [Unit]   Description=Hosts Manager   After=etcd2.service``[Service]   EnvironmentFile=/etc/environment   Restart=always``ExecStartPre=-/usr/bin/etcdctl mkdir /hosts``ExecStart=/bin/sh -c 'while true; do etcdctl watch --recursive /hosts; \\   sleep 1;\\   echo \u0026quot;127.0.0.1 localhost\u0026quot; \u0026gt; /etc/hosts; \\   for i in $(etcdctl ls /hosts); do \\   echo $(etcdctl get $i) $(echo $i | cut -c 8-); \\   done \u0026gt;\u0026gt; /etc/hosts; \\   done'``ExecStartPost=/usr/bin/etcdctl set /hosts/%H $PRIVATEIP   ExecStopPost=/usr/bin/etcdctl rm /hosts/%H``[X-Fleet]   Global=true\u003c/code\u003e\u003c/p\u003e","title":"CoreOS: Fleet Service to Manage /etc/hosts"},{"content":"Separating an application into many small independently developed and deployed microservices that communicate over a thin layer (like http) has many benefits (see Fowler’s article). However, one of the main drawbacks of this architecture is the difficulty automating end-to-end tests for the application.\nNewman’s Building Microservices asks two questions when end-to-end testing a microservices application:\nWhich versions of the services should we test? Where are the tests written, to not to duplicate the effort for each service? His solution is to have an external end-to-end test suite that can be run against many configurations of microservice versions. In this post, I present an implementation of Newman’s end-to-end microservices testing solution that uses the tool pmux and the continuous integration service TravisCI.\nEnd-to-End Testing MicroServices when [end-to-end tests] pass, you feel good: you have a high degree of confidence that the code being tested will work in production. Newman Building Microservices\nIf the benefit of end-to-end testing an application is confidence, then what are we confident in when testing a microservices application? We are confident all the different microservice versions in the system (which I call a configuration) work together correctly. The problem is that with many different configurations of those microwave versions, testing only one doesn’t give us confidence the others work.\nFor example, let’s look at the development of a basic microservices system where:\nservice Av1 (service A version 1) and Bv1 are in production (this configuration is denoted as {Av1, Bv1}) {Av1, Bv2} are in staging the team developing Av2 are testing it with Bv2 in the configuration {Av2, Bv2} the team developing Bv3 are testing it with Av1 in the configuration {Av1, Bv3} In the near future of this system, any of the 6 possible configurations ({Av1, Bv1}, {Av1, Bv2}, {Av1, Bv3}, {Av2, Bv1}, {Av2, Bv2}, {Av2, Bv3}) of this application could be in production, but only 4 have ever been deployed and tested together ({Av1, Bv1}, {Av1, Bv2}, {Av1, Bv3}, {Av2, Bv2}). Now, imagine that Av2 is fast tracked to fix a critical bug. First, it would be deployed into staging where it would work because it has been tested with Bv2. If Av2 were deployed to production in the configuration {Av2, Bv1}, there could be problems as that configuration has not been tested before.\nThis is an exponential problem. If there are three services and each service has a three versions, then there are 27 (3³) combinations of services; four services with three versions is 81 combinations. A real world application may contain many services each with many versions, which can lead to thousands of potential configurations that could be tested.\nIt is not necessary to test every potential configuration of a microservice application. However, to be confident that the application works, you have to end-to-end test more than one.\nA Basic Microservice Application To demonstrate the tools of end-to-end testing microservice applications, I will use a “Hello World” microservices application where:\nService A runs on port 8081 and returns a greeting generated by service A to a subject retrieved from service B Service B runs on port 8082 and returns a subject Service Av1 looks like: var http = require('http');``http.createServer(function (req, res) { var greeting = \u0026quot;Hello\u0026quot; http.get(\u0026quot;http://localhost:8082\u0026quot;, function(reply){ var who = \u0026quot;\u0026quot; reply.on('data', function(data) { who += data }) reply.on('end', function() { res.end(greeting + \u0026quot; \u0026quot; + who) }) }) }).listen(8081);\nSo in Av1 the greeting is “Hello” and in Av2 the greeting changes to “Go Away”.\nService Bv1 looks like: var http = require('http');``http.createServer(function (req, res) { res.end('World'); }).listen(8082);\nSo in Bv1 “World” is returned, where in Bv2 it changes to “Alice”, and in Bv3 changes to “Bob”.\nYou can check out the code at:\nService A Service B Each of the service versions is tagged with v1, v2, v3.\nTo start these services you will need node.js installed, then just run node service.js.\nBasic End-to-End Tests with Mocha and Chai\nThe microservice tests are written in node using the test runner Mocha and assertions library Chai. I have previously written about using these, if you are unfamiliar with them.\nThese microservices tests are not in the same repository as either of the services, they are in their own repository here).\nThere is only one test that calls service A to make sure the returned value is valid: var chai = require('chai') var expect = chai.expect``var bluebird = require('bluebird') bluebird.Promise.longStackTraces(); var needle = bluebird.promisifyAll(require('needle'))``var valid_responses = [ \u0026quot;Hello World\u0026quot;, \u0026quot;Hello Alice\u0026quot;, \u0026quot;Go Away Alice\u0026quot;, \u0026quot;Go Away Bob\u0026quot; ]``describe('service A', function(){ it('should return a valid response', function(){ return needle.getAsync(\u0026quot;http://localhost:8081\u0026quot;) .spread( function(res, body){ expect(valid_responses).to.contain(body.toString()) }) }) })\nThis test uses the bluebird promises library and needle to simplify the http request to service A and the chai expect function to make sure the response is valid.\nIn the test I define only four valid responses \u0026ldquo;Hello World\u0026rdquo;, \u0026ldquo;Hello Alice\u0026rdquo;, \u0026ldquo;Go Away Alice\u0026rdquo; and \u0026ldquo;Go Away Bob\u0026rdquo;. Given the different service versions, only the configurations {Av1,Bv1}, {Av1,Bv2}, {Av2,Bv2}, {Av2,Bv3} are valid.\npmux\nTo automatically test a microservice application, we need to start it with the versions of services we want. For this task we use pmux, which takes a node script defining the commands necessary to initialise and start a microservice application and executes them inside of a tmux session.\nThe pmux file microservice_configuration.js is used to setup our microservices application: var microservices_directory = \u0026quot;services_dir\u0026quot; var Arepo=\u0026quot;https://github.com/grahamjenson/microservice_A\u0026quot; var Brepo=\u0026quot;https://github.com/grahamjenson/microservice_B\u0026quot;``var Aversion = process.env.SERVICE_A_VERSION var Bversion = process.env.SERVICE_B_VERSION``var configuration = { \u0026quot;name\u0026quot;: \u0026quot;microservices\u0026quot;, \u0026quot;pre_commands\u0026quot;: [ \u0026quot;rm -rf \u0026quot; + microservices_directory, \u0026quot;mkdir \u0026quot; + microservices_directory ], \u0026quot;windows\u0026quot;: { \u0026quot;serviceA\u0026quot;: { \u0026quot;commands\u0026quot;: [ \u0026quot;git clone \u0026quot; + Arepo + \u0026quot; -b \u0026quot; + Aversion, \u0026quot;cd microservice_A\u0026quot;, \u0026quot;node service.js\u0026quot; ], \u0026quot;dir\u0026quot; : microservices_directory }, \u0026quot;serviceB\u0026quot;: { \u0026quot;commands\u0026quot;: [ \u0026quot;git clone \u0026quot; + Brepo + \u0026quot; -b \u0026quot; + Bversion, \u0026quot;cd microservice_B\u0026quot;, \u0026quot;node service.js\u0026quot; ], \u0026quot;dir\u0026quot; : microservices_directory } } }``module.exports = configuration\nThis file will:\nexecute the pre_commands list by deleting then making a services directory create two tmux windows in the service directory, one for each service each window will then use git clone to fetch a version of the service which is specified by the environment variables SERVICE_A_VERSION and SERVICE_B_VERSION each service window will then cd into their service directory and start the service with node service.js If we wanted to test the configuration {Av1, Bv1} we would\nfirst install pmux with npm install -g pmux Set the versions of the services to test with export SERVICE_A_VERSION=v1 SERVICE_B_VERSION=v1 start the tmux session with pmux microservice_configuration.js finally run the tests with mocha Note: you can attach to the tmux session using tmux attach -t microservices\nTravisCI\nTravisCI is a continuous integration and testing service. It has nice features like being able to execute simultaneous tests runs using various environments. TravisCI is also free for open source projects, and it integrates automatically with github to run your tests on every git push. The way to tell TravisCI to run your tests is by using a .travis.yml file. The .travis.yml file for our microservices tests is: `language: node_js\nnode_js:\n\u0026ldquo;0.12\u0026rdquo;``env: SERVICE_A_VERSION=v1 SERVICE_B_VERSION=v1 SERVICE_A_VERSION=v1 SERVICE_B_VERSION=v2 SERVICE_A_VERSION=v1 SERVICE_B_VERSION=v3 SERVICE_A_VERSION=v2 SERVICE_B_VERSION=v1 SERVICE_A_VERSION=v2 SERVICE_B_VERSION=v2 SERVICE_A_VERSION=v2 SERVICE_B_VERSION=v3``install: sudo apt-get update sudo apt-get install -y git-core sudo apt-get install -y tmux npm install -g pmux npm install``script: pmux microservice_configuration.js -v sleep 2 mocha``In this file:` node 0.12 is defined as the language the configurations to test are defined by the env key the required tools (git, tmux and pmux) are installed in the install key how to run the tests is described in the script key, where the pmux configuration is started, it sleeps for two seconds for the services to start, then we run the tests with mocha After a git push to the repository TravisCI will trigger the test suite to run, and the output will look like this:\nThis shows us exactly what we needed to know, which microservice configurations pass the tests and which fail. Now we can make sure that the failing configurations never make it to production.\nConclusions\nMany developers see microservices as the direction that large-scale web development is moving, so exploring ways to test and validate these applications is very important. Using pmux and TravisCI to execute end-to-end tests on the microservices applications I am helping to write gives me confidence they are working, and I hope this method can do the same for you.\nReferences\nFowler \u0026amp; Lewis Microservices article Newman Building Microservices ","permalink":"https://maori.geek.nz/posts/2015/2015-06-09_testing-microservices-with-pmux-and-travisci/","summary":"\u003cp\u003eSeparating an application into many small independently developed and deployed \u003cstrong\u003emicroservices\u003c/strong\u003e that communicate over a thin layer (like http) has many benefits (see \u003ca href=\"http://martinfowler.com/articles/microservices.html\"\u003eFowler’s article\u003c/a\u003e). However, one of the main drawbacks of this architecture is the difficulty automating \u003cstrong\u003eend-to-end\u003c/strong\u003e tests for the application.\u003c/p\u003e\n\u003cp\u003eNewman’s \u003ca href=\"http://www.amazon.com/gp/product/1491950358/ref=as_li_qf_sp_asin_il_tl?ie=UTF8\u0026amp;amp;camp=1789\u0026amp;amp;creative=9325\u0026amp;amp;creativeASIN=1491950358\u0026amp;amp;linkCode=as2\u0026amp;amp;tag=maor01-20\u0026amp;amp;linkId=F4GKBWUK22ZTVWB6\"\u003eBuilding Microservices\u003c/a\u003e asks two questions when end-to-end testing a microservices application:\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003eWhich versions of the services should we test?\u003c/li\u003e\n\u003cli\u003eWhere are the tests written, to not to duplicate the effort for each service?\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eHis solution is to have an external end-to-end test suite that can be run against many configurations of microservice versions. In this post, I present an implementation of Newman’s end-to-end microservices testing solution that uses the tool \u003ca href=\"https://github.com/LoyaltyNZ/pmux\"\u003epmux\u003c/a\u003e and the continuous integration service \u003ca href=\"https://travis-ci.org\"\u003eTravisCI\u003c/a\u003e.\u003c/p\u003e","title":"Testing Microservices with pmux and TravisCI"},{"content":"\nI want you to think how deeply dysfunctional it is for you to be afraid of what you created — Uncle Bob Martin Architecture the Lost Years\nWhen I sit down to some code that I haven’t touched in a while, I hesitate. This is weird; I wrote it, I created it, so why am I afraid of changing it? Because, the code works now and if I change something it might break.\nThe code doesn’t have to be important, widely used, or complicated to make me afriad, I only need think one thing:\nif something breaks the tests won’t tell me!\nThe tests are my harness and if I don’t trust them, nothing is stopping me from falling. Documentation, manual testing, or trusting the person who wrote the code (which is usually me) are not cures for this fear. The only thing that settles my nerves is knowing that if I break something, my test suite will yell at me.\nFor me to trust the tests they must fail when the code breaks and pass when the code works. The best way to make sure that test suite satisfies these requirements is to follow the red, green process:\nWrite a test, make it fail (red) Write the code, make it pass (green) Making the test fail first ensures that if the code does’t work it will fail. Writing the code second ensures that the code is what makes the test pass. Trusting my test suite removes my fear, and makes me feel slightly less dysfunctional.\nStop fear, write tests.\n","permalink":"https://maori.geek.nz/posts/2015/2015-06-03_dysfunctional-fear-of-code/","summary":"\u003cp\u003e\u003cimg alt=\"image\" loading=\"lazy\" src=\"/posts/2015/2015-06-03_dysfunctional-fear-of-code/images/1.jpeg#layoutTextWidth\"\u003e\u003c/p\u003e\n\u003cblockquote\u003e\n\u003cp\u003eI want you to think how deeply dysfunctional it is for you to be afraid of what you created — \u003cstrong\u003eUncle Bob Martin\u003c/strong\u003e \u003ca href=\"https://www.youtube.com/watch?v=WpkDN78P884\"\u003eArchitecture the Lost Years\u003c/a\u003e\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cp\u003eWhen I sit down to some code that I haven’t touched in a while, I hesitate. This is weird; I wrote it, I created it, so why am I afraid of changing it? Because, the code works \u003cem\u003enow\u003c/em\u003e and if I change something it might break.\u003c/p\u003e","title":"The Dysfunctional Fear of Code"},{"content":"I previously wrote an introduction to the Tessel here where I described a program that blinked out tweets in morse code. For my second project with the Tessel I wanted to use some of the modules that came with it to create a basic security device/camera to take pictures and alert me when something bad is happening.\nModules The Tessel has 4 different ports to plug modules into, each labeled A, B, C and D. The four modules I decided to use for this project were:\nCamera Module in port A Light and Sound in port B Accelerometer in port C Climate in port D The accelerometer is used to check if someone is tampering with the camera, also for Earthquakes which are concerningly frequent in New Zealand.\nThe light and sound detector can be used to detect sudden changes in light or noise level.\nThe climate detector can detect if the temperate goes above a threshold, for example if the house is on fire.\nThe camera is used to take pictures of what is happening and save them for later inspection. Originally I wanted to tweet them, I had some problems with the twitter API to do this.\nCode Full Code is available here\nI have divided each of the modules into their own sections of code to make it easier to describe. The code which all of the modules require is the tessel library, and the q library for promises (my usual choice of bluebird does not work on the Tessel): tessel = require('tessel') q = require 'q'\nAdditionally, I found it useful to reimplement bluebird’s promisify function, which turns a callback into a promise: promisify = (fn, thisArg = null) -\u0026gt; defer = q.defer() callback = (err, data) -\u0026gt; return defer.reject(err) if err defer.resolve(data) fn.apply(thisArg, [callback]) defer.promise\nAccelerometer To use a module you only have to require its model number from npm, and then call use on the port it is plugged into. For example, the accelerometers model number is accel-mma84 and is plugged into port C, so to access the module I only need to call require(‘accel-mma84’).use(tessel.port[‘C’]). This is so easy!\nThe core function I need to create is called is_moving and returns true if the Tessel is moving. This uses the get_xyz function to return a promise for the acceleration of the module in is an array of [x,y,z] values. Then, it waits for a second takes a different measurement and if the two are difference above a sensitivity level. accel = require('accel-mma84').use(tessel.port['C']); get_xyz = -\u0026gt; promisify(accel.getAcceleration, accel)``is_moving = (sensitivity = 0.005) -\u0026gt; xyz1 = null xyz2 = null get_xyz() .then( (xyz) -\u0026gt; xyz1 = xyz q.delay(1000).then(-\u0026gt; get_xyz()) ) .then( (xyz) -\u0026gt; xyz2 = xyz diffx = Math.abs(xyz2[0] - xyz1[0]) diffy = Math.abs(xyz2[1] - xyz1[1]) diffz = Math.abs(xyz2[2] - xyz1[2]) if (diffx + diffy + diffz) \u0026gt; sensitivity true else false )\nAmbient Sound and Light\nSimilar to accelerometer, the is_light_sound_changing function measures the light level and sound level twice, a second apart, and if there are differences above a sensitivity it returns true. ambient = require('ambient-attx4').use(tessel.port['B']);``get_light_and_sound = -\u0026gt; q.all([promisify(ambient.getLightLevel, ambient),promisify(ambient.getSoundLevel, ambient)])``is_light_sound_changing = (light_sensitivity=0.001, sound_sensitivity=0.005) -\u0026gt; l1 = null l2 = null s1 = null s2 = null get_light_and_sound() .then((light_and_sound) -\u0026gt; l1 = light_and_sound[0] s1 = light_and_sound[1] q.delay(1000).then(-\u0026gt; get_light_and_sound()) ) .then((light_and_sound) -\u0026gt; l2 = light_and_sound[0] s2 = light_and_sound[1] diffl = Math.abs(l1 - l2) diffs = Math.abs(s1 - s2) console.log 'diff', diffl, diffs if diffl \u0026gt; light_sensitivity || diffs \u0026gt; sound_sensitivity true else false )\nClimate\nThe is_burning function measures the temperature and if it is above 30 degrees (which is warm for New Zealand, probably cool for Australia) it returns true. climate = require('climate-si7020').use(tessel.port['D'])``get_temp = -\u0026gt; promisify(climate.readTemperature, climate)``is_burning = (max_temp=30) -\u0026gt; get_temp() .then( (temp) -\u0026gt; console.log temp if temp \u0026gt; max_temp true else false )\nCamera\nThe turn_camera_on function is used to initialise the camera, and then the take_a_picture function takes a picture and saves it to the computer that the Tessel is plugged in to. camera = require('camera-vc0706').use(tessel.port['A']);``turn_camera_on = -\u0026gt; console.log \u0026quot;Turn Camera On\u0026quot; camera_ready_defer = q.defer() camera.on('ready', -\u0026gt; console.log \u0026quot;Camera On\u0026quot; camera_ready_defer.resolve(true) ) return camera_ready_defer.promise``take_a_picutre = -\u0026gt; promisify(camera.takePicture, camera) .then( (image) -\u0026gt; console.log \u0026quot;Image Taken:\u0026quot;, image name = \u0026quot;picture-#{Date.now()}.jpg\u0026quot; console.log('Picture saving as', name, '...'); process.sendfile(name, image); console.log('done.'); )\nThe Main Loop\nFirst the camera is turned on, then every 10 seconds it checks whether the light or sound is changing, it is too hot, and whether the Tessel is moving. Then, if any of these are true it logs it and takes a picture. turn_camera_on() .then( -\u0026gt; setInterval( -\u0026gt; q.all([is_light_sound_changing(), is_burning(), is_moving()]) .spread( (lsc, burning, moving) -\u0026gt; if lsc or burning or moving console.log \u0026quot;Light or Sound Changing\u0026quot; if lsc console.log \u0026quot;Burning\u0026quot; if burning console.log \u0026quot;Moving\u0026quot; if moving take_a_picutre() else console.log \u0026quot;Nothing is happening\u0026quot; ) , 10000) )\nExecuting it\nTo execute this code first we have to build the javascript code from the coffeescript. coffee -c tessel_security_camera.coffee\nNow we can run it, and give it a directory to put the taken pictures into. tessel run tessel_security_camera.js \\ --upload-dir=/pictures/folder\nConclusion\nThe Tessel modules are one of its massive advantages over other embedded development platforms. As you can see above, it is incredibly easy to integrate and use these modules. I think that if you want to quickly make some simple, embedded thing that will just work with the least amount of hassle, then you should definitely consider using the Tessel.\n","permalink":"https://maori.geek.nz/posts/2015/2015-04-29_embeeded-javascript-on-the-tessel-building-a-modular-security-camera/","summary":"\u003cp\u003eI previously wrote an introduction to the \u003ca href=\"https://tessel.io/\"\u003eTessel\u003c/a\u003e \u003ca href=\"http://www.maori.geek.nz/post/testing_out_the_tessel_with_project_1_twitter_to_morse_code\"\u003ehere\u003c/a\u003e where I described a program that blinked out tweets in morse code. For my second project with the Tessel I wanted to use some of the modules that came with it to create a basic security device/camera to take pictures and alert me when something bad is happening.\u003c/p\u003e\n\u003ch3 id=\"modules\"\u003eModules\u003c/h3\u003e\n\u003cp\u003eThe Tessel has 4 different ports to plug modules into, each labeled A, B, C and D. The four modules I decided to use for this project were:\u003c/p\u003e","title":"Embeeded Javascript on the Tessel; Building a Modular Security Camera"},{"content":"I borrowed a Tessel from a friend @leighghunt to play with some embedded node.js. My first project is to listen to a twitter stream and blink it out as Morse code on the Tessel’s LEDs.\nGetting Started npm install -g tessel\nInitialising the project: npm init\nI like CoffeeScript, so to test the Tessel out I adapted the demo code from here to a file called twitter_morse_code.coffee: tessel = require('tessel')``led1 = tessel.led[0].output(1) led2 = tessel.led[1].output(0)``setInterval( -\u0026gt; console.log(\u0026quot;I'm blinking! (Press CTRL + C to stop)\u0026quot;) led1.toggle() led2.toggle() , 100)\nThis code will rapidly toggle two LEDs on the board. To run it just plug in the tessel: coffee -c twitter_morse_code.coffee tessel run twitter_morse_code.js\nSoon the Tessel will support CoffeeScript\nWiFi\nTo connect to twitter we can use the onboard tessel WiFi. It can be connected with: tessel wifi -n [network name] -p [password] -s [security type*]\nThe Tessel remembers these settings and will try to connect on startup\nThe WiFi can be checked with:\ntessel wifi -l\nTwitter to Morse Code\nThe code is available at https://github.com/grahamjenson/tessel_twitter_to_morse\nThe Dependencies:\nmorsecode q (bluebird doesn’t work) node-twitter ( twitter doesn’t work) I changed the code in the file twitter_morse_code.coffee tessel = require('tessel') keys = require './twitter_keys.json' MorseCode = require(\u0026quot;morsecode\u0026quot;); morseConverter = new MorseCode(); Twitter = require('node-twitter'); q = require('q')\nMorse code methods: led1 = tessel.led[0] led2 = tessel.led[1]``toggle = (led, time) -\u0026gt; led.write(true) q.delay(time) .then( -\u0026gt; led.write(false) )``dot = -\u0026gt; console.log 'dot' toggle(led1, 100)``dash = -\u0026gt; console.log 'dash' toggle(led2, 200)``morse_blink = (message, text) -\u0026gt; console.log \u0026quot;BLINKING\u0026quot;, text promise = q.fcall( -\u0026gt; ) for char in message if char == '.' promise = promise.then( -\u0026gt; dot()) else if char == '_' promise = promise.then( -\u0026gt; dash())``promise\nThe dot and dash methods return a promise to turn on and off one of the tessel LEDs.\nThe morse_blink function uses javascript promises like a stack of asynchronous events, where it takes a string made of dots . and dashes _ and returns a promise to blink the the message out.\nTwitter stream: twitterStreamClient = new Twitter.StreamClient(keys.key, keys.secret, keys.akey, keys.asecret);``promise = q.fcall( -\u0026gt; )``twitterStreamClient.on('tweet', (tweet) -\u0026gt; text = tweet.text morse = morseConverter.translate(tweet.text) console.log text console.log morse promise = promise.then( -\u0026gt; morse_blink(morse, text)) );``twitterStreamClient.start(['grahamjenson']);\nAs above, promises are used like a stack to make sure that we wait till the previous message is finished before we start blinking a new one.\nProblems with undefined v.s. null console.log(typeof undefined) console.log(typeof null)\nin node 0.12 returns undefined object\nbut on the tessel it returns: undefined undefined\nThis means that lines like this: if (typeof uri === 'undefined') throw new Error('undefined is not a valid uri or options object.')\nfrom the request package dependency of node-twitter will break. Delete that line and it will work :)\nRunning\nAfter executing: coffee -c twitter_morse_code.coffee tessel run twitter_morse_code.js\nThe tessel will listen for any tweet with grahamjenson in it and then blink it out as Morse on the tessel. The output will look like: test grahamjenson _ . . . . _ _ _ . . _ . . _ . . . . . _ _ _ . _ _ _ . _ . . . . _ _ _ _ . BLINKING test grahamjenson dash dot dot dot dot dash dash dash ...\nConclusion\nThe Tessel is a great little device to play with embedded Node. It is simple to get started and easy to debug. If you want to have Javascript interact with the world or you want to teach someone Javascript in a more pratical way, I recommend the Tessel.\n*Also: Tim Pietrusky: Nerd Disco Talk *\n","permalink":"https://maori.geek.nz/posts/2015/2015-03-21_embedded-javascript-on-the-tessel-twitter-to-morse-code/","summary":"\u003cp\u003eI borrowed a \u003ca href=\"https://tessel.io/\"\u003eTessel\u003c/a\u003e from a friend \u003ca href=\"https://twitter.com/leighghunt\"\u003e@leighghunt\u003c/a\u003e to play with some embedded node.js. My first project is to listen to a twitter stream and blink it out as \u003ca href=\"http://en.wikipedia.org/wiki/Morse_code\"\u003eMorse code\u003c/a\u003e on the Tessel’s LEDs.\u003c/p\u003e\n\u003ch3 id=\"getting-started\"\u003eGetting Started\u003c/h3\u003e\n\u003cp\u003e\u003ccode\u003enpm install -g tessel\u003c/code\u003e\u003c/p\u003e\n\u003cp\u003eInitialising the project:\n\u003ccode\u003enpm init\u003c/code\u003e\u003c/p\u003e\n\u003cp\u003eI like CoffeeScript, so to test the Tessel out I adapted the demo code from \u003ca href=\"http://start.tessel.io/blinky\"\u003ehere\u003c/a\u003e to a file called twitter_morse_code.coffee:\n\u003ccode\u003etessel = require('tessel')``led1 = tessel.led[0].output(1)   led2 = tessel.led[1].output(0)``setInterval( -\u0026gt;   console.log(\u0026quot;I'm blinking! (Press CTRL + C to stop)\u0026quot;)   led1.toggle()   led2.toggle()   , 100)\u003c/code\u003e\u003c/p\u003e","title":"Embedded Javascript on the Tessel; Twitter to Morse Code"},{"content":"NOTE: I am also maintaining this post on github. This way I can accept pull requests for changes and additions :)\nRecommender systems (or recommendation engines) are useful and interesting pieces of software. I wanted to compare other recommender systems to mine (HapiGER) but couldn’t find a decent list of them, so I decided to create one. In this post I will list the recommender systems that I have come across with links and some basic information about them. I intend on keeping this list up-to-date, so comment below if I am missing one or tweet me @grahamjenson.\nSoftware as a Service Recommender Systems SaaS Recommender systems have many challenges to their development including having to handle multi-tenancy, store and process a massive amount of data and other softer concerns like keeping a clients sensitive data safe on remote servers.\nThe benefits to using a SaaS recommender system is that you can pay for value with a low overhead rather than having a large upfront investment, they generally have a clear integration path for you to use, and they provide continual development and improvement while you use it.\nThe SaaS recommender systems I have found are:\nRcmmndr which I first came across as a Heroku add-on. It is based on Hadoop but seems to be based abandoned Mortar Recommendation Engine is a kind of do-it-yourself recommender system, where by using their PaaS Mortar and MongoDB there are instructions to create a recommender system. Peerius closed, product and e-commerce focused for live and email recommendations. Active and seems very interesting, although little information about the actual product and how it works is available. Strands is a closed, product and e-commerce focused system. I think it works by including tracking scripts (a la Google Analytics) on the website, and recommendations widgets. What I really like about Strands is their publishing of case-studies e.g. Wireless Emporium and white papers like The Big promise of recommender systems. Although these do not discuss the exact solutions provided, they give a good overview of their vision and goals of providing recommendations. SLI Systems Recommender A closed recommender system focused on e-commerce, search and mobile. Google Cloud Prediction API Googles offering of cloud computed prediction API Using Hadoop on Google Cloud an example use of Google cloud with benchmarks from recommender system. ParallelDots tool to relate published content Amazon Machine Learning machine learning platform to model data and create predictions Azure ML machine learning platform to model data and create predictions Gravity R\u0026amp;D is a company built by some of the winners from the 2009 Netflix prize. They offer a solution that provides targeted, customized recommendations to users of websites. They have some pretty big clients including DailyMotion and a technology page which describes their architecture, algorithms, and a list of publications. (suggested by Martin Vetes) GraphFlow provides is a user event analytics and recommendations API, with integration into the [WooCommerce(http://www.woothemes.com/woocommerce/) WordPress store] plugin. Open Source Recommender Systems Most of the non-SaaS recommender systems that I came across were open-source. This may have been because recommender systems are more tailored to clients so not easily made into a product.\nThe open-source recommender systems I found are:\nPredictionIO is built on technologies Apache Spark, Apache HBase and Spray. It is a machine learning server that can be used to create a recommender system. The source can be located on github and it looks very active. Racoon Recommendation Engine is an open source Node.js based collaborative filter that uses Redis as a store. It is effectively abandoned. HapiGER is an open source Node.js collaborative filtering engine, which can use in-memory, PostgreSQL or rethinkdb. Reasonably active development (when I have time :) EasyRec Java and Rest based recommendations. Abandoned Mahout Hadoop/linear algebra based data mining Seldon is a Java based prediction engine built on technologies like Apache Spark. It provides a demo movie recommendations application here. LensKit is a Java based research recommender system designed for small-to-medium scale. Oryx v2 a large scale architecture for machine learning and prediction (suggested by Lorand) Non-Sass Product Recommender Systems Not very many Non-SaaS Non-OpenSource recommender systems seem to exist. Below is a list:\nDato is a company that provides a python package and servers for business machine learning including many predictive algorithms for recommendations. They also integrate with Apache Spark and have great blog posts like Why is building custom recommender systems hard? Does it have to be?. Their customers include Pandora and StumbleUpon, must be a good product. Academic Recommender Systems Recommender systems are a very active area of research in academia, though few of the generated systems make it out of the lab. Here are a few I have found that did:\nDuine Framework a Java based recommendation system that has been abandoned MyMediaLite C# based in-memory recommender system that has been abandoned Bonus: List of Recommender System Dissertations, a useful list to keep up with the current state of recommendations systems in academia LibRec A Java based Recommendations engine with loads of implemented algorithms (suggested by Saúl Vargas) RankSys Java Recommendation system for novelty and diversity created by Saúl Vargas) Benchmarking Recommender Systems It is very difficult to benchmark recommender systems, not only because getting good datasets is hard, but different methods and algorithms have different advantages and disadvantages that are dificult to expose.\nHere is a list of some benchmarking tools:\nTagRec Tag Recommender Benchmarking Framework RiVaL an open source toolkit for recommender system evaluation. Some results are posted here. Media Recommendation Applications In addition to generic recommender systems, I decided to add a list of applications where recommendations are a core offering, specifically in the domain of media recommendations:\nYeah, Nah Movie recommendations site based on GER source Jinni Movie recommendations site Gyde Streaming media recommendations TasteKid movies, books, music recommendations. sent to me by thelinuxlich Gnoosic music based on bands. sent to me by thelinuxlich Pandora music recommendations based on likes and dislikes or songs More If you have any more recommender systems or a better way to sort them comment down below. I want to try compile a more complete list to make comparison easier for everyone.\nDon’t forget to check out my recommendation engine HapiGER\n","permalink":"https://maori.geek.nz/posts/2015/2015-03-16_list-of-recommender-systems/","summary":"\u003cp\u003e\u003cstrong\u003eNOTE: I am also maintaining this\u003c/strong\u003e \u003ca href=\"https://github.com/grahamjenson/list_of_recommender_systems\"\u003e\u003cstrong\u003epost on github\u003c/strong\u003e\u003c/a\u003e\u003cstrong\u003e. This way I can accept pull requests for changes and additions :)\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eRecommender systems (or recommendation engines) are useful and interesting pieces of software. I wanted to compare other recommender systems to mine (\u003ca href=\"http://www.hapiger.com/\"\u003eHapiGER\u003c/a\u003e) but couldn’t find a decent list of them, so I decided to create one. In this post I will list the recommender systems that I have come across with links and some basic information about them. \u003cem\u003eI intend on keeping this list up-to-date, so comment below if I am missing one or tweet me @grahamjenson.\u003c/em\u003e\u003c/p\u003e","title":"List of Recommender Systems"},{"content":"In my previous post about using Vagrant to run Docker containers, I recommended not to use Boot2Docker. Since then my issues with Boot2Docker have been resolved and now it is my preferred way to use Docker in OSX.\nHere is a quick tutorial on how to set up Postgres, Elasticsearch, and Redis as Docker containers using Boot2Docker on Mac OS X.\nBoot2Docker for OSX If you are using OSX and want to use Docker, then Boot2Docker is the recommended tool. It works by using a virtual machine to host Docker and letting the OSX command line ‘remotely’ call it with the docker command.\nTo install Boot2Docker (requires homebrew): brew install boot2docker\nTo initialise and start the virtual machine: boot2docker init boot2docker up\nFinally, to link the docker command line interface to the Docker host: export DOCKER_IP=boot2docker ip export DOCKER_HOST=boot2docker socket``\nCall docker ps to test that the docker is correctly linked, this will return successfully if it is working. If the Docker host is not running, or the DOCKER_HOST environment variable is not set, the error will look something like: Get [http:///var/run/docker.sock/v1.12/containers/json:](http:///var/run/docker.sock/v1.12/containers/json:) dial unix /var/run/docker.sock: no such file or directory\nPostgreSQL Running Postgres as a Docker container is pretty easy, simply execute: docker run -it -p 5432:5432 postgres\nThe -it links the the standard in and out from the container to the OSX terminal, making it way easier to debug and you to kill a container with Ctrl-C.\nYou can use psql to check PostgreSQL is working: psql -h $DOCKER_IP -U postgres\nElasticsearch I don’t like to run Elasticsearch on my local machine because I don’t like the JVM reminding me to update. Instead I now use Docker: docker run -it -p 9200:9200 dockerfile/elasticsearch\nYou can use Elasticsearch’s HTTP interface to see if it is running: curl $DOCKER_IP:9200\nRedis Running Redis is also pretty easy: docker run -it -p 6379:6379 dockerfile/redis\nYou can use the redis-cli to check it is running: redis-cli -h $DOCKER_IP\nPersistance These containers are initialised to not store their data, so if you kill or restart them you will lose everything. To properly set them up refer to their documentation on the Docker registry:\npostgres dockerfile/elasticsearch dockerfile/redis Further Reading The Docker Book\nThe Future of Docker\n","permalink":"https://maori.geek.nz/posts/2015/2015-02-14_boot-2-docker-how-to-set-up-postgres-elasticsearch-and-redis-on-mac-os-x/","summary":"\u003cp\u003eIn my \u003ca href=\"http://www.maori.geek.nz/post/vagrant_with_docker_how_to_set_up_postgres_elasticsearch_and_redis_on_mac_os_x\"\u003eprevious post\u003c/a\u003e about using \u003ca href=\"http://www.vagrantup.com/\"\u003eVagrant\u003c/a\u003e to run \u003ca href=\"http://www.docker.com/\"\u003eDocker\u003c/a\u003e containers, I recommended not to use \u003ca href=\"http://boot2docker.io/\"\u003eBoot2Docker\u003c/a\u003e. Since then my issues with Boot2Docker have been resolved and now it is my preferred way to use Docker in OSX.\u003c/p\u003e\n\u003cp\u003eHere is a quick tutorial on how to set up \u003ca href=\"http://www.postgresql.org/\"\u003ePostgres\u003c/a\u003e, \u003ca href=\"http://www.elasticsearch.org/\"\u003eElasticsearch\u003c/a\u003e, and \u003ca href=\"http://redis.io/\"\u003eRedis\u003c/a\u003e as Docker containers using Boot2Docker on Mac OS X.\u003c/p\u003e\n\u003ch3 id=\"boot2docker-for-osx\"\u003eBoot2Docker for OSX\u003c/h3\u003e\n\u003cp\u003eIf you are using OSX and want to use Docker, then \u003ca href=\"http://boot2docker.io/\"\u003eBoot2Docker\u003c/a\u003e is the recommended tool. It works by using a virtual machine to host Docker and letting the OSX command line ‘remotely’ call it with the docker command.\u003c/p\u003e","title":"Boot 2 Docker: How to set up Postgres, Elasticsearch and Redis on Mac OS X"},{"content":"I am proud to announce the beta version of HapiGER an open-source, easy to use, easy to integrate recommendations engine. It is built using the Good Enough Recommendations (GER) engine and the Hapi.js framework.\nIn this post I will describe how you can use HapiGER to generate recommendations for your users.\nInstall HapiGER Install with npm npm install -g hapiger\nStart HapiGER By default it will start with an in-memory event store (events are not persisted) hapiger\nThere are also PostgreSQL and RethinkDB event stores for persistence and scaling\nGive an Action Weight Set the view action to have weight 1: curl -X POST 'http://localhost:3456/default/actions' -d'{ \u0026quot;name\u0026quot;: \u0026quot;view\u0026quot;, \u0026quot;weight\u0026quot;: 1 }'\nCreate some Events Alice views Harry Potter curl -X POST 'http://localhost:3456/default/events' -d '{ \u0026quot;person\u0026quot;:\u0026quot;Alice\u0026quot;, \u0026quot;action\u0026quot;: \u0026quot;view\u0026quot;, \u0026quot;thing\u0026quot;:\u0026quot;Harry Potter\u0026quot; }'\nThen, Bob also views Harry Potter (now Bob has similar viewing habits to Alice) curl -X POST 'http://localhost:3456/default/events' -d '{ \u0026quot;person\u0026quot;:\u0026quot;Bob\u0026quot;, \u0026quot;action\u0026quot;: \u0026quot;view\u0026quot;, \u0026quot;thing\u0026quot;:\u0026quot;Harry Potter\u0026quot; }'\nBob then buys The Hobbit curl -X POST 'http://localhost:3456/default/events' -d '{ \u0026quot;person\u0026quot;:\u0026quot;Bob\u0026quot;, \u0026quot;action\u0026quot;: \u0026quot;buy\u0026quot;, \u0026quot;thing\u0026quot;:\u0026quot;The Hobbit\u0026quot; }'\nGet Recommendations What books should Alice buy? curl -X GET \u0026quot;http://localhost:3456/default/recommendations?\\ person=Alice\\ \u0026amp;amp;action=buy\u0026quot;``{ \u0026quot;recommendations\u0026quot;:[ { \u0026quot;thing\u0026quot;:\u0026quot;The Hobbit\u0026quot;, \u0026quot;weight\u0026quot;:0.22119921692859512, \u0026quot;people\u0026quot;:[ \u0026quot;Bob\u0026quot; ], \u0026quot;last_actioned_at\u0026quot;:\u0026quot;2015-02-05T05:56:42.862Z\u0026quot; } ], \u0026quot;confidence\u0026quot;:0.00019020140391302825, \u0026quot;similar_people\u0026quot;:{ \u0026quot;Bob\u0026quot;:1 } }\nAlice should buy The Hobbit as it was recommended by Bob with a weight of about 0.2.\nThe confidence of these recommendations is pretty low because there are not many events in the system\nHow HapiGER Works (the Quick Version) The HapiGER API calculates recommendations for Alice to buy by:\nFinding people that are like Alice by looking at her past events Calculating the similarities between Alice and those people Look at the recent things that those similar people buy Weight those things using the similarity of the people If you would like to read more about how HapiGER works, here is the long version.\nEvent Stores The “in-memory” memory event store is the default, this will not scale well or persist event so is not recommended for production.\nThe recommended event store is PostgreSQL, which can be used with: hapiger --es pg --esoptions '{ \u0026quot;connection\u0026quot;:\u0026quot;postgres://localhost/hapiger\u0026quot; }'\nOptions are passed to knex.\nHapiGER also supports a RethinkDB event store: hapiger --es rethinkdb --esoptions '{ \u0026quot;host\u0026quot;:\u0026quot;127.0.0.1\u0026quot;, \u0026quot;port\u0026quot;: 28015, \u0026quot;db\u0026quot;:\u0026quot;hapiger\u0026quot; }'\nOptions passed to rethinkdbdash.\nCompacting the Event Store The event store needs to be regularly maintained by removing old, outdated, or superfluous events; this is called compacting. This can be done either synchronously or asynchronously (it can take a while): curl -X POST 'http://localhost:3456/default/compact'``curl -X POST 'http://localhost:3456/default/compact_async'\nNamespaces Namespaces are used to separate events for different applications or categories of things. The default namespace is default, but you can create namespaces by: curl -X POST 'http://localhost:3456/namespace' -d'{ \u0026quot;namespace\u0026quot;: \u0026quot;new_ns\u0026quot; }'\nTo delete a namespace (and all its events!): curl -X DELETE 'http://localhost:3456/namespace/new_ns'\nConfiguration of HapiGER There are many configuration variables for HapiGER to tune the generated recommendations, these can be viewed with hapiger — help. The impact of each of these options are described in the long version of how HapiGER works.\nClients Node.js client ger-client ","permalink":"https://maori.geek.nz/posts/2015/2015-02-07_hapiger-recommendations-made-easy/","summary":"\u003cp\u003eI am proud to announce the beta version of \u003ca href=\"http://www.hapiger.com\"\u003eHapiGER\u003c/a\u003e an open-source, easy to use, easy to integrate recommendations engine. It is built using the \u003ca href=\"https://github.com/grahamjenson/ger\"\u003eGood Enough Recommendations (GER)\u003c/a\u003e engine and the \u003ca href=\"http://hapijs.com\"\u003eHapi.js\u003c/a\u003e framework.\u003c/p\u003e\n\u003cp\u003eIn this post I will describe how you can use HapiGER to generate recommendations for your users.\u003c/p\u003e\n\u003ch4 id=\"install-hapiger\"\u003eInstall HapiGER\u003c/h4\u003e\n\u003cp\u003eInstall with npm\n\u003ccode\u003enpm install -g hapiger\u003c/code\u003e\u003c/p\u003e\n\u003ch4 id=\"start-hapiger\"\u003eStart HapiGER\u003c/h4\u003e\n\u003cp\u003eBy default it will start with an in-memory event store (events are not persisted)\n\u003ccode\u003ehapiger\u003c/code\u003e\u003c/p\u003e","title":"HapiGER: Recommendations Made Easy"},{"content":"accepts_nested_attributes_for is a really powerful method in Rails because it allows a model to alter related models through itself. However, it has a pretty big gotcha.\nAn example using accepts_nested_attributes_for is where a user model which belongs_to an alias model, i.e. #user.rb class User \u0026lt; ActiveRecord::Base end``#alias.rb class Alias \u0026lt; ActiveRecord::Base belongs_to :user accepts_nested_attributes_for :user end\nThis allows the Alias model to change the user by passing a hash key user_attributes i.e. `Alias.first.user.name\n\u0026ldquo;Alice\u0026rdquo;\nAlias.first.update_attributes(\n{\n:user_attributes =\u0026gt; {\n:id =\u0026gt; 1,\n:name =\u0026gt; \u0026ldquo;Bob\u0026rdquo;\n}\n})``Alias.first.user.name\n\u0026ldquo;Bob\u0026rdquo;`\nThe gotcha exists if you do not pass the :id symbol in the attributes hash, i.e. `Alias.first.user_id\n1\nAlias.first.update_attributes(\n{\n:user_attributes =\u0026gt; {\n:name =\u0026gt; \u0026ldquo;Bob\u0026rdquo;\n}\n})``Alias.first.user_id\n2`\nThis behaviour is documented, it is just not what I would have expected.\nTo \u0026ldquo;fix\u0026rdquo; this (if you do not want to pass the id every time) you can set the :update_only flag to true, i.e. #alias.rb class Alias \u0026lt; ActiveRecord::Base belongs_to :user accepts_nested_attributes_for :user, :update_only =\u0026gt; true end\n","permalink":"https://maori.geek.nz/posts/2015/2015-01-29_acceptsnestedattributesfor-is-creating-new-records-gotcha/","summary":"\u003cp\u003eaccepts_nested_attributes_for is a really powerful method in Rails because it allows a model to alter related models through itself. However, it has a pretty big \u003cem\u003egotcha\u003c/em\u003e.\u003c/p\u003e\n\u003cp\u003eAn example using accepts_nested_attributes_for is where a user model which belongs_to an alias model, i.e.\n\u003ccode\u003e#user.rb   class User \u0026lt; ActiveRecord::Base   end``#alias.rb   class Alias \u0026lt; ActiveRecord::Base   belongs_to :user   accepts_nested_attributes_for :user   end\u003c/code\u003e\u003c/p\u003e\n\u003cp\u003eThis allows the Alias model to change the user by passing a hash key user_attributes i.e.\n`Alias.first.user.name\u003c/p\u003e","title":"accepts_nested_attributes_for is Creating New Records; Gotcha!"},{"content":"Docker is great fun when you start building things by plugging useful containers together. Recently I have been playing with Consul and trying to plug things together to make a truly horizontally scalable web application architecture. Consul is a Service Discovery and Configuration application, made by HashiCorp the people who brought us Vagrant.\nPreviously I experimented using Consul by using SRV records (described here) to create a scalable architecture, but I found this approach a little complicated, and I am all about simple. Then I found Consul Template which links to Consul to update configurations and restart application when services come up or go down.\nIn this post I will describe how to use Docker to plug together Consul, Consul Template, Registrator and Nginx into a truly scalable architecture that I am calling DR CoN. Once all plugged together, DR CoN lets you add and remove services from the architecture without having to rewrite any configuration or restart any services, and everything just works!\nDocker Docker is an API wrapper around LXC (Linux containers) so will only run on Linux. Since I am on OSX (as many of you probably are) I have written a post about how to get Docker running in OSX using boot2docker. This is briefly described below: brew install boot2docker boot2docker init boot2docker up\nThis will start a virtual machine running a Docker daemon inside an Ubuntu machine. To attach to the daemon you can run: export DOCKER_IP=boot2docker ip export DOCKER_HOST=boot2docker socket``\nYou can test Docker is correctly installed using: docker ps\nBuild a very simple Web Service with Docker To test the Dr CoN architecture we will need a service. For this, let create the simplest service that I know how (further described here). Create a file called Dockerfile with the contents: FROM python:3 EXPOSE 80 CMD [\u0026quot;python\u0026quot;, \u0026quot;-m\u0026quot;, \u0026quot;http.server\u0026quot;]\nIn the same directory as this file execute: docker build -t python/server .\nThis will build the docker container and call it python/server, which can be run with: docker run -it \\ -p 8000:80 python/server\nTo test that it is running we can call the service with curl: curl $DOCKER_IP:8000\nConsul Consul is best described as a service that has a DNS and a HTTP API. It also has many other features like health checking services, clustering across multiple machines and acting as a key-value store. To run Consul in a Docker container execute: docker run -it -h node \\ -p 8500:8500 \\ -p 8600:53/udp \\ progrium/consul \\ -server \\ -bootstrap \\ -advertise $DOCKER_IP \\ -log-level debug\nIf you browse to $DOCKER_IP:8500 there is a dashboard to see the services that are registered in Consul.\nTo register a service in Consul’s web API we can use curl: curl -XPUT \\ $DOCKER_IP:8500/v1/agent/service/register \\ -d '{ \u0026quot;ID\u0026quot;: \u0026quot;simple_instance_1\u0026quot;, \u0026quot;Name\u0026quot;:\u0026quot;simple\u0026quot;, \u0026quot;Port\u0026quot;: 8000, \u0026quot;tags\u0026quot;: [\u0026quot;tag\u0026quot;] }'\nThen we can query Consuls DNS API for the service using dig: dig @$DOCKER_IP -p 8600 simple.service.consul``; \u0026lt;\u0026lt;\u0026gt;\u0026gt; DiG 9.8.3-P1 \u0026lt;\u0026lt;\u0026gt;\u0026gt; simple.service.consul ;; global options: +cmd ;; Got answer: ;; -\u0026gt;\u0026gt;HEADER\u0026lt;\u0026lt;- opcode: QUERY, status: NOERROR, id: 39614 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0``;; QUESTION SECTION: ;simple.service.consul. IN A``;; ANSWER SECTION: simple.service.consul. 0 IN A 192.168.59.103``;; Query time: 1 msec ;; SERVER: 192.168.59.103#53(192.168.59.103) ;; WHEN: Mon Jan 12 15:35:01 2015 ;; MSG SIZE rcvd: 76\nHold on, there is a problem, where is the port of the service? Unfortunately DNS A records do not return the port of a service, to get that we must check SRV records: dig @$DOCKER_IP -p 8600 SRV simple.service.consul``; \u0026lt;\u0026lt;\u0026gt;\u0026gt; DiG 9.8.3-P1 \u0026lt;\u0026lt;\u0026gt;\u0026gt; SRV simple.service.consul ;; global options: +cmd ;; Got answer: ;; -\u0026gt;\u0026gt;HEADER\u0026lt;\u0026lt;- opcode: QUERY, status: NOERROR, id: 3613 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1``;; QUESTION SECTION: ;simple.service.consul. IN SRV``;; ANSWER SECTION: simple.service.consul. 0 IN SRV 1 1 8000 node.node.dc1.consul.``;; ADDITIONAL SECTION: node.node.dc1.consul. 0 IN A 192.168.59.103``;; Query time: 1 msec ;; SERVER: 192.168.59.103#53(192.168.59.103) ;; WHEN: Mon Jan 12 15:36:54 2015 ;; MSG SIZE rcvd: 136\nSRV records are difficult to use because they are not supported by many technologies.\nThe container srv-router can be used with Consul and nginx to route incoming calls to the correct services, as described here. However there is an easier way than that to use nginx to route to services.\nRegistrator\nRegistrator takes environment variables defined when a Docker container is started and automatically registers it with Consul. For example: docker run -it \\ -v /var/run/docker.sock:/tmp/docker.sock \\ -h $DOCKER_IP progrium/registrator \\ consul://$DOCKER_IP:8500\nStarting a service with: docker run -it \\ -e \u0026quot;SERVICE_NAME=simple\u0026quot; \\ -p 8000:80 python/server\nWill automatically add the service to Consul, and stopping it will remove it. This is the first part to plugin to DR CoN as it will mean no more having to manually register services with Consul.\nConsul Template\nConsul Template uses Consul to update files and execute commands when it detects the services in Consul have changed.\nFor example, it can rewrite an nginx.conf file to include all the routing information of the services then reload the nginx configuration to load-balance many similar services or provide a single end-point to multiple services.\nI modified the Docker container from https://github.com/bellycard/docker-loadbalancer for this example `FROM nginx:1.7#Install Curl RUN apt-get update -qq \u0026amp;amp;\u0026amp;amp; apt-get -y install curl#Download and Install Consul Template\nENV CT_URL http://bit.ly/15uhv24\nRUN curl -L $CT_URL | \\\ntar -C /usr/local/bin \u0026ndash;strip-components 1 -zxf -#Setup Consul Template Files RUN mkdir /etc/consul-templates ENV CT_FILE /etc/consul-templates/nginx.conf#Setup Nginx File\nENV NX_FILE /etc/nginx/conf.d/app.conf#Default Variables ENV CONSUL consul:8500 ENV SERVICE consul-8500# Command will\n1. Write Consul Template File 2. Start Nginx 3. Start Consul Template``CMD echo \u0026ldquo;upstream app { \\n\\ least_conn; \\n\\\n{{range service \u0026quot;$SERVICE\u0026quot;}} \\n\\\nserver {{.Address}}:{{.Port}}; \\n\\\n{{else}}server 127.0.0.1:65535;{{end}} \\n\\\n} \\n\\\nserver { \\n\\\nlisten 80 default_server; \\n\\\nlocation / { \\n\\\nproxy_pass http://app; \\n\\\n} \\n\\\n}\u0026rdquo; \u0026gt; $CT_FILE; \\\n/usr/sbin/nginx -c /etc/nginx/nginx.conf \\\n\u0026amp; CONSUL_TEMPLATE_LOG=debug consul-template \\\n-consul=$CONSUL \\\n-template \u0026ldquo;$CT_FILE:$NX_FILE:/usr/sbin/nginx -s reload\u0026rdquo;;`\nThe repository for this file is here.\nNOTE: the \\n\\ adds a new line and escapes the newline for Docker multiline command\nThis Docker container will run both Consul Template and nginx, and when the services change it will rewrite the nginx app.conf file, then reload nginx.\nThis container can be built with: docker build -t drcon .\nand run with: docker run -it \\ -e \u0026quot;CONSUL=$DOCKER_IP:8500\u0026quot; \\ -e \u0026quot;SERVICE=simple\u0026quot; \\ -p 80:80 drcon\nSERVICE is query used to select which services to include from Consul. So this DR CoN container will now load balance across all services names simple.\nAll Together\nLets now plug everything together!\nRun Consul docker run -it -h node \\ -p 8500:8500 \\ -p 53:53/udp \\ progrium/consul \\ -server \\ -bootstrap \\ -advertise $DOCKER_IP\nRun Registrator docker run -it \\ -v /var/run/docker.sock:/tmp/docker.sock \\ -h $DOCKER_IP progrium/registrator \\ consul://$DOCKER_IP:8500\nRun DR CoN docker run -it \\ -e \u0026quot;CONSUL=$DOCKER_IP:8500\u0026quot; \\ -e \u0026quot;SERVICE=simple\u0026quot; \\ -p 80:80 drcon\nCalling the service: curl $DOCKER_IP:80``curl: (52) Empty reply from server\nNow start a service named simple docker run -it \\ -e \u0026quot;SERVICE_NAME=simple\u0026quot; \\ -p 8000:80 python/server\nThis will cause:\nRegistrator to register the service with Consul Consul Template to rewrite the nginx.conf then reload the configuration Now curl $DOCKER_IP:80 will be routed successfully to the service.\nIf we then start another simple service on a different port with: docker run -it \\ -e \u0026quot;SERVICE_NAME=simple\u0026quot; \\ -p 8001:80 python/server\nRequests will now be load balances across the two services.\nA fun thing to do is to run while true; do curl $DOCKER_IP:80; sleep 1; done while killing and starting simple services and see that this all happens so fast no requests get dropped.\nConclusion\nArchitectures like DR CoN are much easier to describe, distribute and implement using Docker and are impossible without good tools like Consul. Plugging things together and playing with Docker\u0026rsquo;s ever more powerful tools fun and useful. Now I can create a horizontally scalable architecture and have everything just work.\nFurther Reading\nThe Docker Book\n","permalink":"https://maori.geek.nz/posts/2015/2015-01-21_scalable-architecture-dr-con-docker-registrator-consul-consul-template-and-nginx/","summary":"\u003cp\u003eDocker is great fun when you start building things by plugging useful containers together. Recently I have been playing with \u003ca href=\"https://www.consul.io/\"\u003eConsul\u003c/a\u003e and trying to plug things together to make a truly horizontally scalable web application architecture. Consul is a \u003cstrong\u003eService Discovery and Configuration\u003c/strong\u003e application, made by \u003ca href=\"https://hashicorp.com/\"\u003eHashiCorp\u003c/a\u003e the people who brought us \u003ca href=\"http://www.maori.geek.nz/post/vagrant_with_docker_how_to_set_up_postgres_elasticsearch_and_redis_on_mac_os_x\"\u003eVagrant\u003c/a\u003e.\u003c/p\u003e\n\u003cp\u003ePreviously I experimented using Consul by using SRV records (\u003ca href=\"http://www.maori.geek.nz/post/docker_web_services_with_consul\"\u003edescribed here\u003c/a\u003e) to create a scalable architecture, but I found this approach a little complicated, and I am all about simple. Then I found \u003ca href=\"https://hashicorp.com/blog/introducing-consul-template.html\"\u003eConsul Template\u003c/a\u003e which links to Consul to update configurations and restart application when services come up or go down.\u003c/p\u003e","title":"Scalable Architecture DR CoN: Docker, Registrator, Consul, Consul Template and Nginx"},{"content":"I am not sure where the boundary between Data and Big Data exists, but I must be getting close. As the amount of data I deal with grows, my tools and processes have had to significantly adapt to many new challenges. This caused me to start to think about the qualities of working with Big Data as compared to physical laws.\nIn this post I will explore the challenges of working with Big Data using some analogies with scale, malleability and gravity. As with all analogies, these are not exact and just exploring a different way of thinking about a problem\nScale “Are the physical laws symmetrical under a change of scale? Suppose we build a certain piece of apparatus, and then build another apparatus five times bigger in every part, will it work exactly the same way? The answer is, in this case, no!” — Richard Feynman Symmetry in Physical Laws\nThe first and most obvious property of Big Data is its size. As the amount of data increases, the speed and complexity of the tools and processes used increases at an accelerated rate. In this way, Big Data is not symmetric under scale.\nFor example, tools that analyse 1GB of data could load it entirely into RAM. However, a tool that analyses data 1,000 times larger (1TB) will be slower than 1,000 because the tool will have to use the HDD increasing complexity and reducing speed and/or accuracy. A tool that analyses data 1,000,000 times bigger (1PB) now will have to use network storage, increasing complexity even more.\nThis asymmetry of data’s scale exists because of the limits in two dimensions of processing power, getting bigger processors (vertical scale) and getting more processors (horizontal scale).\nVertical scale is the performance of an individual computer. To scale vertically you can buy a bigger, faster computer to process more data. The limits on vertical scale occur once you have the biggest, fastest computer, but it is not big or fast enough to process the data. This is a big problem because the amount of data is growing faster than even Moore’s law (which roughly states that computer power doubles every 24 months) can keep up with.\nAn interesting implication of the accelerated rate of data growth is:\nin the future, we will need more hardware resources just to make the same decision!\nHorizontal scale is the performance of a group of computers. To scale horizontally you purchase more computers and add them to a cluster to process your data. The limit of horizontal scaling is the bandwidth between the computers, and the way in which the data processing is coordinated and distributed. The limit along the horizontal scale occurs when the bandwidth is saturated. At this point adding more computers will not be able to receive or process data fast enough.\nVertical and horizontal scale cause asymmetries in Big Data processing. Understanding where the limits are in relation to your problem is necessary to avoid running head long into a large wall.\nMalleability “The deployment of millions cannot be improvised” — Moltke the first German commander during WWI\nData resists change as its size increases. For example, a simple task like fixing a spelling mistake that occurs on a 10th of the data would be easily accomplished in small data-sets. As the data-set increases questions must be answered like what if the process fails before it finishes? or what if someone tries to use the data while it is being altered?\nAs Moltke says above, the bigger the numbers the more you must plan. Changing the shape of the large amounts of data must be well thought out or you risk of fracturing its internal structure.\nGravity it is all driven by a need to answer a question, what is the question? — Big Data, Big Innovation Evan Stubbs\nCollect data to answer questions Answers raise more questions Go to 1 Data attracts data like a gravitational force. The price of successfully using your data to answer questions, will be the requirement to collect more data. This is not a problem in and of itself, but in how it makes the problems of scale and malleability worse. Given that data attracts data, and the more data you have the problems you have to overcome, Good Luck!\nConclusion I am reasonably new to the field of Big Data and I am still working things out. Thinking about a problem from a different perspective often helps me ground the problems I am facing, and look for solutions that from other places.\n","permalink":"https://maori.geek.nz/posts/2015/2015-01-20_bigdata-quantity-has-a-quality-all-its-own/","summary":"\u003cp\u003eI am not sure where the boundary between \u003cstrong\u003eData\u003c/strong\u003e and \u003cstrong\u003eBig Data\u003c/strong\u003e exists, but I must be getting close. As the amount of data I deal with grows, my tools and processes have had to significantly adapt to many new challenges. This caused me to start to think about the qualities of working with \u003cstrong\u003eBig Data\u003c/strong\u003e as compared to physical laws.\u003c/p\u003e\n\u003cp\u003eIn this post I will explore the challenges of working with \u003cstrong\u003eBig Data\u003c/strong\u003e using some analogies with \u003cstrong\u003escale\u003c/strong\u003e, \u003cstrong\u003emalleability\u003c/strong\u003e and \u003cstrong\u003egravity\u003c/strong\u003e. \u003cem\u003eAs with all analogies, these are not exact and just exploring a different way of thinking about a problem\u003c/em\u003e\u003c/p\u003e","title":"Big-Data: Quantity has a Quality All Its Own"},{"content":" Yeah, Nah: New Zealand slang for yes, or possibly no e.g.\nYeah, Nah is a movie recommendation application built with the Good Enough Recommendation engine (GER), using themoviedb.org’s API for movie information, Hapi.js as its web framework, and Angular.js for front end code.\nSource here\n","permalink":"https://maori.geek.nz/posts/2014/2014-12-15_yeah-nah-movie-recommender-service/","summary":"\u003cblockquote\u003e\n\u003cp\u003eYeah, Nah: New Zealand slang for yes, or possibly no \u003ca href=\"http://www.sayyeahnah.org.nz/\"\u003ee.g.\u003c/a\u003e\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cp\u003e\u003ca href=\"http://yeahnah.maori.geek.nz/\"\u003e\u003cstrong\u003eYeah, Nah\u003c/strong\u003e\u003c/a\u003e is a movie recommendation application built with the \u003ca href=\"https://github.com/grahamjenson/ger\"\u003eGood Enough Recommendation engine (GER)\u003c/a\u003e, using \u003ca href=\"https://www.themoviedb.org/documentation/api\"\u003ethemoviedb.org’s API\u003c/a\u003e for movie information, \u003ca href=\"http://hapijs.com/\"\u003eHapi.js\u003c/a\u003e as its web framework, and \u003ca href=\"https://angularjs.org/\"\u003eAngular.js\u003c/a\u003e for front end code.\u003c/p\u003e\n\u003cp\u003eSource \u003ca href=\"https://github.com/grahamjenson/yeahnah\"\u003ehere\u003c/a\u003e\u003c/p\u003e","title":"Yeah, Nah: Movie Recommender Service"},{"content":"GER (Good Enough Recommendations) is a recommendations engine that could directly add value and increase user engagement for many existing applications. GER is an open source npm module that you could download and start using right now. However, you probably want to know how GER works and how to use it to get good recommendations out of it.\nIn this post I describe GER’s core model, its practical features and its limitations to help you use GER to get good enough recommendations.\nThe GER Model I am sorry for the formality, but formal models are the easiest way to remove ambiguity and describe precisely what is going on.\nThe core sets of GER are:\nP people T things A actions Events are in the set that is the Cartesian product of these, i.e. P × A × T. For example, when bob likes the hobbit movie, this is represented with the event \u0026lt;bob, like, hobbit\u0026gt;.\nThe history of any given person is all the thing’s they have actioned in the form \u0026lt;action,thing\u0026gt;, i.e. A × T. The function H takes a person and returns their history. For example the history for bob after he liked the hobbit would be H(bob) = {\u0026lt;like, hobbit\u0026gt;}.\nThe Jaccard similarity metric, defined as the function J, is used to calculate the similarity between people using their histories. That is, the similarity between two people p1, p2 is the Jaccard metric between their two histories J(p1,p2) = (|H(p1) INTERSECTION H(p2)| / |H(p1) UNION H(p2)|).\nFor example, given that bob liked the hobbit and hated the x-men, where alice only hated the x-men.\nH(alice) = {\u0026lt;hate, x-men\u0026gt;} H(bob) = {\u0026lt;like, hobbit\u0026gt;, \u0026lt;hate, x-men\u0026gt;} H(bob) INTERSECTION H(alice) = {\u0026lt;hate, x-men\u0026gt;} with cardinality 1 H(bob) UNION H(alice) = {\u0026lt;like, hobbit\u0026gt;, \u0026lt;hate, x-men\u0026gt;} with cardinality 2 The similarity between bob and alice is therefore J(bob,alice) = 1/2 Jaccard similarity is a proper metric, so it is comes with many useful properties like symmetry where J(bob, alice) = J(alice,bob).\nIt is also useful to define similarity:\nTwo people are said to be similar if the have a non-zero Jaccard similarity Recommendations Recommendations are a set of weighted things which are calculated for a person p and action a using the function R(p,a). The weight of a thing t is the sum of the similarities between the person p and all people who have \u0026lt;a, t\u0026gt; in their history. One additional constraint on R is that it only returns non-zero weighted recommendations.\nFor Example, given that:\nbob likes the x-men but hates harry potter. alice hates harry potter, and likes the x-men and avengers carl likes the x-men, the avengers and batman What should be movie recommendations should bob like, i.e. R(bob,like)? We can calculate that:\nJ(bob,bob) = 1 J(bob,alice) = 2/3 J(bob,carl) = 1/4 bob has three potential recommendations to like: x-men, avengers and batman. For each of these we can calculate the weight that bob will like them:\nx-men is J(bob,bob) + J(bob,alice) + J(bob,carl) = 1.92 avengers is J(bob,alice) + J(bob,carl) = 0.92 batman is J(bob,carl) = 0.25 Therefore, the recommendations for bob to like are R(bob,like) = {\u0026lt;x-men, 1.92\u0026gt;, \u0026lt;avengers, 0.92\u0026gt;, \u0026lt;batman, 0.25\u0026gt;}. Even though bob has seen x-men it has been included in the recommendations because he does like it. This would make sense if the recommendations were for something that could be consumed multiple times, like food or music.\nPractical Changes to the Model The above model is simple and wouldn’t be able to deal with some of the real world requirements and limitations. Therefore, some additional features and required limitations to the model to make it practical have been applied.\nAdditional Features In the simple model each action is treated equally when measuring a persons similarity to another; this is not the case in reality. If two people liked the same thing they may be more similar than if they hated the same thing. By weighting each action, and finding the Jaccard similarity per-action then combining the results with respect to the action’s weight, the similarity function can more accurately represent reality.\nWhen an event occurs is a very important concept ignored in the simple model. If a person liked the hobbit today, and x-men last year; they are probably more receptive to recommendations like the hobbit. To handle this, every event has an attached date of when it most recently occurred and:\nThe most recent events (defined using a variable for a number of days) are weighted higher than past events, done by calculating multiple Jaccard similarities with a weighted mean. Note: this may break the symmetry of our similarity function, further mathematicians are required Recommending something that a person has already actioned (e.g. bought) could be undesirable. By providing a list of the actions to filter recommendations, selected recommendations can be removed if they occurred in a persons history. For example, it makes sense to filter hate actions to stop recommending things they clearly don’t want. However, they could potentially still receive recommendations for things they may have already liked, because every year they might like to re-watch movies again.\nLimitations When dealing with large sets of data practical limitations are necessary to ensure performance. Here is the list of limitations imposed on the above model and features.\nModel Limitations The first limitation is to not generate recommendations for a person that has under a minimum amount of history. For example, if a person has only liked one movie, their generated recommendations will probably be random. In this case GER return no recommendations and lets the client handle this situation.\nThe most expensive aspect of GER is finding and calculating similarity between people. This is especially expensive for any person who has a large history and every person they are similar to. Given that a person with a large history is similar to many people, only a few such people can significantly decrease the performance of the entire engine. To ensure this is not the case, a few limitations were put in place:\nLimit the number of similar people to find, while attempting to find the most similar people for a users recent activity Limit the size of the history when calculating similarities Finding and weighting every potential recommendation from all similar people may also be expensive and returning every recommendation is likely superfluous. For this the limitations in place are:\nOnly recommend the most recent events from the similar users Only return a number of the best recommendations Limiting the number of similar people, the length of their history, and the amount of recommendations to find, all have different performance and accuracy impacts per data-set. Finding the best values for these is a learning process through trial and error.\nAn important aspect to note about these limits is that they may create the potential for abuse and malicious manipulation of the recommendations. A way to see this is by considering a person who hates all movies, but only likes one. The implications of such a user are:\nThey will be similar to all people who have hated anything Due to limiting history size, they may be a much higher similarity than they would otherwise have been Every person would include in their potential recommendations the movie the malicious person likes Therefore, a person who profits from manipulating recommendations of other users, may attempt to manipulate the system this way.\nData-set Compacting Limitations Given the above description, it is cleat that some events will never be used. For example, if the event are old or belong to a user who has a long history they will not be used in any calculations. These events just loiter, take up space and slow calculations down. By trying to identify these events with some basic heuristics and removing them, it can dramatically speed up performance and decreases the size of the data-set. I call these compacting algorithms.\nCurrently there are two main compacting algorithms:\nLimit the number of events per person, per action, e.g. ensure bob has a maximum of 1000 hates. Limit the number of events per thing, per action, e.g. ensure that a hobbit only has 1000 hates. These compacting algorithms delete the oldest events first as newer events carry more practical importance. They also solve the problem stated above about the malicious user who hates everything, as their history will be reduced and they will be similar to less people.\nLike the other limitations, the numbers associated with the compacting limitations are data-set specific, and can probably be best found through trial and error.\nThe Algorithmic Description The API for recommendations follows the core model and accepts a person and an action and returns a list of weighted things by following these steps:\nFind similar people to person by looking at their history (limiting the number of returned similar people) Calculate the similarities from person to the list of people (limiting the amount of history) Find a list of the most recent things the similar people have actioned (limiting the number returned) Calculating the weights of things using the similarity of the people (filtering based on filter actions and retuning the highest weighted) Technology GER is implemented in Coffee-Script on top of Node.js (here are my reasons for using Coffee-Script).\nA core abstraction is the Event Store Manager (ESM), which implements the persistency and similarity calculation. Currently there is an in memory ESM and a PostgreSQL ESM. There is also a RethinkDB ESM in the works being implemented by the awesome linuxlich.\nHelp Now you know how GER works you can help out. Please consider using and testing it out. If you are able to contribute, consider creating an ESM for you favourite database. The links are:\nNPM package: https://www.npmjs.org/package/ger GitHub repo: https://github.com/grahamjenson/ger I am open to suggestions and improvements (especially if they are in the form of a pull-request or fork!)\nThe overall goal is a recommendations engine that will be good enough for most users.\n","permalink":"https://maori.geek.nz/posts/2014/2014-12-03_gers-anatomy-how-to-generate-good-enough-recommendations/","summary":"\u003cp\u003e\u003ca href=\"https://github.com/grahamjenson/ger\"\u003e\u003cstrong\u003eGER\u003c/strong\u003e (Good Enough Recommendations)\u003c/a\u003e is a recommendations engine that could directly add value and increase user engagement for many existing applications. GER is an open source \u003ca href=\"http://npmjs.org/package/ger\"\u003enpm module\u003c/a\u003e that you could download and start using \u003cstrong\u003eright now\u003c/strong\u003e. However, you probably want to know how GER works and how to use it to get good recommendations out of it.\u003c/p\u003e\n\u003cp\u003eIn this post I describe GER’s core model, its practical features and its limitations to help you use GER to get good enough recommendations.\u003c/p\u003e","title":"GER’s Anatomy: How to Generate Good Enough Recommendations"},{"content":"Search boxes are useful so they are everywhere! Yet they are treated with disdain, as secondary class elements, hidden away in the corners and pushed to the side of other ‘more important’ content. Only put there as a last resort for frustrated users who cannot navigate to their destination. As if to try and make users feel like they failed when they have to use the search box.\nIf you treat it your search box this badly it will never provide you its full potential value. The search box gives your users a way to tell you, in their own words, what they want and what they expect you to have. You do not need to guess these things, just read what they typed no further analysis required.\nBe nice to your search box, be proud of it, put it center stage. Make it useful and a key part of your application, not the backup plan for users that failed to click the right links. Do this and every time a user arrives they will fill out a single question survey that tells you exactly what they want.\nTips on UX Highlight your search box, put it top and center. Dont hide it in a corner and have the box styled to be white on white like many sites have. Facebook, Youtube, Amazon all have their search boxes center top and highlighted. Auto-complete is very useful and not only to reduce a users typing. It also provides hints and reduces spelling mistakes of the user. Try put into Amazon a common misspellings like “vaccum”, how many hints does it provide that you spelled it wrong before you actually search. Tips on Analytics Google Analytics provides a tool to measure and report on search queries. It can be a very useful guide to analyzing user behaviour. If you log both the search query and the number of results you return, you can look at the most common queries, but also find queries that return 0 results. These can be seen as the things that your users want that you don’t have. Satisfying the most common query with 0 results is a very quick win for any application Related Links I got an email from a the guys over at swiftype who look to have a really nice looking search and analytics tool that covers the points I made above. I have not used this tool (so this is not an endorsement), but I would love to hear from anyone who has used swifttype and hear their opinion (maybe in the comments down below).\n","permalink":"https://maori.geek.nz/posts/2014/2014-11-10_search-box-is-a-single-question-survey-what-do-you-want/","summary":"\u003cp\u003eSearch boxes are \u003cem\u003euseful\u003c/em\u003e so they are \u003cstrong\u003eeverywhere\u003c/strong\u003e! Yet they are treated with disdain, as secondary class elements, hidden away in the corners and pushed to the side of other \u003cem\u003e‘more important’\u003c/em\u003e content. Only put there as a last resort for frustrated users who cannot navigate to their destination. As if to try and make users feel like they failed when they have to use the search box.\u003c/p\u003e\n\u003cp\u003eIf you treat it your search box this badly it will never provide you its full potential value. The search box gives your users a way to tell you, \u003cstrong\u003ein their own words\u003c/strong\u003e, what they want and what they \u003cstrong\u003eexpect you to have\u003c/strong\u003e. You do not need to guess these things, just read what they typed \u003cstrong\u003eno further analysis required\u003c/strong\u003e.\u003c/p\u003e","title":"A Search Box is a Single Question Survey; “What Do You Want?”"},{"content":"These are some of the tools I use to test my Node.js code:\nMocha is a testing framework for describing and running tests Chai is an assertion library Sinon is a mocking and stubbing library In this post I will give a brief introduction to each of these, with some basic examples and tips.\nMocha Mocha.js is a test running framework. Install Mocha with npm install -g mocha. Run mocha to execute all the javascript test files in the test directory.\nSome example Mocha tests are: var assert = require(\u0026quot;assert\u0026quot;)``describe('mocha', function(){ it('should fail when throwing an error', function(){ throw \u0026quot;FAIL\u0026quot; })``it('should fail when asserting false', function(){ assert(false) })``it('should pass when finishing without error', function(){ })``it('should pass when asserting true', function(){ assert(true) })``it('should be pending with no function') })\ndescribe is used to organise your tests it takes a test’s name and function (no function makes the test pending) Running mocha for these tests will output: mocha 1) should fail when throwing an error 2) should fail when asserting false ✓ should pass when finishing without error ✓ should pass when asserting true - should be pending with no function``2 passing (5ms) 1 pending 2 failing``1) mocha should fail when throwing an error: Error: the string \u0026quot;FAIL\u0026quot; was thrown, throw an Error :) at \u0026lt;STACKTRACE\u0026gt;``2) mocha should fail when asserting false: AssertionError: false == true at \u0026lt;STACKTRACE\u0026gt;\nChai\nChai is an assertion library which can be used with Mocha to write readable tests. Chai can be installed with npm install chai and it is used like: var chai = require('chai') chai.should()``describe('chai', function(){ it('should fail when asserting false', function(){ false.should.equal(true) })``it('should pass when testing type', function(){ \u0026quot;string\u0026quot;.should.be.a(\u0026quot;string\u0026quot;) })``it('should pass when testing include', function(){ [1,2,3].should.include(2) }) })\nCalling chai.should() adds the should object to the Object prototype which lets you write tests that read like English with Chai\u0026rsquo;s many testing methods.\nNote: I prefer the should style of testing, but Chai also supports the expect style.\nSinon\nSinon is a mocking and stubbing library that can be used to replace functions on an object with fake, inspectable functions. Sinon can be installed with npm install sinon and used like: var chai = require('chai') chai.should()``var sinon = require('sinon')``var Test = { do: function(thing){ return \u0026quot;no\u0026quot;} }``describe('sinon', function(){ it('should stub a method', function(){ Test.do(\u0026quot;thing\u0026quot;).should.equal(\u0026quot;no\u0026quot;)``sinon.stub(Test, \u0026quot;do\u0026quot;, function(){return \u0026quot;yes\u0026quot;}) Test.do(\u0026quot;thing\u0026quot;).should.equal(\u0026quot;yes\u0026quot;)``Test.do.restore() Test.do(\u0026quot;thing\u0026quot;).should.equal(\u0026quot;no\u0026quot;) })``it('should validate if a function is called', function(){ sinon.stub(Test, \u0026quot;do\u0026quot;, function(){return \u0026quot;yes\u0026quot;})``Test.do.calledOnce.should.be.false``Test.do(\u0026quot;thing\u0026quot;).should.equal(\u0026quot;yes\u0026quot;) Test.do.calledOnce.should.be.true``Test.do.restore() })``it('should validate a functions parameters', function(){ sinon.stub(Test, \u0026quot;do\u0026quot;, function(thing){ thing.should.equal(\u0026quot;thing\u0026quot;) return \u0026quot;yes\u0026quot; })``Test.do(\u0026quot;thing\u0026quot;).should.equal(\u0026quot;yes\u0026quot;) Test.do.calledOnce.should.be.true``Test.do.restore() }) })\nSinon\u0026rsquo;s stub method takes an object and the name of the function to stub, and replaces that function with a \u0026lsquo;spy\u0026rsquo;. You can assert a spy has been called with calledOnce, change the output the spy returns, and test the arguments that are passed to it.\nrestore should be called on the spy at the end of a test to restore the old function back to the object, otherwise it will stay stubbed for the next test.\nPromises\nIf you return a promise in a mocha test it will fail if that promise is rejected. For example, using the q promise library: var chai = require('chai') chai.should()``var q = require('q')``describe('promises', function(){ it('should pass if returning resolved promise', function(){ var d = q.defer() d.resolve(\u0026quot;PASS\u0026quot;) return d.promise })``it('should fail if returning failed promise', function(){ var d = q.defer() d.reject(\u0026quot;FAIL\u0026quot;) return d.promise }) })\nThis makes testing async code much easier.\nDefault settings\nMocha has a rich set of options to customise your running tests.\nInstead of adding many arguments each time you run mocha you can create the file test/mocha.opts which contains your default options.\nIn mocha.opts can be specified a common test helper, common code that each test requires. Adding \u0026ndash;require test/test_helper to test/mocha.opts will require test/test_helper.js file before each test. In this helper we could, for example, move the common chai requirements: var chai = require('chai') chai.should()\nYou can also specify different test compilers in mocha.opts. I typically write code in CoffeeScript (for reasons described here). Mocha supports this choice, and by adding \u0026ndash;compilers coffee:coffee-script/register to test/mocha.opts it will execute all the tests written in CoffeeScript.\nFurther Reading\nMy previous post on Testing promises in Node.js with Mocha, Chai and Sinon.\nNode.js in Action\nNode.js the Right Way\n","permalink":"https://maori.geek.nz/posts/2014/2014-10-30_testing-javascript-with-mocha-chai-and-sinon/","summary":"\u003cp\u003eThese are some of the tools I use to test my Node.js code:\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003e\u003ca href=\"http://mochajs.org/\"\u003e\u003cstrong\u003eMocha\u003c/strong\u003e\u003c/a\u003e is a testing framework for describing and running tests\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"http://chaijs.com/\"\u003e\u003cstrong\u003eChai\u003c/strong\u003e\u003c/a\u003e is an assertion library\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"http://sinonjs.org/\"\u003e\u003cstrong\u003eSinon\u003c/strong\u003e\u003c/a\u003e is a mocking and stubbing library\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eIn this post I will give a brief introduction to each of these, with some basic examples and tips.\u003c/p\u003e\n\u003ch3 id=\"mocha\"\u003eMocha\u003c/h3\u003e\n\u003cp\u003e\u003ca href=\"http://mochajs.org/\"\u003eMocha.js\u003c/a\u003e is a test running framework. Install Mocha with npm install -g mocha. Run mocha to execute all the javascript test files in the test directory.\u003c/p\u003e","title":"Testing Javascript with Mocha, Chai, and Sinon"},{"content":"Do you need a solution to scale your web services both vertically and horizontally, with load balancing and health checking? Consul with Docker and Nginx can help!\nIn this post I will describe how to use Consul (a service registry) and Nginx (with srv-router) running in Docker containers to load balance across multiple services.\nNote: For OSX users boot2docker is required\nConsul Consul is a service registry that uses the DNS protocol to return a list of healthy services, in a random order (for load balancing).\nConsul can be run as a docker container with: docker run -it -h node \\ -p 8500:8500 \\ -p 8600:53/udp \\ progrium/consul \\ -server \\ -bootstrap \\ -advertise $DOCKER_IP\nThis runs Consul using the progrium/consul image, mapping to the ports 8500 to the web interface, and 8600 to the DNS interface. node is the given hostname, and it is being started as a server in bootstrap mode.\nFinally, DOCKER_IP is the address of the docker host and is used for advertising. If you are using boot2docker export DOCKER_IP=boot2docker ip will assign it correctly.\nA service can be registered with HTTP: curl -XPUT \\ $DOCKER_IP:8500/v1/agent/service/register \\ -d '{ \u0026quot;ID\u0026quot;: \u0026quot;simple_instance_1\u0026quot;, \u0026quot;Name\u0026quot;:\u0026quot;simple\u0026quot;, \u0026quot;Port\u0026quot;: 8000, \u0026quot;tags\u0026quot;: [\u0026quot;tag\u0026quot;] }'\nThis service can be discovered using the DNS query tool dig: dig @$DOCKER_IP -p 8600 \\ tag.simple.service.consul\nNginx with srv-router To have multiple instances of a service running on a single machine they have to be exposed on different ports. DNS A records only contain an IP address, where SRV records contain the port of the service.\nsrv-router uses\nSRV records to correctly redirect an incoming request to the address and port. srv-router is a Docker container that runs an Nginx server modified with a Lua script to route requests using SRV records.\nTo run srv-router: docker run -it -p 80:80 \\ --net host \\ -e \u0026quot;NS_IP=$DOCKER_IP\u0026quot; \\ -e \u0026quot;NS_PORT=8600\u0026quot; \\ -e \u0026quot;TARGET=simple.service.consul\u0026quot; \\ -e \u0026quot;DOMAINS=$DOCKER_IP\u0026quot; \\ vlipco/srv-router\nThis will start the srv-router on port 80.\n— net host gives the container the same network interface as the Docker host to give it access to all other containers. The NS_IP and NS_PORT point towards the Consul server.\nThe srv-router when called will query Consul for home.simple.service.consul, then route to the address and port that is returned. This is the tags namespace in Consul, so each consul service must have the home tag.\nStart to Finish Using a simple Docker web-service described here, lets set up a “server” that uses load-balances across multiple services using Consul and srv-router.\nFirst, lets run two servers on port 8001 and 8002: docker run -it -p 8001:80 python/server docker run -it -p 8002:80 python/server\nNow lets start Consul: docker run -it -h node -p 8500:8500 -p 8600:53/udp progrium/consul -server -bootstrap -advertise $DOCKER_IP\nNow srv-router: docker run -it -p 80:80 --net host -e \u0026quot;NS_IP=$DOCKER_IP\u0026quot; -e \u0026quot;NS_PORT=8600\u0026quot; -e \u0026quot;TARGET=simple.service.consul\u0026quot; -e \u0026quot;DOMAINS=$DOCKER_IP\u0026quot; vlipco/srv-router\nFinally, let register the services: curl -XPUT \\ $DOCKER_IP:8500/v1/agent/service/register \\ -d '{ \u0026quot;ID\u0026quot;: \u0026quot;simple_instance_1\u0026quot;, \u0026quot;Name\u0026quot;:\u0026quot;simple\u0026quot;, \u0026quot;Port\u0026quot;: 8001, \u0026quot;tags\u0026quot;: [\u0026quot;home\u0026quot;] }'``curl -XPUT \\ $DOCKER_IP:8500/v1/agent/service/register \\ -d '{ \u0026quot;ID\u0026quot;: \u0026quot;simple_instance_2\u0026quot;, \u0026quot;Name\u0026quot;:\u0026quot;simple\u0026quot;, \u0026quot;Port\u0026quot;: 8002, \u0026quot;tags\u0026quot;: [\u0026quot;home\u0026quot;] }'\nLets check Consul has the services with: dig @$DOCKER_IP -p 8600 \\ home.simple.service.consul SRV\nFinally, lets call the services with curl $DOCKER_IP.\nWhat should happen is:\nsrv-router gets the request and asks Consul for the SRV records for home.simple.service.consul Consul returns in a random order the two SRV records for simple_instance_1 on port 8001 and simple_instance_2 on port 8002. srv-router then routes the request to one of those services then the service handles the request and returns a response Future Things and Stuff In this post I did not describe how Consul can do health checking of the services, or how to streamline registering services with registrator. I will definitely go over these aspects in further posts.\nUsing Docker to contain and distribute services and Consul with srv-router to load-balance across them could greatly reduce growing pains of services. In conclusion, Docker is awesome.\nOther Resources Consul HTTP API\nThe Docker Book\nprogrium/docker-consul\nprogrium/registrator\nThe Future of Docker\n","permalink":"https://maori.geek.nz/posts/2014/2014-09-29_docker-web-services-with-consul/","summary":"\u003cp\u003eDo you need a solution to scale your web services both vertically and horizontally, with load balancing and health checking? \u003ca href=\"http://www.consul.io/\"\u003eConsul\u003c/a\u003e with \u003ca href=\"https://www.docker.com/\"\u003eDocker\u003c/a\u003e and \u003ca href=\"http://nginx.org/\"\u003eNginx\u003c/a\u003e can help!\u003c/p\u003e\n\u003cp\u003eIn this post I will describe how to use \u003ca href=\"http://www.consul.io/\"\u003eConsul\u003c/a\u003e (a service registry) and \u003ca href=\"http://nginx.org/\"\u003eNginx\u003c/a\u003e (with \u003ca href=\"https://github.com/vlipco/srv-router\"\u003esrv-router\u003c/a\u003e) running in \u003ca href=\"https://www.docker.com/\"\u003eDocker\u003c/a\u003e containers to load balance across multiple services.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eNote: For OSX users\u003c/em\u003e \u003ca href=\"http://boot2docker.io/\"\u003e\u003cem\u003eboot2docker\u003c/em\u003e\u003c/a\u003e \u003cem\u003eis required\u003c/em\u003e\u003c/p\u003e\n\u003ch3 id=\"consul\"\u003eConsul\u003c/h3\u003e\n\u003cp\u003eConsul is a service registry that uses the DNS protocol to return a list of healthy services, in a random order (for load balancing).\u003c/p\u003e","title":"Docker Web Services with Consul"},{"content":"A common change to make to a Rails application is to extract an attribute from a model into a one-to-many relationship. This change can be made without causing a large amount of downtime, even if there a significant amount of records needing to be changed.\nIn this post, I will describe how to change a model to replace an attribute with a one-to-many relationship while minimising downtime and emphasising continuous deployment.\nThe problem We have a Person model with a name attribute and have been asked to change it to allow for a Person to have multiple Aliases, including their name. There are about a million people in the database, and minimising the downtime of the system is a high priority.\nThe naive solution is to change the system in a single deployment with a migration to create Alias and remove name. Such a deployment would take a long time, and could be very dangerous as ensuring such a large change to the system doesn’t break anything can be difficult.\nA safer solution is to break down the change into multiple steps and alter the system over many deployments. Below I tried to break down the steps I have used in the past to solve such problems.\nStep 1: Before Validate, Create The first step is to create the Alias migration and model: class CreateAliases \u0026lt; ActiveRecord::Migration def change create_table :aliases do |t| t.integer :name t.integer :person_id end end end``class Alias \u0026lt; ActiveRecord::Base attr_accessible :name belongs_to :person end\nThen edit the Person model to create an Alias without “hooking it up” to the main functionality: class Person \u0026lt; ActiveRecord::Base attr_accessible :name has_many :aliases, :autosave =\u0026gt; true before_validation :upsert_alias`` def upsert_alias alias = aliases.first || aliases.build alias.name = self.name end ...\nThis will allow the system to create all the Aliases in the background without causing an outage with: Person.find_each do |person| person.save end\nStep 2: Delegate and Drop After step 1 we know that every person has exactly one Alias because any new Person is created with an Alias and all existing people have had an Alias attached.\nDelegating the name functions from Person to the Alias will allow the new model to start being used while also having most of the old code keep working. class Person \u0026lt; ActiveRecord::Base attr_accessible :name has_many :aliases, :autosave =\u0026gt; true`` delegate :name, \u0026quot;name=\u0026quot;, \u0026quot;name_changed?\u0026quot;, to: :first_alias`` def first_alias aliases.first || aliases.build end ...\nOnce delegation is working name can be removed from Person. This may cause some pain as any code like Person.where(:name =\u0026gt; \u0026lsquo;bob\u0026rsquo;) will break. It may also cause problems for other entities in your organisation (e.g. Data Warehouse) which may depend on database structure.\nStep 3: Explicit Build Altering the first_alias function to not build an alias if one doesn\u0026rsquo;t exist, e.g. def first_alias aliases.first end\nwill require functionality that creates people to explicitly create aliases. The previous assumption that when a Person\u0026rsquo;s name is accessed an Alias will be created. Now the alias will have to be created explicitly for person.name to not break. Basically, anywhere a Person is created an alias must be added.\nStep 4: Remove Delegation The final stage is to remove the delegation from the Person model, i.e. class Person \u0026lt; ActiveRecord::Base has_many :aliases, :autosave =\u0026gt; true ...\nThis is the most painful step, but can be accomplished slowly. As you maintain the code, or write new functions just ensure to not use person.name but person.firt_alias.name (or however you want to access the model).\nOnce the delegation has been removed you are done.\nConclusion These steps may not work for your problem, as every application is unique in its own way (to paraphrase Tolstoy). However, when I come across similar problems I inevitably use a solution like the one described above as it gives me a lot of room to safely and slowly move my applications structure to how I need it.\nReferences Ruby Rogues episode Extreme Deployment has a great discussion about similar problems.\nContinuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation\n","permalink":"https://maori.geek.nz/posts/2014/2014-09-24_replacing-an-attribute-with-a-onetomany-relationship-in-rails/","summary":"\u003cp\u003eA common change to make to a Rails application is to extract an attribute from a model into a one-to-many relationship. This change can be made without causing a large amount of downtime, even if there a significant amount of records needing to be changed.\u003c/p\u003e\n\u003cp\u003eIn this post, I will describe how to change a model to replace an attribute with a one-to-many relationship while minimising downtime and emphasising continuous deployment.\u003c/p\u003e","title":"Replacing an Attribute with a One-to-Many Relationship in Rails"},{"content":"Docker is a great tool to test out new application architectures. To make sure that my architectures routed the right calls to the right places, I needed to have a simple containerised web service that just logged when it was called.\nIn this post, I am going to describe how to create a simple web service with Docker and Python, it is even small enough to fit in a tweet. This post will be brief.\nBoot2Docker for OSX If you are using OSX, then Boot2Docker is the tool for you. It works by using a virtual machine to host Docker and letting the OSX command line ‘remotely’ call it with the docker command.\nTo install Boot2Docker: brew install boot2docker\nTo initialise and start the virtual machine: boot2docker init boot2docker up\nFinally, to link docker to the Docker host: export DOCKER_IP=boot2docker ip export DOCKER_HOST=boot2docker socket``\nCall docker ps to test that the docker is correctly linked, this will return successfully if it is working. If the Docker host is not running, or the DOCKER_HOST environment variable is not set, the error will look something like: Get [http:///var/run/docker.sock/v1.12/containers/json:](http:///var/run/docker.sock/v1.12/containers/json:) dial unix /var/run/docker.sock: no such file or directory\nSmall Web Service Lets build the service. Create a file called Dockerfile that contains: FROM python:3 EXPOSE 80 CMD [\u0026quot;python\u0026quot;, \u0026quot;-m\u0026quot;, \u0026quot;http.server\u0026quot;]\nA a new image is defined from the python version 3 image. When the container is run it will execute python -m http.server, which starts a HTTP server on the exposed port 80.\nBuild the Dockerfile with: docker build -t python/server .\nThe image is built with the tag python/server and can be run with: docker run -i -t -p 8000:80 python/server\nThis maps the containers port 80 to the hosts port 8000 (-p 8000:80).\n-i lets the container take STDIN and pipes it to the container. So if you Ctrl+C the container will be killed, making services easier to manage.\n-t gives the container a pseudo-tty, that basically pipes the containers output to STDOUT so you can see when the server has been called.\nUse curl $DOCKER_IP:8000 to call the service and see the container log your call.\nConclusion I am having a blast mucking around with Docker.\nFurther Reading The Docker Book\nVagrant: Up and Running\n","permalink":"https://maori.geek.nz/posts/2014/2014-09-08_smallest-docker-web-service-that-could/","summary":"\u003cp\u003e\u003ca href=\"https://www.docker.com/\"\u003eDocker\u003c/a\u003e is a great tool to test out new application architectures. To make sure that my architectures routed the right calls to the right places, I needed to have a simple containerised web service that just logged when it was called.\u003c/p\u003e\n\u003cp\u003eIn this post, I am going to describe how to create a simple web service with Docker and Python, it is even small enough to fit in a \u003ca href=\"https://twitter.com/GrahamJenson/status/508428481454034944\"\u003etweet\u003c/a\u003e. This post will be brief.\u003c/p\u003e","title":"The Smallest Docker Web Service That Could"},{"content":"When implementing the Good Enough Recommendations (GER) engine, a core requirement was to let users insert large amounts of data quickly in order to bootstrap the recommendations engine. Additionally, this bootstrapping should be available over HTTP, as this will become the primary channel for interaction with GER.\nPostGres (which GER uses) has the COPY command that is “optimised for loading large numbers of rows” in various formats, and npm has the package pg-copy-streams that pass Node.js streams to COPY. This would work well with the Hapi.js web application framework which can turn an uploaded file into a Node.js stream without having to hold the entire file in memory or create a temporary file on disk.\nIn this post I will describe how to upload a file and directly insert its data into PostGres using Node.js streams with Hapi.js and pg-copy-streams.\nStreaming Data Directly into PostGres with Bootstrap GER implements a function bootstrap that takes a csv_stream, inserts its data into PostGres, and returns a Q promise for when it is finished (or errored). bootstrap can be broken down into two parts, getting the connection and inserting the data.\nGetting the Connection from Knex GER uses the query builder Knex to communicate and manage connections to the database. The first action for the bootstrap function is to get a connection to the database from Knex: runner = new knex.client.Runner(knex.client) runner.ensureConnection() .then( (connection) =\u0026gt; runner.connection = connection #Use connection ) .finally( -\u0026gt; runner.cleanupConnection())\nAs with Upsert, this is a bit of a round-about way of dealing with connections. Knex is great, but working around its edges of functionality can be difficult.\nInserting the Data with pg-copy-streams The query that is used to insert the data is: query = \u0026quot;COPY events (person, action, thing) FROM STDIN CSV\u0026quot;\nThis query COPY’s data to the table events, inserting the rows person, action and thing from the standard in (STDIN) stream as a comma separated values (CSV) format. For example, if the streams data were: bob, views, product1 alice, buys, product2\nthe query would insert two events one for each row.\nUsing the from function in pg-copy-streams (copyFrom = require(‘pg-copy-streams’).from) the query is wrapped and sent to the PostGres connection: copy = connection.query(copyFrom(query))\nThe returned copy stream is a writable stream where the input csv_stream is piped: csv_stream.pipe(copy)\nTo notify the caller that bootstrap has finished or errored, a Q defer is created which listens to the streams end and error events: deferred = q.defer()``csv_stream.pipe(copy) .on('end', -\u0026gt; deferred.resolve()) .on('error', (err) -\u0026gt; deferred.reject(err))\nAll together this data insertion looks like: query = \u0026quot;COPY events (person, action, thing) FROM STDIN CSV\u0026quot; copy = connection.query(copyFrom(query)); deferred = q.defer() csv_stream.pipe(copy) .on('end', -\u0026gt; deferred.resolve()) .on('error', (err) -\u0026gt; deferred.reject(err)) return deferred.promise\nHapi.js and Streams By using a Hapi.js server hooked up to GER’s bootstrap function, a file can be uploaded and streamed directly into PostGres. As described in the Hapi.js docs. a route can be setup to output a Node.js stream.\nTo implement this a Hapi.js server must be created with: Hapi = require('hapi') server = new Hapi.Server('localhost', 8000)\nA route that takes a file upload and turns it into a Node.js stream is added to the Hapi.js server with: server.route method: 'POST' path: 'event/bootstrap' config: payload: maxBytes: 209715200 output:'stream' parse: true handler: (request, reply) -\u0026gt; #do things...\nThe handler option is the function that handles the request. It can access the uploaded file stream in the requests payload, e.g. request.payload[“events”]. This stream is passed to GER’s bootstrap function, i.e. handler: (request, reply) =\u0026gt; ger.bootstrap(request.payload[\u0026quot;events\u0026quot;]) .then( -\u0026gt; reply({finished: true})) .fail((err) -\u0026gt; reply({error: err}).code(500))\nThe final part is to start the Hapi.js server with server.start()\nTesting the Server To test the server and the route, curl can be used to upload a file, e.g. curl -i -F events=@data.csv [http://localhost:8000/event/bootstrap](http://localhost:8000/event/bootstrap)\ncurl can also take a standard stream and upload that, e.g. head data.csv | curl -i -F events=@- [http://localhost:8000/event/bootstrap](http://localhost:8000/event/bootstrap)\nI would just like to take the time and examine how awesome this is. head creates a standard-stream, pipes it to curl which turns it into a HTTP multipart request, Hapi.js turns that request to a Node.js stream, which is then piped into PostGres as a standard-stream from GER’s bootstrap function. That is just cool!\nPerformance Metrics I wrote a small mocha test that compared inserting 10,000 events into GER one event at a time method, and compared it to inserting 10,000 events using the bootstrap function.\nThe results were:\n0.7297ms per event when each event was inserted one at a time 0.0696ms per event for events using bootstrap That is a 10 times performance improvement when inserting events.\nThese results are even more exaggerated when adding the overhead of HTTP, as each insert also requires the overhead of its own HTTP request where one uploaded file is only one request.\nFurther Reading Substacks Stream Handbook\nImage from RLA-Inque\n","permalink":"https://maori.geek.nz/posts/2014/2014-08-28_streaming-directly-into-postgres-with-hapi.js-and-pgcopystream/","summary":"\u003cp\u003eWhen implementing the \u003ca href=\"http://maori.geek.nz/post/good_enough_recomendations_with_ger\"\u003eGood Enough Recommendations\u003c/a\u003e (GER) engine, a core requirement was to let users insert large amounts of data quickly in order to bootstrap the recommendations engine. Additionally, this bootstrapping should be available over HTTP, as this will become the primary channel for interaction with GER.\u003c/p\u003e\n\u003cp\u003ePostGres (which GER uses) has the \u003ca href=\"http://www.postgresql.org/docs/9.3/static/sql-copy.html\"\u003eCOPY\u003c/a\u003e command that is \u003cem\u003e“optimised for loading large numbers of rows”\u003c/em\u003e in various formats, and npm has the package \u003ca href=\"https://www.npmjs.org/package/pg-copy-streams\"\u003epg-copy-streams\u003c/a\u003e that pass Node.js streams to COPY. This would work well with the \u003ca href=\"http://hapijs.com/\"\u003eHapi.js\u003c/a\u003e web application framework which can turn an uploaded file into a Node.js stream without having to hold the entire file in memory or create a temporary file on disk.\u003c/p\u003e","title":"Streaming directly into Postgres with Hapi.js and pg-copy-stream"},{"content":"While developing the Good Enough Recommendations (GER) engine, I needed to Upsert a record in Postgres. Upsert is a function that updates a record if it exists, or inserts the record if it doesn’t. However, Postgres doesn’t come with upsert out-of-the-box, so I had to find out how best to implement it.\nIn this post, I will describe two methods to upsert records in Postgres, multi-query and single-statement. Then I will describe how I compared them to select a method for GER to use.\nNote: this post uses CoffeeScript, and Q promises\nGER’s Upsert Action GER is a recommendations engine implemented in Node.js using Knex.js to query Postgres. GER’s API has the method set_action_weight(action,weight) which will:\nupdate the actions weight OR insert the action with that weight if it does not exist. To implement this function GER needed to Upsert the action record. Given that upsert is not available in Postgres, I found two possible workarounds:\nimplement it with multiple queries to Postgres in the application combine the queries into a single statement to be sent to Postgres The Multi-Query Approach The most straight forward approach to implementing upsert is directly in the application by calling Postgres multiple times.\nTo implement this the function set_action_weight(action, weight) first checks to see whether the action already exists using count: @knex.select('*').from('actions') .where(action: action).count() .then( (count) =\u0026gt; ...\nIf the action does not exist (i.e. the count is 0) then insert the record, otherwise update it: if count == 0 @knex('actions') .insert(({action: action, weight: weight}) else @knex('actions').where(action: action) .update({weight: weight})\nThis code can introduce a race condition where if two actions are added at the same time it can cause both to try insert the record. Given actions must be unique in GER, this would cause a unique key violation error (whose code is 23505) that must be handled: .catch( (error) -\u0026gt; if error.code != '23505' throw error )\nThe main problems with this method are:\nAlthough the code is reasonably straight forward, it requires ugly exception handling Each time this method is called, it will call the database multiple times, potentially impacting performance The Single-Statement Approach Upsert can also be implemented in a single Postgres statement, as described here.\nKnex can be used to build query strings with the toString function, whose output can combined into a single statement.\nFirst the insert statement is built: insert = @knex(\u0026quot;actions\u0026quot;) .insert({action: action, weight: weight}) .toString()\nTo workaround this bug the values must be replaced with select: insert.replace(/\\svalues\\s\\(/,\u0026quot; select \u0026quot;)[..-2]\nThe update query can then be built: update = @knex(\u0026quot;actions\u0026quot;) .where(action: action) .update({weight: weight}) .toString()\nThe single statement can then be constructed to first lock the table, removing the possibility for a race condition. Then some fancy SQL can be used to first try update the record, and if no columns are are updated it will then try insert: query = \u0026quot;BEGIN; LOCK TABLE actions IN SHARE ROW EXCLUSIVE MODE; WITH upsert AS (#{update} RETURNING *) #{insert} WHERE NOT EXISTS (SELECT * FROM upsert); COMMIT;\u0026quot;\nKnex can then send this statement with: @knex.raw(query)\nThis single-statement method for upsert executes entirely inside Postgres, and removes the possibility of a race condition. However, it adds some reasonably complex SQL that is Postgres specific and might be difficult to maintain.\nComparison of Methods To properly compare these methods, the actual gain in performance should be measured. This is because optimisation without metrics is not optimisation. So, I ran this test against them: ger = new GER() start_time = new Date().getTime() promises = [] for x in [1..1000] promises.push ger.set_action_weight('buy', 1)``q.all(promises) .then(-\u0026gt; end_time = new Date().getTime() time = end_time - start_time per_time = time/1000 console.log \u0026quot;#{per_time}ms\u0026quot; )\nThis test executes set_action_weight 1000 times concurrently. Waits for them all to complete and calculates the average time.\nFor the multi-query method it took on average 1.16ms per call, and for the single statement it took 0.99ms.\nThis shows that the single-statement method is about 10% faster than multi-query.\nConclusion I understand that the comparison experiment is not a realistic test of the two methods, but it does show that the single-statement is faster by more than just a little bit. This is why the single-statement method is used in GER, which will hopefully help keep it performant as its use scales.\nUpsert is a common pattern that can be used across applications. So it is a handy tool to wield, especially if performance is a requirement.\nReferences/Thanks The Good Enough Recommendations (GER) engine.\nThanks Autaux for the image.\nThe Art of Web\u0026rsquo;s SQL: A basic UPSERT in PostgreSQL. This is an excellent article about upsert in Postgres.\nSeven Databases in Seven Weeks: I should probably read this book.\n","permalink":"https://maori.geek.nz/posts/2014/2014-08-04_postgres-upsert-update-or-insert-in-ger-using-knex.js/","summary":"\u003cp\u003eWhile developing the \u003cstrong\u003eG\u003c/strong\u003eood \u003cstrong\u003eE\u003c/strong\u003enough \u003cstrong\u003eR\u003c/strong\u003eecommendations (\u003ca href=\"http://maori.geek.nz/post/good_enough_recomendations_with_ger\"\u003eGER\u003c/a\u003e) engine, I needed to \u003ca href=\"http://en.wikipedia.org/wiki/Merge_%28SQL%29\"\u003e\u003cstrong\u003eUpsert\u003c/strong\u003e\u003c/a\u003e a record in Postgres. Upsert is a function that updates a record if it exists, or inserts the record if it doesn’t. However, Postgres doesn’t come with upsert out-of-the-box, so I had to find out how best to implement it.\u003c/p\u003e\n\u003cp\u003eIn this post, I will describe two methods to upsert records in Postgres, multi-query and single-statement. Then I will describe how I compared them to select a method for GER to use.\u003c/p\u003e","title":"Postgres Upsert (Update or Insert) in GER using Knex.js"},{"content":"Recommendation engines could be beneficial for many applications as they can directly add value and lead to greater engagement for users. However, there is significant overhead in implementing a custom solution and many off-the-shelf engines have overcomplicated APIs, or try to be infinitely scalable which is not needed by most applications.\nIn this post I introduce the Good Enough Recommendation (GER) engine. GER (pronounced like this) is built to be easily usable through a simple API, as well as being reasonably fast and scalable, to let developers focus on their applications and not a recommendation engine.\nGood Enough is All You Need (Right Now) A recommendation engine is a feature (not a product) — Why You Should Not Build a Recommendation Engine\nWhen developing a product, the recommendation engine is a secondary consideration (unless the product is a recommendation engine). Building a custom engine is a difficult and time consuming challenge, and many existing engines are complex to setup and get running. These problems make many developers choose not to use a recommendation engine, even if there could benefit their application.\nGER’s goal is to let developers easily integrate a recommendation engine that is satisfactory for their product, and not overly complex to get up and running. As your product grows and becomes successful, if your recommendations need finer configuration or greater scale, then other solutions like Apache Mahout can be used. But right now, if you want to add a recommendation engine, then GER will be good enough.\nGER and its API GER is a collaborative filtering calculator. Events go into GER and predictions about future events come out.\nIts API includes only four ‘types’: person:String, a thing:String, and an action:String that has a weight:Integer.\nAn event is a person performing an action on/to a thing, e.g. “ann” “buys” “product_1”: event(person, action, thing)\nEach action has a weight (defaulting to 1) which determines how important it is to GER’s predictions, e.g. buying is more important than viewing. The weight of an action can be altered with: set_action_weight(action, weight)\nGER can return an ordered list of recommended things for a person to action, e.g. recommend things for “ann” to “buy”: reccommendations_for_person(person, action)\nGER calculates these recommendations by:\nfinding a list of similar people to the person then finding things those similar people have actioned each thing is then scored and sorted based on the number and similarity of the people who have actioned it. GER contains no business rules, limited configuration, and almost no setup required. The benefit is that it is fast and easy to understand, sacrificing the infinite scalability and endless configurations of other engines.\nTechnology I plan to write more posts about the inner workings and development of GER, as it was really challenging and fun to create. Here is a brief description of GER:\nGER is implemented in Node.js using CoffeeScript with Q promises and knex for understandable, asynchronous code. GER was developed in a TDD style, and is well tested using Mohca, Chai and Sinon. Accuracy It would be impossible to state how accurate GER is for a specific application. However, in the interest of testing GER, I have performed some of my own little experiments.\nI took 7 months worth of events where users viewed or purchased products. I then used the first month of events and added them GER. I weighted the purchasing action to 100, and the viewing action to 1. These were intuitive weights gained through trial and error, not rigorous optimisation.\nI then selected user events from the last 6 months where:\nthe user had at least 10 events in the first month (to select users with a reasonable chance for prediction) purchased at least one thing in the last 6 months that GER knew about From there I selected 500 random purchase events and compared each of them to GER’s top 10 recommendations for the purchasing person. The results were:\nabout 16% of the purchases were in the 10 predicted things by GER The mean position for a correct prediction was 2.5 GER recommendations took on average 124.8ms to complete For a control, I compared GER’s recommendations against just always recommending the top 10 purchased things. The results for this control were:\n5% of the purchases were in the top 10 The mean position for a purchase in the top 10 was 3.5 These are very positive results that show with limited bootstrapping (only one months worth of data) and a naive configuration GER was able to:\npredict a significant amount of user actions. have a high accuracy in what was predicted. return results significantly better than a top 10 list Things Left To Do GER is not finished, there are many things left to complete and this post was just an introduction to get some feedback. I still need to make GER an easily deployable micro-service and increase documentation and tools to support GER’s use.\nGER would be more usable if it were wrapped into a micro-service. To easily deploy GER it could be further wrapped into a Docker container which would greatly simplify integration.\nDocumentation for GER is lacking. I will iteratively improve this as I continue to develop it.\nConclusion Watch this space, as GER continues to be developed. Go try it out, and as always, comments and feedback are welcome.\nThanks to Cam Evans for the thumbnail image\nOther Recommendation Engines and Links There are some great recommendation engines and other resources I came across when developing GER:\nA Comparative Study of Collaborative Filtering Algorithms is a paper going over many different collaborative filtering algorithms, implementations and characteristics.\nThe Algorithm Design Manual is a really useful book when thinking about designing and implementing algorithms.\nPredicionIO is a larger, more product based recommendation application.\nApache Mahout is a a massively scaleable engine that can use clusters of computers executing many different algorithms.\nRaccoon was the initial inspiration for GER and a recommendation engine written in Node.js using Redis.\n","permalink":"https://maori.geek.nz/posts/2014/2014-07-25_good-enough-recommendations-with-ger/","summary":"\u003cp\u003eRecommendation engines could be beneficial for many applications as they can directly add value and lead to greater engagement for users. However, there is significant overhead in implementing a custom solution and many off-the-shelf engines have \u003ca href=\"http://easyrec.sourceforge.net/wiki/index.php?title=REST_API_v0.98#view\"\u003eovercomplicated APIs\u003c/a\u003e, or try to be \u003ca href=\"https://mahout.apache.org/\"\u003einfinitely scalable\u003c/a\u003e which is not needed by most applications.\u003c/p\u003e\n\u003cp\u003eIn this post I introduce the \u003ca href=\"https://github.com/grahamjenson/ger\"\u003e\u003cstrong\u003eG\u003c/strong\u003eood \u003cstrong\u003eE\u003c/strong\u003enough \u003cstrong\u003eR\u003c/strong\u003eecommendation (\u003cstrong\u003eGER\u003c/strong\u003e)\u003c/a\u003e engine. GER (pronounced like \u003ca href=\"https://www.youtube.com/watch?v=Cw00jTxHlMk\u0026amp;amp;t=53s\"\u003ethis\u003c/a\u003e) is built to be easily usable through a simple API, as well as being reasonably fast and scalable, to let developers focus on their applications and not a recommendation engine.\u003c/p\u003e","title":"Good Enough Recommendations with GER"},{"content":"There was this time, known as the bad old days, where programmers were seen as the assembly line workers and mechanics. The perception was that programmers didn’t create, they were the people who merely put together and fixed software systems, systems that were really ‘created by’ people who wrote the specifications.\nGiant requirements documents were dumped onto programmers desks, then they were told that most of the work had been done already and all they had to do was put it together. Then after slaving away building a system for months or years, they would find the thing they created was not what the customer wanted, and was essentially useless and/or fundamentally flawed, e.g. INCIS or Novapay.\nI did not work in that bad world, but I have heard stories from elder programmers, who, with a glazed over thousand-yard stares, describe their experiences with the waterfall process, requirements documentation and TPS reports.\nWith newer, better agile methodologies of developing software, there is now more room for programmers become the creators, and masters of their domain. Where the user is the centre of the process and the programmer is building what they actually need.\nIn this post, I am going to briefly look at Waterfall, Scrum, Extreme Programming, Agile, and Kanban. This is not an in-depth examination or comparison between them, but a quick introduction to their history and concepts.\nWhy use Software Methodologies? As programmers we create and maintain software. The process by which we do this will determine the outcome of the code. Will the code we write be good and the project a success, or will the code be broken, and the project be over-budget, unmaintainable and not needed by the user?\nTo increase the chances that a software project will be successful, previously successful practises were grouped into Software Methodologies. Basically, these are sets of good ideas that were formalised into methods to help developers build successful software.\n1970 — Waterfall Royce, when first describing Waterfall wrote that it is:\nrisky and invites failure.\nFrom the outset Waterfall was seen as a bad idea.\nWaterfall proposes that you develop software by going down through phases of development: Requirements, Design, Implementation, Verification, and ending at Maintenance.\nThe idea being that a software system is defined and refined at each stage, and a document is passed to the next stage to complete their part. The core benefit of this approach is that spending time upfront on requirements and design will save more money later as solving problems earlier is significantly more efficient.\nHowever, the main criticism of Waterfall is that customers and users will not know what they want upfront. They do not know or understand what is possible or feasible or the best way to get either. Software instability can also be due to a feedback cycle (Lehman describes Laws of Software Evolution), where once a user has a working system it effects the environment the system is deployed in, thus changing and adding requirements.\nAdditionally, as stated by Steve McConnell in Code Complete, design requirements and limitations cannot be entirely known before completion. Where following Waterfall means all the requirements and design being done before any implementation is done.\nWaterfall has fallen out of favour in recent times, though it is still used for many projects. A good discussion on some of the reasons why it continues being used, particularly in government, is given in the Ruby Rogues podcast Ruby in Government with Sarah Allen.\n1995 — Scrum Although the ideas in the Scrum methodology have been around since the 1980’s, it was not yet formalised and presented until OOPSLA in 1995 by Jeff Sutherland and Ken Schwaber. The book Agile Software Development with Scrum written by Schwaber and Beedle was later published as the core work defining the Scrum methodology.\nScrum is an iterative approach to development. That is, Scrum quickly repeats the same steps until the project is finished. This is in response to the often changing mind of the user, and is used to increase the ability to quickly adapt to changing requirements.\nFeatures are written from the perspective of the user and are called User Stories. All the user stories are combined into the product backlog.\nThe product owner is the person who decides what goes into the product backlog and they decide product’s direction as they represent the customers and users of the product.\nThe scrum master (basically the manager) is there to make sure the project moves smoothly and that everyone has what they need. Also, they set up meeting and facilitates release planning.\nThe developers build the product by implementing the user stories.\nThe testers test the product to make sure the user stories are completed correctly.\nThe customer uses and (hopefully) pays for the product.\nRelease Planning is where the team identifies stories from the product backlog and moves them to a release backlog. There the stories are prioritised and the amount of work to implement them is estimated. If a story is too large it is broken down into smaller stories. By summing up all the estimates from the stories the estimate of the total work required for release can be given.\nThere are many ways to estimate the work amount, a common way is to play a game called Planning Poker (described in Agile Estimating and Planning) where each player is given a deck of Fibbonaci numbers (0, 1, 2, 3, 5, 8, 13, 21…), the large difference in the higher numbers is a reflection of how difficult it is to estimate large values. Each story is discussed then the each person in the team picks a card to estimate the amount of work they think it will take. The members that pick the highest and lowest numbers get to discuss why they picked them, then the process is repeated till a consensus is reached.\nWith the prioritised and estimated user stories the sprints can now be planned. Sprints are short-duration milestones that allow teams to tackle a manageable chunk of a product, and get it to a ship ready state. Sprints can range from a few days to a month, and there are two to a dozen sprints in a release. The release backlog is split into the sprint backlog, where the goal of each sprint is to get a set of user stories that will be ready to ship by the end of the sprint.\nAs the sprints progress it is very important to monitor the progress of the stories. This is because the measurement of the teams progress will show if the project is behind schedule. This is where the burndown chart comes into play.\nThe burndown chart is a day by day measure of the remaining work in a sprint or release. This is calculated each day as developers update the stories they are working on with the estimated amount of effort remaining till completion.\nThe total amount of work remaining should be trending downwards as stories are completed. The rate at which it is falling, the slope of the graph, is known as the velocity, which is the average rate of productivity of the team per day. Knowing the velocity allows the team to calculate the estimated completion date for the sprint or release.\nThe burndown chart is a large reason for the popularity of scrum as it is a great visualisation tool to show the overall progress of the product. It is accessible by all non-technical and technical team members, and it conveys a massive amount of data in a few seconds. It is also useful in seeing the progress being made early on in the project, and enabling the team to compare actual velocity with needed velocity and make changes if necessary.\nThe daily scrum is a communication tool that is used to let information flow freely between team members. It is a fast paced standup meeting where people list the work they have completed and blocking obstacles they are facing. It lets major issues come to light and be dealt with quickly.\nAt the end of each sprint there is a sprit retrospective that lets everyone discuss what went right and where to improve. This allows for the method to be altered iteratively till it fits the team best.\n1999 — Extreme Programming (XP) In 1996 Kent Beck became lead of the Chrysler Comprehensive Compensation System (C3) payroll project, which intended to replace several independent payroll systems with one. While managing this project Beck refined the development methodology that would eventually become extreme programming. This resulted in Beck writing Extreme Programming Explained in 1999.\nExtreme programming is called as such because it takes its practises to the extreme, like always writing tests before code in test-driven development.\nThe introduction of Extreme Programming had a significant impact on the software development landscape. It was especially notable for its emphasis on test driven development practises, with things like ‘Keep it simple stupid’ (KISS), ‘You aint gonna need it’ (YAGNI) and ‘Fake it till you make it’.\nSome of the methods that XP prescribes are seen as too extreme by many, and often people reduced or removed some of the practices. The overall effect was that XP’s many practices have been widely adopted and are followed if not to the extreme nature they were originally intended.\n2001 — Agile Many iterative software methodologies like Scrum and XP were now out in the wild being used by developers. They had similar core philosophies and ideas like adapting to requirements and allowing for change. In February 2001, 17 developers meet to discuss their development methods and out of this meeting came the Agile Manifesto:\nIndividuals and interactions over processes and tools Working software over comprehensive documentation Customer collaboration over contract negotiation Responding to change over following a plan This broad set of guidelines was created and signed by many influential people including Kent Beck (creator of XP and Test Driven Development), Ken Schwaber and Jeff Sutherland (creators of Scrum), Martin Fowler (author of [Patterns of Enterprise Application Architecture] (http://www.amazon.com/gp/product/0321127420/ref=as_li_qf_sp_asin_il_tl?ie=UTF8\u0026amp;camp=1789\u0026amp;creative=9325\u0026amp;creativeASIN=0321127420\u0026amp;linkCode=as2\u0026amp;tag=maor01-20)and Refactoring: Improving the Design of Existing Code).\nMore than 10 years later the agile manifesto is still an important document in the world of software engineering. Although software has rapidly advanced over this time, the guidelines that Agile prescribes are still applicable. This demonstrates that as technology changes the nature of programming and development do not. This is one reason why understanding methodologies is an important aspect of software development.\nThe Agile Alliance was formed by some of the signees with the goal of supporting:\nthose who explore and apply Agile principles and practices in order to make the software industry more productive, humane and sustainable.\n2003 — Kanban In September 2003 the book Agile Management for Software Engineering: Applying the Theory of Constraints for Business Results described the Kanban System (or the Toyota System) to organise and deliver work.\nYou start with a board that has lanes with titles such as ‘Backlog’ and ‘In Progress’. Then you represent user stories as cards and place them in the lanes from left to right as to represent their completion. This way you can quickly gauge where there is a backup of stories on the board. The number and titles of the lanes are up to you, but the simpler the better.\nTo help ensure items are being completed each of the lanes can have a work-in-progress or WIP limit, that is the limit of the number of cards in a single lane. When there are too many cards you can easily identify the problem and then take steps to find a solution.\nKanban is more about getting-things-done rather than an entire software specific methodology. It can be used within other methodologies like Scrum to help progress through stories.\nSimilar to Kanban is the Lean Software Development (LSD) method described by Mary and Tom Poppendieck in\nLean Software Development: An Agile Toolkit. It also takes inspiration from the Toyota but contains more concepts that are software specific.\nConclusion These methodologies are tools for your belt. However, they are all still evolving (even Waterfall) in the way they used. So like any tool, they must be kept sharp. Adding tools to your belt is also necessary to dodge the trap where “every problem is a nail, when I only have a hammer”.\nSome More Useful Links Scrum in less than 10 minutes\nIntro to Kanban in Under 5 Minutes\n","permalink":"https://maori.geek.nz/posts/2014/2014-07-07_waterfall-to-agile-an-introduction-to-the-waterfall-scrum-and-kanban-software-methodologies/","summary":"\u003cp\u003eThere was this time, known as the bad old days, where programmers were seen as the assembly line workers and mechanics. The perception was that programmers didn’t \u003cem\u003ecreate\u003c/em\u003e, they were the people who merely put together and fixed software systems, systems that were really \u003cem\u003e‘created by’\u003c/em\u003e people who wrote the specifications.\u003c/p\u003e\n\u003cp\u003eGiant requirements documents were dumped onto programmers desks, then they were told that most of the work had been done already and \u003cstrong\u003eall\u003c/strong\u003e they had to do was put it together. Then after slaving away building a system for months or years, they would find the thing they created was not what the customer wanted, and was essentially useless and/or fundamentally flawed, e.g. \u003ca href=\"http://en.wikipedia.org/wiki/INCIS\"\u003eINCIS\u003c/a\u003e or \u003ca href=\"http://en.wikipedia.org/wiki/Novopay\"\u003eNovapay\u003c/a\u003e.\u003c/p\u003e","title":"Waterfall to Agile: An Introduction to the Waterfall, Scrum and Kanban Software Method(ologies)"},{"content":"After some time spent looking at Docker from afar, hearing everyone talk about how awesome it is and how all the cool kids are already **** using **** it. I decided to test drive Docker out by using it in my development environment. In this post I will describe how to set up Postgres, Elasticsearch, and Redis as Docker containers with Vagrant on Mac OS X.\nWhat is Docker? Docker uses lightweight containers to separate an application from the operating system it is running in. It puts the application in an isolated box that only exposes selected folders or ports required for that application to be used.\nThis makes each container is a reusable, shareable, knowledge base of how to setup and use an application. There already exists over 15,000 containers ready to be used at the Docker Hub. Docker is like a shopping cart, where you go and pick out the services you need to build the application you want, then just download and turn them on.\nSetting Docker Up in OS X Docker does not run in OS X natively, it requires a Linux kernel with LXC (LinuX Containers). So if you are on OS X like me, you will require some virtualisation.\nDon’t use boot2docker While trying to get docker working I found the “easy” install described here. This uses a tool called boot2docker which is a thin wrapper on a virtual machine like VirtualBox.\nI soon discovered this tool has some significant problems, like this, which halted any progress towards getting Docker in a stable state. I did not feel like hitting my head against a virtual wall any longer, so I continued looking for an alternate solution.\nUse Vagrant I eventually found out that since Vagrant version 1.6, it has built-in support for Docker. Vagrant is a thin wrapper around virtualisation software like VirtualBox, and it uses a declarative Ruby DSL to describe the environment you want to have.\nI like this way of defining the virtual environment, because if something fails you can burn it down and start again without a lot of left over mess, like environment variables, littering your machine.\nInstalling stuff First, lets quickly go over all the things you need to have installed.\nHomebrew, installed with: ruby -e \u0026quot;$(curl -fsSL https://raw.github.com/mxcl/homebrew/go)\u0026quot;\nCask, installed with: brew tap caskroom/homebrew-cask brew install brew-cask\nVagrant and VirtualBox, installed with: brew cask install virtualbox brew cask install vagrant\nThe Vagrant Files A Vagrant file describes the required virtual machine environment using a Ruby DSL. When describing Docker containers, Vagrant makes each container look like it is its own virtual machine. But this is a lie, as each Docker container is actually running in a “proxy” virtual machine.\nTherefore, two Vagrant files are required, one to define the proxy virtual machine (the provisioner) and one to define the the Docker containers (the providers).\nThe Proxy VM Vagrant file The proxy Vagrant file is called Vagrantfile.proxy: VAGRANTFILE_API_VERSION = \u0026quot;2\u0026quot;``Vagrant.configure(VAGRANTFILE_API_VERSION) do |config| config.vm.box = \u0026quot;hashicorp/precise64\u0026quot; config.vm.provision \u0026quot;docker\u0026quot; config.vm.provision \u0026quot;shell\u0026quot;, inline: \u0026quot;ps aux | grep 'sshd:' | awk '{print $2}' | xargs kill\u0026quot;``config.vm.network :forwarded_port, guest: 6379, host: 6379 config.vm.network :forwarded_port, guest: 5432, host: 5432 config.vm.network :forwarded_port, guest: 9200, host: 9200 end\nThis uses the hashicorp/precise64 Ubuntu 12.04 64 bit image for the proxy VM. It also provisions docker and does some magic with the shell to make docker to work (explained here).\nThe last thing is to set up the port forwarding. This uses config.vm.network to map the ports for Redis, Elasticsearch and Postgres from the proxy VM to OS X.\nThe Docker Containers Vagrant File This is the main Vagrantfile: VAGRANTFILE_API_VERSION = \u0026quot;2\u0026quot; Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|``config.vm.define \u0026quot;redis\u0026quot; do |v| v.vm.provider \u0026quot;docker\u0026quot; do |d| d.image = \u0026quot;dockerfile/redis\u0026quot; d.volumes = [\u0026quot;/var/docker/redis:/data\u0026quot;] d.ports = [\u0026quot;6379:6379\u0026quot;] d.vagrant_vagrantfile = \u0026quot;./Vagrantfile.proxy\u0026quot; end end``config.vm.define \u0026quot;elasticsearch\u0026quot; do |v| v.vm.provider \u0026quot;docker\u0026quot; do |d| d.image = \u0026quot;dockerfile/elasticsearch\u0026quot; d.ports = [\u0026quot;9200:9200\u0026quot;] d.vagrant_vagrantfile = \u0026quot;./Vagrantfile.proxy\u0026quot; end end``config.vm.define \u0026quot;postgres\u0026quot; do |v| v.vm.provider \u0026quot;docker\u0026quot; do |d| d.image = \u0026quot;paintedfox/postgresql\u0026quot; d.volumes = [\u0026quot;/var/docker/postgresql:/data\u0026quot;] d.ports = [\u0026quot;5432:5432\u0026quot;] d.env = { USER: \u0026quot;root\u0026quot;, PASS: \u0026quot;abcdEF123456\u0026quot;, DB: \u0026quot;root\u0026quot; } d.vagrant_vagrantfile = \u0026quot;./Vagrantfile.proxy\u0026quot; end end``end\nThis file defines the three containers for Redis , Elasticsearch, and Postgres with the images dockerfile/redis, dockerfile/elasticsearch and paintedfox/postgresql.\nEach file defines the vagrant_vagrantfile as the proxy VM file, this makes them all run in the same proxy virtual machine.\nThe volumes for Redis and Postgres are defined so that their information is stored in the proxy VM, and not in the container. This is so the container could be deleted or upgraded and the data will not be lost. The next step would be to map those folders from the proxy VM to OS X, but this is not necessary to get things working.\nThe ports on each container defines which ports to forward to the proxy VM. These need to match up with the ports that the proxy VM forwards to OS X.\nThe Postgres container also defines the environment variables needed to set up its server. These can be used to set up the default Postgres server in OS X by setting the environment variables PGHOST=localhost PGUSER=root PGPASSWORD=abcdEF123456.\nWorking with Vagrant In the same directory as your Vagrant file, you can now run: vagrant up --provider=docker\nThe first time you run this, Vagrant will download then start the proxy VM, then it will download and start the Docker containers. Each time Vagrant is run after these initial downloads it will reuse the existing images.\nThe status of the Docker containers can be seen with: vagrant status\nThis should output something like: Current machine states:``redis running (docker) elasticsearch running (docker) db running (docker)\nTo test the Docker containers are working correctly, the Redis and Postgres clients, and curl for Elasticsearch can be used. Just check that redis-cli and psql connect to their servers, and curl http://localhost:9200 responds.\nIf you need to connect to the proxy VM, which can be very useful for debugging, run vagrant global-status which will list all VMs including the proxy. Then call vagrant ssh with the ID of the proxy. I would recommend not changing this proxy VM manually, instead use a Chef (or similar) script so that the changes can be more easily tested and distributed.\nPerformance When using virtualisation, the first question always asked is \u0026ldquo;How much of a performance hit is there?\u0026rdquo;. To find out how bad this performance hit is, my colleague and I both ran a Postgres, Elasticsearch and Redis intensive test suite on identical hardware. The only difference was one test suite had natively installed software and the other had Docker-ized containers. The native suite ran in 2 minutes and the containers ran in 3 minutes.\nThis performance hit is not as small as I would like, but it could be worse. Even with this, I will continue using Docker for development, but not recommend it to everybody as a panacea for all development environment problems.\nNote: Some other limitations of using Vagrant and Docker are listed here.\nConclusion I cannot yet see where this \u0026ldquo;Vagrant with Docker\u0026rdquo; path is going. However, after seeing what is possible I cannot help but think about how else it can be used. Plus, it is the most fun I have ever had with virtualisation, and fun is what programming is all about.\nFurther Reading The Docker Book: Containerization is the new virtualization\nVagrant: Up and Running\n","permalink":"https://maori.geek.nz/posts/2014/2014-07-02_vagrant-with-docker-how-to-set-up-postgres-elasticsearch-and-redis-on-mac-os-x/","summary":"\u003cp\u003eAfter some time spent looking at \u003ca href=\"http://www.docker.com/\"\u003eDocker\u003c/a\u003e from afar, hearing everyone talk about how awesome it is and how all the cool kids are \u003ca href=\"http://techcrunch.com/2014/06/10/google-bets-big-on-docker-with-app-engine-integration-open-source-container-management-tool/\"\u003e\u003cstrong\u003ealready\u003c/strong\u003e\u003c/a\u003e **** \u003ca href=\"https://speakerdeck.com/teddziuba/docker-at-ebay\"\u003e\u003cstrong\u003eusing\u003c/strong\u003e\u003c/a\u003e **** \u003ca href=\"http://aws.amazon.com/about-aws/whats-new/2014/04/23/aws-elastic-beanstalk-adds-docker-support/\"\u003e\u003cstrong\u003eit\u003c/strong\u003e\u003c/a\u003e. I decided to test drive Docker out by using it in my development environment. In this post I will describe how to set up \u003ca href=\"http://www.postgresql.org/\"\u003ePostgres\u003c/a\u003e, \u003ca href=\"http://www.elasticsearch.org/\"\u003eElasticsearch\u003c/a\u003e, and \u003ca href=\"http://redis.io/\"\u003eRedis\u003c/a\u003e as Docker containers with \u003ca href=\"http://www.vagrantup.com/\"\u003eVagrant\u003c/a\u003e on Mac OS X.\u003c/p\u003e\n\u003ch3 id=\"what-is-docker\"\u003eWhat is Docker?\u003c/h3\u003e\n\u003cp\u003eDocker uses lightweight \u003cem\u003econtainers\u003c/em\u003e to separate an application from the operating system it is running in. It puts the application in an isolated box that only exposes selected folders or ports required for that application to be used.\u003c/p\u003e","title":"Vagrant with Docker: How to set up Postgres, Elasticsearch and Redis on Mac OS X"},{"content":"Creating maps to be used in visualisations can be a very difficult task. Although I have written about creating visualisations using maps, I have not yet written how I created those maps which can be just as important.\nIn this post I will describe how to use a few tools (koordinates, QGIS, ogr2ogr and TopoJSON) to generate, edit and transform maps to be used in visualisations.\nNote: TopoJSON command line tool has implemented much of the ogr2ogr functionality described in this post. There is also a free online tool mapshaper that can be used instead as well.\nShapefiles The most common distribution format for maps is the shapefile (.shp). A great place to get shapefiles from is koordinates, which is a site that shares geo-spatial data downloadable in various formats.\nOnce you have downloaded a shapefile, there are two useful tools to view and manipulate them, QGIS and ogr2ogr. QGIS is a GUI tool for opening and editing shapefiles, and ogr2ogr is a command line tool for converting between different mapping file formats.\nIntersection Clipping Sometimes shapefiles do not come with the boundaries you want. For example the New Zealand Regions shapefile includes the areas in the ocean belonging to the region so drawing them makes New Zealand look like a blob.\nTo fix this:\ndownload the shapefile including the New Zealand coastlines import both the regions shapefile and coastlines shapefile into QGIS go to Vector \u0026gt; Geoprocessing Tools \u0026gt; Intersect select the layers to intersect and export to new shapefile Resulting in New Zealand looking like:\nSimplifying the Map Shapefiles are a vector format, this means that any additional detail increases their size. If the visualisation does no need the detail, the the viewer will have to wait to download lots of useless data.\nUsing ogr2ogr, a shapefile can be simplified to reduce its size, you just need to specify a tolerance, e.g.: ogr2ogr -simplify .001 out.shp in.shp\nTo demonstrate the effect of the tolerance on the resulting map, I used the New Zealand regions shapefile to perform a small experiment. I took the map and simplified it to 4 different sizes; 0.01 (top-left), 0.05 (top-right), 0.1 (bottom-left) and 0.5 (bottom-right):\nThe original shapefile size was 9.6MB, with 0.01 tolerance it was 70kB, 0.05 14kB, 0.1 7kB and 0.5 3kB. In the image you can see that 0.01 and 0.05 are very similar, 0.1 shows more simplification (especially when compared with 0.1) and 0.5 is clearly oversimplified.\nShapefile to GeoJSON GEOJson is a JSON format for encoding geographic data structures (features), and is the format used by D3.js to visualise geographical information.\nCreating GeoJSON files from shapes uses the ogr2ogr tool: ogr2ogr -f GeoJSON out.json in.shp\nThis will create a larger file than the shapefile because GeoJSON is a plain text format, e.g. the 70kB shapefile will create a 200kB GeoJSON file.\nCompressing GeoJSON to TopoJSON GeoJSON is a verbose format that will repeat the same information over and over. For example a border between two countries will be described twice in GeoJSON, once for each country.\nTopoJSON is a compression tool for GeoJSON that reduces a GeoJSON file by removing its repetition.\nCompress a map using TopoJSON by: topojson -o out.json in.json\nYou can use D3 to fetch out.json then convert it using the TopoJSON javascript library to the original GeoJSON features:\n\u0026lt;script src=\u0026#34;topojson.js\u0026#34;\u0026gt;\u0026lt;/script\u0026gt; \u0026lt;script\u0026gt; d3.json(\u0026#34;out.json\u0026#34;, function(error, out) { var geojson_features = topojson.feature(out, out.objects.regions) ... // }) \u0026lt;/script\u0026gt; Once you have the map you want, at the resolution you want, compressed down to the size you want, you can start to make your visualisation.\nFurther Reading New Zealand Regional Maps\nData Visualization with D3.js Cookbook\nInteractive Data Visualization for the Web\n","permalink":"https://maori.geek.nz/posts/2014/2014-06-22_d3.js-tips-tricks-and-tools-for-creating-and-working-with-maps/","summary":"\u003cp\u003eCreating maps to be used in visualisations can be a very difficult task. Although I have written about \u003ca href=\"http://maori.geek.nz/post/d3_js_geo_fun\"\u003ecreating visualisations using maps\u003c/a\u003e, I have not yet written how I created those maps which can be just as important.\u003c/p\u003e\n\u003cp\u003eIn this post I will describe how to use a few tools (\u003ca href=\"https://koordinates.com/\"\u003ekoordinates\u003c/a\u003e, \u003ca href=\"http://www.gdal.org/ogr2ogr.html\"\u003eQGIS\u003c/a\u003e, \u003ca href=\"http://www.gdal.org/ogr2ogr.html\"\u003eogr2ogr\u003c/a\u003e and \u003ca href=\"https://github.com/mbostock/topojson\"\u003eTopoJSON\u003c/a\u003e) to generate, edit and transform maps to be used in visualisations.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eNote: TopoJSON command line tool has implemented much of the ogr2ogr functionality described in this post. There is also a free online tool\u003c/em\u003e \u003ca href=\"http://www.mapshaper.org/\"\u003e\u003cem\u003emapshaper\u003c/em\u003e\u003c/a\u003e \u003cem\u003ethat can be used instead as well.\u003c/em\u003e\u003c/p\u003e","title":"D3.js: Tips, Tricks and Tools for Creating and Working with Maps"},{"content":"\nI recently picked up a copy of Youtility: why smart marketing is about help not hype by Jay Baer and found it interesting enough to blog about.\nYoutility is about how to engage your customer by being useful to them, thus gaining the limited and valuable resource — their attention. In this post I will highlight the interesting parts of the book, and also try to convey the overall themes and content.\nNote: Given that I know almost nothing about marketing and many of the concepts in the book are brand new to me, this may not be the most the most eye-opening post to some.\nYoutility The message of the book is “being useful is the best kind of marketing” follows from the arguments that:\nThe amount of time and attention that people have is limited and valuable, to both them and to marketers. The increasing amount of marketing people are exposed to is reducing their engagement with it, making it more difficult to convert a person to a customer. The increasing amount of information available has made people require more information before they make any decision. The best way to gain a person’s attention and engage with them is to be informative and useful. The goal of this should not be to sell something, but to encourage a relationship between a potential customer and a company. Jay Baer first wrote about the idea of Youtility as a blog post, which describes this non-linear approach to marketing. Non-linear because this form of marketing is not done with the direct goal of sales, but done to sincerely be useful in order for a customer to voluntarily engage with the company.\nYoutility follows the “if you give a man a fish” proverb:\nIf you sell something to someone, you make a customer today; if you help someone, you make a customer for life.\nWho Provides Youtility? Throughout the book there are many examples of how companies are currently providing youtility to the customers and how it has worked for them.\nMcDonald’s Canada’s created a website that answers any questions about their food that people might have. Questions like ‘Is it really 100% beef?’ and ‘Why does the advertised burger look different to the actual burger?’ have resulted in answers including the viral video Behind the scenes at a McDonald’s photo shoot. These answers were candid and not trying to sell the menu, but openly inform the customer about their food. This openness has paid off as many myths about McDonald’s food have been dispelled and the company has been able to grab the attention of many people.\nGeek Squad used a similar method of youtility by creating free videos of how to fix and maintain computers, software and other technology, e.g. How to install an amplifier and sub-woofer into a car. Giving out this information gave them credibility as experts in the domain, so when someone needed help they would call the place they trusted.\n@HiltonSuggests is a Twitter account that provides information to anyone about restaurants, hotels and more. This is a place that people trust to ask where to get meal in a city they are unfamiliar with. This means that the people who interact with @HiltonSuggests are in an unfamiliar city, their main customer base. So when these people ask where to find a hotel, it will try direct you to a nearby Hilton hotel.\nThe Informed Customer Marcus Sheridan (The Sales Lion) in the foreword of Youtility discusses how an informed customer is a double edged sword. Saying more informed customers are easier to deal with and have a better understanding of what is currently on offer. However,\nEducated consumers are sometimes threatening to sales-people, […] information changes the balance of power\nSheridan, a ‘pool-guy’ business owner, chalks up his businesses success to his helpful blog which answers common questions about pool design and installation. His example of using a blog that is helpful, as compared to other companies who use blogs to brag, encouraged customers to trust his advice and company. This is a similar Youtility strategy to Geek Squad.\nHis blog ultimately created more leads and sales in a time where the rest of his industry was suffering because of the economy. He did not advertise directly, but by providing information and being useful generated traffic and served as the way in which customers found out about his company. When they were ready to buy a pool, they knew where to go, they went to the place they got all their information from.\nconsumers of all types expect to find answers on the Internet now, and companies that best provide that information garner trust and sales and loyalty\nSheridan also noted that informed customers have knowledge before they buy something, this means that you have less work to do to explain to them their options. Additionally, he saw his closing rates increase as when they contacted him, they had already seen the options and made a decision to buy.\nTypes of Marketing A large portion of the book is used to compare marketing with Youtility against other classic marketing strategies. Baer breaks down marketing strategies into three types:\nTop-of-mind awareness Frame-of-mind awareness Friend-of-mine awareness Top-of-mind awareness Top-of-mind awareness is constant marketing so when the customer is ready to buy they think of you first. For example, Coca-Cola advertisements everywhere encouraging people to want thier drinks when then are thirsty. This strategy requires lots of advertising, and your product to be available everywhere. It also has one message for everyone, a dumb (if effective) marketing strategy. Increasing however, as the media landscape becomes more fractured where there is more places to find and consume content, reaching everyone with this approach has never been more difficult. This shotgun strategy might work for some but is not for all.\nFrame-of-mind awareness Frame-of-mind awareness is reaching customers when they are ready to buy, e.g. a store-front display that attracts attention of a shopper walking past. This also includes directories, like the yellow pages and Google, for when someone knows they want something and so are looking for a place to get it. Being the top result with an appealing name will drive people to your company. This means that the company is passively waiting for a customer to choose them, making predictions about when the customer will walk in, or click a link, difficult to predict.\nFriend-of-mine awareness Friend-of-mine awareness is where a company attempts to become a trusted resource to a person, so when they are ready to be sold something the company is already there. Providing value to your customers, beyond your sales pitch, gives customers a reason to be constantly in touch. This strategy of helping the customer is not about directly making a sale, it is about the relationship.\nThe three aspects Baer points out as important to successfully use friend-of-mine awareness are:\nSelf-serve information: helping people find the information they want. Radical transparency: providing all answers, being sincere. Real time relevancy: Be useful at the particular time, location and context a user in, then be in the background till next required. Zero Moment of Truth The Zero Moment of Truth is the instant where the customer can either progress down the sales path, or retreat, it is the flight or fight response in sales. It is an e-book publicised by Google that has many concepts, one of which is that customers are increasing the need for more information before buying something. This is true for me, before I buy anything I require at least two independent reviews, and the full specifications of what I am buying.\nThe increase in required information is not a reflection of our increasing distrust, but more because there are more information sources available. If a customer has all the necessary information at their fingertips, then making good decisions is easier.\nAlways on Internet access has made us all passive-aggressive.\nThis assertion comes after the fact that a person would more likely read a review of a product than send an email directly asking a question, and much more likely than walking into a store and talking to a sales rep. This is what Baer describes as the Self Serve Culture, where sales representatives are less involved when convincing a customer to purchase.\nYou are in two businesses. You’re in the ‘whatever business you’re in’ business, and you’re in the media business. […] Both B2B and B2C companies with 101–200 pages generate 2.5 times more leads. Companies that blog 15 or more times per month get 5 times more traffic. source: Hubspot 2012\nBeing available at the Zero Moment of Truth is where the real time relevancy comes in to play in the friend-of-mine strategy.\nYou’re either sufficiently useful at any given moment, and thus can connect with the customer, or you’re not. It’s real-time relationship building.\nFor this strategy context is key, understanding where they are, what they are doing, what they like and what they might like will give better indicators of how to be helpful.\nThe future of this hyper-relevancy may be taking applications […] to their logical extreme, using “anticipatory computing” to push information to participants before they even realise they need it.\nThis future is coming to fruition with the rapid adoption of mobile, as Baer says:\nWithin a generation every customer and prospective customer of every company in every developed nation will have never known a world without the ability to access information at any time through a mobile device.\nHow to You can be Yousful with Youtility This book also gives a few helpful tips on how to use friend-of-mine marketing strategies and how to create youtility. First the questions must be asked:\nHow do your customers discover information? What are their preferences for consumption? What motivates them to take action? Use Google tools to understand your customers; Google Analytics, Google Trends, Google Suggest, Google WebMaster, Google Keyword Planner, Google Correlate. All these provide insight into your customers on the other side of the screen. WebTrends, HubSpot, KISS metrics are also useful tools.\nThese tools are used to find your customers problems, which is not the same as finding a solution. Figuring out the real customer needs into actual results is the difficult part. Before creating an app, writing a blog, making videos, using social media make sure that is the right decision, otherwise you may end up with disappointing results.\nThe next hurdle is telling people about the solution. Just creating something useful is not enough if no one knows about it. This may seem self referential, marketing your marketing, but you have to get the ball rolling somehow. If it is useful and you are not selling it then it should not be difficult gaining adoption.\nThe mindset to create such useful solutions for customers is not a project, it is a process that can be refined and reinvented. It is a long term goal, not a short term gain. More sales or getting wide spread adoption is not the goal, it is the result. The goal is to provide useful, valuable, helpful assistance to people who are, or could become, your customer.\nProviding something for people does not mean you should not measure its value to the company. Some metrics to measure your success are:\nConsumption metrics: How many people are using your product? Advocacy and Sharing metrics: How many people mention it on social media or blog about it? Lead Generation Metrics: How many people considered a purchase? Sales Metrics: How Many people were sold something? Return on Investment: How much it cost v.s. leads and sales metrics? This can be compared to your other forms of marketing, with the understanding that the goal is long term and not necessarily immediate. Measuring things like trust may be a bit difficult, but as long as the youtility created is sincere and not just a business as usual marketing attempt, people will start to associate your company with being genuinely helpful. That is not a bad thing.\nActual Book review This book makes some great points about marketing during the digital and mobile revolution. It is also a great place to start if you want to dig deeper as it references many good books, companies, studies and websites. It even comes with an excellent references list at the back (like any good publication should).\nI think the target audience for this are marketing or management people looking at understanding the direction that technology is taking in their domains. Some of the book for me felt like it was preaching to the choir, e.g. the discussion of why it is a good idea to have a blog. However, at the same time I found many of the well made arguments in the book to be counter-intuitive, so I learnt a few things and think it was still worth the read.\nThe main problem I had with the book is its constant self referencing to Youtility, like it is desperately trying to coin the word. But I think that is how marketers usually write, so I will let that slide :)\nNote: please leave a comment or suggestion. This is the FIRST marketing book I have ever read, and I would love people to have a discussion and point out other resources that would be interesting.\n","permalink":"https://maori.geek.nz/posts/2014/2014-06-15_marketing-by-being-useful-how-you-can-be-yousful-with-youtility/","summary":"\u003cp\u003e\u003cimg alt=\"image\" loading=\"lazy\" src=\"/posts/2014/2014-06-15_marketing-by-being-useful-how-you-can-be-yousful-with-youtility/images/1.jpg#layoutTextWidth\"\u003e\u003c/p\u003e\n\u003cp\u003eI recently picked up a copy of \u003ca href=\"http://www.amazon.com/gp/product/1591846668/ref=as_li_qf_sp_asin_il_tl?ie=UTF8\u0026amp;amp;camp=1789\u0026amp;amp;creative=9325\u0026amp;amp;creativeASIN=1591846668\u0026amp;amp;linkCode=as2\u0026amp;amp;tag=maor01-20\"\u003e\u003cstrong\u003eYoutility: why smart marketing is about help not hype\u003c/strong\u003e\u003c/a\u003e by Jay Baer and found it interesting enough to blog about.\u003c/p\u003e\n\u003cp\u003eYoutility is about how to engage your customer by being useful to them, thus gaining the limited and valuable resource — their attention. In this post I will highlight the interesting parts of the book, and also try to convey the overall themes and content.\u003c/p\u003e","title":"Marketing by being Useful: How You can be Yousful with Youtility"},{"content":"Lately I have been learning Redis and experimenting with the features it provides. One such feature is its publisher/subscriber functionality that allows a publisher to push messages to many subscribers. This functionality is deceptively easy to use and a very handy tool to know, so in this post I will describe how to use the this Redis functionality in Ruby.\nNote: If you want to learn about Redis Redis in Action\nSending Messages There are many problems that can be simplified to the publish subscribe pattern (pub/sub). Twitter is a great example of a massive publish and subscribe service, where someone publishes or tweets a message to many subscribers or followers.\nThe pub/sub pattern can also be used as an enterprise service bus to help scale an organisations internal messaging. As described in my previous post Enterprise Software and Building Infinite Staircases, there are many dispersed teams in enterprise organisations. Using the pub/sub pattern allows teams to communication while limiting interdependencies that could cause technology to become fragile.\nThese benefits of the pub/sub pattern can be summarised as:\nReduced coupling: The publisher does not need to know anything about the subscriber and vice versa. The only parts where they are both dependant upon is where the message is published, and the message content. Scalability: A publisher does not know if it has one subscriber or thousands, and a subscriber does not know how many other subscribers there are. This independence of size means that this pattern can increase in scale without increasing in complexity. Redis Pub/Sub Redis is the most popular Key/Value store according to the monthly rankings at db-engines.com. It has functionality for lists, sets, sorted sets, and a few other data structures. It is also memory based, holding its entire data-set in memory at once making it very fast.\nRedis also provides the functionality to publish a message to a channel, and for subscribers to listen to messages published on various channels. To experiment with this functionality first the redis gem needs to be installed to handle the communication aspects with Redis.\nNote: I assume you already have a Redis server installed. If not brew install redis on OSX or apt-get install redis on Ubuntu is a good start gem install redis\nThen a subscriber script can be written: require 'redis' connection = Redis.new connection.subscribe 'stream' do |on| on.message do |channel, msg| puts \u0026quot;#{channel} says #{msg}\u0026quot; end end\nRedis.new will create an object that connects to a Redis server on localhost with the default Redis port of 6379. The subscribe function takes a channel ‘stream’ and a block that yields an on callback. A block is given to the message function on the callback which is executed when a message is received. In this example the block just prints the channel and message to the console.\nNote: This subscriber will run in an infinite loop listening for incoming messages.\nA subscriber could also explicitly listen to multiple channels with: connection.subscribe 'stream:1', 'stream:2' do |on| on.message{ |channel, message|... } end\nOr using wild cards to listen to channels described by a pattern: connection.psubscribe 'stream:*' do |on| on.pmessage{ |query, channel, message| ... } end\nThe publisher can then be written: require 'redis' connection = Redis.new connection.publish 'stream', 'Hello'\nIt is that easy; just call publish on a Redis connection with a channel stream with a message Hello. This will cause the subscriber above to write “stream says Hello” to the console.\nConclusion The more I learn about Redis, the more I see potential uses for it in applications I create. It is a great tool to have on my tool belt and I will continue to experiment with the Redis’s pub/sub, as well as the rest of its features.\n","permalink":"https://maori.geek.nz/posts/2014/2014-05-06_become-a-publisher-with-redis-and-ruby/","summary":"\u003cp\u003eLately I have been learning \u003ca href=\"http://redis.io/\"\u003eRedis\u003c/a\u003e and experimenting with the features it provides. One such feature is its \u003ca href=\"http://redis.io/topics/pubsub\"\u003e\u003cstrong\u003epublisher/subscriber\u003c/strong\u003e\u003c/a\u003e functionality that allows a \u003cem\u003epublisher\u003c/em\u003e to push messages to many \u003cem\u003esubscribers\u003c/em\u003e. This functionality is deceptively easy to use and a very handy tool to know, so in this post I will describe how to use the this Redis functionality in Ruby.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eNote: If you want to learn about Redis\u003c/em\u003e \u003ca href=\"http://www.amazon.com/gp/product/1617290858/ref=as_li_qf_sp_asin_il_tl?ie=UTF8\u0026amp;amp;camp=1789\u0026amp;amp;creative=9325\u0026amp;amp;creativeASIN=1617290858\u0026amp;amp;linkCode=as2\u0026amp;amp;tag=maor01-20\u0026amp;amp;linkId=OCCKGZ6G3K575DMS\"\u003e\u003cstrong\u003e\u003cem\u003eRedis in Action\u003c/em\u003e\u003c/strong\u003e\u003c/a\u003e\u003c/p\u003e\n\u003ch3 id=\"sending-messages\"\u003eSending Messages\u003c/h3\u003e\n\u003cp\u003eThere are many problems that can be simplified to the \u003ca href=\"http://en.wikipedia.org/wiki/Publish%E2%80%93subscribe_pattern\"\u003epublish subscribe pattern\u003c/a\u003e (\u003cstrong\u003epub/sub\u003c/strong\u003e). Twitter is a great example of a massive publish and subscribe service, where someone \u003cem\u003epublishes\u003c/em\u003e or tweets a message to many \u003cem\u003esubscribers\u003c/em\u003e or followers.\u003c/p\u003e","title":"Become a Publisher with Redis and Ruby"},{"content":"Being new to an enterprise organisation, I found that reading Martin Fowlers book Patterns of Enterprise Application Architecture (PoEAA) a cathartic exercise. It described the confusing and often frustrating world I found myself in, and prescribed solutions without claiming to be a panacea.\nEnterprise systems are difficult, and not only because they can be technically challenging. In this post I will briefly look at two problems of enterprise systems (illogical requirements and modelling objects across teams) and try to describe ways to mitigate them.\nNote: In a previous post I discussed When to be a Software Architect, which was also inspired from reading PoEAA.\nIllogical Enterprise Software Systems What is enterprise software?\n[Enterprise Software is] purposed-designed computer software used to satisfy the needs of an organisation rather than individual users — Wikipeidia\u0026gt; [Enterprise Systems] support business processes, information flows, reporting, and data analytics in complex organisations — Wikipeidia\nThat is, enterprise systems are business focused systems that are purpose designed for a complex organisations. They are focused on the business instead of the individual.\nSo enterprise systems are large and complex. What does Fowler have to say about them?\nenterprise applications often have complex data — and lots of it — to work on, together with business rules that fail all tests of logical reasoning. — PoEAA\nMy manager put this idea another way:\nJust because you can draw an infinite staircase doesn’t mean you can build it!\nSo, enterprise systems are:\nBusiness Focused Purpose Designed Complex Illogical The illogical nature of enterprise systems is probably due to the first three points. Any big business will be complex and have many intricate internal flows and relationships. Given the enterprise system is meant to represent these complexities, it will have to try and implement the complex logic which seems illogical (from a sane developers view).\nTo fix the problem seems straight forward, fix the illogical aspects of the system. However, this can be very difficult, as Fowler states:\nbusiness rules are just given to you, and without major political effort there’s nothing you can do to change them. — PoEAA\nIt therefore may be necessary to incorporate the illogical aspects of the business into the enterprise system.\nOrganisation is the Key Fowler’s core technical advice for building an illogical enterprise applications is:\norganise the business logic as effectively as you can, because the only certain thing is that the logic will change over time — PoEAA\nSandi Metz (author of Practical Object-Oriented Design in Ruby) describes a more specific way of organising illogical code in her presentation Go Ahead, Make a Mess. She advocated to allow bad code as long as it is in an Omega Mess, which should:\nBe at the end of the line Have no Dependencies Have no Dependents This way the omega mess of illogical code is hidden from the main flow of the application, and it will then not impact the design of the rest of the application.\nAs Sandi Metz states:\nSince it has no dependants and no dependencies, there is no way that changes to it can effect you app or changes in your app can effect it. It is not connected in one of the threads on your tapestry.\nSo to deal with the illogical parts of enterprise system:\nOrganise the illogical parts into places where they will have limited dependencies, and where they can be easily changed\nSame Model, Different Properties Enterprises are big and comprising of many teams, the also generally are focused on a few core concepts like product or service. This leads to a problem where the same concept (or model) can have vastly different uses and properties between teams.\nIn Brandon Byars post Enterprise Integration Using REST he gives an example of a product model where a marketing team may require each product to have a picture, a blurb and uses them in promotions. Where the accounting team requires them to having a supplier, a cost price and uses them as lines on invoices.\nThe marketing and accounting teams will have different rules and constraints on the properties and uses of the product model, and these may not be easily merged into a single definition, e.g. an accountant may join multiple similar products together to reduce lines in invoices and a marketer may break down products into variants to advertise.\nCreating each model with all the necessary properties and logic for each team may be very difficult. Organising and reusing code can cause fragility in the system by creating dependencies between teams, e.g. if one team changes how they use a product, it may effect another team.\nByars further states that:\nAttempting to rationalise the entire enterprise view of a product into a single catalog simply makes that catalog both fragile and inflexible.\nThis echoes Domain Drive Design, that states:\nthe total unification of the domain model for a large system will not be feasible or cost-effective\nBoundaries are Key Byars solution is inspired from Domain Drive Design, where for each team the the domain model is split into bounded contexts where the interrelationships are explicitly defined and can be mapped to and from a universal terminology.\nAn example of a domain model with bounded contexts is:\nThe benefits of this approach are:\nHaving code specific to a team separated from other teams and core logic Minimising the dependencies between teams Reuse of the universal terminology across teams No upward dependence from the universal terminology to the team specific logic Drawbacks of Layers:\nCascading changes if the universal terminology changes Lost reuse between the team contexts Drawing the boundary between universal and team specific logic can be difficult Conclusion I found reading Patterns of Enterprise Application Architecture useful to understand enterprise systems and environments. By understanding that the illogical requirements and complexity is not unique to my work, but is actually a property of the business, makes it easier to identify and mitigate problems as they occur. Plus, I no longer go crazy when a specification comes in telling me to build an infinite staircase.\nP.S. An excellent example of building infinite staircases in enterprise is given in this video:\n","permalink":"https://maori.geek.nz/posts/2014/2014-04-23_enterprise-software-and-building-infinite-staircases/","summary":"\u003cp\u003eBeing new to an enterprise organisation, I found that reading Martin Fowlers book \u003ca href=\"http://www.amazon.com/gp/product/0321127420/ref=as_li_qf_sp_asin_il_tl?ie=UTF8\u0026amp;amp;camp=1789\u0026amp;amp;creative=9325\u0026amp;amp;creativeASIN=0321127420\u0026amp;amp;linkCode=as2\u0026amp;amp;tag=maor01-20\"\u003ePatterns of Enterprise Application Architecture\u003c/a\u003e (\u003cstrong\u003ePoEAA\u003c/strong\u003e) a cathartic exercise. It described the confusing and often frustrating world I found myself in, and prescribed solutions without claiming to be a panacea.\u003c/p\u003e\n\u003cp\u003eEnterprise systems are difficult, and not only because they can be technically challenging. In this post I will briefly look at two problems of enterprise systems (illogical requirements and modelling objects across teams) and try to describe ways to mitigate them.\u003c/p\u003e","title":"Enterprise Software and Building Infinite Staircases"},{"content":"Elasticsearch is a great way to store lots of documents that need to be quickly searched and retrieved. In addition to a broad query API, Elasticsearch also provides scrolling functionality that lets you query the server and incrementally download the results. This can be really useful for processing any large result set, e.g. for reindexing.\nUsing this scrolling, however, can be difficult because it first requires setting up then repeatedly calling the server to access the results. In order to simplify scrolling, I have implemented a package called ElasticScroll using Node.js with the Q promises library. In this post I will describe how ElasticScroll works and how to use it.\nNote: I am going to use CoffeeScript in this post for reasons described here.\nOverview The two libraries that ElasticScroll uses are Q and Q-IO. Q is a Promises A+ library (read more about promises here) and Q-IO is a wrapper for promises around the Node.js IO interfaces.\nFirst, ElasticScroll must import the libraries: Q = require 'q' qhttp = require 'q-io/http'\nThen define the class that will contain the functions and variables: class ElasticScroll constructor: (@url, @query, @process_fn) -\u0026gt;\nTo initialise ElasticScroll it must have a url that defines where the Elasticsearch server is, a query to send and a function process_fn that will process each document.\nThe main scrolling function scroll is defined as: scroll: -\u0026gt; Q.fcall( =\u0026gt; @set_scroll_id(@query)) .then( =\u0026gt; @get_next_set()) .then( (hits) =\u0026gt; @process_hits(hits)) .then( (hits) =\u0026gt; @continue_scroll(hits))\nscroll returns a promise for the results created using Q.fcall. It gets these results by first setting up the scroll_id, then getting the first set of results, processing the hits, then continuing to scroll. Each of these promises are described below.\nNote: the use of =\u0026gt; (fat-arrow) syntax in CoffeeScript just defines a function where this (@) is bound to the object where the function is defined, read more here\nGetting the Scroll ID The first stage to scroll Elasticsearch is to send the query and it returns a scroll id. This id is used to access the cached results in Elasticsearch. set_scroll_id: -\u0026gt; request = { method: \u0026quot;POST\u0026quot; body: [JSON.stringify(@query)] url: \u0026quot;#{@url}/_search?search_type=scan\u0026amp;amp;scroll=10m\u0026quot; }``qhttp.request(request) .then((response) -\u0026gt; response.body.read()) .then((resp) -\u0026gt; JSON.parse(resp.toString())) .then((json) =\u0026gt; @scroll_id = json._scroll_id)\nIn the set_scroll_id function, first the request is defined to post the query to Elasticsearch. Then Q-IO\u0026rsquo;s http module is used to send the request, read the body, parse the response, then assign the instance variable scroll_id as the returned scroll id.\nGetting the Results Getting the results involves sending the obtained scroll id to Elasticsearch, then parsing the results. get_next_set: () -\u0026gt; request = { method: \u0026quot;GET\u0026quot; url: \u0026quot;#{@url}/_search/scroll/#{@scroll_id}?scroll=10m\u0026quot; } qhttp.request(request) .then((response) -\u0026gt; response.body.read()) .then((resp) -\u0026gt; JSON.parse(resp.toString())) .then((json) -\u0026gt; json.hits.hits)\nProcessing the Results Processing the results just uses CoffeeScript list comprehensions and the function passed into the constructor. process_hits: (hits) -\u0026gt; (@process_fn(hit) for hit in hits)\nContinueing to Scroll To check whether the function should stop scrolling it will return if there were no hits. Otherwise, this function will return a promise for the next step, process the results, then recursively call itself to see if it should continue scrolling. continue_scroll: (hits) -\u0026gt; return if hits.length == 0``@get_next_set() .then( (hits) =\u0026gt; @process_hits(hits)) .then( (hits) =\u0026gt; @continue_scroll(hits))\nUsing ElasticScroll To use ElasticScroll you have to install the module with: npm install elasticscroll\nThen you have to import it, define a query, define a processing function, and url, and initialise ElasticScroll with them. then call scroll. ElasticScroll = require 'elasticscroll'``query = { \u0026quot;query\u0026quot;: { \u0026quot;query_string\u0026quot; : { \u0026quot;query\u0026quot; : \u0026quot;some query string here\u0026quot; } } }``print_to_console = function(hit){ console.log(hit) }``es = new ElasticScroll(\u0026quot;http://localhost:9200\u0026quot;, query, print_to_console)``es.scroll().fail(console.log)\nConclusion I really like working with Elasticsearch and think that Node.js is an excellent platform to build tools that interact with it. Additionally, I really enjoyed learning more about Q promises and Q-IO, because it made writing this reasonably complex function much more enjoyable.\nResources ElasticSearch ElasticSearch Cookbook ","permalink":"https://maori.geek.nz/posts/2014/2014-04-14_scrolling-elasticsearch-using-node.js-and-promises/","summary":"\u003cp\u003eElasticsearch is a great way to store lots of documents that need to be quickly searched and retrieved. In addition to a broad query API, Elasticsearch also provides \u003ca href=\"http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-scroll.html\"\u003escrolling\u003c/a\u003e functionality that lets you query the server and incrementally download the results. This can be really useful for processing any large result set, e.g. for \u003ca href=\"http://euphonious-intuition.com/2012/08/reindexing-your-elasticsearch-data-with-scanscroll/\"\u003ereindexing\u003c/a\u003e.\u003c/p\u003e\n\u003cp\u003eUsing this scrolling, however, can be difficult because it first requires setting up then repeatedly calling the server to access the results. In order to simplify scrolling, I have implemented a package called \u003ca href=\"https://www.npmjs.org/package/elasticscroll\"\u003eElasticScroll\u003c/a\u003e using \u003ca href=\"http://nodejs.org/\"\u003eNode.js\u003c/a\u003e with the \u003ca href=\"https://github.com/kriskowal/q\"\u003eQ\u003c/a\u003e promises library. In this post I will describe how ElasticScroll works and how to use it.\u003c/p\u003e","title":"Scrolling Elasticsearch using Node.js and Promises"},{"content":"If you want to create visualisations for the web, the tool you will want to learn is D3.js. It helps you build powerful visualisations that can express your data while remaining light-weight and flexible. However, D3 is one of those tools that has its own way of doing things that is a little bit different from other visualisation libraries. It has a philosophy that focuses on data and functions, and not visual elements. This may make it difficult to learn because to wrap your head around it you need an epiphany, where you just get how to use it.\nIn this post I will not go over all of D3, but I will try to convey its philosophy and style in the hopes that you will have a moment where you just get D3.\nQuick Introduction I have been using D3 for a few years now on projects such as 100 companies and Mihmihi, I have also written tutorials on things like how to draw maps. However, I have never written an introduction to D3 because it is difficult to convey its awesomeness without just using it. So lets get started.\nData Driven Documents Above all else D3 is a functional toolkit for visualising data, which binds data to visual elements. Using the bound data, the attributes and look of the elements can be defined using utilities and functions. This is the power of D3; it is not visualisation centric, it is focused on data and functions to use that data.\nNote: D3 can use any element in the Document Object Model (DOM), such as SVG or HTML elements. This tutorial will use D3 with HTML and CSS (instead of SVG) as more developers are familiar with them.\nBinding Data Lets bind a piece of data to the body element: d3.select('body').datum('red')\nThe string ‘red’ is now bound to body.\nNow lets retrieve that datum: d3.select('body').datum() // 'red'\nWhat can the datum be? Any Javascript thing like objects, string, arrays, and so on. D3 just remembers the relationship between the element and the datum so that it can be used later.\nYou can use D3 to set attributes on the element you bound the data to: // changes the body's CSS color to red d3.select('body').style('color', function(datum){ // datum == 'red' return datum } )\nYou can also chain many of the functions D3 offers for easier use: // changes the CSS background-color to 'blue' d3.select('body') .datum('blue') .style('background-color', function(datum){ // datum == 'blue' return datum } )\nIf you have a lot of data then you can select multiple elements with selectAll and bind to those elements with the data function.\nNote: Many functions in D3 return a Selection including selectAll data and select // Selection of all p elements p = d3.selectAll('p') // bind data to the first 5 p elements p.data(['red','green','blue','black','white']) // changes the CSS background-color to bound datum p.style('background-color', function(datum){ return datum } )\nD3 also helps if you have more (or less) data than elements that exist. The functions enter and exit are used with append and remove respectively to let you create and destroy elements.\nNote: the data function returns a restricted Selection of only elements that can be bound to the data. // Selection of all p elements in body p = d3.select('body').selectAll('p') // bind data to the first 5 p elements, get restricted Selection pdata = p.data(['red','green','blue','black','white']) // add p elements if they don't already exist pdata.enter().append('p') // remove p elements if there are too many pdata.exit().remove()\nCreating Functions Binding data to elements using D3 is cool, but do you know what is really cool? Functions! D3 has a massive set of Higher order functions (functions that return functions) that let you easily create useful tools to manipulate your data.\nSpecifically, my favourite functions in D3 are Scales.\nFor the following example I will use the Scoville scale of how spicy differed peppers are for our example. Here is some data (shu stands for Scoville Heat Unit): data = [ { name: 'Pimento', shu: 100}, { name: 'Anaheim', shu: 2500}, { name: 'Jalapeno', shu: 10000}, { name: 'Cayenne', shu: 50000}, { name: 'Habanero', shu: 350000}, { name: 'Ghost', shu: 1500000} ]\nQuantize Scales I will start by creating a function to describe this data with words like ‘medium’ and ‘hot’. For this the 0 to 1,500,000 SHU needs to be converted to words using a Quantize Scale that has a continuous domain (input) and an discreet range (output).\nThe first thing to notice is that the SHU data looks exponential, so to make it useful for other functions a Log Scale can be used. logscale = d3.scale.log() .domain([1,1500000]) .range([0,1])\nThis scale creates a function that takes a number from 1 to 1,500,000 and returns a number from 0 to 1 on a log curve.\nNote: This is not a very precise scale and you could tweak the numbers to get what you needed, but for this example it is good enough.\nNow lets define the list of words using a quantize scale: qscale = d3.scale.quantize() .domain([0,1]) .range(['nothing', 'mild','medium','hot' ,'burning','scorching','hellfire'])\nThis function takes a number from 0 to 1 and returns an element from the list of words provided. For example, qscale(0) will return ‘nothing’ and qscale(1) will return ‘hellfire’.\nTo be useful a function that combines and maps to the data provided is needed: textscale = function(datum){ return qscale( logscale(datum.shu) ) }\nText scale will then take a pepper object and return a word that matches it hotness. To check the results map can be used: data.map(textscale) // [\u0026quot;medium\u0026quot;, \u0026quot;hot\u0026quot;, \u0026quot;burning\u0026quot;, \u0026quot;scorching\u0026quot;, \u0026quot;hellfire\u0026quot;, \u0026quot;hellfire\u0026quot;]\nLinear Scales Some places use a number of icons visualise how hot something is, e.g. 0 to 4 chilis. A Linear Scale can be used for this: linearscale = d3.scale.linear() .domain([0,1]) .range([0,4])\nThis scale takes a number from 0 to 1 and returns a number from 0 to 4 on a linear line, e.g. linearscale(.5) returns 2. However, it may return non-integer numbers, so when creating the mapping function it should round the output. chiliscale = function(datum){ return Math.round( linearscale( logscale(datum.shu) ) ) } data.map(chiliscale) // [1, 2, 3, 3, 4, 4]\nColor Scales Another way to visualise how spicy a pepper is, is to use color. One of the best features of D3 is that it lets you use hexadecimal colors in the ranges of the scale functions. Here is how you can use a linear scale to define color: colorscale = d3.scale.linear() .domain([0,1]) .range(['#FFF', '#933'])\nThis scale takes a number from 0 to 1 and returns a color from white to red-ish. Again, a simple mapping function is needed: burnscale = function(datum){ return colorscale( logscale(datum.shu) ) } data.map(burnscale) // [\u0026quot;#debdbd\u0026quot;, \u0026quot;#c78f8f\u0026quot;, \u0026quot;#bd7b7b\u0026quot;, \u0026quot;#b16464\u0026quot;, \u0026quot;#a34848\u0026quot;, \u0026quot;#993333\u0026quot;]\nAdding Visual Elements Using the tools I discussed earlier, lets create a simple visualisation for the peppers data-set.\nFirst lets create some divs for our peppers: peppers = d3.select('.peppers').selectAll('div') pepperssdata = peppers.data(data) .enter().append('div').attr('class','pepper')\nNote: I added the class pepper is for some styling\nUsing the text function, the text inside of the div can be set to the name of the pepper: pepperssdata.text( function(datum){ return datum.name } )\nA div is added with the class how-spicy with text assigned by the textscale function: pepperssdata.append('div').attr('class','how-spicy').text( function(datum){ return 'is ' + textscale(datum) } )\nThe background color is set by the burnscale function: pepperssdata.style('background-color', burnscale)\nThis is the most complicated bit of code, but contains nothing that has not been seen before: pepperssdata.append('div').attr('class','fire').selectAll('i') .data( function(datum){ return new Array(chiliscale(datum)) } ) .enter().append('i').attr('class','fa fa-fire')\nFirst this appends a div with class fire, then selects all the i elements (there are none yet). Then by passing to data a function that creates an empty array the length of chiliscale, a new set of data is created for each pepper. Finally, it appends an i element with the icon classes for each piece of data in the empty array.\nNote: Unfortunately, Font Awesome does not have a chili icon (see here)\nAdd some CSS and:\nhttp://bl.ocks.org/grahamjenson/raw/9950335/\nView code here\nThis is not the prettiest, or very useful, or the most interesting visualisation in the world. It is however a good start to learning D3.\nConclusion D3 is a powerful tool, and this post has barely scratched the surface. If you want to learn D3 I recommend finding some data that you are interested in, then making a small visualisation for it. D3 is definitely a tool that you learn by using, so go and use it.\nRead More Every so often I go over to bl.ocks and have a look at what Mike Bostock (the creator of D3) is building. You can get some amazing D3 examples from there.\nVisual Display Quantitative Information — Edward Tufte\nGetting Started with D3 — Mike Dewar\n","permalink":"https://maori.geek.nz/posts/2014/2014-04-07_d3.js-tutorial-using-html-scales-and-chili-peppers/","summary":"\u003cp\u003eIf you want to create visualisations for the web, the tool you will want to learn is \u003ca href=\"http://d3js.org/\"\u003eD3.js\u003c/a\u003e. It helps you build powerful visualisations that can express your data while remaining light-weight and flexible. However, D3 is one of those tools that has its own way of doing things that is a little bit different from other visualisation libraries. It has a philosophy that focuses on data and functions, and not visual elements. This may make it difficult to learn because to wrap your head around it you need an epiphany, where you just \u003cstrong\u003eget how to use it\u003c/strong\u003e.\u003c/p\u003e","title":"D3.js Tutorial using HTML, Scales and Chili Peppers"},{"content":"The motto for the Royal Society is Nullius in verba, which translates to Take nobody’s word for it. This comes from a scientific culture of scepticism that does not care how renowned, well known, popular, or respected someone is, you should always demand evidence for any claim put forward.\nI have been thinking about how scientific ideas like this are analogous in software engineering when writing tests. These days it is common to write tests to demonstrate that your code is working, no matter how good a programmer you claim to be. I think this is the software community rediscovering that they should Take nobody’s word for it, not even your own.\nI think there are more lessons from the world of science that could be used when testing software. So, in this post I am going to briefly explore the similarities and differences between science and testing software, and maybe provide a different way to look at your test suite.\nIsaac Newton was an Awesome Tester I am going to start this post with a story of Isaac Newton, the tester.\nWhen Newton started writing tests for the universe in the 1670’s it was believed that a feature of the universe is that white light was colourless. An implication of this is that when white light is passed through a prism, the colors that appear are created by the prism, not the light itself. Newton could see this when he performed the test: describe :prism do it 'will create many colors with white light' do white_light = light.new(:white) Prism.new(white_light).should include? :red, :green, :blue end end\nThis tests passes and Newton probably thought everything was fine with the universe.\nNewton realized that an implication of this colorless white light would be that when you passed just red light through the prism it should separate again into more colors. So Newton created a test to take the red light created by passing it through one prism, then passed it through another to see all the colors: describe :prism do it 'will create many colors with red light' do white_light = light.new(:white) red_light = Prism.new(white_light).get(:red) Prism.new(red_light).should include? :red, :green, :blue end end\nNewton probably though “Oh crap, the universe doesn’t implement this feature. Then what feature does it have?. Maybe the universe has the feature where the colors are inside the white light. This would mean that if we focus the colors with a lens we could get the white light back.”. So Newton went about writing a test for that: describe :prism do it 'will create white light with many colors' do white_light = light.new(:white) colored_light = Prism.new(white_light) Prism.new(colored_light).should eq white_light end end\nThis test passed, and Newton discovered something about the system in which he was testing. He then published his test, and the implication it had, so others could see for themselves the feature about our universe that was just discovered.\nTesting is Science Richard Feynman once wrote that:\nExperiment is the sole judge of scientific truth. The Feynman Lectures on Physics, Introduction, Richard Feynman, 1961.\nAn analogous idea for software testing is:\nRepeatable tests are the sole judge of your systems functionality\nDocumentation, intuition, and people (even developers) can be wrong. However, your tests show and describe the truth about your system. If you have no tests, you have no evidence that your system is working the way it should be. What is the evidence that is demanded? It is the same in both domains, repeatable tests.\nBut Software Systems Change We are like sailors who on the open sea must reconstruct their ship but are never able to start afresh from the bottom. Willard Van Orman Quine, Word and Object, 1960\nThe above quote describes the Neurathian bootstrap (named after Otto Neurath) which describes the nature of scientific verification and identity. To me, it sounds more like my day job, where I am maintaining and developing systems while having to have them in production and working. I am sailing a ship and rebuilding it at the same time.\nThe biggest difference between software development and science is that science is trying to discover a static set of knowledge about the Universe, where developers are trying to alter their systems while ensuring all previous knowledge remains true. For example, an experiment by Newton in the 17th century will still return the same result today, but the tests that I wrote last week broke with the latest feature.\nThis lack of change means that scientists do not need to run their experiments again and again to ensure they have a working universe and developers continuously run our tests because we are always breaking them. This is why scientists put much more weight on experimental accuracy, more than what developers do at least.\nScientists Started with Integration Tests Imagine showing up to a job where you had to maintain a mission critical system that has no tests, no documentation and everyone who knew about it left (probably for the lack of tests and documentation). Where would you start? I would start by writing integration tests, then moving downward towards unit tests.\nThis is the same problem people had when we showed up here. We got this world without a manual or specification. We started with big integration tests, things that tested behaviours of the system like how to start a fire. Slowly we have moved downwards towards more and more specific unit tests, like how friction is caused.\nWhich tests are more profound? It is always the unit tests! Gravity, special relativity, quantum mechanics all fundamentally altered the way in which we view our world. The things that impact our understanding of the whole system are the tests that are fundamental to many things. Where is the unit test for your system that explains how all the models move relative to one another, you know, the test for gravity?\nImplications of the Hypothesis “Testing code is Science” What I cannot create, I do not understand Richard Feynman, 1988\nWhat are the consequences of thinking about testing using this scientific metaphor? Will you write your tests with the rigour of scientists, and if they fail will you see your universe as broken.\nIf you ignore your tests, with things like “Oh that test always fails” you are ignoring the only good evidence that your system works. It is like those people who only accept some of the scientific evidence, then derive conclusions from that, while ignoring the stuff that conflicts with their view (I am looking at you, young earth creationists).\nIf you say “this javascript doesn’t need tests, it is too abstract and my unit tests are still working”. I see this as like a Biologist saying “the fundamental chemical reactions are well tested by chemists, so I don’t need to experiment”. No matter how abstract you are from the core, not having tests means you don’t know if it is doing what you think it should.\nConclusions When I program sometimes I think I am the master of this small, simple universe that I created. Someone, someday may come across my universe and if that happens I would like to leave some evidence of how it works. A little bit of the science about my universe written as tests.\nI am not saying this analogy holds up under microscopic inspection, I think it breaks down somewhere around TDD and peer review. I am just saying that it is fun to imagine your tests as tiny scientists experimenting on your system, making sure everything is right with their universe.\nRead More Surely You’re Joking, Mr. Feynman! — Richard P. Feynman\nCosmos: Carl Sagan\nApplying the scientific method to software testing\n","permalink":"https://maori.geek.nz/posts/2014/2014-03-30_lessons-from-science-on-how-to-test-your-code/","summary":"\u003cp\u003eThe motto for the \u003ca href=\"http://en.wikipedia.org/wiki/Royal_Society\"\u003eRoyal Society\u003c/a\u003e is \u003cem\u003eNullius in verba\u003c/em\u003e, which translates to \u003cstrong\u003eTake nobody’s word for it\u003c/strong\u003e. This comes from a scientific culture of scepticism that does not care how renowned, well known, popular, or respected someone is, you should always demand evidence for \u003cstrong\u003eany\u003c/strong\u003e claim put forward.\u003c/p\u003e\n\u003cp\u003eI have been thinking about how scientific ideas like this are analogous in software engineering when writing tests. These days it is common to write tests to demonstrate that your code is working, no matter how good a programmer you claim to be. I think this is the software community rediscovering that they should \u003cstrong\u003eTake nobody’s word for it\u003c/strong\u003e, not even your own.\u003c/p\u003e","title":"Lessons from Science on How to Test your Code"},{"content":"I recently created a visualisation that showed the difference between rich and poor schools in New Zealand. This visualisation used a series of relatively complex animations to introduce and convey information to the viewer.\nIn this post I will describe how I used Q and JQuery promises to compose the complex animations and give examples of how using promises can be beneficial for such a visualisation.\nNote: I am going to use CoffeeScript in this post for reasons described here. To learn CoffeeScript perhaps you could try The Little Book on CoffeeScript.\nThe Visualisation http://bl.ocks.org/grahamjenson/raw/9168767/\nView the full code here\nFor this visualisation I wanted to try the style where statistics are presented as a set of icons that each represent many people, similar in style to this video. I find this type of visualisation quite powerful as it provides an easy way for people to understand large scale issues by compressing many data points into single icon.\nQ and JQuery For this project I used the Q promises to direct the overall flow of the visualisation, and JQuery promises (which I have previously discussed here) to create the animation effects. One of the great things about Q is that it is able to use JQuery promises, so I had no problems with their integration.\nI decided to use Q, instead of just JQuery, because I wanted to start using promises that conform to the Promises/A+ specification. Q also added some useful functions, like delay, which waits a set time to resolve a promise.\nThe Basic Promises The way in which I chose to animate using promises is to create very simple functions that return a promise, then compose them together either to either be completed in parallel with Q.all or in sequence with then.\nThis is the show_message function I wrote to display text in the visualisation: show_message = (message, delay = 2000) -\u0026gt; Q.fcall( -\u0026gt; $('.messages').html(message) ) .then(-\u0026gt; $('.messages').show('scale',100)).delay(delay) .then(-\u0026gt; $('.messages').hide('scale',100)) .then(-\u0026gt; $('.messages').html(''))\nThis function takes a message string and a delay to show the message for delay time. First show_message uses Q.fcall to set the message as the html of the messages class. Then it shows the message using the scale effect, waits for the delayed time, then hides the message. Finally, it removes the message from the html.\nThe Steps Removing the complexities of the smaller animation functions, then I could begin to think of the steps the visualisation would need. Each step is a unit of animation described with a promise, for example: step3 = -\u0026gt; p = show_message(\u0026quot;Let's distribute them into NCEA Level 3\u0026quot;) return p.then( -\u0026gt; all_subjects.get_deciles().sort_out_deciles() )\nThis step3 function returns a promise to display a message with show_message then execute the sort_out_deciles animation. sort_out_deciles returns a promise for its completion, therefore the step3 promise will not be resolved until the sort_out_deciles animation is complete.\nIf I were to change this step, by adding more animations or changing the length of time the animations run for, the returned promise will not be resolved till the entire step is completed. This means that all following steps will also wait for this step to complete, meaning no further changes are needed.\nPutting it all together By linking the steps together using promises and delay to alter the timing of the animations, you can finely tweak the resulting visualisation. $('.play').on('click', -\u0026gt; $('.play').hide() step1().delay(500) .then(step2).delay(500) .then(step3).delay(500) .then(step4).delay(500) .then(step5).delay(500) .fail(console.log)\nI decided to include a play button in this visualisation to enable the user to select when to start the visualisation. Once this button is clicked, then a series of steps are promised to be executed. In this visualisation the final step, step5, alters the visualisation to enable further interaction with the visualisation.\nNote: Q will swallow your errors unless you log, or otherwise handle, the errors using the fail function. This is necessary to debug any problems.\nConclusion Using Q and JQuery promises is a really elegant way to compose a visualisation and it let me cleanly build complex animations out of simple promises. It also had the benefit of being able to change the complex functions and have all the steps adapt without any additional work. I enjoyed writing this project, in part because of the use of promises, and I would like to further explore this idea in the future.\nRelated reading\nI promise this will be short a post about JQuery promises Learning jQuery Deferreds: Taming Callback Hell with Deferreds and Promises JavaScript with Promises ","permalink":"https://maori.geek.nz/posts/2014/2014-03-23_using-q-and-jquery-promises-to-compose-complex-animations/","summary":"\u003cp\u003eI recently created a visualisation that showed \u003ca href=\"http://maori.geek.nz/post/the_difference_between_rich_and_poor_schools_in_new_zealand\"\u003ethe difference between rich and poor schools in New Zealand\u003c/a\u003e. This visualisation used a series of relatively complex animations to introduce and convey information to the viewer.\u003c/p\u003e\n\u003cp\u003eIn this post I will describe how I used \u003ca href=\"https://github.com/kriskowal/q\"\u003eQ\u003c/a\u003e and \u003ca href=\"http://maori.geek.nz/post/i_promise_this_will_be_short\"\u003eJQuery\u003c/a\u003e promises to compose the complex animations and give examples of how using promises can be beneficial for such a visualisation.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eNote: I am going to use CoffeeScript in this post for reasons described\u003c/em\u003e \u003ca href=\"http://maori.geek.nz/post/why_should_you_use_coffeescript_instead_of_javascript\"\u003e\u003cem\u003ehere\u003c/em\u003e\u003c/a\u003e\u003cem\u003e. To learn CoffeeScript perhaps you could try\u003c/em\u003e \u003ca href=\"http://www.amazon.com/gp/product/1449321054/ref=as_li_qf_sp_asin_il?ie=UTF8\u0026amp;amp;camp=1789\u0026amp;amp;creative=9325\u0026amp;amp;creativeASIN=1449321054\u0026amp;amp;linkCode=as2\u0026amp;amp;tag=maor01-20\"\u003e\u003cem\u003eThe Little Book on CoffeeScript\u003c/em\u003e\u003c/a\u003e\u003cem\u003e.\u003c/em\u003e\u003c/p\u003e","title":"Using Q and JQuery Promises to Compose Complex Animations"},{"content":"\nMany of the constraints I have in my day-to-day job as a developer come from this mysterious world of the software architect. After listening to Martin Fowler on the Ruby Rogues podcast talk about his book Patterns of Enterprise Application Architecture (PoEAA), I decided to pick up a copy (i.e. I ordered it from Amazon) and read a bit more into architecture. Not understanding the domain very well, I also watched a few talks and read a few different articles trying to get a feeling for the important aspects.\nThis post goes over a bit of Fowler’s book, goes over some of the different opinions on software architecture, and tries to answer the questions “what is software architecture?” and “when should you be an architect?”\nArchitecture is Important architecture boils down to the important stuff — whatever that is. — PoEAA\nFowler describes architecture as the important decisions in a project. The kind of decision that would require massive effort to change down the line. It is comprised of decisions that\ndevelopers wish they could get right early on because they’re perceived as hard to change. — PoEAA\nIf you can imagine a request to change something in an application and your answer is “it will require a major change to the system”, that difficulty was probably caused by an architectural decision in the past.\nDecisions like which persistency patterns should be used, how objects will interact with remote systems, how domain logic is separated in the system will significantly impact the way the system is developed. They will also have ripple effects into other areas like performance, maintenance and design.\nIt is important to understand what an architectural decision is, and what it looks like. Spending a lot of effort making good decisions for unimportant aspects of the application is time wasted. However, making important decisions lightly and without an appreciation for how hard it will be to change later, may cause years of pain.\nConveying the importance of architectural decisions can be difficult because some people dislike software architecture as it seems Waterfall-ish. Uncle Bob Martin put it like this:\nthere has been a feeling in the Agile community since about ’99, that “architecture is irrelevant, we don’t need to do architecture, all we need to do is write lots of tests, and do lots of stories, and do quick iterations and the code will assemble itself magically”. This has always been horse shit!\nJust because you are planning ahead, does not mean that you are somehow violating agile development principles, just don’t go overboard.\nWhen to make Architecture Decisions? Fowler states that architectural decisions should be made towards the beginning of the project. Others, like Kent Beck and Uncle Bob Martin, say you should defer making architectural decisions until they are absolutely needed.\nIn Fowler’s article Is Design Dead? he compares his “cowardly” approach of making decisions early to the “aggressive” approach to defer decisions until necessary. Fowler argues that you may have a broad understanding of the domain from experience, and you can make architectural decisions early in a project, with the understanding that it might be necessary to change them later.\nIn Uncle Bob’s talk Architecture the Lost Years from Ruby Midwest 2011, he asserts that decisions like which framework or database to use are merely details of an application and should not dictate the way in which it is designed. He then gives an example where by deferring the decision of which database to use, he was able to create a fully functioning application without any database. Finally, when a database was eventually needed, they were able to quickly add it because of this deferred decision. Uncle Bob’s advice is:\nA good architecture allows major decisions to be deferred\nThis battle between deferring and upfront architecture is something that I deal with day-to-day. For example, the first major decision in most of my projects is what framework to use (Ruby with Rails or Sinatra, Node.js with Hapi or Express), which:\naccording to Uncle Bob, the decision should be deferred until it is necessary to be made, with the understanding that a price may need to be payed for adapting the existing project to a framework according to Fowler, existing experience should be used to make the decision now, with the understanding that at some point a price may need to be paid for changing the framework Jim Coplien in a discussion with Uncle Bob (video here) gave a nice middle ground where you should:\ncapitalize on what you know [and] take some hard decisions up front, because that will make the rest of the decisions easier later.\nThat is, make the decisions you can now, and instead of guessing defer the other decisions until later, when more information is available.\nHow I break down these views is:\nIf the domain is not well understood or I have little experience with it, I will take Uncle Bob’s advice and wait to make any architectural decisions If I have experience with a domain, then I will take Fowler’s advice and make architectural decisions with the understanding they may need to be revised later Conclusion Software architecture is a complex topic, and I still have much more to read and learn. In this post I looked at the kinds of decisions, and when to make them. I did not cover the hardest topic, how to make the best architectural decisions with the available information. I think that topic would fill many books.\nLearn More Patterns of Enterprise Application Architecture — Martin Fowler\nAgile Software Development, Principles, Patterns, and Practices — Uncle Bob Martin\n","permalink":"https://maori.geek.nz/posts/2014/2014-03-17_when-to-be-a-software-architect/","summary":"\u003cp\u003e\u003cimg alt=\"image\" loading=\"lazy\" src=\"/posts/2014/2014-03-17_when-to-be-a-software-architect/images/1.jpg#layoutTextWidth\"\u003e\u003c/p\u003e\n\u003cp\u003eMany of the constraints I have in my day-to-day job as a developer come from this mysterious world of the \u003cem\u003esoftware architect\u003c/em\u003e. After listening to Martin Fowler on the \u003ca href=\"http://rubyrogues.com/097-rr-book-club-patterns-of-enterprise-architecture-with-martin-fowler/\"\u003eRuby Rogues\u003c/a\u003e podcast talk about his book \u003ca href=\"http://www.amazon.com/gp/product/0321127420/ref=as_li_qf_sp_asin_il_tl?ie=UTF8\u0026amp;amp;camp=1789\u0026amp;amp;creative=9325\u0026amp;amp;creativeASIN=0321127420\u0026amp;amp;linkCode=as2\u0026amp;amp;tag=maor01-20\"\u003ePatterns of Enterprise Application Architecture\u003c/a\u003e (\u003cstrong\u003ePoEAA\u003c/strong\u003e), I decided to pick up a copy (i.e. I ordered it from Amazon) and read a bit more into architecture. Not understanding the domain very well, I also watched a few talks and read a few different articles trying to get a feeling for the important aspects.\u003c/p\u003e","title":"When to be a Software Architect"},{"content":"\nCoffeeScript is currently my favourite language to write in! That is because CoffeeScript contains three things that I like in a language:\neasy function and lambda definitions syntactically significant whitespace straight forward class definitions However, for all its benefits, CoffeeScript is complicated by its intertwined relationship with Javascript. This is because CoffeeScript is a language that doesn’t compile to binary or VM code, but it is transpiled to Javascript. So, to understand why you should use CoffeeScript you should probably understand why you should use it instead of Javascript.\nIn this post I will go over how to get started with CoffeeScript, and give a few useful examples where it is a great improvement over JavaScript. This is not a complete tutorial of CoffeeScript, it is more a primer to get you excited to learn it.\nNote: to parse this document a basic level of JavaScript is needed. If you are unfamilar with CoffeeScript perhaps you could try The Little Book on CoffeeScript\nHistory Underneath that awkward Java-esque patina, JavaScript has always had a gorgeous heart. CoffeeScript is an attempt to expose the good parts of JavaScript in a simple way. — coffeescript.org\nCoffeeScript was created by Jeremy Ashkenas (creator of underscore.js and backbone.js) in 2009, to make Javascript simpler and more readable. It takes elements from Ruby, Python and Haskell to create its blend of functional and OO features.\nIt has seen a massive uptake in the community, in part because of its inclusion in Ruby on Rails since version 3.1. It has also had upstream affects on Javascript, where Brendan Eich (creator of Javascript) has proposed CoffeeScript-like changes for future JavaScript versions (here).\nPhilosophy The golden rule of CoffeeScript is: “It’s just JavaScript” — coffeescript.org\nCoffeeScript is just a prettier JavaScript. It will let you have smaller and more readable code by taking core elements of JavaScript and giving you a better way of writing it.\nThe main reason why some people dislike CoffeeScript is they don’t want to write in CoffeeScript and debug in JavaScript. CoffeeScript trys to be nicer JavaScript, and to keep the output code similar to the input so that debugging is trivial. Personally, I have never found this a hurdle to debugging, and I don’t think you will either.\nGetting Started One reason why JavaScript is so popular is the ease in which anyone can access it. In that vein, I will get you up and running with CoffeeScript ASAP. You can quickly start using CoffeeScript by either:\nGoing to http://coffeescript.org/ and clicking Try CoffeeScript at the top. Include CoffeeScript in an HTML page with \u0026lt;script src=”https://rawgithub.com/jashkenas/coffee-script/master/extras/coffee-script.js\u0026quot;\u0026gt; in the then wrap code in tags. Install the CoffeeScript console with Node.js and npm with npm -g install coffee-script then run it with coffee. Note: These are not ways to use CoffeeScript in production, they are tools for learning and testing CoffeeScript.\nSyntax I will go over the CoffeeScript syntax in the order of my favourite parts; functions, whitespace, classes.\nFunctions Lets start with functions in CoffeeScript: times_two = (x) -\u0026gt; x*2\nRead this as\ntimes_two equals a function that takes an input (x) moves it to (-\u0026gt;) output of x*2 I think this is the most beautiful function definition syntax that exists. It uses syntax similar to Haskell’s function description Integer -\u0026gt; Integer, and Ruby’s implicit return of the last statement.\nThis CoffeeScript will compile to the JavaScript: var times_two;``times_two = function(x) { return x * 2; };\nFirst JavaScript has to define a variable, because of the worst gotcha in programming, default global scope! Then it uses the long function keyword to define a function, wrapping the code in { and }. The function then explicitly define the returned value with return.\nI think this JavaScript is ugly, it has lots of code junk (like chart junk), i.e. unnecessary elements not required to convey the functions purpose. However, it clearly maps to the CoffeeScript. The variable names are the same, and although more verbose, the function definition is the same. This is due to CoffeeScript\u0026rsquo;s philosophy to output readable JavaScript.\nThere are a few more benefits of CoffeeScript functions, like the optional parenthesis and argument definitions, e.g. this code: log = -\u0026gt; console.log arguments\nwill compile to the JavaScript: var log;``log = function() { return console.log(arguments); };\nThere are many things to learn with CoffeeScript functions, the closure creating do keyword, default values in constructors, the fat arrow, they are icing on the Coffee cake.\nWhitespace Similar to Python, CoffeeScript uses syntactically significant whitespace to group blocks of code. For example: fn = (x) -\u0026gt; x += 2 x *= 4\nMost people will indent their code to give an immediate visual clue as to its organisation. CoffeeScript takes this convention and makes it part of the language to tidy up all those curly braces.\nIndentation works for any statement that requires a block, like if and for statements: for num in [1..10] if num % 2 == 0 console.log \u0026quot;#{num} is even\u0026quot; else console.log \u0026quot;#{num} is odd\u0026quot;\nwill compile to the JavaScript: var num, _i;``for (num = _i = 1; _i \u0026lt;= 10; num = ++_i) { if (num % 2 === 0) { console.log(\u0026quot;\u0026quot; + num + \u0026quot; is even\u0026quot;); } else { console.log(\u0026quot;\u0026quot; + num + \u0026quot; is odd\u0026quot;); } }\nIndentation adds to the readability of CoffeeScript and it removes lots of the unnecessary braces.\nClass Definitions It seems silly, when put into the context of other programming languages, that a common question in JavaScript is \u0026rsquo;What\u0026rsquo;s the best way to define a class in JavaScript?\u0026rsquo;.\nWhen asking how to define a class in any other language, would it (as the StackOverflow mod put it) \u0026lsquo;likely solicit debate, arguments, polling, or extended discussion\u0026rsquo;? I would say NO. Most languages have a clear way to define classes, e.g. Ruby\u0026rsquo;s, Java\u0026rsquo;s, C#\u0026rsquo;s, PHP\u0026rsquo;s \u0026hellip; it is the class keyword.\nJavaScript is a language limited by few rules and its ability to define OOP-ish classes in many ways is powerful, yet confusing to many. CoffeeScript solves this by adding the class keyword: class Person constructor: (@name) -\u0026gt;``hello: -\u0026gt; console.log \u0026quot;Hello #{@name}\u0026quot;\nNote: @ is short for this., and (@name) -\u0026gt; is short for (name) -\u0026gt; @name = name.\nThis compiles to the ugly JavaScript: var Person;``Person = (function() { function Person(name) { this.name = name; }``Person.prototype.hello = function() { return console.log(\u0026quot;Hello \u0026quot; + this.name); };``return Person;``})();\nThis JavaScript code is not as straight forward or trivial as the CoffeeScript code. It includes in it non-obvious fixes to some JavaScript gotchas.\nNote: in the podcast JavaScript Jabber Jeremy Ashkenas discusses why this particular class definition is used.\nConclusion There are loads of other things that I like about CoffeeScript, but the things I discussed in this post are the reasons I love CoffeeScript. They remove the ugliest parts of JavaScript to show its real power. Other things like list comprehensions, for loops, array slicing, splats\u0026hellip; will have to wait for another post.\nLearn More Go to http://coffeescript.org/ and click Try CoffeeScript, listen to the JavaScript Jabber podcast on CoffeeScript,\nalso\nThe Little Book on CoffeeScript\n","permalink":"https://maori.geek.nz/posts/2014/2014-03-12_why-should-you-use-coffeescript-instead-of-javascript/","summary":"\u003cp\u003e\u003cimg alt=\"image\" loading=\"lazy\" src=\"/posts/2014/2014-03-12_why-should-you-use-coffeescript-instead-of-javascript/images/1.jpeg#layoutTextWidth\"\u003e\u003c/p\u003e\n\u003cp\u003e\u003ca href=\"http://coffeescript.org/\"\u003eCoffeeScript\u003c/a\u003e is currently my favourite language to write in! That is because CoffeeScript contains three things that I like in a language:\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003eeasy function and lambda definitions\u003c/li\u003e\n\u003cli\u003esyntactically significant whitespace\u003c/li\u003e\n\u003cli\u003estraight forward class definitions\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eHowever, for all its benefits, CoffeeScript is complicated by its intertwined relationship with Javascript. This is because CoffeeScript is a language that doesn’t compile to binary or VM code, but it is \u003ca href=\"http://en.wikipedia.org/wiki/Source-to-source_compiler\"\u003etranspiled\u003c/a\u003e to Javascript. So, to understand why you should use CoffeeScript you should probably understand why you should use it \u003cstrong\u003einstead\u003c/strong\u003e of Javascript.\u003c/p\u003e","title":"Why should you use CoffeeScript instead of JavaScript?"},{"content":"In New Zealand we split our schools into 10 groups called Socio-economic Deciles that each represent 10% of schools ranked by the relative poverty of the students family. That is, Decile 1 contains 10% of our schools where the poorest students attend up to Decile 10 that contains 10% of our schools where the wealthiest students attend.\nNew Zealand uses this measure to target funding to more needy schools and try and ensure that compulsory education is equal for all students, some of which may not be as privileged as others.\nWhat I want to know is:\nHow effective is the decile strategy at equalling out education outcomes in New Zealand? What are the differences of education outcomes between the 10 deciles? I have tried to answer these questions by taking a dataset of how well students perform in different deciles and different subjects and created a small visualisation exploring their differences.\nThe Visualisation I recently read an article on the New Zealand Herald website about the differences between internal and external examinations from NCEA (New Zealand’s school qualification system).\nThis article has a visualisation that explores this topic, and provides the means to compare the results.\nAfter looking around a bit, I realised that it is using an accessible dataset located here, and that I can use it to answer some questions I have about the equality and differences in the decile system.\nHere is my attempt at such a visualisation:\nNote: These results may be biased because of an uneven distribution of students across the deciles. A post on the real distributions can be found here\nIn this visualisation I looked at three dimensions:\nAmount of students in each decile The outcomes of study measured by the grades Not Achieved, Achieved, Merit, and Excellence Some of the different subjects that are studied Amount of Students\nFor this visualisation I only looked at results for the highest level of high school qualification (NCEA level 3) in the year 2012. In an ideal world the amount of students who get to NCEA level 3 should be the same for all deciles. However, I discovered it is drastically different, where there are nearly 5 times more students in decile 10 studying than in decile 1. This may be a slightly distorted number because more students are in decile 10 schools than decile 1 schools, but this is still quite a difference.\nOutcomes\nBy looking at the students outcomes after studying, you can see that although there were more students studying in Decile 10 than Decile 9, less received a Not-Achieved grade.\nSubjects\nI grouped together Calculus and Statistics into Math; Physics, Biology and Chemistry into Science; and English as English. The least equal of these subjects was science, where more than 10 times as many students study science in decile 10 compared to decile 1.\nCritiques of the Visualisation I would have liked to spend more time on this visualisation but I have to stop somewhere. I would have liked to add:\nTake into consideration the distribution of students in each decile the ability for the user to select different years other than 2012, to see if there is a trend over time a way to examine the exact numbers so that the user can drill down into the actual figures a wider selection of subjects to be selected functionality to let the user to select different levels other than just NCEA level 3 It may be very difficult and time consuming to implement these while keeping the visualisation relatively clean from clutter. Cluttering a visualisation with controls is too easy, and it may only remove from the core message that I want to convey.\nThe Technology This is my first attempt at creating a visualisation with complex animations. To accomplish this I used D3.js, Q Promises and Font-Awesome for icons.\nThe main reason for the use of Q promises over just JQuery promises was for the helper function delay. This allowed me to better pace the flow of the animations with much greater clarity. These technologies really came together and the overall definition of the visualisation looked like this: $('.play').on('click', -\u0026gt; $('.play').hide() step1().delay(500) .then(step2).delay(500) .then(step3).delay(500) .then(step4).delay(500) .then(step5).delay(500) .fail((err) -\u0026gt; console.log err)\nWhere each step looked like: step3 = -\u0026gt; p = show_message(\u0026quot;Let's distribute them into NCEA Level 3\u0026quot;) return p.then( -\u0026gt; all_subjects.get_deciles().sort_out_deciles())\nOne of my favourite code snippets from this is the show_message function: show_message = (message, delay = 2000) -\u0026gt; Q.fcall( -\u0026gt; $('.messages').html(message) ) .then(-\u0026gt; $('.messages').show('scale',100)).delay(delay) .then(-\u0026gt; $('.messages').hide('scale',100)) .then(-\u0026gt; $('.messages').html(''))\nUsing promises is a really elegant way to create these complex animations as you can change any step, and all following steps will adapt.\nTo further look at my (messy) code have a look at my portfolio.\nIn this project, I did not use D3.js for its ability to render great looking visualisations, but for the tools it provides like scales. I am finding I am hitting the limitations of SVG more and more, and rendering using CSS with rendered HTML provides a nicer way to layout items. But… if I ever needed to draw a line I will go back to SVG!\nLearn More To learn more about data visualisation you could read:\nEnvisioning Information — Edward Tufte Visual Display Quantitative Information — Edward Tufte Interactive Data Visualization for the Web ","permalink":"https://maori.geek.nz/posts/2014/2014-03-06_difference-between-rich-and-poor-schools-in-new-zealand/","summary":"\u003cp\u003eIn New Zealand we split our schools into 10 groups called \u003ca href=\"http://en.wikipedia.org/wiki/Socio-Economic_Decile\"\u003e\u003cstrong\u003eSocio-economic Deciles\u003c/strong\u003e\u003c/a\u003e that each represent 10% of schools ranked by the relative poverty of the students family. That is, \u003cem\u003eDecile 1\u003c/em\u003e contains 10% of our schools where the poorest students attend up to \u003cem\u003eDecile 10\u003c/em\u003e that contains 10% of our schools where the wealthiest students attend.\u003c/p\u003e\n\u003cp\u003eNew Zealand uses this measure to target funding to more needy schools and try and ensure that compulsory education is equal for all students, some of which may not be as privileged as others.\u003c/p\u003e","title":"The Difference Between Rich and Poor Schools in New Zealand"},{"content":"I have been working as a Ruby Programmer for over a year. Now I am thinking about how much I have learnt and how happy it has made me.\nIn this post I will briefly go over the history and community of Ruby, then give a small example that I think demonstrates why Ruby is made for developer happiness. This is not a tutorial, more like an introduction to Ruby.\nHistory Yukihiro Matsumoto, known as Matz, in 1993 conceived of the a programming language that is made for developer happiness and productivity. This makes Ruby older than Java, Javascript, and PHP; all created in 1995.\nMatz took inspiration from the languages Perl and Smalltalk. From Perl he took things like the optional parentheses and the close shell integration; from SmallTalk he took object orientation and its emphasis on message passing.\nThe first Ruby implementation, known as the “Matz Ruby Implementation” (MRI), was initially seen as the reference implementation of Ruby. From MRI many other implementations were created:\nJRuby: Java Virtual Machine implementation Rubinius: Mostly pure Ruby to LLVM machine code IronRuby: .NET implementation MagLev: SmallTalk implementation Note: recently MRI has been replaced as the Ruby standard by RubySpec. This is a suite of tests of Ruby semantics to better enable the standardisation of semantics between implementations.\nCommunity Matz is nice, so we are nice.\nThe community of Ruby I have found to be very open and inviting, as well as self moderating in those ‘get the pitch-fork out’ type of controversies. A lot of this is because the philosophy of Ruby is geared towards your happiness as a developer.\nThe community has many interesting resources to learn Ruby and general software development. Including books like Confident Ruby by Avdi Grimm, interactive tutorials like exercisim.io, and screencasts like RailsCasts. However, I really like the many ‘soft’ discussions that the Ruby community has about how to be a developer, with talks like:\nDisabilities by James Edward Grey II Developers and Depression by Greg Baugues Loyalty and Layoffs by the Ruby Rogues This is summed up by James Edwards Grey’s statement at GoGaRuCo:\nLearning how to program, thats the easy part. Learning how to be a programmer, thats the advanced stuff. Thats what gets you to the next level, I think we should work on that.\nOn a personal note, interacting with a community cane be difficult for many programmers as we are often introverted or isolated. One way which I am trying to interact is by writing this blog, and giving talks about the topics. So please comment and join the conversation :)\nProductivity When cycling, either there is a headwind or you are having a good day. I am always ready to take a wind assisted ride as my accomplishment, as if I really am that strong. But I can’t forget that if my doppelgänger was out riding the same road in the same conditions but in the opposite direction, that she would work just as hard but accomplish far less\n— Sandi Metz Tells your Future\nThis is Sandi Metz talking about the feeling of productivity that you get when you write Ruby, and how it is like cycling without a headwind. However, she talks about the fact her productivity is not her own doing, but she is a benefactor of others work to make her life easier.\nThe productivity that you feel when writing Ruby is down to two things:\nRuby being very malleable to a particular problem and developer The ability to quickly re-use Ruby components, called Gems A Ruby Gem is a Ruby library/component unit for distribution. It describes the environment that is required for its use, including its dependencies on other Ruby Gems.\nThere are two separate topics to discuss here, writing gems and using gems.\nWriting a Gem To describe the environment and distribution of a gem a gemspec file is used, which looks like: Gem::Specification.new do |s| s.name = 'spellchecker' s.version = '0.1.0' s.summary = \u0026quot;This is a spell checker\u0026quot; ... s.files = [\u0026quot;lib/spellchecker.rb\u0026quot;] s.add_development_dependency 'testinggem', '~\u0026gt; 2.1' s.add_runtime_dependency 'wordsgem', '~\u0026gt; 1.1' s.homepage = 'https://rubygems.org/gems/example' end\nAn interesting thing to note is that this file is actually Ruby code.\nNote: the full specification for a gemspec file is here\nUsing a Gem To effectively use RubyGems, the standard tool to use is bundler. bundler calculates the gems to install and encapsulates them into sets so that each project you create can use different versions of any Gem.\nTo describe the Gems you want to use, you use a Gemfile: source 'https://rubygems.org' gem 'nokogiri' gem 'rack', '~\u0026gt;1.1' gem 'rspec', :require =\u0026gt; 'spec'\nThen you can bundle install, this will install all the selected Gems and their dependencies. It will also create a file called Gemfile.lock this file includes the exact versions that are installed, so others can use your exact environment.\nYou can explore the available Gems on sites like RubyGems and Ruby Toolbox.\nPerformance Often people, especially computer engineers, focus on the machines. They think, “By doing this, the machine will run faster. By doing this, the machine will run more effectively. By doing this, the machine will something something something.” They are focusing on machines. But in fact we need to focus on humans, on how humans care about doing programming or operating the application of the machines. We are the masters. They are the slaves. — Matz\nPeople sometimes dismiss Ruby because Ruby is slow. Well it is, and I don’t care. The biggest performance bottlenecks that I have day to day is the database, the file system, the internet and bugs. These are not problems with Ruby, so speeding it up will have a negligible effect.\nIf I was writing a matrix transformation application, I would write it in C because it will be very performance intensive. However, if I was writing a web app that needed a matrix transformer I will write it in Ruby and call the C library, because I would find the Ruby code much more fun and productive.\nAdditionally, hardware is cheap and developers are not. If you can use a programming language like Ruby to make your programers more productive, then you can increase the performance of your code with better hardware. You can buy the better hardware with the money you saved by making your developers more productive.\nKiller App: Rails You call a libraries code, a framework calls your code\nRuby on Rails (or just Rails) is a web application framework that provides lots of useful tools to use. Rails implements many helpful patterns like Active Record (from Martin Fowler’s Patterns of Enterprise Application Architecture) and Model View Controller (MVC) patterns. It also includes many helpers like routers and useful methods, and provides a standard way in which to use them.\nThe core philosophy of Rails is:\nConvention over Configuration\nThat is, instead of spending a large amounts of time configuring a project, do it in a standard way and only specify the unconventional. This Rails philosophy is aligned to the core Ruby philosophy, to make developers happier. I think this is a significant reason why Rails has seen so much success.\nThe reason why I use Rails is because I can create a complex web application in hours rather than days. It also helps with getting to the core problem of the application, not some sideline problem like setting up a servers configuration correctly.\nI will not go into an example of how to use Rails in this post. I will try an fill this gap later with the typical ‘Blogging Site in 10 Minutes’, for now I will just mention two of my favourite features:\nThe Asset Pipeline: The asset pipeline takes a list of assets that need to be pre-compiled like coffeesciprt and sass assets, and then manages their lifecycle in development and deployment lifecycles. Engines: A Rails Engine is a RubyGem that includes parts of the MVC pattern. It enables a Rails application to be separated into components. I fully intend on writing three blog posts with an intro to Rails, the Asset Pipeline, and Rails Engines, but this post is about Ruby.\nFun With Ruby Dates To quickly demonstrate the power of Ruby I have decided to implement two of the best helper methods that Rails provides, day and ago. These methods sit on a Numeric class, and lets the developer say 1.day.ago and have it return yesterdays date.\nCompare this to code for Java to get yesterdays date: Calendar cal = Calendar.getInstance(); cal.add(Calendar.DATE, -1); System.out.println(cal.getTime() + \u0026quot;\u0026quot;);\nWhat part of this code makes you happy? Where is the bit thats says what it is doing?\nAfter seeing that horrible Java code, lets make ourselves happy by implementing day and ago in Ruby.\nThe first thing to know about Ruby, you can open any class or module, anywhere, and add anything! The way you do this is just define the class again, it does not override the class but add to it. So lets open up Numeric (the top level number class) and add day class Numeric def day self * 60 * 60 * 24 end end\nday returns the Numeric multiplied by the number of seconds in a day. This is because we use seconds as the atomic unit in dates. Lets now add the ago function to Numeric: class Numeric def ago Time.now - self end end\nago returns the Numeric subtracted from the current Time. Now 1.day.ago returns yesterdays date. Due to us defining it this way addition and subtraction work with these numbers, e.g. (1.day + 1.day).ago will return 2 days ago.\nThere is one more thing I want to do, saying 2.day.ago looks silly, and does not read very well so.. class Numeric alias :days :day end\nThis final addition means when the days function is called it is passed to day so that 2.days.ago will work.\nHow easy was that? Now in you have made your Ruby code more readable and saved yourself heaps of time. This is why Ruby will make you happy. Now you could go and implement methods like week, hour, second and from_now to fill out the rest of these helpers.\nNote: There are many problems with this implementation, but as an example of power in Ruby I think it serves its purpose. Still, please comment with the problems!\nConclusion Ruby makes me happy, just about everyday I am finding a new better way to do something I thought would be tedious. I think this is because Ruby does nothing unexpected, yet constantly makes me surprised at its ability.\nLearn More The Ruby Rogues podcast is one of the best software engineering podcasts out there. This podcast uses Ruby more as a backdrop to talk about how to write code, and be a developer.\nProgramming exercises at exercisim.io, where you can feedback for your code.\nAvdi Grimm: Confident Ruby\nSandi Metz: Practical Object-Oriented Design in Ruby\nMartin Fowler: Patterns of Enterprise Application Architecture\n","permalink":"https://maori.geek.nz/posts/2014/2014-03-03_what-is-ruby-it-is-fun-and-makes-you-happy/","summary":"\u003cp\u003eI have been working as a \u003ca href=\"http://en.wikipedia.org/wiki/Ruby_%28programming_language\"\u003eRuby\u003c/a\u003e Programmer for over a year. Now I am thinking about how much I have learnt and how happy it has made me.\u003c/p\u003e\n\u003cp\u003eIn this post I will briefly go over the history and community of Ruby, then give a small example that I think demonstrates why Ruby is made for developer happiness. This is not a tutorial, more like an introduction to Ruby.\u003c/p\u003e\n\u003ch3 id=\"history\"\u003eHistory\u003c/h3\u003e\n\u003cp\u003e\u003cimg alt=\"Matz!\" loading=\"lazy\" src=\"/posts/2014/2014-03-03_what-is-ruby-it-is-fun-and-makes-you-happy/images/1.jpg#layoutTextWidth\"\u003e\u003c/p\u003e","title":"What is Ruby? It is fun and makes you happy!"},{"content":"Since I spent 4.5 years completing my Ph.D. at Massey University, I have decided 1 year on to give an overview of what I actually did.\nIf you want my full thesis, you can download it here. Be warned, it is really long, really dense and an academic document. I am trying to make this post a bit more reader friendly.\nSo here it goes…\nIntroduction to Components and Evolution In order to agree to talk, we just have to agree we are talking about roughly the same thing. The Feynman Lectures on Physics, Motion, Richard Feynman, 1961.\nAccording to Feynman(and I always refer to Feynman when trying to discuss a complex topic), before I get too far down the rabbit hole of my thesis, I need to make sure we are talking about the same things. So lets break apart the title of my thesis into three parts:\nSoftware Components Component Systems Software Evolution I will go over each of these things, to make sure we are on the same page.\nSoftware Components A software component is a piece of software that can be composed into a component system. Well… that is a useless definition, it is like saying ‘a piece of cake is defined as a part of a cake’. Of course, what you want to know is what does a software component look like?. Well that turns out to be a very difficult question to answer.\nA component is an intuitive concept, where each person has their own internal metaphor in which to describe it. Software components are:\nlego blocks, where a developer just clicks them together to build software No, wait, they were are more like biological systems where a developer puts them near each other and they communicate and self organise Nope, they are like an electrical system, where they are wired up together by a developer Hold on, they are mechanical like cogs in a engine where one connection is nothing like another and you have to put the right piece in the right place These are all existing metaphors that have been used to describe software components in academia. Finding a definition that will satisfy all these ideas is impossible. Even well respected, and well known, engineers and researchers have trouble defining what a software component is. For example Bertrand Meyer (author of Object-Oriented Software Construction) and Clemens Szyperski (author of Component Software: Beyond Object-Oriented Programming) discussed in Dr Dobbs, the definition of a comopnent; Szyperski defines a component as having three characteristic properties:\nBeing a unit of independent deployment. Being a unit of third party composition. Having no externally observable state. Meyer defines a software component where it:\nMay be used by other software elements (clients). May be used by clients without the intervention of the components’ developers. Includes a specification of all dependencies (hardware and software platform, versions, other components). Includes a precise specification of the functionality it offers. Is usable solely on the basis of that specification. Is composable with other components. Can be integrated into a system quickly and smoothly. What hope have I in defining what a software component is when these two cannot come to an agreement on the most basic problem “should a software component have state?”. I think that after many such arguments Szyperski wrote that it is impossible to\nenumerate a fixed agreeable set of features that is necessary and sufficient for a natural concept such as component\nI still needed a definition for my thesis. Since there didn’t exist a definition that would satisfied everyone, I decided to create my own. I chose to only define the parts of a software component I needed for my research. My definition: a software component:\nhas explicitly declared constraints for its execution environment, e.g. dependencies on (or conflicts with) other components include mechanisms to automatically alter a component composition, e.g. change the components in a system without manual intervention I found these two attributes were sufficient for me to continue to study software components without being paralysed by trying to define exactly what it is I am studying. So now, hopefully you roughly know what I mean by software component.\nComponent Systems Some examples of component systems that fit my definition are:\nthe Ubuntu operating system the Eclipse IDE There are an estimated 20 million users of the Ubuntu operating system and millions of users of the Eclipse IDE. Ubuntu and Eclipse systems are constructed from components, called packages and bundles respectively, and can be changed by adding or removing components to and from their systems.\nUbuntu has a Debian based package system where each package declares its constraints in a control file that looks like this: Package: textEditorPackage Version: 0.0.1.alpha Depends: spellChecker Conflicts: otherTextEditorPackage\napt-get will use these constraints to change the component system by adding, remove or updating the components in the system.\nEclipse has a component system based on OSGi bundles. These bundles describe the constraints on their execution environment in a bundle manifest file that looks like this: Bundle-Name: TextEditor Bundle-Vendor: Graham Jenson Bundle-SymbolicName: nz.geek.textEditor Bundle-Version: 0.0.1.alpha Bundle-RequiredExecutionEnvironment: J2SE-1.4 Export-Package: nz.geek.textEditor;version=\u0026quot;0.0.1.alpha\u0026quot; Require-Bundle: nz.geek.fonts Import-Package: nz.geek.spellchecker;version\u0026gt;\u0026quot;0.0.1\u0026quot;\nP2 is the core of how Eclipse changes its component composition. It uses the constraints from the bundles to allow a user to install, remove and update their system.\nSoftware Evolution If an object has all its component parts replaced, is it the same object? paraphrased from Plutarch, the Life of Theseus\nBrooks in The Mythical Man-Month states that over 90% of the cost of a system occurs after deployment in the maintenance phase, and that any successful piece of software will inevitably need to be maintained. This realisation, that the cost of software is not in its construction, but in its maintenance made many people look at how software changed after its deployment. There are two views you can take on software changing after deployment:\nthe deliberate activities a developer does to software after deployment, software maintenance the unplanned changes that happen because of these activities, software evolution Software maintenance has been formalised into types in ISO/IEC 14764 as:\nAdaptive Maintenance: adapting to new system or technical requirements. Perfective Maintenance: adapting to new user requirements. Corrective Maintenance: fixing errors and bugs. Preventive Maintenance: adapting to prevent future problems. Using these definitions studies from Lientz and Swanson (1980) found that 75% of the maintenance effort was on the first two types, and corrective maintenance took about 21% of the effort.\nSome laws of software evolution were identified by Lehman; these are:\nContinuing Change: Software systems must be continually adapted, otherwise they become progressively less satisfactory. Increasing Complexity: As the system evolves its complexity increases unless work is done to reduce it. Self Regulation: The system evolves with statistically determinable trends and invariances. Conservation of Organisational Stability: The average effective activity rate to evolve a system is invariant over its lifetime. Conservation of Familiarity: As the system evolves, its incremental growth must remain invariant to ensure users maintain mastery over the system. Continuing Growth: The system must continually grow to maintain user satisfaction. Declining Quality: The quality of the system will decline unless rigorously maintained. Feedback System: The function a system performs is changed by the effect it has on its environment. Both the study of software maintenance and evolution show that a software engineer’s objective of creating a satisfactory system is difficult, expensive, and not always achievable. Additionally, the continual evolution of a software system is necessary, and this evolution reduces quality, increases complexity, and is costly. These are the core problems of being a software engineer.\nComponent and Component System Evolution We are like sailors who on the open sea must reconstruct their ship but are never able to start afresh from the bottom. Willard Van Orman Quine, Word and Object, 1960\nTo maintain and evolve a software component requires technical skills. You need to know how the component system works that you are developing for and you will need to be able to write software. Originally, that was the assumption when creating component systems. Adding, removing or updating a component will require technical knowledge and understanding of the system. However, applications like apt-get started to make changing a system trivial, to the point where a complete novice can install, upgrade, and remove any component in their system.\nWhen a user is the one who composes their system, and not a developer, this breaks apart two processes:\nthe evolution of the component the evolution of the component system. Component Evolution is very similar to software evolution, it requires a developer. The units of this evolution are versions. Using these versions, you can tell if a version of software is more or less evolved than another.\nComponent System Evolution (CSE) is not like software evolution. There are no versions of a system, telling if one is more evolved than another is impossible. However, changing a component system is much more restrictive than an individual component as it must satisfy all contained component constraints. If component a depends on component b, which is noted as a -\u0026gt; b, then if a is installed b must be as well.\nChanging a component system would be impossible for a non-technical user if it were not for tool support. The named a tool that helps change a component system a Component Dependency Resolver (CDR). CDR’s help users alter their systems by calculating a system that satisfies all the constraints of a requested install, removal or upgrade of a component. CDR’s like Eclipse P2 and apt-get allow uses to simply state they want to add a component.\nAdditionally, CDR’s typically try to minimise the change and out-of-dateness of a change to a system. If you, as a user, wanted to install a component and in doing so your CDR installed all components; it technically did what you asked, but it changed too much for you to be satisfied. Also, if it installed the oldest version of every component it could find, that is also not good as most users what the most up-to-date software.\nWhat I Did and How I Did it So, knowing all that I just described about Software Components, Evolution, and Component System Evolution, what did I actually do?\nLike all the science fair projects I did in primary school, I needed to define a hypothesis, a method, do some experiments, then analyse the results.\nHypothesis The objective of this research has lead to the thesis:\nIt is possible to reduce the negative effects of component system evolution by altering the mechanisms by which systems are changed.\nI broke this down into these necessary steps I had to take:\nTo develop a reproducible and controllable environment in which to measure the effects of CSE. To use this environment to study how systems evolve. To alter the mechanisms by which systems are changed and study their impact on CSE. To demonstrate a reduction on change and out-of-dateness using such alterations. Method Simulation is a great way to study stuff. It lets you control all the variables, cheaply run hundreds of experiments without interruption and all it takes is a little (or a lot) time on a computer.\nTo make a realistic simulation you need to have enough data of significant size and complexity to make the simulation realistic and non-trivial.\nFor these reasons I selected Ubuntu as the component system I simulated.\nUbuntu Ubuntu systems follow the Unix philosophy of components:\nWrite programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.\nFirst two rules of Unix:\nRule of Modularity: Write simple parts connected by clean interfaces. Rule of Composition: Design programs to be connected to other programs. Ubuntu also has a massive sets of data to use:\na full history of every component in its core repository and the time it was uploaded one of the largest automated surveys in the world just about what packages are popular, Ubuntu popcon millions of users who are easily accessible to collect data an open and transparent community of developers and researchers all actively sharing what they are doing Validation The biggest hurdle with simulation is making it similar enough to reality. To show a simulation is similar to reality. That is:\nValidation is the process of determining whether a simulation is an accurate representation of the system, for the particular object of the study.\nThis quote comes from “How to build valid and credible simulation models” Law (2005), which is the methodology that I used to guide me in creating and validating my simulation. I chose this method because of its practical advice it had to create models and validate them to the degree I needed. The method is outlined as such:\nFormulate the problem Collect information to construct a conceptual model Validate the conceptual model Implement the conceptual model Validate the implementation Design conduct and analyse experiments Document and present results What I Created Experiment is the sole judge of scientific truth The Feynman Lectures on Physics, Introduction, Richard Feynman, 1961.\nTo answer my hypothesis I needed to answer some smaller questions:\nHow can CSE be modelled? How can a user who changes their component system be modelled? How can a CSE simulation be implemented? How can the negative effects during CSE be reduced? To answer these questions I needed to:\ncreate a formal model CoSyE (Component System Evolution) that describes CSE. create the CUDF* language that is used to define documents that describe the evolution of a component system. create SimUser (Simulated User), which models a user who changes their system. create GJSolver which is an efficient implementation that calculates the changes made to a system as it evolves (called resolving). build a simulation of the evolution of Ubuntu operating systems using CoSyE, CUDF*, SimUser and GJSolver. test two methods to reduce the out-of-dateness and change during CSE. analyse the results from experiments and draw conclusions. I will briefly go over each of these things as their creation and validation is the majority of my thesis.\nCoSyE (Semantic) To exactly describe what I am talking about, I decided to describe it as a model. The qualities I wanted from model were:\nAbstractness — it is a reduction of reality Understandability — it is intuitive Accuracy — it is a representation of the real system Predictivness — can be used to predict non-obvious properties Inexpencive — cheaper to study than the actual system The model of Component System Evolution (CoSyE) that I created is described in mathematical notation. The reason for using maths (instead of say UML) is that the complexities of the model constraints made them difficult to describe in such a rigid framework such as UML.\nIt took a very long iterative process to create this model. This is because I had never created a model before and the I was learning mathematical notation while building it. This model is very specific, I had to define what a version was, what a component was, even what a name was. It is a complete definition of what I was studying.\nThe benefit such a precise definition is that it provided a solid foundation on which to discuss the domain I was studying.\nThe parts of the model that it is important to understand are a list of possible constraints that can describe a components relationship to other component and a component system:\nExclusion: Not in the system Conflict: When two components cannot be in the system Inclusive Disjunction: When at least one of a set of components must be in the system Dependence: When if one component is in a system at least one of a set of other components must be in the system Exactly One: When exactly one of a set of components must be in a system. And the parts of the model that are needed to create and instance of it\nA series of times The set of available components at each time User requests to change the system at each time Sets of systems constraints at each time (can be extracted from the components that exist) Optimisation criteria at each time, used to satisfy the user request The initial components in the system Why these parts are necessary and how they interact will become clearer soon.\nCUDF* (Syntax) For the CoSyE model to be useful, I had to be able to describe an instance of it. I extended an existing language called Common Upgradeability Description Format (CUDF) from Mancoosi.\nHere is an example of CUDF: preamble: property: size: int = [0]``package: syslib version: 1 installed: true``package: syslib version: 2 conflicts: syslib``package: textEditor version: 1 depends: spellChecker | spellCheckerService, syslib \u0026gt; 1``package: spellChecker version: 1 size: 1``package: tpspeller version: 1 provides: spellCheckerService size: 2``request: install:textEditor\nMancoosi was a European research project looking at the problem of upgrading a component system. Since the evolution of a component system occurs over many upgrades, and CUDF was a syntax to describe just one upgrade, I had to extend CUDF to CUDF* to suit my domain.\nThe difference between CUDF and CUDF* are:\nthe Mancoosi Optimisation Format(MOF) to describe optimisation criteria the times different components became available the initial time of the simulation multiple user requests Here is an example of CUDF*: preamble: 100 property: size: int = [0]``package: syslib version: 1 time: 50 installed: true``package: syslib version: 2 time: 150 conflicts: syslib``package: textEditor version: 1 time: 150 depends: spellChecker | spellCheckerService, syslib \u0026gt; 1``package: spellChecker version: 1 time: 150 size: 1``package: tpspeller version: 1 time: 250 provides: spellCheckerService size: 2``request: 200, -change,-size install:textEditor``request: 300, -size,-change install: textEditor, tpspeller\nThis example describes five components, a initial system with only syslib installed at time 100, and two changes to a component system:\nA request to install the textEditor component at time 200 while minimising change then size.\nA request to install the textEditor and tpspeller at time 300 while minimising size then change.\nIt is reasonably straight forward description, and your intuition about what will happen is probably right. However, if you want an exact description of this format please refer to the thesis.\nSimUser So I have a way to describe the evolution of component system, but how will I create these descriptions that look realistic. How do component system actually evolve with users upgrading, installing and removing components.\nModelling a user behaviour is hard. I mean really hard and it is even harder to verify. The first step though is generally talking to users, so I chose to start my modelling by doing an online survey.\nNote: Everything involving people outside your research requires ethics approval, including online surveys.\nI put a query on an online forum and got about 50 respondents of people who use various linux systems. Each one gave me loads of great information as I let them have lots of free text. I also did something that would help a great deal later, and asked for people to submit their logs. I collected about 30 useful logs which is better than a user survey because it is real data.\nFrom survey I was able to describe two type of user behaviour:\nProgressive behaviour prioritizes the potential risk of becoming out-of-date over the risk of introducing new problems. Conservative behaviour prioritizes the potential risk of changing the system over having less functionality and having old problems persist. For example, a desktop user who wants the latest and greatest is more progressive than a conservative server admin who only wants things to keep working the way they are.\nNext, I created SimUser, which is a formal and simple description of how a user upgrades their system. It includes four variables:\nis the probability a user requests to upgrade the system per day. is the probability a user requests to install any component per day. is the MOF criteria used to select an optimal system for an upgrade request. is the MOF criteria used to select an optimal system for an install request. What values should I assign these variables to get a realistic user?\nThe criteria to upgrade and and install a component I can assign what apt-get uses, and tweak it a bit to see if I can make it better.\nThe probability a user requests to upgrade their system can be determined by looking at the survey and the logs the applicants provided.\nHere, I was able to graph the participants probabilities to install and upgrade components, and cluster them using k-means:\nThe probability a user requests to install any component per day can be extracted from an online survey called The Ubuntu Popularity Context or popcon. This survey records every package and its chance of being installed in an Ubuntu system. However, because there are many packages that are only installed because they are dependencies of another package, I had to filter this list by the 2399 packages from the package called app-install-package that contains information about popular packages to install.\nValidation\nI validated this model in four ways:\nI had discussions with project supervisors (Jens and Hans) and other stakeholders (i.e. Giovanni, Marsland, Catherine and anyone else I could). I compared SimUser to responses from the survey I compared generated CUDF* documents to the supplied user logs I created a virtual Ubuntu system and looked at its changing repository over a month. GJSolver What I cannot create, I do not understand Richard Feynman, 1988\nProgramming is what a programmer does, and the component resolution algorithms is what got me interested in this topic to begin with. With that in mind I wanted to implement my own solver, so that I could experiment with it.\nDPLL What I found when I looked deep into component resolvers like Eclipse P2 was one of the hardest to implement and beautiful algorithms Davis-Putnam-Logemann-Loveland (DPLL). This algorithm is used to find if a Boolean equation can have its variables assigned to make the equation true.\nGiven the equation a OR b, is it satisfiable? Yes, if you assign a = true , b = true will result in it being satisfiable. DPLL does not care if there is more solutions, it just needs one to show that the equation is satisfiable.\nGiven the equation a AND (!a OR b), is this one satisfiable? Well we know a must be true, and the second part we know that !a will be false so b must be true. This is the kind of reasoning that DPLL can employ to derive values for variables.\nAnother way to write (!a OR b) is a -\u0026gt; b, i.e a implies b. This is how we map these Boolean equations to components, where if a where a component then a depends on b.\nHere is the description of DPLL function from my thesis:\nIn this function F is just the set of constraints, P is the currently assigned variables (or partial solution) and:\nunit-propagate is a particular way in which you can infer variable assignments from already inferred knowledge, e.g. you know a=true and you know a -\u0026gt; b, therefore b=true. decide makes a guess at which variable should be true or false. If decide picks the right assignments all the time, then this algorithm will be really fast. It is recursive, e.g. it guesses a=true then it calls itself to see if there is an answer where a=true. If there isn’t then the answer must be a=false. If a cannot be true or false, then there is no answer and DPLL knows there is no solution. Simple right! Well the devil is in the details. Getting fast unit-propagation, good heuristics for decide, efficiently handling hundreds of thousands of variables and constraints is very difficult.\nOptimisation Finding any solution to a component problem isn\u0026rsquo;t a good idea. There are many solutions to any given component upgrade problem, the hardest part is finding a good one.\nFor this I used an algorithm called Lexicographic-iterative-strengthening, which is another way of saying it finds a solution and keeps trying to make it better till it can\u0026rsquo;t any more.\nThe last thing I needed to do was to describe how I mapped my CUDF* instance to the implementation, and that is pretty tedious so I will leave it out here.\nVerification of GJSolver Verification: Did I build it right?\nSo after implementing GJSolver I needed to make sure:\nIt correctly solves component upgrade problems It solves the problems at least as well as other implementations I found out if it did both of these by entering into Mancoosi International Solver Competition (MISC). In this competition GJSolver competeted against other implementations in solving CUDF problems. A competition like this has the benefit of being conducted by an impartial third party, meaning I couldn\u0026rsquo;t tamper with the results.\nThe outcomes of the competition were:\nGJSolver consistently got good results, reasonably quickly. It won one the most difficult of the three tracks it was entered in. No CUDF problem was solved incorrectly by GJSolver. When compared to the other solvers I produced similar solutions in similar time frames. Validation of GJSolver\nValidation: Did I build the right thing?\nTo validate GJSolver and SimUser I:\nsimulated installing 200 requests of packages and compared it to the logs of apt-get users. simulated updating an Ubuntu system once a day for a month and compared it to the results I collected from a virtual Ubuntu system doing the same thing. The results were close, but not exactly the same. This validation is not done to make sure the simulation is exactly reality, it never will be. The validation is done to see where it differs from reality.\nExperiment: Alter Simulation and Measure Effect\nI had four questions I wanted to answer with my simulation:\nWhat consequences do a user’s choices have on their system (i.e. their probabilities to upgrade and install) when using the apt-get criteria? Can the out-of-dateness of a system be reduced? Can the total change of a system be reduced? How do the systems of realistic users evolve? To answer these questions I used SimUser to create CUDF* documents for various users, then used GJSolver to resolve those CUDF* documents.\nResults\nI will not go over all the results but my initial experiments were with four simulated users:\nAlways Install users install one component every day Always Update and Install users install and update every day Always Upgrade users upgrade everyday Control users do nothing Note: The measure UTTDpC stands for Up-To-daTe-Distance per Component, a measure of how many newer versions exist of a component.\nThe results from these users looked like:\nHow up to date a comopnent system is (lower is better):\nHow much change a system goes through:\nReducing up-to-dateness\nWhen creating the optimisation criteria for apt-get I learnt something that I didn\u0026rsquo;t previously know: apt-get will not install a entirly new component during an update. So if a newer version of an already installed component requires a component that you do not currently have installed it will be unsable to upgrade that component.\nBy altering the optimisation criteria I was able to get a system about 24% more uptodate over a year of upgrading everyday.\nReducing Change While experimenting I noticed that many times a component would be upgraded in quick sucession. Wondering if this was a common issue I looked to the data and created this graph:\nUsing this information I came up with the theory that sometimes a component is released to the package repository with a bug that is quickly found, fixed, then a new version is released. This means that any person who installed the original package would have been better waiting a little bit because:\nthey will not have to download a package twice they will not have a bug in your system they will not have to change your system twice So I created an optimisation that waited 7 days after a package is added to the repository to install it and simulated the results. I found that if you update frequently then you will save yourself about 30 instances per year where you install the same package twice in a week. If you increase that waiting time to 28 days, you can save yourself over 100 changes a year.\nSummary of Experiments In addition to showing how to reduce changing by waiting before installing a component, and decreasing out-of-dateness by letting updates install new components; I also showed:\nThe majority of change during evolution is caused by a user upgrading. Installing new components increases the amount of change when upgrading. Systems become out-of-date at the rate at which components evolve. Components evolve at a higher rate during release cycles. Reuse decreases the rate of change during CSE. This is due to the two effects; reuse decreases the installation rate of components and this decreases the amount of components necessary to be upgraded. Increasing the frequency of upgrading has depreciating returns on reducing a systems out-of-dateness. It may also increase change due to components being repeatedly upgraded if they quickly release multiple versions. Research Conclusion The research I did gave an understanding of how component systems evolve and it can provide users and developers with insights into the effects of their choices. Additionally, the research has proposed two novel ways to reduce negative effects during CSE and the tools in with which to measure the effectiveness of these techniques.\nWhat I learnt\nOnly do a Ph.D. if you can live on ramen noodles: you are not paid much (if anything) there is a lot of stress involved and it will take a while. Doing a Ph.D. will force you to learn how to give presentations, write research papers, organise conferences, in addition to all the academic things you must learn. So get good at learning things. Look at what others have done, it saved me a lot of problems. If I had not found Mancoosi, or Daniel Le Berre, I would have never finished. Understanding their work allowed me to put my work in their context. Writing is difficult, feedback is necessary. I had many people look at my writing and try to understand what I was saying. Formalism is not for formalisms sake, but for your understanding and communication. Learn to write and read mathematical notation. I believe it has helped me be a better programmer, by forcing me to think about every function I write. An experiment is more powerful than any amount of words. Feynman says “Experiment is the sole judge of scientific ‘truth’”. If you have a repeatable experiment no one can argue with you. I am much more critical of science, especially scientific reporting when they use the word prove and proof. Most of the time what they do is demonstrate something, proof is a long way off. References This will be short, because if you want to see all my references there are 7 pages of them in my thesis!\nBrooks The Mythical Man-Month, this book is a must for a developer who wants to understand what their job is.\nSzyperski Component Software: Beyond Object-Oriented Programming, a great (and massive) tomb of knowledge about all aspects of software components.\nEverything written by Richard Feynman, Surley you\u0026rsquo;re joking Mr Feynman, What do you care what other people think?, even QED. He has a way of telling a story while explaining something that makes you able to understand (even a bit).\nBonus: This is a timelapse of me writing my thesis:\n","permalink":"https://maori.geek.nz/posts/2014/2014-02-22_grahams-ph.d.-thesis-a-study-of-software-component-system-evolution/","summary":"\u003cp\u003eSince I spent \u003cstrong\u003e4.5 years\u003c/strong\u003e completing my Ph.D. at Massey University, I have decided \u003cstrong\u003e1 year on\u003c/strong\u003e to give an overview of what I actually did.\u003c/p\u003e\n\u003cp\u003eIf you want my full thesis, you can download it \u003ca href=\"https://s3-ap-southeast-2.amazonaws.com/maorigeek/documents/GrahamJensonThesis.pdf\"\u003ehere\u003c/a\u003e. Be warned, it is really long, really dense and an academic document. I am trying to make this post a bit more reader friendly.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eSo here it goes…\u003c/em\u003e\u003c/p\u003e\n\u003ch3 id=\"introduction-to-components-and-evolution\"\u003eIntroduction to Components and Evolution\u003c/h3\u003e\n\u003cblockquote\u003e\n\u003cp\u003eIn order to agree to talk, we just have to agree we are talking about roughly the same thing. \u003ca href=\"http://www.amazon.com/gp/product/0465023827/ref=as_li_qf_sp_asin_il_tl?ie=UTF8\u0026amp;amp;camp=1789\u0026amp;amp;creative=9325\u0026amp;amp;creativeASIN=0465023827\u0026amp;amp;linkCode=as2\u0026amp;amp;tag=maor01-20\"\u003e\u003cem\u003eThe Feynman Lectures on Physics\u003c/em\u003e\u003c/a\u003e\u003cem\u003e, Motion, Richard Feynman, 1961.\u003c/em\u003e\u003c/p\u003e","title":"Graham’s Ph.D. Thesis: A Study of Software Component System Evolution"},{"content":"To test my Rails projects I typically use rspec. I really enjoy the way it helps me layout and describe my tests.\nSo when I started writing my first node.js package (back-on-promise), I wanted a similar way in which to write my tests when testing promises. I decided to use mocha for running the tests, chai for test assertions, and sinon to mock and stub objects. In this post I will describe how to test with these tools in node.js, specifically looking at promises.\nNote: I use CoffeeScript in this post because I like the syntax and my fingers get sore typing function hundreds of times. If you are unfamiliar with CoffeeScript maybe [The Little Book on CoffeeScript] (http://www.amazon.com/gp/product/1449321054/ref=as_li_qf_sp_asin_il_tl?ie=UTF8\u0026amp;camp=1789\u0026amp;creative=9325\u0026amp;creativeASIN=1449321054\u0026amp;linkCode=as2\u0026amp;tag=maor01-20)_can help._\nTesting in node.js Since it is typical for node packages to be very granular, and testing packages are no exception, a node testing framework will consist of many complementary packages. This is different to other frameworks like rails, where the testing framework rspec is a single package that provides assertions, stubbing and a test runner. These aspects of testing have been broken up into individual node packages, where:\nMocha is a testing framework for describing and running tests Chai is an assertion library Sinon is a mocking and stubbing library Note: the granular packages for nodes testing frameworks allows replacement of parts based on preference or suitability, e.g. you could replace chai with should.js.\nThe Setup Note: node and coffee and npm are requirements of this project\nFirst, let’s create a project called cachy with a package.json: { \u0026quot;name\u0026quot; : \u0026quot;cachy\u0026quot;, \u0026quot;description\u0026quot; : \u0026quot;Lets test some promises\u0026quot;, \u0026quot;url\u0026quot; : \u0026quot;https://github.com/grahamjenson/test_promises.git\u0026quot;, \u0026quot;author\u0026quot; : \u0026quot;Graham Jenson \u0026lt;grahamjenson@maori.geek.nz\u0026gt;\u0026quot;, \u0026quot;dependencies\u0026quot; : { \u0026quot;q\u0026quot;: \u0026quot;1.0.0\u0026quot;, \u0026quot;q-io\u0026quot;: \u0026quot;1.10.9\u0026quot; }, \u0026quot;devDependencies\u0026quot;: { \u0026quot;mocha\u0026quot;: \u0026quot;1.17.1 \u0026quot;, \u0026quot;chai\u0026quot;: \u0026quot;1.9.0\u0026quot;, \u0026quot;sinon\u0026quot;: \u0026quot;1.8.2\u0026quot;, \u0026quot;coffee-script\u0026quot;: \u0026quot;1.7.1\u0026quot; }, \u0026quot;scripts\u0026quot;: { \u0026quot;test\u0026quot;: \u0026quot;mocha --compilers coffee:coffee-script/register\u0026quot; } }\nThen do the standard npm install to install all the node packages.\nThe first thing to note about the cachy package is that it uses Q promises with the Q-IO library for the http IO. This is a Promises/A+ spec promises library so its promises can manage exceptions, unlike JQuery promises.\nNote: I have previously posted a description of JQuery promises here.\nThe second part to note is the definition of the test script mocha — compilers coffee:coffee-script/register. The only argument for this is to tell mocha to compile tests with the extension coffee with coffee-script/register.\nNote: if you are using coffee-script 1.6 then the argument is just coffee:coffee-script\nThe Testy To demonstrate testing promises I am going to implement an http cache for remote JSON. For this I need a get function that takes a url and returns a promise for the data. The benefit of such a method is that it uses the same interface regardless of whether the data has been cached or not. I call the object Cachy and put it in a file called cachy.coffee: qhttp = require(\u0026quot;q-io/http\u0026quot;) q = require('q')``Cachy = { _cache : {}``write_cache: (key, data) -\u0026gt; @_cache[key] = data``read_cache: (key) -\u0026gt; @_cache[key]``reset_cache: -\u0026gt; @_cache = {}``get: (url) -\u0026gt; if @_cache[url] return q.fcall(=\u0026gt; @_cache[url]) return qhttp.read(url).then( (buf) =\u0026gt; json = JSON.parse(buf) @write_cache(url,json); return json ) }``if (typeof module != 'undefined' \u0026amp;amp;\u0026amp;amp; module.exports) module.exports = Cachy;\nFirst it imports q and q-io/http. q-io is a package that offers a tidy wrapper around http IO so that calls return a promise.\nSecond, I define Cachy, this object has a few helper methods to manage the cache (write_cache, read_cache, reset_cache) and the core method get.\nget returns a promise for the data at a url. It first looks in the cache to see if the url has already been called; if it is there it returns a promise (created using q.fcall) for the cached object. If the call has not been cached yet it will get a return a promise that takes a qhttp.read (which is a http get call), then parses the returned object to JSON and writes the object to cache.\nThe final part of this file is just defining what to export when required.\nThe Test Now Cachy is ready to be tested! Mocha will look for tests with the .coffee extension in a folder called test. So I created a file test/tests.coffee.\ntests.coffee will start similar to cachy.coffee with an import of all the required modules. The only non-standard line is chai.should(), which is called to inject the should methods on objects. chai = require 'chai' should = chai.should()``sinon = require 'sinon'``q = require 'q' qhttp = require(\u0026quot;q-io/http\u0026quot;)``Cachy = require '../cachy'\nMocha lets you group tests together using the describe method. describe takes a description of the tests and a function defining all the tests. describe 'Cachy.get', -\u0026gt;\nIn this describe function there are two tests for Cachy.get:\na test for when the data is already cached a test for when the data must be fetched using qhttp.read Already Cached Test describe 'if the data is cached', -\u0026gt; it 'returns a promise for the data from cache', (done) -\u0026gt; url = 'http://www.maori.geek.nz' data = {name: 'maori.geek'} Cachy.write_cache(url, data) Cachy.get(url).then( (data) -\u0026gt; data.name.should.equal 'maori.geek' done() ) .catch((error) -\u0026gt; done(error) ) .fin( -\u0026gt; Cachy.reset_cache() )\nMocha defines its tests using the it function, which takes a description of the test and a function whose first parameter is a done callback. This done callback is used for asynchronous tests; a test will wait some time (default 2 seconds) for the done() callback at which point the test is finished. However, if done(error) is called it will immediately fail the test.\nThe first test defines a url and some data, then caches it using Cachy.write_cache. Calling Cachy.get(url) should return a promise for the data. If the promise is satisfied then will be called to assert (using Chai\u0026rsquo;s should method) that the data is correct. If the promise fails or errors it will be catch will be called to execute done(error), which fails the test by passing the error to mocha.\nNote: As the Promises/A+ spec defines a promise to internally handle errors, the catch method is necessary or the promise will silently fail and the test will incorrectly pass\nNot Yet Cached Test describe 'if the data is not cached', -\u0026gt; it 'should fetch data, cache and return it', (done) -\u0026gt; url = 'http://www.maori.geek.nz' data = {name: 'maori.geek'}``sinon.stub(qhttp, 'read', (curl) -\u0026gt; curl.should.equal url return q.fcall(-\u0026gt; JSON.stringify(data)) ) Cachy.get(url).then( (data) -\u0026gt; data.name.should.equal 'maori.geek' Cachy.read_cache(url).should.equal data done() ) .catch( (error) -\u0026gt; done(error) ) .fin( (value) -\u0026gt; Cachy.reset_cache() qhttp.read.restore() )\nThis test starts off similar to the previous test defining url and data. Then it uses sinon to stub the call to the http server qhttp.read to instead return a promise (created using q.fcall) for a stringified JSON object.\nCalling Cachy.get(url) should call the stub to get the data with the url provided, which will then return the data. Once returned the data is asserted to be correct and that it has been cached.\nIf an error occurs, it is caught by catch and the test will fail.\nFinally, the test is cleaned up in fin by resetting the cache, and removing the qhttp.read stub with the method restore.\nRunning the tests By calling npm test in the console you will get an output similar to this: `\u0026gt; cachy@ test /home/graham/test_promises\nmocha \u0026ndash;compilers coffee:coffee-script/register -C\n․․\n2 passing (15ms)`\nConclusion I like testing, the more I test the more I see its benefits. However, promises and node.js have some idiosyncrasies, due to their asynchronous nature, that must be understood to test. Although, this was not a complete guide to testing in node, I hope it will help you get started.\nNote: the code in this post is available on github here\nSome more places for information\nNode.js in Action\nNode.js the Right Way\nDerick Bailey: Asynchronous Unit Tests With Mocha, Promises, And WinJS\nNot Yet Released: JavaScript with Promises\nO\u0026rsquo;Reilly Learning jQuery Deferreds: Taming Callback Hell with Deferreds and Promises\n","permalink":"https://maori.geek.nz/posts/2014/2014-02-15_testing-promises-in-node.js-with-mocha-chai-and-sinon/","summary":"\u003cp\u003eTo test my Rails projects I typically use \u003ca href=\"http://rspec.info/\"\u003erspec\u003c/a\u003e. I really enjoy the way it helps me layout and describe my tests.\u003c/p\u003e\n\u003cp\u003eSo when I started writing my first node.js package (\u003ca href=\"https://www.npmjs.org/package/back-on-promise\"\u003eback-on-promise\u003c/a\u003e), I wanted a similar way in which to write my tests when testing promises. I decided to use \u003ca href=\"http://visionmedia.github.io/mocha/\"\u003emocha\u003c/a\u003e for running the tests, \u003ca href=\"http://chaijs.com/\"\u003echai\u003c/a\u003e for test assertions, and \u003ca href=\"http://sinonjs.org/\"\u003esinon\u003c/a\u003e to mock and stub objects. In this post I will describe how to test with these tools in node.js, specifically looking at promises.\u003c/p\u003e","title":"Testing promises in Node.js with Mocha, Chai and Sinon"},{"content":"Envisioning Information was recommended to me by a friend as a way to improve how I think about and design visualisations. Although less popular than his other work Visual Display Quantitative Information (which is still on my to-read list), this book has many interesting examples and ideas on how to present complex information. Of particular interest to me, this book gives a large discussion on how cartographers design and present geographical maps. It also gives practical rules on how to design, colour and layout data in a visualisation.\nThis post is part review, part write-up and part discussion about what I found interesting in this book. Take from it what you will, and if you like or disagree with anything, please leave a comment.\nVisualisation In this book, Tufte gives a wide definition of what a visualisation is. He describes visualisations of maps, train lines, periodic tables, planetary movements, sun spots, scientific data, memorials, calligraphy, engine construction, train signals, river lengths, user interfaces, dance step instructions and much more. The amount of things that he considers visualisations reminds me that many times I fall into the trap of having a very narrow view on what a visualisation is. I have to remind myself that there are more visualisations than just good looking charts or graphs of big data on the internet.\nWhile reading this book it reminded me of the documentary Objectified which is about how everything is designed. Looking around has shown a constant meme in society that visualising data is seen as a worthy endeavour. For example, rather than merely stating that people in Scandinavia are more likely to be blonde, giving a visualisation to make the point is better.\nAfter widening my view on visualisations, I started to realise how often I use them day-to-day, e.g. the metservice’s daily forecast:\nMy folders are also a great visualisation that gives depth to my very flat and sequential hard drive.\nThese are just examples that are readily available to me now, but there are many examples that go beyond the computer. For example I recently had to construct furniture and the instructions looked something like this:\nGoals of a Visualisation According to Tufte, the goals when creating a visualisation is to increase the number of dimensions on a flat surface (computer or paper) while increasing the density of presented data. This is a ‘cognitive art’ (as described by Philip Morrison), it is a presentation that is aesthetically pleasing and rich in information.\nHowever, as Augustus Pugin notes:\nIt is alright to decorate construction, but never to construct decoration\nVisualisations are meant to contain information, yet often they are polluted with chart junk (meaningless decoration) that only negates from a visualisations value.\nThe closer you look at a visualisation, the more information you should get, not just decorated fluff.\nAdditionally, Tufte says that a visualisation should not be judged by how much information is shown, but how effectively it is presented. Showing to much information can be just as bad at lowing a visualisations value as chart junk.\nTufte also talks about the importance of the visualisations design for its consumers. Good graphic design, typography, object representation, layout, colour, production techniques and good visual principles are required for a visualisation. Any clutter or confusion in a visualisation are failures of design, and not the fault of the viewer. This message echoes the theme of the book The Design of Everyday Things, which is all about adapting to a users expectations and not blaming them for misunderstanding your design.\nThroughout the book, Tufte described and defined the characteristics of good visualisations, I have tried to distil what he said into one sentence:\nIncrease data density and dimension without cluttering with superfluous information or adding unnecessarily junk to the design.\nPractical Advice for Visualisation I suck at design! That is, when I get my ideas out of my head and onto paper (or in the browser) they are not how I imagined them to look. So the main reason why I liked this book is that it had many practical guidelines to follow when designing a visualisation, and designing in general. Tufte, although he gave a lot of advice, drew much of this information from other books like Josef Alber’s Interaction of Color and Eduard Imhof’s Cartographic Relief Presentation. This external knowledge was distilled and added to by Tufte, which created a great combination of general design information and visualisation specific info. Some of which I will try an present here.\nColour Colour is an important aspect of how a visualisation presents its information. Colour can be used to:\ndifferentiate — colour to discern between annotation and annotated join — colour to show relatedness label — colour as a noun measure — colour as a quantity imitate reality — colour as a representation decorate — colour as beauty Given the importance of colour, Tufte gave many guidelines around its use.\nHe used the rules from cartographers when creating maps (specifically from Eduard Imhof’s Cartographic Relief Presentation) and described their general applicability to all visualisation creation:\nFirst Rule — pure, bright or very strong colours have very loud, unbearable effects when next to each other, but they have striking effects on a dull background Second Rule — bright colours next to white is bad Third rule — background should be dull, let the foreground do the work Fourth rule — if the picture is divided by colour, put colours from one area intermingled in the other, and all colours should be represented in the background Some of these rules you can see applied to make this visualisation of marshalling signals more understandable and striking. Using red arrows and yellow sticks to draw the attention, while drawn on the less important dull grey person.\nTufte also described how to layer a visualisation using similar colours. For example, if a river is visualised as light blue then labels for the river should be darker blue to link the two together. Additionally, to draw attention to specific points use saturated red as it stands out as separate from the blue and green layers. This kind of straight forward reasoning about design and colour lets me critically look at my own visualisations, and see if I have used colour to its full efficacy.\nThese types of rules you can see applied in a map of New Zealand’s Marlborough sounds, published in Atlas of Design by geographx:\nNote: I got this book for my dad, and highly recommend it\nIn addition to using colour, the weight of the line can be altered. In the example below, the thick black line is underlined by the fine red line:\nTypes of Visualisation Tufte also described a few different types, or techniques, of visualisations. These examples are practical ways in which to think about layout and presentation of data.\nMicro/Macro This is where Tufte introduces micro/macro design, what I think of as small picture/big picture. The idea is to have the\nsame ink serve more than one informational purpose.\nUsing large quantities of data with a high density, give an overall picture using the smaller ones. This way you can immediately convey the meaning, while allowing for deeper analysis.\nAn interesting example that was given for this kind of visualisation was the Vietnam War Memorial in Washington D.C. This memorial lists the names of 58,000 soldiers who died in Vietnam, in order of the date they died. Tufte saw this as a good example of a Micro/Macro visualisation because each name has three functions; to memorialise the person who died, to show the sequence of when they died, and to add to the overall visual representation of the number who died. It demonstrates the big picture tragedy that 58,000 soldiers lost their lives without diminishing the small picture tragedy that each one had a name and a story. It is well thought out, powerful, emotive, yet simple.\nSmall Multiple A Small Multiple visualisation is the same display repeated, but each time with different data. Visualising data in the same manner side by side will highlight differences that are present across the data.\nSuch a small multiple visualisation can be seen in the Trilogy Meter by Dan Meth.\nThis visualisation shows the IMDB ratings per movie for many different trilogies. In this visualisation you can see how audiences enjoyed the movies between and within trilogies. An alternative way of visualising this data could have been to display it on a single bar chart. Such a visualisation would be far less striking, more complicated and without conveying any more information than this example does.\nSome Inspiration For me this book is a great read if you are stuck or in a mental block. It makes me want to go and create something beautiful and meaningful. When describing themes, Tufte often gave lists instead of trying to define something. This is a great method he used to convey something complicated. Here is the list he gave of the things a visualisation can do for you to read and take ideas from if you get stuck:\nselect, edit, single out, structure, highlight, group, pair, merge, harmonize, synthesize, focus, organise, condense, reduce, choose, categorise, catalogue, list, abstract, scan, idealize, isolate, sort, integrate, blend, inspect, filter, smooth, cluster, summarise\nConclusion I really enjoyed this book. I spent a long time going over the examples and visualisations trying to extract every point that Tufte made in his writing. His ability to pull apart visualisations and describe why they are effective is a skill I think is worth fostering in a world that wants to visualise everything.\nTufte finished his book with the thought:\nPerhaps one day high-resolution computer visualisations […] will lighten the laborious complexity of encodings — and and yet still capture some worthwhile part of the subtlety of human itinerary.\nI like that this became reality.\nReferences Envisioning Information The Visual Display of Quantitative Information The Design of Everyday Things Interaction of Color Cartographic Relief Presentation Atlas of Design ","permalink":"https://maori.geek.nz/posts/2014/2014-02-07_envisioning-information-with-edward-r.-tufte/","summary":"\u003cp\u003e\u003ca href=\"http://www.amazon.com/gp/product/0961392118/ref=as_li_qf_sp_asin_tl?ie=UTF8\u0026amp;amp;camp=1789\u0026amp;amp;creative=9325\u0026amp;amp;creativeASIN=0961392118\u0026amp;amp;linkCode=as2\u0026amp;amp;tag=maor01-20\"\u003eEnvisioning Information\u003c/a\u003e was recommended to me by a \u003ca href=\"https://twitter.com/PrototypeAlex\"\u003efriend\u003c/a\u003e as a way to improve how I think about and design visualisations. Although less popular than his other work \u003ca href=\"http://www.amazon.com/gp/product/0961392142/ref=as_li_qf_sp_asin_tl?ie=UTF8\u0026amp;amp;camp=1789\u0026amp;amp;creative=9325\u0026amp;amp;creativeASIN=0961392142\u0026amp;amp;linkCode=as2\u0026amp;amp;tag=maor01-20\"\u003eVisual Display Quantitative Information\u003c/a\u003e (which is still on my to-read list), this book has many interesting examples and ideas on how to present complex information. Of particular interest to me, this book gives a large discussion on how cartographers design and present geographical maps. It also gives practical rules on how to design, colour and layout data in a visualisation.\u003c/p\u003e","title":"Envisioning Information with Edward R. Tufte"},{"content":"I am always looking for beautiful solutions to complex problems, and recently I have been experimenting with promises to solve the ugly problem of asynchronous actions in javascript. Promises are a simple metaphor that make complex operations easy to understand. In this post I will describe what promises are, why they are beneficial, how to use them, and a project I have been working on called Back-on-Promise which integrates promises into backbone.js.\nNote: this post will focum Science on How to Test your Codes on the JQuery 1.8 and up promises API, with a small discussion at the end about other libraries\nWhat are Promises and Deferred Objects? To understand what promises are you need to understand deferred objects. A deferred object describes the state of availability of something, and offers a nice interface to help when that state changes.\nA deferred object starts in a pending state, this means it is not yet completed. While in the pending state if the resolve() function is called then the state is changed to resolved, and if the reject() function is called then the state is changed to rejected. $.Deferred().state() // 'pending' $.Deferred().resolve().state() // 'resolved' $.Deferred().reject().state() // 'rejected'\nCallbacks can be attached to these state changes using the done() and fail() functions, or to all state changes using the always() function. The arguments given to the resolve() or the fail() functions are passed to the callbacks so the object that resolved or failed the deferred object can be used.\nFor example: log = function () { console.log(Array.prototype.slice.call(arguments).join(' ')) } // logs a functions arguments``d = $.Deferred() d.done(log) d.resolve('maorigeek') // maorigeek\nAll these functions return the deferred object so they can be chained together to make the code more concise, like this: d = $.Deferred().done(log).resolve('maorigeek') // maorigeek\nOne last thing to understand is that attaching a callback after the state has changed will execute the right callbacks immediately. This means you do not need to worry when a deferred object is resolved or rejected, and that the correct arguments will be passed to the callbacks at any time. For example: d = $.Deferred() d.resolve('maorigeek') d.done(log) // maorigeek\nPromises\nIt rarely makes sense to resolve or reject deferred objects that a third-party has made. For example, resolving an AJAX call before the server has returned or rejected the request. This is where promises are used. A promise is just a deferred object that does not have functions to change its state. By using a promise instead of a deferred object it will stop others breaking promises you made. To get a promise for a deferred object, you only need to call its promise() function, e.g. p = $.Deferred().promise().\nThere are a few helper methods to deal with promises and deferred objects.\nThe when() function takes a list of deferreds and returns a single promise that will resolve once all its arguments are resolved.\nIt will also accept variables as arguments and wrap them in an already resolved promise. p = $.when(\u0026quot;maorigeek\u0026quot;) p.state() // resolved p.done(log) // maorigeek\nTherefore, when() is like a logical AND and will wait for all deferreds to resolve. For example: d = $.Deferred() $.when(d, 'World').done(log) d.resolve('Hello') // Hello World\nThe then() function takes two callbacks, one that that will be executed if the promise is resolved, the other if it is rejected; and returns a new promise. If the callback returns a deferred object then the returned promise will be of that deferred object. gate = $.Deferred() promise = $.when('Hello').then(function(h){ return $.when(h,gate) }) promise.done(log) gate.resolve('World') // Hello World\nThe above example first creates a deferred object called gate, which is used as a switch. Using when() a promise for \u0026lsquo;Hello\u0026rsquo; (which is already resolved) is created. Then using then(), a callback is attached that returns a promise for the input and gate (the returned promise will not be resolved till gate is resolved). Finally, a resolved callback is attached to the proimse that logs the input.\nOnce gate becomes resolved the promise returned by then() becomes resolved, and the final log callback is executed.\nWhy use promises? Now that you understand what promises are, why should you use them? To demonstrate the advantages of using promises I am going to use the a contrived (borrowed) example of requiring to wait for three asynchronous requests to render a sidebar on a website. Here are three possible solutions to that problem\u0026hellip;\nThe bad, the ugly, and the good\nGiven of waiting on three AJAX calls before rendering a side-bar, there are a few different ways to go\u0026hellip;\nSerial Calling Ajax (bad) a.k.a. callback hell $.ajax({ success: function() { $.ajax({ success: function() { $.ajax({ //Yo Dawg, I heard you like callbacks... }); }); } });\nThis is not pretty, efficient, or smart. It fetches the data serially, it will quickly get out of hand when handling errors or branches, and it is difficult to refactor or understand.\nParallel with call-backs (ugly) var pseudo_promises = [];``$.ajax({ success: function() { pseudo_promises.push('resolved'); check(); } });``$.ajax({ success: function() { pseudo_promises.push('resolved'); checkDataCalls(); } });``$.ajax({ success: function() { pseudo_promises.push('resolved'); check(); } });``var check = function() { //checks for all 3 values in the pseudo_promises array, then //renders sidebar }\nThis is nearly as good as code that uses callbacks can get. It fetchs the data in parallel, separates the functions into readable and refactorable functions. However, it is verbose, difficult to expand, and brittle, as handling errors in this fashion will introduce significantly more complexity across all functions.\nPromises in Action (good) var address = $.ajax({});``var tweets = $.ajax({});``var facebook = $.ajax({});``render_side_bar = function(address, tweets, facebook){ //render sidebar }``render_no_side_bar = function(){}``$.when(address, tweets, facebook).then( render_side_bar, render_no_side_bar )\nThis is where promises make understanding your code easier. First we make three AJAX calls to get the address, tweets and facebook, when each has resolved then render_side_bar if one failed, render_no_side_bar.\nFun with Promises Great! Now you know what promises are and I have convinced you of their ability to help clean up asynchronous code, but what situations can they be used in? Here I will describe some useful (and fun) things that you can do with promises.\nComplex animations\nNot knowing when an animation finishes but having to do something when it completes (e.g. another animation) can result in code that is really difficult to read. Especially if it involves other operations like rendering, or form manipulation. JQuery will return a promise (if you ask) for most of its animations to easily chain them together $('body').toggle('blinds').promise().then( function(){ $('body').toggle('blinds') } )\nThe wait promise\nA promise can be used to wrap existing functions to increase the vocabulary of promises. Chris Webb uses promises to redefine setTimeout to make using it significantly more understandable. function wait(ms) { var deferred = $.Deferred(); setTimeout(function(){deferred.resolve()}, ms); return deferred.promise(); }``wait(1500).then(function () { // After 1500ms this will be executed });\nHandling a Queue\nThis may not be a complete (good) idea, but promises can be used to manage a queue of events, e.g.: window.queue = $.when() $('#list').on('click', function() { window.queue = window.queue.then(function() { //do the thing }) } )\nHere is an example I made that has a reasonably complex animation when you click on it. To make sure that the animation executes once for all clicks, and that it finishes before executing the next clicks animation, I use the above method for event queueing and execution.\nhttp://bl.ocks.org/grahamjenson/raw/8309901/\nBack on Promise Note: back-on-promise is not production ready, with further interest I may be able to change that.\nTo have some fun with promises I wanted to see if I could implement an extension to Backbone that would allow model a user to get a remote model with minimal code. This turned into back-on-promise (github) an npm module that is still very beta. The main idea is to change Backbone.Model.get to return a promise for the data you want, whether it is async. or not. This way the user of the model does not need to care if the model needs to go to the server and fetch the data. It also caches the data in the resolved promise so that the data does not need to be fetched twice.\nFor example: class Posts extends Backbone.Collection url: -\u0026gt; \u0026quot;http://user/#{@user.id}/posts\u0026quot; model: Post``class User extends BOP.BOPModel @has 'posts', Posts, method: 'fetch', reverse: 'user'``user = new User(id: 1) $.when(user.get('posts')).done( (posts) -\u0026gt; #render posts)\nPromises/A+ v.s. JQuery There are a many different promise libraries out like when.js, rsvp.js and Q to name a few.\nJQuery, although definitely the most distributed promises library, does not conform to the most popular promises specification, Promises/A+.\nAs described by Domenic Denicola (author of the Q promise library and co-author of the Promises/A+ spec.) from the podcast Javascript Jabber:\n[JQuery has] a fail flaw in their chaining implementation, which is that they don’t do thrown exception handling at all. So the whole abstraction breaks down when you can no longer throw an exception and have it turn into a rejected promise\nFor Example d = $.Deferred() d.then(function(){ throw new Error('err') }).fail(function(){ console.log('fail') }) d.resolve() // throws Error: err, //instead of calling fail with the exception\nThis post is not about comparing the Promises/A+ and JQuery promises, but more trying to get you excited to use promises. If you read this post and want to see the differences, here is a document on the Q wiki about moving from JQuery promises to Q.\nThere are a few other differences, but this argument may become a moot point in the future with the announcemnet that ES6 will contain a native version of the Promises/A+ spec. Dominic Denicola helped spearhead this inclusion, which is described in his excellent presentation on working with standards bodies. This means browsers and Node.js may encourage (force?) libraries, like JQuery, to eventually use the standard promise implementation that will be provided.\nWhat else? I have not discussed many aspects of promises here, or provided an opinion on all matters of promises. All this post was written for was to encourage you to go and try them out. You could just open up a console and try them\u0026hellip; right here\u0026hellip;. right now. How easy is that!\nReferences Some references that might be useful:\nO\u0026rsquo;Reilly Learning jQuery Deferreds: Taming Callback Hell with Deferreds and Promises\n[JQuery issue to make promises spec-compliant](http://bugs.jquery.com/ticket/14510)``[ConFreaks vid from JQuery Conf](http://www.youtube.com/watch?v=juRtEEsHI9E) ([github](https://github.com/alexmcpherson/jquery-talk))``[CommonJS Promises Specs](http://wiki.commonjs.org/wiki/Promises)``[Q promises](https://github.com/kriskowal/q)``[Javascript Jabber: 037 Promises with Domenic Denicola and Kris Kowal](http://javascriptjabber.com/037-jsj-promises-with-domenic-denicola-and-kris-kowal/)``[Back on Promise](https://npmjs.org/package/back-on-promise)([github](https://github.com/grahamjenson/back-on-promise))\n","permalink":"https://maori.geek.nz/posts/2014/2014-01-31_jquery-promises-and-deferreds-i-promise-this-will-be-short/","summary":"\u003cp\u003eI am always looking for beautiful solutions to complex problems, and recently I have been experimenting with promises to solve the ugly problem of asynchronous actions in javascript. Promises are a simple metaphor that make complex operations easy to understand. In this post I will describe what promises are, why they are beneficial, how to use them, and a project I have been working on called \u003ca href=\"https://npmjs.org/package/back-on-promise\"\u003eBack-on-Promise\u003c/a\u003e which integrates promises into \u003ca href=\"http://backbonejs.org/\"\u003ebackbone.js\u003c/a\u003e.\u003c/p\u003e","title":"JQuery Promises and Deferreds: I promise this will be short"},{"content":"SAT Solver Last night decided that I am going to write a SAT solver. But I had some conflicting requirements:\nI want it to be fast (SAT solvers are very resource heavy) I want it to be modular (Easily replace parts for more efficiency) I want the code to be very understandable. SAT4J is the SAT solver I am most familiar with. It is as efficient as Java lets it be, it is very modular, and it is reasonably understandable on the surface, but the lower you get the more it is obfuscated.\nSAT4J is based on the solver MiniSAT, which is meant to “to help researchers and developers alike to get started on SAT”. MiniSAT is fast, implemented in C, modular, however I ended up using the published papers about MiniSAT to translate the code that I was seeing.\nBoth of these are excellent projects, but I want to make the code easier to understand without sacrificing, speed or modularity.\nRPython RPython is a subset of python (i.e. rpython code is python code but not vice versa), it is hopefully my solution. It is described in this paper\nand in this talk.\nI will quickly go over the setup on my OSX system, and then later hopefully post about how my SAT solver implementation is going.\nSetup First thing is to install pypy: brew install pypy\nThis will install pypy, but we need the pypy source for the translator: cd /usr/local/Cellar/pypy curl -O [https://bitbucket.org/pypy/pypy/get/release-1.9.zip](https://bitbucket.org/pypy/pypy/get/release-1.9.zip) unzip release-1.9.zip\nThis will create a horrible directory name with a SHA so… mv pypy-pypy-341e1e3821ff/ pypy-src\nNow that we have an installation lets get an rpython example up and running.\nSimple Fibonacci Example Lets do the default thing and create a Fibonacci calculator. cd ~ mkdir fib cd fib touch fib.py\nThe final thing to do is to link the translate.py script which will compile the rpython ln -s /usr/local/Cellar/pypy/pypy-src/pypy/translator/goal/translate.py\nThis code was taken from the above talk. #fib.py def fib(n): if n \u0026lt; 2: return 1 else: return fib(n-1) + fib(n-2)``def main(argv): print fib(int(argv[1])) return 0``def target(*args): return main, None``if __name__ == '__main__': import sys main(sys.argv)\nTo make rpython work it requires a method called target, which must return an integer and to make it work as a python script it requires the final if statement.\nPython script Lets run this as a python script using CPython: time python fib.py 34 9227465``real 0m3.319s user 0m3.285s sys 0m0.015s``Now with pypy:``time pypy fib.py 34 9227465``real 0m1.017s user 0m0.586s sys 0m0.032s\nRPython\nTo run with RPython, we first we need to compile the script pypy translate.py fib.py\nThis will generate the executable fib-c: time ./fib-c 34 9227465``real 0m0.081s user 0m0.077s sys 0m0.003s\nSummary So pypy is about 3 times faster than CPython, and the compiled RPython is about 12 times faster than pypy.\nHowever a Fibonacci calculator is a significantly less complicated than a SAT solver, both in terms of computation and code complexity.\nGiven the understandability of python code, and effeiency gained when compiling it to C, I think two of my initial requirements are satisfied.\nSome potential modularity may be sacrificed because of the lack of som dynamic aspects of python. Soon, I hope to have a simple SAT solver implementation in RPython. At which point which I will write about it\u0026hellip;\n","permalink":"https://maori.geek.nz/posts/2014/2014-01-12_rpython-compiling-python-to-c-for-the-speed/","summary":"\u003ch4 id=\"sat-solver\"\u003eSAT Solver\u003c/h4\u003e\n\u003cp\u003eLast night decided that I am going to write a SAT solver. But I had some conflicting requirements:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003eI want it to be fast (SAT solvers are very resource heavy)\u003c/li\u003e\n\u003cli\u003eI want it to be modular (Easily replace parts for more efficiency)\u003c/li\u003e\n\u003cli\u003eI want the code to be very understandable.\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003e\u003ca href=\"http://www.sat4j.org/\"\u003eSAT4J\u003c/a\u003e is the SAT solver I am most familiar with. It is as efficient as Java lets it be, it is very modular, and it is reasonably understandable on the surface, but the lower you get the more it is obfuscated.\u003c/p\u003e","title":"RPython: Compiling Python to C (for the speed)"},{"content":" The exact definition is “RPython is everything that our translation toolchain can accept” :)\nThe above quote is from the coding guidelines for RPython. RPython is not a typical language, in that it is not described by a syntax, but is defined by whether or not a tool chain can compile the code.\nRPython is a subset of the Python language. That is, any RPython code can run in a Python interpreter. The difference is that you can compile RPython code, with the RPython tool-chain down to C code. So the advantage of RPython is speed after compilation, and the disadvantage is that you cannot use all of Python’s features.\nThe borders and rules of RPython can be blurry, context dependent, and very difficult to understand. Teaching RPython by example can be very tricky as sometimes code will work, and sometimes it wont! Therefore, tutorials for RPython are very different from other languages, e.g. the vague rules expressed in the coding guidelines.\nRPython is a language that is largely learnt by trial and error. I have only just scraped the surface, but I will post what I have learnt so far.\nStatic Typing To demonstrate RPython I will give a few examples. The main code will be wrapped inside the main function, with other functions defined outside, i.e.: #FUNCTIONS HERE``def main(argv): #MAIN CODE HERE return 0``def target(*args): return main, None``if __name__ == '__main__': import sys main(sys.argv)\nTo set up the RPython tool-chain to compile these examples, please look at my other post.\nThe first example shows the dependence on context the code can have: #FUNCTIONS def add(x,y): return x + y``#MAIN CODE print add(1,2)\nThis code is RPython code as it will compile and output the integer 3. However, if you add the code add(\u0026lsquo;Graham\u0026rsquo;,\u0026lsquo;Jenson\u0026rsquo;), it will fail to compile with a large stack trace including the error: [translation:ERROR] In \u0026lt;FunctionGraph of (rtest:9)main at 0x103f7dc58\u0026gt;: [translation:ERROR] Happened at file rtest.py line 11 [translation:ERROR] [translation:ERROR] print add(1,2) [translation:ERROR] ==\u0026gt; print add('Graham','Jenson')\nNow, if you delete the line add(1,2) it will compile again. This is because RPython will generate statically typed functions, and if its input or return value are different at different parts of your code, then it will throw an error.\nHINT 1: Remember the types of your functions\nPython Functions\nMany functions that are defined in the core Python may work in RPython, may not work, or may work with certain arguments. For example \u0026lsquo;Graham Jenson\u0026rsquo;.split() is not RPython, but \u0026lsquo;Graham Jenson\u0026rsquo;.split(\u0026rsquo; \u0026lsquo;) is.\nHINT 2: Be careful when using previously defined functions\nThe addition to this hint is that many of Python\u0026rsquo;s built-in functions that will not work, e.g. map.\nThe work-around to this is to redefine these functions yourself, there may be a RPython utility library somewhere with these defined.\nHINT 3: Redefine built-in functions that you want to use\nAs an example, here is the function I use to read_lines from a file and return a list: import os def read_lines(inputfile): f = os.open(inputfile, os.O_RDONLY, 0777) x = os.read(f,1) lines = [] tmpstr = \u0026quot;\u0026quot; while x != '': if x == '\\n': lines.append(tmpstr) tmpstr = \u0026quot;\u0026quot; else: tmpstr += x x = os.read(f,1)``lines.append(tmpstr) f.close() return lines\nHigher Order Functions Lambdas in RPython seem not to work. However, you can still define and use higher-order functions (functions that take functions as parameters).\nHere is an RPython example using a custom map: #FUNCTIONS def map(fun,ls): nls = [] for l in ls: nls.append(fun(l)) return nls``def add_one(x): return x + 1``#MAIN CODE map(add_one, [1,2,3])\nHINT 4: Don\u0026rsquo;t use lambdas, use functions\nOperators Watch out when using operators as Python will attempt to call \u0026lsquo;rich\u0026rsquo; methods that may not be defined e.g. for == operator call the eq method: #FUNCTION def hello() return 'hi' #MAIN CODE hello() == None\nThis will break with the error: MissingRTypeOperation: unimplemented operation: \u0026rsquo;eq\u0026rsquo; on (, )\nYou can fix it by replacing the == is an is, i.e. hello() is None.\nNote: for some reason \u0026lsquo;a\u0026rsquo; == None is RPython, does anyone know why.\nHINT 5: Use explicit functions instead of operators\nWork Flow I found the biggest advantage to working in RPython is the fact that at any point you can execute your code in the python interpreter and see what happens. This means tools like the assert statement can be used to help write valid Python code, but are ignored by the RPython compiler when optimisation is required.\nMy workflow quickly became\nWrite and get it working in Python Compile with RPython tool-chain and fix any errors that occur Rinse and Repeat HINT 6: Make it work in Python before worrying about making it work in RPython\nSAT Solver: Case Study Previously, I posted an article about RPython in which I compared the performance between interpreted python and compiled RPython when calculating Fibonacci numbers. I noted that this was a very basic experiment and further, more complicated examples were needed.\nTo compare Python against compiled RPython in a more complicated function I created an object oriented, conflict driven, learning SAT solver called SATRPy (SATrippy).\nIf you do not know what a Boolean Satisfiability (SAT) problem is, it is basically trying to answer the question\nFor a given Boolean formula, can you assign values to the variables to make the equation true?\nIf the answer is yes the problem is said to be satisfiable, if it is no then the problem is said to be unsatisfiable. A SAT solver is a function that tries to find if a problem is satisfiable or not. The SAT problem is most (in)famous as it is the first identified NP-Complete problem. I have studied SAT problems towards my research into Component Dependency Resolution and found the algorithms and heuristics around this domain very interesting.\nRPython SAT Solver I found it fun to implement a complex function using RPython . The lack of the dynamic aspects of Python can be worked around, and once you get used to not using them.\nThe biggest annoyance for me was the time it takes to compile the source. Generally, I would compile just to identify problems with the RPython code. This problem is a result of the lack of definition for the language. That is, to find out if my code is RPython, I have to ask RPython.\nThe most interesting RPython \u0026lsquo;hack\u0026rsquo; I used to get SATRPy working involved the heap implementation I used. Previously when implementing a SAT solver I found the priority queue ( that is used to order literals in order to be chosen) to be incredibly slow. This was because I was using the built in priority queue that came in the Java standard library. To solve the issue I ended up borrowing the heap queue from SAT4J.\nTo get an efficient heap implementation I edited the heapq core implementation. First by removing the reference to the C implementation (I am going to compile the heap to C with RPython anyway). Then I changed the cmp_lt function from return (x \u0026lt; y) if hasattr(x, '__lt__') else (not y \u0026lt;= x)\nto the RPython friendly return x.heur() \u0026lt; y.heur()\nWhere heur() returns the value of the heuristic that is used to sort the items.\nExperiment \u0026amp; Results I ran my SAT solver against 200 (100 satisfiable, and 100 unsatisfiable) Uniform Random 3-SAT problems, each have 75 variables and 325 clauses. This set was taken from the SATLIB Benchmark suite.\nIn addition to testing my solver using both the pypy interpreter and after compiling with the RPython tool-chain, I tested it against the MiniSat solver.\nTo time the results I ran the solvers over all the files and timed them with the unix time function. I used the real time value as the measurement.\nThe results are:\npypy Interpreted Solver\nSatisiable problems: 259.177s Unsatisiable problems: 742.133s Compiled RPython Solver\nSatisiable problems: 10.514s Unsatisiable problems: 59.729s MiniSat Solver\nSatisiable problems: 0.295s Unsatisiable problems: 0.441s These results show that for satisfiable problems the Compiled RPython code is about 25 times faster than the interpreted code, and 12 times faster for the unsatisfiable problems.\nIt also shows how much room for improvement there is for my algorithms and heuristics, as minisat was 35 times faster in satisfiable problems, and 150 times faster for unsatisfiable problems.\nConclusion The more complicated the algorithm the more is gained by compiling to RPython. The simple calculation of Fibonacci numbers in is 12 times faster after compilation, but finding if a SAT problem is satisfiable is 25 times faster after compiled RPython code.\nThis experiment measured the gain of using RPython as opposed to interpreted python. Although we compared my solver to minisat, this is not a fair test as minisat uses many different optimised algorithms, heuristics, and data structures. A more interesting experiment would be to re-implement minisat directly into RPython. Then we could see what is lost when using RPython instead of coding directly in C.\n","permalink":"https://maori.geek.nz/posts/2014/2014-01-12_rpython-is-not-a-language-an-introduction-to-the-rpython-language/","summary":"\u003cblockquote\u003e\n\u003cp\u003eThe exact definition is “RPython is everything that our translation toolchain can accept” :)\u003c/p\u003e\n\u003c/blockquote\u003e\n\u003cp\u003eThe above quote is from the \u003ca href=\"http://doc.pypy.org/en/latest/coding-guide.html\"\u003ecoding guidelines\u003c/a\u003e for RPython. RPython is not a typical language, in that it is not described by a syntax, but is defined by whether or not a tool chain can compile the code.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eRPython\u003c/strong\u003e is a subset of the Python language. That is, any RPython code can run in a Python interpreter. The difference is that you can compile RPython code, with the RPython tool-chain down to C code. So the advantage of RPython is speed after compilation, and the disadvantage is that you cannot use all of Python’s features.\u003c/p\u003e","title":"RPython is not a Language: An introduction to the RPython language"},{"content":"Recently, I was in need of a testing environment for Mahara’s functionality. For those that do not know, Mahara is an ePortfolio and social networking system, written in PHP, and developed right here in New Zealand.\nTo test Mahara, I came up with three possible approaches:\ndownload\u0026amp;install mahara from apt-get (with all its dependencies). Then setup and configure a local environment in which I would be working, as described here. Download a virtual machine (maybe from here) and configure it to host Mahara. Or create and deploy Mahara to my favourite app server, Heroku. I tried for an hour or so to install Mahara locally, however I just ended up where I always do, Apache config hell.\nI skipped trying to set up the VM as I was tired of looking at configuration files. So… I decided to try my hand at PHP apps on Heroku.\nI had read a post a while back about getting PHP applications deployed to Heroku. This was motivation enough to get it running, here is how.\nPre-requisites:\ngit heroku toolbelt other things First we need to download and prepare the app directory. `mkdir mahara-heroku\ncd mahara-heroku\ncurl -OL https://launchpad.net/mahara/1.6/1.6.2/+download/mahara-1.6.2.zip\nunzip mahara-1.6.2.zip\nmv mahara-1.6.2/htdocs/* .\n#clean up\nrm mahara-1.6.2.zip\nrm -rf mahara-1.6.2``#git setup\ngit init\ngit add .\ngit commit -m \u0026ldquo;Init Commit\u0026rdquo;\n`\nThe second stage is setting up the Heroku server: heroku create mahara-heroku --buildpack [https://github.com/grahamjenson/heroku-buildpack-mahara](https://github.com/grahamjenson/heroku-buildpack-mahara)\nThis will create a Heroku app with a buildpack I have forked from heroku-buildpack-php. This was inspired by ngson2000\u0026rsquo;s heroku-buildpack-mahara, though I think this is still in flux (I may go back to this when it becomes stable).\nNow lets set up the database and config: heroku addons:add heroku-postgresql:dev\nThis will return the reference to the database, e.g. HEROKU_POSTGRESQL_MAROON_URL.\nNow check the credentials: heroku pg:credentials HEROKU_POSTGRESQL_MAROON_URL\nThis will return something like Connection info string: \u0026quot;dbname=abcdefg host=ec1-2-3-4-5.compute-1.amazonaws.com port=5432 user=iamauser password=thisisapassword sslmode=require\u0026quot;\nUsing this info we can create the Mahara config config.php: \u0026lt;?php $cfg = new StdClass; $cfg-\u0026gt;dbtype = 'postgres8'; $cfg-\u0026gt;dbhost = 'ec1-2-3-4-5.compute-1.amazonaws.com'; $cfg-\u0026gt;dbport = 5432; $cfg-\u0026gt;dbname = 'abcdefg'; $cfg-\u0026gt;dbuser = 'iamauser'; $cfg-\u0026gt;dbpass = 'thisisapassword'; $cfg-\u0026gt;dbprefix = ''; $cfg-\u0026gt;dataroot = '/app'; $cfg-\u0026gt;emailcontact = '';\nOne more change to mahara is necessary as I have not got gd working with freetype yet.\nSo the comment our the lines in lib/mahara.php:81 ... //Check for freetype in the gd extension $gd_info = gd_info(); if (!$gd_info['FreeType Support']) { throw new ConfigSanityException(get_string('gdfreetypenotloaded', 'error')); } ...\nWe need to add this to git. git add config.php git add lib/mahara.php git commit -m \u0026quot;added config\u0026quot;\nLets Deploy! git push heroku master\nThen we can visit out app at http://mahara-heroku.herokuapp.com/.\nFuture TODO\u0026rsquo;s/Refactorings\nUploading zip files does not work, however setting $cfg-\u0026gt;pathtounzip in your config file to point at an unzip binary (that you can add to your project” will fix this. Get GD with Freetype working. See UPDATE Getting rid of error warnings, by altering php.ini in the buildpackage. If you want background workers, or to execute the ~bin/php on the server, you need to add the paths heroku config:add LD_LIBRARY_PATH=/app/php/ext:/app/apache/lib. Heroku will wipe Mahara data, as it is set to a local folder One of the more difficult TODOs, is getting GD with Freetype working.\nI have attempted this by recompiling php (as suggested here with GD and Freetype. I have had limited success; here is a link to help with config errors.\nA possible solution is to use vulcan to build php for heroku.\nUPDATE:\nWith the help of ngson2000 in compiling PHP and apache, not to mention much of the config, GD with Freetype is now available.\nOverall I found this very rewarding, as it required minimal configuration. If this idea were taken further, I image it would be an excellent way of testing out Mahara.\n","permalink":"https://maori.geek.nz/posts/2014/2014-01-12_how-to-deploy-mahara-on-heroku/","summary":"\u003cp\u003eRecently, I was in need of a testing environment for \u003ca href=\"https://mahara.org\"\u003eMahara\u003c/a\u003e’s functionality. For those that do not know, Mahara is an ePortfolio and social networking system, written in PHP, and developed right here in New Zealand.\u003c/p\u003e\n\u003cp\u003eTo test Mahara, I came up with three possible approaches:\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003edownload\u0026amp;install mahara from apt-get (with all its dependencies). Then setup and configure a local environment in which I would be working, as described \u003ca href=\"https://wiki.mahara.org/index.php/System_Administrator%27s_Guide/Installing_Mahara/How_to_install_Mahara_in_Ubuntu\"\u003ehere\u003c/a\u003e.\u003c/li\u003e\n\u003cli\u003eDownload a virtual machine (maybe from \u003ca href=\"http://www.turnkeylinux.org/mahara\"\u003ehere\u003c/a\u003e) and configure it to host Mahara.\u003c/li\u003e\n\u003cli\u003e\u003cstrong\u003eOr\u003c/strong\u003e create and deploy Mahara to my favourite app server, Heroku.\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eI tried for an hour or so to install Mahara locally, however I just ended up where I always do, Apache config hell.\u003c/p\u003e","title":"How to Deploy Mahara on Heroku"},{"content":"I recently decided to create some mapping visualisations. Mostly because using a map is an awesome way to present many data sets, and creating such visualisation is a skill I lacked.\nSo I looked around and found that D3.js has geographical features.\nI had used D3.js in the past on projects like 100 companies, so I understood how to use it and could apply that knowledge to make visualisations with maps.\nIn this post I will go over a few examples of how to use D3’s geographical API to create visualisation with maps.\nNote: I also wrote a post about creating and editing maps for use by D3 here\nExamples Here is an example I edited from here.\nhttp://bl.ocks.org/grahamjenson/raw/8168412/\n... var width = 480, height = 250;``var projection = d3.geo.equirectangular() .scale(75) .translate([width/2,height/2]) .rotate([-180,0]);``var path = d3.geo.path() .projection(projection);``var svg = d3.select(\u0026#34;#js-map-nz-center\u0026#34;).append(\u0026#34;svg\u0026#34;) .attr(\u0026#34;width\u0026#34;, width) .attr(\u0026#34;height\u0026#34;, height);``svg.selectAll(\u0026#34;.graticule\u0026#34;) .data([topojson.object(worldtopo, worldtopo.objects.land)]) .enter() .append(\u0026#34;path\u0026#34;) .attr(\u0026#34;class\u0026#34;, \u0026#34;land\u0026#34;) .attr(\u0026#34;d\u0026#34;, path); ... The basic steps are:\nCreate a projection function. Create a path function. Using a GEOJson object as the data, draw the map using the path function. In my opinion understanding these three things (along with d3.js in general) is all you need to understand this library.\nProjection function The projection function takes a location [longitude, latitude] and returns a Cartesian coordinates [x,y] (in pixels).\nThe pros and cons of many projections are well explained by xkcd.\nThe other functions that were used are:\nscale is the linear scale to scale the map. rotation rotates the entire map. translate, moves the returned points. Scale Scale is the function that determines the scale transformation from a location (latitude and longitude) to point (x,y).\nHere is an example that demonstrates its purpose.\nhttp://bl.ocks.org/grahamjenson/raw/8192666/ setInterval(function(){ currentScale = (currentScale + 1) % 350; projection.scale(currentScale); svg.selectAll(\u0026quot;.land\u0026quot;) .attr(\u0026quot;d\u0026quot;, path); },100);\nRotation Rotate takes [longitude, latitude, roll] and moves the projection (roll is defaulted to 0 if none is given). To centre the map on a specific location then negative values are necessary, i.e. [- longitude, - latitude].\nExample of rotation:\nhttp://bl.ocks.org/grahamjenson/raw/8192485/ setInterval(function(){ currentRotation += 1; projection.rotate([currentRotation,0]); svg.selectAll(\u0026quot;.land\u0026quot;) .attr(\u0026quot;d\u0026quot;, path); },100);\nTranslate Translation moves each point that is drawn. This function makes no assumptions about the projection, and thus takes a point as argument.\nhttp://bl.ocks.org/grahamjenson/raw/8192665/ setInterval(function(){ currentX = (currentX + 1) % width; projection.translate([currentX,height/2]); svg.selectAll(\u0026quot;.land\u0026quot;) .attr(\u0026quot;d\u0026quot;, path); },100);\nPath function The path function translates GEOJson features into svg path data.\nGEOJson features\nGEOJson is a JSON format for encoding geographic data structures (features). The worldtopo object (in the code above) is a compressed set of GEOJson objects. The compression is handled by the topojson library.\nA GEOJson feature looks like {type: \u0026quot;Point\u0026quot;, coordinates: [-180,0]}\nSome GEOJson features are:\nPoint, a single point [longitude, latitude] MultiPoint, a list of points LineString, a list of points (they are meant to be connected) MultiLineString, a list of LineStrings Polygon, a list of LineStrings (they will be closed) MultiPolygon, a list of Polygons All features can be handled by the path function.\nAn example I have created is an approximation of James Cook\u0026rsquo;s first voyage.\nhttp://bl.ocks.org/grahamjenson/raw/8192663/ cook = {\u0026quot;type\u0026quot;: \u0026quot;LineString\u0026quot;, \u0026quot;coordinates\u0026quot;: [[-4.1397, 50.3706], [-43.2436, -22.9083] , [-67.2717, -55.9797] , [-149.4500, -17.6667], [172.1936, -41.4395] ,[151.1667, -34] , [147.70, -18.3] ,[106.7, -6], [18.4719, -34.3], [-5,-15], [-25.6, 37.7],[-4.1397, 50.3706]] }``svg.selectAll(\u0026quot;.geojson\u0026quot;).data([cook]) .enter() .append(\u0026quot;path\u0026quot;) .attr(\u0026quot;class\u0026quot;,\u0026quot;geojson\u0026quot;) ..attr(\u0026quot;d\u0026quot;, path);\nI can easily see the possibilities for such a format to be used in many projects.\n[longitude, latitude] Gotcha\nTo find the longitude and latitude of any place in the world you can use Google.\nThe problem with this method is that the returned results are backwards to mathematical and programming convention. The first measurement is latitude then longitude, which is the y co-ordinate before the x.\nAlso, instead of using negative values they may use South, or West. For example, 30.1S, 20.2W will translate to [-20.2,-30.1].\nThese difficulties came about because of my lack of experience with geographic co-ordinate systems.\nAuto Scaling Projection to a GEOJson feature When using this library I found few utility functions available. One utility I would have found useful would be an auto scaling function for rendering maps of an appropriate scale for a particular GEOJson feature.\nThere is a bounding function d3.geo.bounds, however there is a gotcha with this function. On a sphere (the earth) given any two points returned from the bounding function, TWO squares can be calculated. The smaller square and that squares inverse. For example, if a person travelled the length of New Zealand, their bounding box would be the same as a person who travelled around the world from the top left point of NZ to the bottom right point. I found this out when plotting Cooks voyage above.\nAnother function provided is finding the centre of a feature. By finding the centre, and measuring the distance from one of the corners of the bounding box, the real box can be found and the scale calculated.\nOnce again another annoying gotcha. The distance between two points is not the same as on a plane. I found this algorithm (which assumes the earth is a sphere) that calculates the distance between points. calcDist: (p1,p2) -\u0026gt; #Haversine formula dLatRad = Math.abs(p1[1] - p2[1]) * Math.PI/180; dLonRad = Math.abs(p1[0] - p2[0]) * Math.PI/180; # Calculate origin in Radians lat1Rad = p1[1] * Math.PI/180; lon1Rad = p1[0] * Math.PI/180; # Calculate new point in Radians lat2Rad = p2[1] * Math.PI/180; lon2Rad = p2[0] * Math.PI/180;``# Earth's Radius eR = 6371; d1 = Math.sin(dLatRad/2) * Math.sin(dLatRad/2) + Math.sin(dLonRad/2) * Math.sin(dLonRad/2) * Math.cos(lat1Rad) * Math.cos(lat2Rad); d2 = 2 * Math.atan2(Math.sqrt(d1), Math.sqrt(1-d1)); return(eR * d2);\nOne final gotcha is that a point on a map can be zoomed infinity as it covers 0 area. Therefore it is important to ensure that you define limits on the zoom.\nThe final code: [x,y] = d3.geo.bounds(feature)[0] [xc,yc] = d3.geo.centroid(feature) distToCenterOfBbox = @calcDist([x, y],[xc,yc])``minScale = 79 maxScale = 300 scaleCalc = d3.scale.linear().range([maxScale,minScale]).domain([0,5000]).clamp(true) s = scaleCalc(distToCenterOfBbox) projection = d3.geo.equirectangular().scale(s)\nThis was hastily written, and is therefore not perfect code (e.g. the scaling function needs to take into account Pythagoras).\nOver all impression After using the geo functionality provided in d3.js I was able to get a mapping visualisation up and running. There was a significant amount of learning on my part to understand the co-ordinate system and GEOJson. However, once these hurdles were overcome I was able to quickly and easily create the visualisations that I wanted.\nLearn More Learn more from :\nData Visualization with D3.js Cookbook\nOr\nInteractive Data Visualization for the Web\nFuture You may have noticed the efficiency is horrible in these examples. To increase efficiency a dynamic simplification algorithm is needed (like the one implemented here) with auto-scaling. With such an algorithm the precision of the projections can be based on the size and scale. An algorithm to simply paths, may be less expensive that rendering unnecessary path data.\n","permalink":"https://maori.geek.nz/posts/2014/2014-01-12_drawing-maps-with-d3.js-and-other-geographical-fun/","summary":"\u003cp\u003eI recently decided to create some mapping visualisations. Mostly because using a map is an awesome way to present many data sets, and creating such visualisation is a skill I lacked.\u003c/p\u003e\n\u003cp\u003eSo I looked around and found that \u003ca href=\"http://d3js.org/\"\u003e\u003cstrong\u003eD3.js\u003c/strong\u003e\u003c/a\u003e has geographical features.\u003c/p\u003e\n\u003cp\u003eI had used D3.js in the past on projects like \u003ca href=\"http://100companies.co.nz/\"\u003e100 companies\u003c/a\u003e, so I understood how to use it and could apply that knowledge to make visualisations with maps.\u003c/p\u003e\n\u003cp\u003eIn this post I will go over a few examples of how to use D3’s geographical API to create visualisation with maps.\u003c/p\u003e","title":"Drawing Maps with D3.js and Other Geographical Fun"}]