newer posts
older posts

Solar Inverter Monitoring

2024-07-18

I installed a hybrid solar inverter and battery bank in my home a little while ago. I have panels on the way right now, but in the meantime I've had the inverter up and running as a whole-house UPS for a little while.

But, what does this have to do with computer hackery and comp-sci theory you ask? It's a fair question and I'm getting there. The SolArk, like most modern pieces of technical hardware, comes complete with a bunch of web integration. You plug in a little WiFi dongle and your data gets wisked off to a datacenter somewhere, where you can have the privledge of accessing it in some limited ways through a mediocre web interface and maybe some annoying REST APIS. I found this unsatisfactory for a number of reasons

And my solution was this

Solark Dashboard

First Steps

Pulling the data

But, the Solark it my needs really well otherwise. So, rather than whine and winge about it I decided to see what would be required to get the data off the device myself. I thought this would be a HUGE project, but as it turns out it wasn't at all. With a little searching online I found that the solark speaks modbus, and they actually published a spec for it . You can talk to the solark over a couple of different interfaces, but I already had a USB->RS232 adapter lying around, so I plugged it in and after playing around in python for a little while I started getting data.

Logging the data

I already had an IoTaWatt for monitoring power usage in my home. It's both open-source hardware and software, and it supports pushing the resulting data into InfluxDB. So, I'm also running InfluxDB on my home server. I actually used this set of tooling to help understand my power usage so I could properly size my solar build. Anyway, the point is I already have my power usage data in InfluxDB, and it'd be great to have my solar data in the same place. So I dug up a python library for influxDB and pretty soon I had my data streaming into InfluxDB... pretty cool.

Alerting the data

Well, that's all well and good, but I have to keep looking at my dashboards. If my power goes out my WHOLE HOUSE is backed up, so there's no obvious way to even know that my power is out. That's... not great. So, I want my wife and I to get notifications somehow. I asked my wife what she'd prefer and she it should just come to our cellphones. That made sense to me. The two of us currently use the matrix chat protocol to talk to each other wherever we are. Though, we currently use an external server for this (so it won't work our communications are out), it's fully open source and I can set up and run my own server... and probably will.

So, I dug up a matrix library. This turned out to be the hardest part, especially as the first library I tried wasn't flexible enough for what I needed. I ended up moving the entire program from asynchronous, to synchronous, and then back to asynchronous again, but I eventually got it working. And, here's the result

Getting Serious

I ran that program for maybe a month and eventually got it largely working, but still felt... sketchy. In python it's hard to have any confidence in codepaths that aren't actively tested. This program is really just communication at it's heart, which means a lot of possible points of failure. But, then I hurt my foot and had a lot of time to kill, and I'd been meaning to learn rust, so I rewrote the whole thing

This was my first real project in rust and it didn't disappoint. As I expected the typesafety gave me far more trust in corner cases that I'm unlikely to bother testing. The "?" operator is a really convenient cross of exceptions and error return codes. I found that it helped me think about all the error cases without adding much complexity to the code. That, combined with the single mutable reference restrictions in rust did make some functional idioms harder to use, and I definitely tried a few cute "Build up all the futures and then run them" sorts of tricks that didn't work... but all in all it's better than any other language I've tried for this type of code.

Final notes

The final result is one of the few personal projects I could see someone else actually wanting to run, with proper configuration options and files, and all of that jazz. The real takeaway here though is that I had a problem, and I solved it, and while not *easy*, it wasn't that hard either. I run my new monitor on system startup, so I don't have to think about it and hopefully it will just hum away logging what's going on and letting me know when something interesting happens.

If you're in the space you may wonder why I didn't integrate with Home Assistant. The answer is that Home Assistant is cool and open source and stuff but... not really up my ally. It feels large and bloated, it's nearly impossible to run without virtualization, it's intended for people who like clicky-button interfaces rather than configuration-as-code, and I just don't need it. Having Home Assistant play middleman would, if anything, obscure the simplicity of what's going on here, adding layers of abstraction, conversion, protocols, DSLs that would be in my way.


Linux Software I use

2023-03-24

I find that a large percentage of my time "tweaking" linux is spent searching for software that I like. I've slowly built up a toolset that's lightweight and stable, and does the things that I generally need to do - and I thought that *maybe* someone else would find this list useful too. I am sure there are better ways to do some things, but I'm pretty happy with most of how my setup works.

Rather than try and come up with a list of programs I use, Gentoo already has this list sitting around for me in the form of the world file. For non-gentoo folks, the world file is a list of every package I explicitly installed, and doesn't include dependencies. So it's pretty much everything I use. I did drop some stuff from the list that's just boring, like fonts and git.

One reason this list might be interesting to someone is that this configuration is 100% wayland. I'm not running X or XWayland. So if you're looking for a wayland solution to something, this list might be a helpful list. Another reason is that I strongly dislike heavyweight software. I will generally choose the lightest-weight option available. I do still have thunderbird on this list, I'm not saying I always use the most minimal option. Sometimes I just want to look at a darn graphical calendar, but the bias in this list is clear.

Lastly you may notice that my configuration isn't over-engineered, in fact it's very under-engineerd. I have hard-coded paths just written into scripts. This is partly because this really is what I run, not some cleaned up configuration I created for posting. It's also because I don't like over-engineering. As is I can fix this stuff really easily. My notifications aren't working? Oh yeah, that wav file doesn't exist, eh, just find a new one. It's no harder to modify notifications.py than it is to modify the config that calls it. I'm probably erring too far on the side of lazyness, but hey, it works.

I've dropped some of the less interesting entries from the world file, things like git, fonts, etc.

Not every piece of software I use is in the portage package system (or maybe it is but I haven't got searching for the overlay). I should probably write ebuilds for these and be all cool and Gentooy, but I haven't.

Then I run a few things on my server:

But... how do you configure sway you ask? Never fear, here's my sway and waybar configs from my laptop. It's a pretty vanilla config, not much to see here. The notification bits are probably the most interesting

Then a couple of short scripts I'll just post inline. The volume script is interesting because it changes ALL the pulseaudio volumes. This means shortcuts using it that get attached to e.g. the volume buttons on my phone, work when using plug-in headphones, a bluetooth speaker, USB headphones, or the built-in speaker. I've tried a lot of scripts over the years and this works the best so far. It DOES occasionally cause a weird jump in the settings if something else touched a volume, but I prefer everything to just adjust together for simplicity

volume

change=0 device=alsa_output.pci-0000_00_1f.3.analog-stereo cur_vol=$(pactl get-sink-volume ${device} | awk '/front-left:/{gsub("%",""); print $5}') new_vol=$((cur_vol + ${change})) echo $new_vol pactl list sinks | awk '/Name:/{print $2}' | while read SINK; do pactl set-sink-volume $SINK ${new_vol}% done

lcd_brightness

#!/bin/bash v=$(cat /sys/class/backlight/intel_backlight/brightness) expr $v + $1 >> /sys/class/backlight/intel_backlight/brightness

It's possible to get w3m to display images in foot, even over ssh, which is pretty cool. I can run mutt on my server (I have it both places) and view images in an email if I want! This took a lot of poking around the internet, digging through configs, and guessing to figure out - and I've not seen anyone mention it anywhere, so I'm going to add it here. To convince w3m to display images in foot select "img2sixel" for the "inline image display method". For that to work img2sixel needs to be installed, which is part of libsixel in Gentoo. I have w3m-0.5.3_p20230121 built with imlib, gpm, ssl, and unicode use flags. I also have fbcon set, but I'm 95% sure it's not needed. As long as the terminal supports sixel (a format for displaying images in terminal emulator) it'll work. If it's working w3m www.google.com will display the google logo as an image.


PinephonePro with Gentoo

2023-03-22

In this post I'm going to skip the why and jump to the what. I crosscompiled Gentoo for my Pinephone Pro with Pinephone keyboard, and got it working pretty nicely, and I wanted to document some of the process for others.

Obligatory photo of phone in operation

In retrospect the easiest path to get close to what I'm running would be to install mobian and then just use sway instead of phosh etc. One advantage of Gentoo though is the ability to run wayland-only which saeves on resources. An easier way to get Gentoo would be to just download an arm64 stage 3, downlad a mobian image, replace the userland with the stage3 tarball, and you'd be most of the way there. That is not the path I took though. One downside of that approach is that you aren't left with a matching cross-compiler environment for building packages requiring more than the 4GB of ram on the Pinephone Pro. Whether it's worth the extra effort, is up to you.

What is this post's purpose? I'm an experienced software engineer and long-time Linux user and admin. I've run Gentoo in particular before, and recently came back to it. That said, this was my first time building or using a cross-compiler or doing any sort of project like this. My target audience is folks similar to myself. The following is not intended as instructions for beginners. It's an approximate outline that I hope can help others with similar experience avoid a lot of dead ends, and many many hours of googling, or enable someone with a bit less knowledge to pull this off at all.

I'm kind of joking, but so far I haven't tried to get GPS, or the cell modem to work. My priority was to make it useful like a laptop. I have tried to get the camera to work. The client software is called "Megapixels". It installs fine, and I've heard works on the pinephone, but the pinephone pro needs some kernel patches that are not widely used yet. I DO have full convergence working with the kernel config I posted. I can use a USB or bluetooth keyboard, a USB-C HDMI hub, an external HDMI monitor, a USB keyboard, and an ethernet device built indo the USB-C hub, and it all works great. My bluetooth headphones work well, my USB headset works fine. I have the power-button suspend via elogind. My volume buttons work. I can change screen brightness with keyboard shortcuts. If you DO use my config and do anything I don't, you'll probably need to enable those features :).

Honestly, if you manage to truly brick your phone I'm impressed, it's not easy with the pinephone pro... but should you pull it off you're doing this at your own risk, I'm not liable, not my fault, bla bla bla. This just my blog. I don't know what I'm doing.

obligatory photo of pinephone pro doing convergence stuff

Pinephone Tips

At this point hopefully you have gentoo booting on your pinephone. But that's a ways from having everything you want working. Since we've basically set up the pinephone like a laptop you can test things out on a laptop and then apply the changes to your pinephone if that's helpful. But here's some ideas

As I mentioned I'm using wayland only swayWM. Some software I like

I'm hoping to write up another post about my desktop configuration and using sway soon, for less pinephone specific information. But there you go.

This is not the easiest process, but compared to your average "build your own OS for arbitrary device" project I suspect this is downright trivial. If you're good with linux and understand how it all works this actually isn't all that hard. I had to do a lot of reading to e.g. work out what the cflags should be, which bootloader to use, the best way to get a bootable system, etc. Hopefully this set of psuedo-instructions will save you those headaches and make the project only a little more involved than a typical gentoo install.

Lastly, other people blazed this trail already. First the all the folks who patched the Linux kernel, wrote firmware for the pinephone keyboard, etc. And then folks who build Gentoo for it before me. I just followed in there footsteps. https://gitlab.com/bingch/gentoo-overlay/-/blob/master/README.md is the best resource I found https://xnux.eu/howtos/build-pinephone-kernel.html Is where I found megi's sources (though I ended up using the ones from bingch). Megi did a lot of the work of writing patches and collecting disprate patch sets together to get the pinephone to really work well. I know I used a couple of other sources for gentoo-specific pinephone knowledge, but have forgotten what they worry, so appolagies for not citing you, whoever you are.


String Sorts

2019-01-23

You'll recall in my recent post about Fast Sort we learned that Radix Sort is significantly faster at sorting 32 bit integers than normal comparison based sorts, particularly as the array gets large. A result every computer scientist knows theoretically, yet with median-find we saw how theoretical bounds don't always translate in practice.

Well, a friend asked me, what about sorting variable length strings? Does radix sort still beat comparison based sorts? It's a fair question so, I decided to take a shot at it. And here are my results:

In my mind that graph is pretty clear, above 10 elements radixsort is clearly winning. Note that this is currently a 128 bit radix-sort that only handles ASCII... though I'm actually only feeding it uppercase strings currently. So, lets talk about how this algorithm works, because it's not an entirely trivial conversion of radixsort

String Radix Sort Algorithm

This is a little bit interesting. You may recall that there are two types of radix-sort. Least Significant Digit first, and Most Signicant Digit first. These are referred to as LSD and MSD. My binary radix sort from earlier benchmarks was an example of an MSD sort, and the one I just referred to as "radix sort" is an LSD sort. LSD sorts are preferred generally because they are stable, simplier to implement, require less temp space AND are generally more performant.

There's just one problem. With variable length strings, LSD sorts don't work very well. We'd have to spend a lot of time scanning over the array just looking for the longest array so we can compute what counts as the smallest significant bit. Remember that in lexicographic ordering it slike all the strings are left justified. The left-most charactor in each string is equivelent in precidence, not the rightmost.

MSD sorts, must be recursive in nature. That is, they need to work on only the sublist we sorted in to a certain bucket so far. I'm going to call this sublist a "slice". To keep our temporary space vaguely in order I'm using a total of 5 lists.

Here's the algorithm. Start by looking at the first bytes of the strings. Look in slice list A, and get the next slice. Bucket everything in this slice. Each of these buckets (if non-empty) becomes a new slice, so write strings back out to string list B, and write the index of end each slice in to string list B. Swap lists A and B, move to the next byte, and do it again. We terminate when for each slice it's either of length 1, or we run out of bytes. To see the full implementation take a look at string_sort.h in my github repo .

Conveniently, they way my algorithm works it is in fact stable. We walk the items in order, bin them in order, then put them in the new list still in order. If they are equal there is no point where they'd get swapped.

It's a LOT of temporary space, which is kind of ugly, but it's pretty performant as you saw above. Another optomization I haven't tried is short-circuiting slices of length 1. We should be able to trivially copy these over and skip all the hashing work. Testing would be required to see if the extra conditional was worth it... but It seems likely

Data tested on

To test this I'm generating random strings. It's a simple algorithm where, with a probability of 9/10 I add another random uppercase letter, but always stopping at 50 charactors. I'm mentioning this because obviously the distribution of the data could impact the overall performance of a given algorithm. Note that this means functionally we're only actually using 26 of our 128 buckets. On the other hand, real strings are usually NOT evenly distributed, since languages carry heavy biases towards certain letters. This means my test is not exactly represenative, but I haven't given it a clear advantage either.

Conclusion

I can't say that this is a clear win for Radix Sort for sorting mixed-length strings. The temporary space issue can be non-trivial, and certainly time isn't always worth trading for space. We're using O(3N) additional space for this sort. That said, there are some obvious ways to reduce the space overhead if you need to, e.g. radix-sort smaller chunks of the array, then merge them. Use 32 bit instead of 64 bit pointers, or come up with a cuter radix-sort.

Note that my radix-sort was a mornings work to figure out the algorithm, write and validate an implementation, find a couple optomizations, and benchmark it. I wrote this post later. Oddly "inline" made a huge difference to gcc's runtime (it's needed due to loop unrolling for handling the A/B list cases). In any case, I've little down someone can beat my implementation, and maybe find something using a bit less space. I just wanted to prove it was viable, and more than competitive with comparison based sorts.


Median Find

2018-12-19

Similar to Radix Sort, I thought it might be interesting to see how worst-case linear time medianfind actually performed. Since the algorithm is identical to expected-case linear-time medianfind (usually called quickselect), except for the pivot selection, I elected to add a boolean to the template to switch between them (since it's in the template it'll get compiled out anyway). Before we go in to the results, here's a quick refresher on these algorithms:

Problem Statement

Imagine you have a sorted array. If you want the K'th largest element, you can simply look at the element in location K in the array. Comparison-based sorting an array takes O(Nlog(N)) time (strictly speaking the theoretical limit is log-star, but it doesn't really matter). What if we want to do this without sorting the array first?

Quick Select

Quick select chooses a pivot and runs through the array throwing elements in to 2 buckets... smaller and larger thanthe pivot. Then it looks the number of elements in the buckets to tell which one contains the k'th element, and recurses. We can prove we usually choose a good pivot and this is expected O(N) time. But it's worst-case is much worse.

Worst-case-linear Quick Select

What if we always chose the right pivot? Or at least... a good *enough* pivot. This is how we build our worst-case-linear quick select algorithm. It's a really cool trick, but it's been covered in many places. So if you want to know how it works you can check wikipedia, or this nice explanation .

Real World performance

All of that is great in theory, but what is the *actual* performance of these things... well, in a benchmark, but at least on a real computer.

As usual I'm working on an array of test_size k/test_size times, so we work over the same number of array-cells at every point on the graph: small arrays many-times on the left, and large arrays fewer-times on the right.

For a while I was pretty upset about these results. The runtimes for "lineartime" quickselect look more like quicksort (the algorithm displayed as the "sort" line) then they do like basic quickselect. In short... that doesn't look linear at all. What the heck?

I must have a bug in my code right? This algorithm was proved linear by people much smarter than me. So, my first step was to double-check my code and make sure it was working (it wasn't, but the graph above is from after I fixed it). I double, triple, and quadrouple checked it. I wrote an extra unittest for the helper function that finds the pivot, to make sure it was returning a good pivot. Still, as you see above, the graph looked wrong.

I finally mentioned this to some friends and showed them the graph. Someone suggested I count the "operations" to see if they looked linear. I implemented my medianfind algorithm using a seperate index array. That way I could output the index of the k'th element in the *original* array. From there everything is done "in place" in that one index array. As a result, swapping two elements is my most common operation. That seemed like a pretty accurate represention of "operations". So, here's what that graph looks like.

Now THAT look's a bit more linear! It's not exactly a flat line, but it looks asymptotic to a flat line, and thus classified as O(N). Cool... So, why doesn't the first graph look like this?

Real machines are weird. That index array I'm storing is HUGE. In fact, it's twice the size of the original array, because the original is uint32_t's and my index array is size_t's for correctness on really large datasets. The first bump is similar in both graphs, but then a little farther to the right in the time graph we see it go crazy... that is probably the algorithm starting to thrash the cache. Presumably if I made it big enough we'd see it go linear again. That said, if we go that large we're soon running on a NUMA machine, with even more layers of slowness, or hitting swap.

So, should you *ever* use guaranteed linear-time medianfind? Probably not. If there is a case it's vanishingly rare. It happens pivot-selection distributes well, so there's probably a use there? But, if you just used the "fastsort" we talked about in my last post you'd get even better performance, and it's still linear, AND it distributes well too! It's not comparison-based of course, but there are very few things that can't be radixed usefully with enough of them, and if you're stubborn enough about it.

Big O notation didn't end up proving all that useful to us in this case did it? Big O is usually a good indicator of what algorithm will be faster when run on large amounts of data. The problem is, what is large? Computers are finite, and sometimes "large" is so large that computers aren't that big, or their behavior changes at those sizes.

newer posts
older posts