Tiernan's Comms Closet

Geek, Programmer, Photographer, network egineer…

Currently Viewing Posts in Tutorial

Day 61 of #100daysofhomelab – swapping disks in a Hetzner Dedicated Machine

It’s been a while… So, for Day 61 of , I thought I should write up how to swap a disk in a Hetzner Dedicated Machine.

I have a dedicated server I rent from Hetzner in Germany. It has an Xeon E5-1650 V2 processor (6 cores, 12 threads, 3.5Gz base, 3.9Gz turbo), 128Gb RAM, and a pretty impressive 15 6Tb HDD. All drives are hooked to a Mega RAID controller, but because I am running ProxMox, I left it in JBOD mode and set up the 15 drives in RAIDZ-2. All 15 drives are in a single pool (probably not ideal, but it works for me). Now and again, I get a message from ProxMox telling me about bad blocks… and every time it happens, I have to remember what to do to find the bad drive, report it to Hetzner, wait for them to replace the drive and then add it back to the pool… Today, it happened, so I thought I better document it, to help future me, and hopefully someone else out there…

First, we need to find the drive in question. Usually, I’m my alerts, I get the Serial number of the drive causing problems. So, I ran the following command:

megacli -PDList -aAll | egrep "Enclosure Device ID:|Slot Number:|Inquiry Data:|Error Count:|state"

This gives me a full list of drives along with the Slot Number (needed when sending to Hetzner) and the Serial Number. the data output starts with the “Enclosure Device ID:” so when you find the Serial number, look above it for the Slot Number… so, my issue is with the disk in Slot 10. I opened a support ticket with Hetzner requesting a replacement disk. It can take an hour or more for this, but sometimes faster. Depends on their load…

Once you get a confirmation that the disk is done, you now need to swap it into the zpool.

first, we must check if the new drive is set up correctly. Run the following:

megacli -PDList -a0 | grep Firmware

We are looking for “Firmware status: Online, Spun Up”. If we have anything marked as configured, we need to run the following:

megacli -CfgForeign -Scan -a0

This shows us any foreign configurations. If that’s more than 0, we run:

megacli -CfgForeign -Clear -a0

This clears out that configuration. Next, we need the Enclosure ID and Slot number for the new drive from:

megacli -PDList -aAll | egrep "Enclosure Device ID:|Slot Number:|Inquiry Data:|Error Count:|state"

cause we need to run:

megacli -PDMakeGood -PhysDrv [<enclosure>:<slot>] -a0

Finally, run:

megacli -CfgEachDskRaid0 WB RA Direct CachedBadBBU -a0

Note: If that fails with a message about cache data, you may need to run:

megacli -DiscardPreservedCache -L"10" -a0

This will clear the cache and then you can run the CfgEachDskRaid0. This will mark all new disks as JBOD disks… used for ZFS. If you have something different, check the docs from Hetzner below.

Next, we need to swap disks in ZFS. Run

zpool status

to get the info about the missing disks. the missing disk will show as unavailable. Next, find the ID of the disk that was added.

cd /dev/disk/by-id/

ls

find the new disk (usually wont have any partitions on it). Now, its a matter of running the following:

zpool replace rpool /dev/disk/by-id/scsi-3600605b008f498802aa37da51674ea7e-part3 /dev/disk/by-id/wwn-0x600605b008f498802b2a3a683752e088

swap the scsi-36xxx and wwn-0x6xxx parts for the ones you found and rpool with your ZFS pool name.

finally, run

zpool status

to see the status, run:

zpool status -v -1

shows you the status with more info and refreshes every second. ZFS is now running in the background resilvering the drives and swapping out the old ones. since the old one is missing, it will wait till the new drive is sorted then remove the old one. This can take some time, depending on your disks and data size.

Hopefully, this helps someone!

Some links for info:

LSI RAID Controller – Hetzner Docs

Bulk updating Tasmota Devices over MQTT #100daysofhomelab

I have a load of these Smart Plugs from GoSund around the house (currently around 11, but more are still in boxes). The handy part of these is they can be re-programmed using Tuya Convert and using the following config you can get power usage and an on/off switch. I have mine hooked up to an MQTT server, and with the MQTT plug-in to Home Assistant, I get all the details about power usage and can control each device I need to (hence the 11 of them!).

But MQTT can be used for more than monitoring. I can send commands to the devices. Given that all of them are on a locked-down network, and only have access to the NTP server, internal DNS and the MQTT box, I needed to figure out how to get OTA updates to the box. Luckily, you can change the OTA update URL on the web interface and download it from a local endpoint… But, I am a Lazy Git, so I needed to figure out an automated way. Enter MQTT again.

First, you will need to log in to your Tasmota device, go to configuration, MQTT and Enter your MQTT host. Also, get your topic name while you are there and keep it handy.

Hit save and wait a few seconds for it to update. You need to watch the MQTT messages going through. I am using MQTT Explorer to see all messages.

For me, the tele topic has all my devices listed, plus stats around power, status, etc.

On the Tasmota site, they have documentation on sending commands over MQTT. I then installed the MQTT CLI on the mac (so I could automate this later) and ran the following commands:

mqtt pub -t cmnd/tasmota_<deviceid>/OtaUrl -m "<internal url hosting tasmota updates>/tamota.bin" -h <mqtt host> -p <mqtt port>

mqtt pub -t cmnd/tasmota_<deviceid>/Upgrade -m "1" -h <mqtt host> -p <mqtt port>

update <deviceid>, URLs, host and ports as required. for the internal URL, I just have a small copy of Nginx running in docker, and serving a folder with copies of the latest OTA files from the Tasmota Release page. I just wanted all the files and put them in the folder shared by NGinx. I need to automate it a bit better… maybe next time there is a new release?

the first one tells the device where the latest OTA file is. the second command kicks off the update. If your devices are not segregated, you can just leave the existing OTA Url there and kick off the upgrade task on its own… I wasn’t that brave…

Within a couple of seconds, you will start to see messages showing up in MQTT Explorer. After a couple of minutes, all devices will have been upgraded and rebooted (no power down, luckily) and all is good!

DNSControl and Github Actions #100daysofhomelab

I am participating in the #100daysofhomelab challenge and have been posting a lot on Twitter as @tiernano, but some posts and tasks I am doing will require longer-form write-ups. So, some updates will include either Videos (which will be published on my Youtube Channel) or blog posts, which will go here. This is the first of the blob posts.

DNSControl is a tool written by the Stackoverflow lads (when they called themselves StackExchange). It is designed to update DNS records and can work with DNS providers and registrars. I use it to update records in Cloudflare and Route53, but many providers are available. I wrote an article a while back about how to create reverse DNS records for IP space with Route53 and DNSControl, but most of it is still relevant, and the main documentation site for DNSControl has a lot of useful tips.

Up till this morning, if I wanted to update a record, I checked out the DNS records from my private Github repo, made the change, and ran the DNSControl commands on my machine (check for syntax checking the file, preview to show what will change at the provider level, and push to make the changes). But I wanted to have some automation for this. So, enter Github Actions.

I did a bit of digging and found a Github Action from koenrh called dnscontrol-action. The docs on this are quite simple to go through, so I created 2 action files for my Repo: preview and push. a Gist for Preview is below:

and the one for push is as follows:

The important parts are as follows:

In both preview and push, the check command does a syntax check of your DNS config file. Then preview will check the providers to see if any records need an update. When push runs, it will make the changes.

All my required secrets are set in the Github repo as secrets, so when the action is run, it will pull the required keys out. These are put into the environment variables. I use name.com and a registrar for some domains (though most have now moved to Cloudflare, and some, like my .ie domains, are with Blacknight, who are not supported on DNSControl). Cloudflare is used by the majority of my domains, and Route53 is used for 2 domains currently. There are around 53 domains current managed by this, and the plan is to add more. I also plan on getting some more automation around checking configs and sending alerts if anything changes.

So, enough “How it works” and show us it working!

Right. Let’s update my zt.tiernanotoole.net domain, which is used for Zerotier IPs internal to my network. It’s been a while since I did this, so most will be removed and a few adds… first, I create a new branch, called zt-update, and check it out in VSCode. I made my changes, git committed and git pushed to the branch.

at this stage, the actions have NOT run, since this is neither checked in to master, nor a PR for master.

I go into the create PR section, and I can see the changes I have made. in my case, I removed a load of unused records and added extras:

I now create my PR and wait for the checks to complete:

within a short time, I get an alert that all checks have passed, and I can find the results of the changes in the build (It was meant to add a note to the PR with the details, but I might be missing something in my config…)

Also, not sure why it is redacting part of my name here…

I check the rest of the list, and other than the deletes and creates in route53 for this domain, there are no other changes. So, being happy with that, I click the Merge Pull Request and the code is checked into master, and the DNSControl push command runs:

If i now go into Route53, i can see the records on the site:

Happy days! Next challenges to fix:

  • fix the PR to include the output of check and preview
  • only run a check and push on the master branch, and no need to run preview again…
  • run preview once a week and send alerts if anything has changed

Till next time, good luck!

Ubiquiti UDM Pro Fail over to Speedify

So, this has been a blog post in the making for a while now but never got around to fully writing it up, so here goes nothing…

I run a UDM Pro in the house. It has 2 WAN Links: 1 1Gb link and 1 10Gb Link. I also run AS204994, my own ASN with its own Transit and Peering connections, mostly in Europe. There is a VM in the house which acts as a connection to AS204994, which gives me a full connection to the Internet through my own ASN. More details on my AS204994 blog are here.

That connection is hooked up to the 10Gb Link on the UDM Pro, which is listed as the primary internet link. Details on how these works were uploaded in this video on YouTube:

In the video above, I was using OpenMPTCPRouter to connect to the internet, but it’s been causing some issues lately, I decided to try something else.

The new setup is an Intel Nuc (i3 with 32GB RAM and 2x512GB SSDs… VERY OVERKILL for the job at hand) running Ubuntu Linux. It has a USB Hub with 3 USB Ports and an Ethernet port connected, giving me 2 Ethernet ports on the box in total. 2 of the USB Ports are connected to USB 4G Modems from Huawei and the external ethernet port is directly connected to my cable modem.

USB Hub with 1 Huawei Modem and connection to second

Both modems and the ethernet port are connected to the NUC with full internet connections (The Huawei boxes give up NATed IPs, but the Cable modem is a full public IP) and then Speedify takes those 3 connections and does some bonding magic. Speedify is a handy little VPN service that does connection bonding. You can use it to make sure your internet is rock solid using multiple links, make sure streams are stable, etc. It can bond Wifi Links, LTE modems, Cable Modems, DSL, etc. Anything that can connect and be bonded. The only issue I have with it, compared to OpenMPTCPRouter is that you don’t control the upstream server…

Speedify is set in shared mode, so the internal port on the NUC is set to share the internet connection. This is hooked to the 1Gb WAN Port on the UDM Pro. This is set for failover only (currently the only option on a UDM Pro) so if my AS204994 link goes down (VM reboots, VM host dies, Cable modem connection goes out, etc) I will still have a connection. If the cable goes out, it will use just the 4G links, but if everything is running, I get all 3 connections.

Connecting to my car over ZeroTier

I use ZeroTier on my network for a good few things, including internal network peering between BGP VMs, management of machines, and now, connecting to my car over LTE. This is one of those posts that sounds silly, but is very handy! First, the parts list:

  • Car…
  • 3G/4G/5G modem of some sort. I am using a Huawei Wingle… Can be used without the Router below, but I wanted Zerotier, so I have it in modem only mode…
  • A router that supports Zerotier. I am using a modified TP-Link TL-WR703N upgraded to 16MB ROM and 64MB RAM. This is required for newer OpenWRT builds
  • a dashcam that connects over Wifi. I am using a BlackVue DR750S-2CH
  • Latest ROOter software from Of Modems and Men
  • Patients…

After installing the the latest copy of ROOter on the TPLink (or router of your choice) and getting the modem configured correctly (this took a while) you need to install the Zerotier software though the dashboard. Once installed, I joined my Zerotier network using the CLI (SSH into the router) and the approved it though the my.zerotier.com dashboard. Once its approved and connected, you can now go to the Zerotier IP and get to the router directly. From here, you can either setup a route in Zerotier to point at the internal network behind the router, or, in my case, setup a  SSH tunnel to the dashcam. I found the IP given to the dashcam and used SSH forwarding to get to it. Finally, i used the URLs from Digital-Nebula’s hackview repo to get to the different URLs. I use this to download stuff like GPS logs, emergency videos, etc. I have to clean up some scripts at some stage for this, and plan to upload them at some stage.

If anyone has any questions, leave a comment!

Domain Joining a machine over VPN and Password Resets/Changes with Azure AD

With the whole Work From Home thing probably becoming more and more normal in the years to come (I can count on 2 hands how many times I have physically been in my main office in the last 7 months) there are a couple of certainties in that people will come up against. One is passwords expiring and needing to be changed, one is password resets being required and finally laptops or desktops needing to be domain joined or connected to the domain before they can be fully provisioned. As the (currently only) IT guy in our office, I have had to deal with these first hand, and decide to write this post, helping both my fellow employees, and possibly other IT Admins stuck in this challenge.

So, as the IT person, there are a couple of assumptions:

  • You have on premises AD
  • You have Azure AD (P1 and above seems to be required if users are mixed AD and on prem. Free allows just Cloud users).
  • Azure AD Sync installed and enabled

If all above are set, you will need to follow the steps to Enable Azure Active Directory Self Service Password Reset. I have enabled this on our domain. Next, you need to get your users to setup their secondary authentication for backup. All our users have a 2FA requirement, so most of them had that already. New users need to go though those setups. Finally, if a user needs to change or reset their password, they can do so though https://aka.ms/sspr. If all is done well, that reduces the amount of support calls I (and you) get.

Now, the next task: domain joining over VPN. This is a bit more “fun” to play with.

First, you need a VPN connection. We use Meraki gear using Active Directory for RADIUS auth. I wont go into too much details on setting that part up, but the script we use to build the VPN connections for users is below. This will probably be different for different VPNs, but this is our starting point.

Lines you need to change are at 8, 9, 10 and 47. Line 39 can also be modified to change from Split Tunneling (only sending traffic to internal subnets) or full Tunneling (all traffic over VPN). If you have multiple internal subnets, Line 49 can be copied with more.

The most important part we need though is line 34. The -AllUserConnection allows the connection to be available to all users on the machine, but also on the start screen. This is important.

So, with all that in place, you will need to connect to the VPN

you should now be able to join the domain as if you where on your local network.

Enter Domain details and change name of machine if required
when asked enter your domain username and password
You will be welcomed to the domain
and then asked to reboot

reboot your machine as usual and when it boots, you should see a new option on the login screen

VPN login option

Click this icon and if you only have one VPN connection the screen below will show up. If you have more than one, you will be given a list of options to use.

Login to VPN at the login screen

Enter your domain credentials. Since our AD and VPN use the same credentials, it will automatically log you in aswell.

Machine is now domain joined and logged in, and in my case, finishing setup

So, there you have it. How to domain join a machine outside the network. Now, in reality, Azure Active Directory and Intune would probably be the better option, but that’s future work…

Auto deploying to multiple servers with GitHub and Webhooks

In yesterdays post, i mentioned that i wanted to try get an auto deploy working for this site. It already builds auto-magically using Forestry and puts the static HTML into a Github repo, but i needed to manually update the servers hosting the site… Well, not any more!

using the magic of Github’s Web hooks, the Webhook project and a small piece of bash shell script, i have managed to get this auto deploying…

First, Download the Webhook project (its a Go application, so it works pretty much anywhere). Copy it somewhere on your machine. Next, you need a config. I used the Github sample config from the project site and made tweaks to what script to run and what i was passing in.

next, the script to pull from Github was simple enough:

The repo should already be cloned into the folder, /var/www/localfolder and your web server should be pointing at that also. Then, its just a matter of running the command:

./webhook --hooks github.json --verbose

The --verbose tag gives you lots of info, so its handy for testing. and then your app is running and listening on the default port, 9000.

next, head over to your project on Github and go to settings:

select webhooks and add new web hook

Fill in the required details on the page, and click save.

Github will go out and have a chat with the webhook and verify it can send and receive stuff from the hook. You can see this in the deliveries section:

Clicking on these will show you the headers that were sent, along with the payload, and you can also see the response from your server. Finally, you have the option of re sending the payload, just in case anything goes wrong.

So, there you have it. A complete automated deploy across multiple servers! Any questions, leave a comment below!

[UPDATE] yesterday i mentioned i had to modify the sample that was included on the webhook site. Well, i noticed something this morning. The reason i needed it modified was the trigger rule was checking the header and the reference for the branch, but any time i ran it, it would not trigger… The reason was simple: the webhook app is expecting application/json but i had it set to application/x-www-form-urlencoded which is the default… the webhook app then couldn’t parse it correctly… changing that fixes the problem! happy days!

PFSense with Multiple Public IPs

So, a few weeks back, i got my hands on a Hetzner Dedicated box. It has a quad core Xeon, 32Gb ram, 3x3Tb hdds, RAID controller and KVMoIP. one of the first thing i did was get myself a /29 IP pool (8 total, 6 usable IPs). There where already 3 IPs given to me: 1 for the KVM, one for the box itself, and 1 as the router for the IP block.

So, i need to setup my own router, so i picked PFSense since its what i run in house. I gave it 2 network connections: 1 connected to the main network adapter on the VMWare ESXi box (public) and one to a virtual switch, which is only used by VMs. The public is the WAN link and it gets a static IP from Hetzner, and the virtual switch is then my “LAN” link. This allows me to have standard NATed network connections to any VM i have, but then, what do i do with those IPs?

So, after a lot of digging, i found the answer. So, this should help.

  • Under firewall, click on Virtual IPs.
  • Click the plus. I then selected IP alias, selected the WAN interface and set the IP to my first public IP i wanted to give. in my case, i was given a /29 block, and my first address was 176. This is the network address. I used 177. Likewise, my last address is 183, but that cannot be used either as its a broadcast address. give it a description and then hit OK. Repease for all IPs you want to use. TIP: Give each a meaningful description!
  • Next, click firewall, NAT and 1:1. Click the add button and select your interface as WAN. set the External Subnet IP as the one you want to use and your internal IP as the machine that will have it. Thats all i did on that screen…
  • Then go to Firewall, NAT, outbound… this is where things got complicated. Set the mode to “Manual outbound NAT rule generation (AON – Advanced Outbound NAT)” and click save.
  • Then create a new rule: Interface: WAN, Source, Network, IP of the internal machine and then under translation, under address select the IP you want to give it. If you followed my tip in step 2, you should see the descriptions in here.

After saving everything and reloading the firewall, visiting a page like WhatsMyIP or ICanHazIP should show you your public IP. You can then create firewall rules to allow access. Quick idea would be:

Firewall/Rules, Add, Interface WAN, Destination: Local IP you want to use, and give whatever “normal” rules you would (HTTP, lock down to source address, etc). Click apply and hitting that address using what ever method (SSH, HTTP, etc) should work.

YMMV, but hopefully this helps! Any questions, leave a comment.

Quick tip for internet facing ESXi servers

Quick tip for all you with internet facing VMWare ESXi Hosts. I
have just got my hands on a box on the Hetzner network (more on
that later) and using their LARA system i installed ESXi on it. All was good, then I tried login in a couple hours later and i kept getting errors about my password being wrong… So, i tried a few more times, got pissed off and rebooted the box (had to do a hard reboot, since i couldn’t even get in over KVM). I though this was a hardware issue, or a config issue, and left it… yesterday, i had the console open most of the day, and when looking at something i noticed this:

Well, that’s why I couldn’t login! So, tip: create a second user account, name it something other than root, give it a secure password and use that to login to your ESXi box. Ideally, your ESXi box should be behind a firewall, but in the case of a dedicated server, that may not be financially feasible… Hope this helps someone!

Bulk compressing images for the Web

Now that all my sites are running Jekyll I am trying to get them optimized for SPEED which meant
looking at all the stuff that takes time to download… There are more tweaks (and possibly posts) coming down
the road, but to start, I needed to look at images.

First things first. I’m running this on a Sabayon Linux box, so some of the install commands will be different… (Also, i do need to explain why I moved from Windows to Linux on the GodboxV2, but that’s a different post…)

First, install OptiPNG (they have a Windows build too…) and JPEGOptim

sudo equo install optipng
sudo equo install jpegoptim

[UPDATE] I tried this on an Ubutnu Box, and to install both of these, the package names are the same. so, to install both:

sudo apt-get install optipng jpegoptim

Next, using the Linux find command (this should work also on OSX…) run OptiPNG and JPEGOptim on all pngs and
jpgs in your given directory:

find . -iname "*.png" -exec optipng {} \;
find . -iname "*.jpe?g" -exec jpegoptim {} \;

depending on how many images (and how fast your machine is) it should take a min or two…

That’s it! I did a git status, which showed me all the changed images, and then deployed the Jekyll sites… All
good! That’s it!