A tool for file recovery cleanup

If you’ve ever run file recovery tools on a disk, you know that you can end up with multiple copies of recovered files. Well, I made a little script that can help reduce the number of duplicates for you to clean up.

https://github.com/eltopo1971/file-duplicate-nuker

fileDuplicateNuker takes a directory as an argument, then recursively goes through that directory and takes a hash signature from the files in it. When it encounters a file with the same hash signature, it deletes the file.

Does this take care of all the duplicates? Oh heavens no. That’s a feature, not a bug — call it erring on the side of safety. The script has no idea what kind of file it’s dealing with. All it does is take a hash signature and base the decision of whether to delete the file on that. If there is so much as one byte of difference in the file it’s examining, it’s counted as a unique file and not deleted.

That being said, from my testing it does delete a good number of files, and when you have thousands of files to wade through, any little bit helps.

Running a Windows virtual machine on Linux using an Existing Windows Installation

I’ve been a Linux user for years. At home I’ve kept a file server which runs off Linux, but that was “classic Linux” — without a graphical user interface, using the command line, and doing things the “hard way”. My main desktop PC was still running Windows, and had to because the one game I play regularly, Fortnite, was not available on Linux.

Well, someone recommended that I check out GeFORCE NOW, which is a virtualized environment you can use within Linux to play a large number of games in the cloud. I’ve always been skeptical about that type of play, largely because the capacity of the system to provide a good gaming experience had not previously been available. However since I live alone and have a good broadband connection I tried it out and loved it. It’s indistinguishable from playing the game from your own computer.

So, since most of the apps I use are open-source or otherwise available for Linux, I decided to switch from Windows to Ubuntu 24.04. My PC has two hard drives as well as a NVME drive which hosts the Windows system, so I decided to delete and transfer what I could from a 1TB drive and installed Ubuntu on there. It was a breeze and I was quickly up and running. Ubuntu has very good hardware support so most things just worked, and the only thing I had to hunt down was a driver for the Logitech G13 gamepad I constantly use. I’ve been running this for about 3 weeks now and I have no desire to go back.

However, today I came up against an obstacle. On Windows I used Bitvise SSH client to connect to this server, and Bitvise saves its files in its own binary format. I found myself in a situation where I would have to go back to Windows to use Bitvise to connect to my server. Also while my current situation does not require that I have access to a Windows machine, that can change all too easily. So I decided to create a Windows virtual machine (VM from now on), but instead of using a virtual hard drive file, I would simply use the disk on which Windows is already installed. That makes a lot more sense. It’s just more efficient.

I found one set of instructions to help me do that, but it dates back to 2021 and hasn’t been updated more recently, so I spend some time figuring out the more current way to go about it.

How to run a Windows VM from an existing Windows disk

For this you will naturally need a computer running Linux, preferably something Debian-based like Ubuntu. You will also need to install VirtualBox on that computer (downloads here). You should also disable Bitlocker encryption from your Windows drive before proceeding. I didn’t have it enabled on my drive so I don’t know how that would affect the installation.

Set up your user

Your user will need to be part of two groups: “disk” (to enable raw disk access) and “vboxusers”.

sudo usermod -aG disk,vboxusers [user]

Also let’s create a folder “vms” in your home directory in which you will keep your VMs:

mkdir vms

Set up your Windows disk

First, find out where your Windows drive is mounted. By default when you install Ubuntu on a computer with FAT32 or NTFS drives, they will be accessible to Linux.

lshw -short -class disk,volume

This will show you a list of your hard disks and their partitions. Look for a disk that contains a “Windows FAT volume”, a “reserved partition” and one or more “Windows NTFS volumes” and note the entry in the Device column for the disk (not partition). Usually this will be “/dev/nvme0” (if you have a NVME disk) or “/dev/sda”.

Next, we’re going to create a special file that points to that disk using a utility that is installed with VirtualBox. Enter this as your regular user (not root):

VBoxManage createmedium disk --filename=vms/[disk file].vmdk --variant=RawDisk --format=VMDK --property RawDrive=[Windows drive]

This file points at the location of the Windows drive.

Create a VM

Now we’ll start using VirtualBox itself. But before we do, let’s install the VirtualBox Extension Pack (download from here). To install the file just double-click on it on the file. It will launch VirtualBox and the installation will take place.

In the VirtualBox dashboard, click on the Create a new virtual machine (VM) link.

  1. In the VM Name field, enter a name for the virtual machine
  2. The VM Folder should be “/home/[user]/vms”
  3. In the OS field select Microsoft Windows
  4. In the OS Version field select the version of Windows installed on your Windows disk.Win12
  5. Click on the Next button at the bottom right of the New Virtual Machine window.
  6. Under Specify virtual hardware, adjust the Base Memory and Number of CPUs. Bring the Disk Size slider to the lowest value. Note that using 16GB (or more) of RAM is highly recommended otherwise you’ll find the VM experience very taxing, but keep in mind that this memory will not be available to your Linux apps while the VM is running.
  7. Select Use EFI
  8. Click on the Next button.
  9. Click on the Finish button.

Attach The Windows Disk to the VM

Now we’ll attach the pointer file we created in step 1 to the VM.

  1. In the Machines tab of VirtualBox Manager, right-click on the new VM and click on Settings.
  2. In the [VM]-Settings window, select the Storage tab.
  3. You’ll see a .vdi file which we won’t be using. Click on the Add Attachment button at the bottom right of the Devices box (see below) and select Hard disk from the dropdown menu. Attach Disk
  4. In the Hard Disk selector window, click the Add button.
  5. Select the .vmdk file you created earlier in the file selection dialog box.
  6. With the .vmdk file selected, click on the Choose button.
  7. In the [VM]-Settings window, select the .vdi file, and click on the Remove Attachment button (next to the Add Attachment button).
  8. Click OK to save your VM configuration.

Run Your VM

Nothing left to do but to run the VM and make sure it works, so in the Machines tab of VirtualBox Manager, right-click on the VM you just edited and select Start > Start with GUI.

You should be able to log into your Windows installation.

If you have tried in the past to install Linux and modified the UEFI partition of your boot disk… well, you will then have to navigate around the disk using the GRUB CLI to fix your boot sequence. This is beyond the scope of this particular tutorial, but instructions are easily found online. I had to do this myself.

Something else that came to mind as I was writing this was to try and see if I can do the same thing using QEMU instead of VirtualBox, which I will also write a tutorial for if I can manage to do it.

Keep in mind that virtualization at the local level can be a bit tricky and resource-intensive. It’s also one of the rare things that can completely freeze up your system and force you to reboot it — that’s called a kernel panic.

Elon Musk Keeps Spreading a Very Specific Kind of Racism

If nothing else positive comes out of his acquiring Twitter, it will at least have provided the public with an opportunity to see how obsessed Elon Musk is with race-related conspiracy theories.  A surprising amount of it is just plain creepy.

Elon Musk Keeps Spreading a Very Specific Kind of Racism (Mother Jones)

 

With the spotlight on him Elon Musk doesn’t look like much of a genius anymore

A highlight: “Musk is a right wing demagogue and a pathetic, narrow-minded racist, one that endorses causes that suppress the marginalized and champion white supremacy.”

The Descent of Elon Musk by Ed Zitron

What drives a man to “suicide”?

On March 9th a man named John Mitchell Barnett was found dead in a Charleston NC hotel parking lot, victim of an apparent suicide. But this wasn’t just any ordinary schmoe, Barnett was the main whistleblower for assembly and quality issues in the Charleston Boeing plant that produced the 737 MAX.

Funny how that happened. The man dedicated the last few years of his life to exposing problems that put the entire flying public at risk, and just when the issues he warned about are making headlines, suddenly, he “commits suicide”. Come on people. Sure, he died of a gunshot wound, but I would stake a large amount of money to say that he was not the one to pull the trigger.

The Charleston County coroner ruled the wound was self-inflicted, but when you think about the amount of pull that a huge employer like Boeing have on a place the size of Charleston you realize how the wheels of justice are sometimes greased just enough by major economic players into “being team players”.

Here is a video on the man and the major safety issues he tried to warn the public about.

YouTube player

It’s time to admit I overestimated humanity

Does the internet make people stupid? I mean, it seems like a silly idea. After all the internet gives you instant access to the entire knowledge of the world (along with a whole lot of BS) so surely that can’t be a bad thing… well, after seeing this screenshot I am not so sure.

A person on the internet asking if ramadan is a new tiktok challenge

Maybe AI isn’t ready to take over the world yet

Shared today on BlueSky — apparently this is the result of asking ChatGPT to illustrate what its core values are. So, there’s probably not much cause to be worried that ChatGPT is going to steal your job, because most jobs out there require communicating in, well, *a* language, and not some weird babble invented by randomly throwing syllables together.

ChatGPT values: weird icons with gibberish captions.

Twitter: a post-takeover poison pill

When a company is about to go through a hostile takeover, the stakeholders in the company have this strategy that’s available to them called a “poison pill”. The idea of the “poison pill” is that the shareholders, worried about the effects of the takeover on the long-term health of the company, will artificially depress the stock price of the company so as to make it unattractive for a takeover.

This is obvious *not* quite what’s taking place at Twitter right now.

@elonmusk‘s offer was so much over the realistic valuation of the company that the shareholders just saw $$$$ and went with it. However Twitter isn’t a traditional company. Twitter is a social network. Its technology stack is robust but it’s not particularly outstanding. It works, it doesn’t have a huge lot of features, but it can handle the traffic. Its real value is in the users and the connections it brings to the party. To remain at its baseline of “value” compared to before the takeover, it has to retain its userbase. If users leave, the site’s value is diminished. And this is something that @elonmusk
doesn’t grok.

While he doesn’t get it, users do get it. And their response to Musk’s “comedy of errors” tenure ever since he took over the site is to look elsewhere for a new social network to spend time on, because it’s become clear that Musk wants to take this site and turn it into his personal sandbox. I wouldn’t pay $44 billion for a sandbox, but then I don’t have the sort of detachment from reality that being the world’s richest man engenders.

However for a couple of weeks now we’ve had a look at what @elonmusk considers entertainment for himself, and we’re all pretty much horrified, from Nazi imagery to petty personal fighting to non-stop lying by Musk himself. And that’s why, sadly, now is the time to ditch this platform. Because remaining a part of it at this point is to risk immeasurable personal reputational damage. Think the repercussions in your life if it came out that you were a user of “stormfront” (or whatever KKK-affiliated web site exists out there). This is what Twitter will turn into in the hands of a spoilt man-child with highly questionable morals and a reputation as a con man who has no board to answer to and in time is growing more and more embittered that he can’t just buy a positive image for himself. Or friends.

And if that sounds like I’m describing Donald Trump, it’s not a coincidence; both Trump and Musk are trust fund babies whose lives are led by their malignant narcissism. 

So there’s an understandable urge to leave a platform that’s devolving into a giant cesspit of xenophobia in all its diseased forms, because users don’t want the taint of it.

It’s probably a bad idea to deactivate one’s account, however. All this will do is leave your handle open to a malignant actor taking it over and attempting impersonation. A much better approach is this: make sure you set multi-factor authentication on your account, and then log off. This way no one can use your handle, and you are protecting your reputation.

There are many alternate social networks out there that don’t belong to snake-oil-selling egomaniac billionaires, such as mastodon, counter.social and tribel. Check them out and give them your time and eyeballs instead of watching someone who should know better tank a platform to flatter his own malignant ego.

What is a blockchain?

2018 is poised to be year when cryptocurrencies become mainstream. The original cryptocurrency, Bitcoin, has entered the common jargon of the modern world last year as its valuation hit record a record high of nearly 20k USD/BTC, and stayed in the news as its valuation dropped to more reasonable levels. Ethereum is also gaining recognition as it became the #2 cryptocurrency in terms of market capitalization. In short, a little over 8 years since the creation of Bitcoin cryptocurrencies are gaining recognition and acceptance in the “real” world.

Cryptocurrencies are created as part of something called a blockchain. And more than cryptocurrencies, it is the blockchain idea which is expected to have a huge impact on the computing world, at least for the next couple of years. As such it is a good idea to learn what a blockchain is, at both a basic and more advanced level.

The Basics

At its core, a blockchain is a distributed ledger. Those with an accounting background will immediately recognize what a ledger is — it is a record of transactions. A blockchain is distributed, which means that entries in the ledger are written by many parties, as opposed to by one centralized authority.

Like an ordinary paper ledger, blockchains are write-once. Once a block has been verified and added to the blockchain it cannot be erased or modified. This insures that transactions cannot be taken back.

The Nodes

All these “parties” are actually computers running a node for the blockchain’s network on the internet. This involves executing software which contributes to the blockchain network. Depending on the network involved there may be several types of nodes in a blockchain; this will be explored in depth later.

The Blocks

Nodes compile a number of transactions into a block. How large the blocks are, and how often they are verified, varies widely between blockchains. For example, the Bitcoin blockchain generates a block every 10 minutes. The Ethereum blockchain, in comparison, generates a block in less than 20 seconds, and Bitshares blocks are generated every 3 seconds at most. A number of factors affect block time; if you’re not intimidated by math check out this article for more information.

The Chain

Blockchains are so named because each new block is appended to the previous block, effectively forming a chain. In fact one can always look at certain information in the latest block of any given blockchain and trace the blockchain’s history all the way back to its very first block.

Hashing

Since blocks are appended to the blockchain by several different nodes, there needs to be a way to ensure that only the block with the right data can be added at any given time. Otherwise there would be no way of ensuring the continuity of the blockchain from the genesis block to the most current one.

This is where hashing comes in. Hashing is a cryptographical technique that is used to generate a unique code that can be used to identify a set of data, rather like a fingerprint. The hash is generated from the transactions contained in the block and recorded as data in the block, which also includes the hash from the previous block. This is one of the mechanisms used to verify any new blocks. If the previous-block hash does not match the previous block’s recorded hash, then the current block is invalid and cannot be added to the chain.

The actual library used to generate the hashes depends on the blockchain. SHA256 is a popular one and is used by Bitcoin. Other libraries include scrypt, X11, Cryptonight and ETHash.

Hashing produces a completely different string if there is any change whatsoever to the original hashed content. The SHA256 library can produce a very large number of distinct values (3.4028237e+38) so arriving at the same value from two different pieces of content is extremely unlikely. By comparison, the chances of winning the Powerball lottery in the USA is 1 in 2.92e8. One could win this lottery 4 times and that would still be less likely than generating the same hash from 2 different sources. Thus the use of hash values makes blockchains virtually tamper-proof.

This was a very basic overview of blockchains. We’ve barely scratched the surface. In my next few articles I will be providing more in-depth coverage on subjects such as concensus algorithms, blockchain node types, the relationship between blockchains and cryptocurrencies, and how the blockchain can be used by businesses to streamline processes and reduce processing costs.

Did Postmedia attempt to smear the NDP in the @vikileaks30 affair?

After a most momentous week in Canadian politics — namely, one in which a government with an absolute majority in both the House of Commons and the Senate was at least momentarily thwarted in its efforts to pass Bill C-30 — the @vikileaks30 twitter account has been retired. It simply no longer exists. However it has had one hell of an effect, and the way in which it was reported about should definitely raise a lot of eyebrows.

For those who don’t know about this story, @vikileaks30 was an anonymous account launched on Wednesday which broadcasted certain salacious details about Vic Toews, including parts of affidavits from his 2007 divorce — largely his ex-wife’s testimony — and many interesting details of expense claims by Mr. Toews as a government minister.

Soon after the novelty twitter account appeared on the scene Ottawa Citizen tech news reporter Vito Pilieci came up with an interesting plan to figure out who was posting on it and came up with the idea to send the twitterer a web site link which was unique for that particular user. There’s nothing wrong with that technique, I’ve used it myself a couple of times, and twitter’s use of URL shorteners makes that technique discoverable only with some difficulty. The IP address which was used to visit the link turned out to have been one connected with the Parliament buildings. That much can be reliably established.

What I find a little more difficult to understand is the way that the story was reported both by Pilieci himself and Postmedia flagship paper the National Post. Starting with the title, which was surely written by a higher-up: “Vikileaks Twitter account on Vic Toews linked to ‘pro-NDP’ address in House of Commons”. Indeed the original Ottawa Citizen story used the considerably less “inciteful” (if you will) “Vikileaks30 linked to House of Commons IP address”. But this is only the start of the smear. In the story itself we see this paragraph:

Aside from being used to administer the Vikileaks30 Twitter feed, the address has been used frequently to update Wikipedia articles — often giving them what appears to be a pro-NDP bias, actions that have attracted the attention of numerous Internet observers in recent months.

I’ve taken the liberty here to put in bold type the second instance of the smear. Note the use of “weasel language” here — the author (almost undoubtedly Pilieci himself) double-qualifies the statement so as to obviate the necessity of backing that statement with actual evidence, which he indeed does not provide.

So, that’s interesting. Without any more specifics this certainly looks like an attempt to smear the party that currently holds the position of Official Opposition in the House of Commons. Now why would someone do that and be this specific about it?

Well, the Ottawa Citizen, which currently employs Pilieci, is owned by the Postmedia Network, which is a group encompassing several newspapers, including my hometown’s The Gazette newspaper and Canada’s second national daily, the National Post (which should be no surprise to you as the link shown above goes to a NatPo story). The National Post, pretty much since its inception, is regularly accused of running a pro-Conservative slant on the political stories it covers, which clearly explains why they chose to edit Pilieci’s story  from the rather more neutral “Vikileaks Twitter account traced to House of Commons” (the title of the story on Thursday) to the, well, deliberately less equivocal title they chose to run on Friday. Am I supposed to think that this is just some kind of “oversight” or absent-minded error? Maybe others can think so, but I’m not that gullible. The smear is clear and deliberate.

OK, so maybe you think, this is a one-off thing… well, no. On Friday the Citizen ran this Stephen Maher editorial, this time with a neutral, toned-down title: “Maher: Toews made himself Twitter target with ‘pornographers’ crack” about how the @vikileaks30 story started. Read the story, though, and the ugly smear rears its head again in connection with the IP address:

That IP address also was linked to some Wikipedia pages where someone had written pro-NDP comments, which the Citizen reported.

Actually I do wish that Postmedia hired better editors because what Maher is saying now is not quite the same as what Pilieci was saying earlier, but this seems to me little but a barely-disguised attempt at repeating the smear. And then not content with doing it once, Maher pipes up again soon after:

It may be that that person is a secret NDP supporter, and enemy of Vic Toews, or it may be that there is some confusion over the IP address.

Does Maher think we’re all blind here?.. this is getting pretty blatant. Again, note the use of the weasel phrase “it may be”. Overall the article is pretty weak stuff by a national  Postmedia correspondent. In Canadian print journalism this is as senior as it gets without getting bumped up to a position involving more management duties, this isn’t the young guy who writes the computer column (that would be Pilieci, who is a staff member at the Ottawa Citizen and not really staff with the Postmedia “mothership”).

But that article isn’t what really rang a bell for me on the smear question — rather, what made me see the big picture was the follow-up by Pilieci following the @vikileaks30 poster’s announcement that the account was now retired. See if you can spot the difference from the (youthful?) exhuberance of his former column:

A further look into the IP address associated with Vikileaks30 found the address had been used in a range of online activities, including to edit several entries on the Internet encyclopedia Wikipedia ranging on topics from the history of ice hockey to a biography of Whitney Houston, as well as to alter content on a variety of politically charged topics that span the political spectrum. It does not appear the poster was targeting any specific political party or affiliation.

This went to publishing after it was clear that the NDP slur had failed to gain any traction in the House of Commons or indeed with public sentiment. What a difference a day makes, I say.

It still remains a good question as to whether there was a concerted effort by the Tory-friendly Postmedia to deliberately steer hostility towards the NDP at a time when the Conservative Party was in a bit of a crisis. The coverage in the first story mentioned actually lead to quite a few angry words in the House of Commons, mostly coming (as the second story reports) from rather easily-influenced Tory attack dog John Baird:

“Not only have they stooped to the lowest of the lows, but they have been running this nasty Internet dirty-trick campaign with taxpayers’ money,” he said.

That’s the head of Canadian diplomacy shooting himself in the foot there, taking Pilieci’s story as gospel truth (his was the main story that included the smear). Oh dear.

I for one will be following further developments regarding this aspect of the C-30 story, and I certainly hope that others will start asking questions about the possibility of spin or even possible fabrications by the newspaper conglomerate that bills itself as “the largest publisher by circulation of paid English-language daily newspapers in Canada”.

Either that, or they need to take a serious look at who they keep on staff.

Note: in order to avoid any confusion if any of the three aforementioned stories should be edited or somehow deleted, I have taken screen captures of all 4:

  1. The original IP address story as it appeared on the National Post web site on 2/16
  2. The same story as it appeared on the Ottawa Citizen web site
  3. The Stephen Maher story as it appeared on the Ottawa Citizen web site on 2/17
  4. The later story by Pilieci as it appeared on the Ottawa Citizen web site on 2/17