Datacenter Provisioning - Installing a GlusterFS File Server - [Raspberry PI 4/Rock64]

(**) Translated with www.DeepL.com/Translator

[ LEVEL ] Beginner [ LEVEL ] Beginner

This procedure allows the installation of a server to store files. It will distribute its files for the whole Datacenter through the “GlusterFS” service.

Prerequisite

To perform this operation you must :

  • have a Raspberry PI4 server with 64-bit RaspiOS or a Rock64 with 64-bit ArmBian.
  • have followed the procedure Follow the procedure concerning the installation of operating systems
  • know how to execute a command in a Linux console
  • know how to use the “vi” editor
  • be connected “root” to the console, for sudo users: type sudo bash.
  • have a USB storage disk, for performance reasons, prefer an SSD - USB3 disk. Adapt the volume size to your needs.

Why GlusterFS?

The simplest would have been to install an NFS server, but the particularity of GlusterFS is that it can be extended as a RAID service. If in time you want to have a “real time” copy of your files, you just need to add a server and integrate it into the GlusterFS cluster.

Installing and mounting the encrypted SSD disk.

Follow the procedure to encrypt the contents of a disc.

Installation of necessary packages

apt-get -y update
apt-get -y install glusterfs-server
# enable at boot
systemctl enable glusterd
systemctl start glusterd
systemctl status glusterd

The console must return “Active: active” :

glusterd.service - GlusterFS, a clustered file-system server
   Loaded: loaded (/lib/systemd/system/glusterd.service; disabled; vendor preset: enabled)
   Active: active (running) since Wed 2020-12-23 18:18:22 CET; 8s ago

GlusterFs setup

Creating GlusterFS volumes

The purpose of this procedure is to create two GlusterFS volumes for: nextcloud and matrix. The disk is mounted on "/Disk1" and has two directories: “nextcloud”, “matrix”, “nextcloud” and “matrix”. So I’m going to create two volumes that will then be shared by GlusterFS.

WARNING: For directories that already have files: see the “Problems encountered” chapter below.

# Initialization of the Gluster network
gluster peer probe $(hostname)
# Response :
#    peer probe: success.
# Creation of the volumes named gfsvol-nextcloud and gfsvol-matrix
gluster volume create gfsvol-nextcloud  $(hostname):/Disk1/nextcloud
# Response :
#    volume create: gfsvol-nextcloud: success: please start the volume to access data
gluster volume create gfsvol-matrix  $(hostname):/Disk1/matrix
# Response :
#    volume create: gfsvol-matrix: success: please start the volume to access data
# Starting the volume 
gluster volume start gfsvol-nextcloud
# Response :
#    volume start: glusterfsvolume: success
gluster volume start gfsvol-matrix
# Response :
#    volume start: glusterfsvolume: success

Volume management

Operations Commands Notes
Create a volume gluster volume create [nom du volume] $(hostname):[Local directory path]
Starting a volume gluster volume start [nom du volume] start sharing
Stopping a volume gluster volume stop [nom du volume] stop sharing
Delete a volume gluster volume delete [nom du volume] Does not delete the data contained in the volume

Cluster status

gluster v status

Must return a similar result :

Status of volume: gfsvol-matrix
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick raspi6:/Disk1/matrix-synapse          49153     0          Y       1043 
 
Task Status of Volume gfsvol-matrix
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: gfsvol-nextcloud
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick raspi6:/Disk1/nextcloud               49152     0          Y       1000 
 
Task Status of Volume gfsvol-nextcloud
------------------------------------------------------------------------------
There are no active volume tasks

Post-Installation Checks

It is important to check from another server, the accessibility of the volumes :

apt-get -y install glusterfs-client
# Then mount a volume
mount -t glusterfs [gluster Ip server]:[volume name] [Mount point]
# ex : mount -t glusterfs 192.168.1.53:nextcloud-config /mnt

Check :

  • The file system contains files (ls command)
  • The content of a file (cat command)
  • Create a folder/directory
  • Delete a folder/directory.
# Unmounting the file system
umount /mnt

Opening communication ports (Firewall)

Protocole Port Sens Remarque
TCP 24007 Input
TCP 49152 Input First shared volume
TCP 49153 Input Second shared volume
TCP 4915X Input X shared volume
TCP 491XX Input XX shared volume

Glusterfs uses one port per shared volume, and starts at 49152.

List of ports opened by Glusterfs

Using the command gluster v status :

echo "Open ports for Gluster volumes"
PORTS=`service glusterd status | grep "listen-port" | sed 's/.*\.listen-port=//'|sort -u`
echo "$PORTS"

Adding firewall rules

## Datacenter network interface
IFACE=eth0
## iptables path
IPT="/usr/sbin/iptables"
## ports list
PORTS=`service glusterd status | grep "listen-port" | sed 's/.*\.listen-port=//'|sort -u`
## Adding rules - Opening input/output firewall
for P in $PORTS
do
   echo "Opening port $P"
   $IPT -m comment --comment "[GLUSTERFS] Volume INPUT $P" -A INPUT -i $IFACE -p tcp -m tcp --dport $P -j ACCEPT
   if [ "$?" != "0" ];then
      echo "Error - iptables - INPUT $P"
      exit 1
   fi
   $IPT -m comment --comment "[GLUSTERFS] Volume OUTPUT $P" -A OUTPUT -o $IFACE -p tcp -m tcp --sport $P -m state --state RELATED,ESTABLISHED -j ACCEPT
   if [ "$?" != "0" ];then
      echo "Error - iptables - OUTPUT $P"
      exit 1
   fi
done
echo "Saving rules"
/usr/sbin/iptables-save > /etc/iptables/rules.v4

Backup/Restore files

Personally I make a daily external backup, to a storage box reachable by “ssh” or “borgbackup server”. With “Borgbackup”, files are encrypted using a personal key, so the host cannot “see” the contents of the storage.

Conclusion

The GlusterFS service is started. This procedure does not address replication, but once initialized, it is quite trivial to add servers to implement replication.

Problems encountered

Shifting volumes - volume create: xxxxxxx: failed: xxxx/xxx is already part of a volume

I moved a directory associated with a Gluster volume without stopping it, so it is impossible to recreate it. You have to reset the attributes of the volume, for example the problem is located on /Disk/config, impossible to create the volume with the message “volume create: nextcloud-config: failed: /Disk/config is already part of a volume”

setfattr -x trusted.glusterfs.volume-id /Disk/config
setfattr -x trusted.gfid /Disk/config
rm -rf /Disk/config/.glusterfs

or delete the directory and then re-create it, it works as well :)

Creating a volume with existing data

I created a volume from a directory that already has all the Nextcloud data files. This is not the good practice, since it is the Gluster clients that allow to update the GlusterFS metadata. The consequence: The data on the server exists, but the clients only “see” part of these files. I found a link to fix it, I didn’t try it (link found after reinjecting the data through a Gluster client mount point) :

Example:

  • move your nextcloud data to /Disk1/datanextcloud,
  • creation of the glusterfs volume, with this directory which contains a multitude of files : gluster v create gfs-nextcloudata $(hostname):/Disk1/datanextcloud
  • I connect with a client, the client only sees about ten files. !!!!
  • Chris’s proposal :
# On the Gluster server
# Mount the incriminated volume
mount -t glusterfs 127.0.0.1:gfs-nextcloudata /mnt
cd /Disk/datanextcloud
find . -exec stat '/mnt/{}' \;

By doing this on each server, you’re accessing each and every one of those files and folders causing Gluster to create the metadata and that makes the data appear in the Gluster volume. Hooray!

(**) Translated with www.DeepL.com/Translator