Security Monkey deployment with CloudFormation template

netflix-security-monkey-overview-1-638In order to give back to the Open Source community what we take from it (actually from the Netflix awesome engineers), I wanted to make this work public, a CloudFormation template to easily deploy and configure Security Monkey in AWS. I’m pretty sure it will help many people to get their AWS infrastructure more secure.

Security Monkey is a tool for monitoring and analyzing the security of our Amazon Web Services configurations.

You are maybe thinking on AWS CloudTrail or AWS Trusted Advisor, right? This is what the authors say:
“Security Monkey predates both of these services and meets a bit of each services’ goals while having unique value of its own:
CloudTrail provides verbose data on API calls, but has no sense of state in terms of how a particular configuration item (e.g. security group) has changed over time. Security Monkey provides exactly this capability.
Trusted Advisor has some excellent checks, but it is a paid service and provides no means for the user to add custom security checks. For example, Netflix has a custom check to identify whether a given IAM user matches a Netflix employee user account, something that is impossible to do via Trusted Advisor. Trusted Advisor is also a per-account service, whereas Security Monkey scales to support and monitor an arbitrary number of AWS accounts from a single Security Monkey installation.”

cloud-formationNow, with this provided CloudFormation template you can deploy SecurityMonkey pretty much production ready in a couple of minutes.

For more information, documentation and tests visit my Github project: https://github.com/toniblyx/security_monkey_cloudformation

The 10 commandments to avoid disabling SELinux

Well, they are 10 ideas or commands actually 😉
Due to my new role at Alfresco as Senior DevOps Security Architect, I’m doing some new cool stuff (that I will be publishing here soon) and also learning a lot and helping a little bit with my knowledge on security to the DevOps team.
One of the goals I promised myself was to “never disable SELinux”, even if that means to learn more about it and spend time on it. I may say that it’s being a worth it investment of my time and here you go some results of it.

This article is not about what is or what is not SELinux, you have the Wikipedia for that. But a brief description could be: a MAC (Mandatory Access Control) implementation in Linux that prevents a process to access to other processes or files that is supposed to not to have access (open, read, write files, etc.)

If you are here is because you want to finally start using SELinux and you are really interested on make it work, to tame this wild horse. Let me just say something, if you are really worry about security and have dozens of Linux servers in production, keep SELinux enabled, keep it “Enforcing”, no question.
Once said that, here is my list. It is not an exhaustive list, I’m looking forward to see your insights in the comments:
  1. Enable SELinux in Enforcing mode:
    • In configuration files (need restart)
      • /etc/sysconfig/selinux (RedHat/CentOS 6.7 and older)
      • /etc/selinux/config (RedHat/CentOS 7.0 and newer)
    • Through commands (no restart required)
      • setenforce Enforcing
    • To check the status use
      • sestatus # or command getenforce
  2. Use the right tools. To do cool things you need cool tools, we will need some of them:
    • yum install -y setools-console policycoreutils-python setroubleshoot-server
    • policycoreutils-python comes with the great semanage command, the lord of the SELinux commands
    • setools-console comes with seinfosesearch and sechecker among others
    • from setroubleshoot-server package we will use sealert to easily identify issues
  3. Get to know what is going on: Dealing with SELinux happens mostly during installation, configuration and tests of Linux services. Therefore, in case something in your system is not working properly or in the same manner as with SELinux disabled. When you are configuring and installing a service or application on a server and something is not working as expected, not starting as it should to, you always think “Damn SELinux, let’s disable it”. Forget about that, you have to check the proper place to see what is going on with it: the Audit logs. Check /var/log/audit/audit.log and look for lines with “denied”.
    • tail -f /var/log/audit/audit.log | perl -pe ‘s/(\d+)/localtime($1)/e’
    • the perl command is to convert the Epoch time (or UNIX or POSIX time) inside the audit.log file to human readable time.
  4. See the extended attributes in the file system that SELinux use:
    • ls -ltraZ # most important here is the Z
    • ls -ltraZ /etc/nginx/nginx.conf will show:
      • -rw-r–r–. root root system_u:object_r:httpd_config_t:s0 /etc/nginx/nginx.conf
      • where system_u: is the user (not always a user of the system), object_r: role and  httpd_config_t: is the object type, other objects can be a directory, a port or socket and types of an object can be a config file, log file, etc.; finally s0 means the level or category of that object.
  5. See the SELinux attributes that applies to a running process:
    • ps auxZ
      • You need to know this command in case of issues.
  6. Who am I for SELinux:
    • id -Z
      • You need to know this command in case of issues.
  7. Check, enable or disable defined modes (enforcing or permissive) per deamon:
    • getsebool -a # list all current status
    • setsebool -P docker_connect_any 1 # allow Docker to connect to all TCP ports
    • semanage boolean -l # is another alternative command
    • semanage fcontext -l # to see al contexts where SELinux applies
  8. Add a non default directory or file to be used by a given daemon:
    • For a folder used by a service, i.e.: change Mysql data directory:
      • Change your default data directory in /etc/my.cnf
      • semanage fcontext -a -t mysqld_db_t “/var/lib/mysql-default/(/.*)?”
      • restorecon -Rv /var/lib/mysql-default
      • ls -lZ /var/lib/mysql-default
    • For a new file used by a service, i.e.: a new index.html file for Apache:
      • semanage fcontext -a -t httpd_sys_content_t ‘/myweb/web1/html/index.html’
      • restorecon -v ‘/myweb/web1/html/index.html’
  9. Add a non default port to be used by a given service:
    • i.e.: If you want nginx to listen in other additional port:
      • semanage port -a -t http_port_t -p tcp 2100
      • semanage port -l | grep  http_port # check if the change is effective
  10. Spread the word!
    • SELinux is not easy but writing easy tips make people using it and making the Internet a safer place!

Alfresco, NAS or SAN, that’s the question!

The main requirement on the shared storage is being able to cross-mount the storage between the Alfresco servers. Whether this is done via an NAS or SAN is partly a decision around which technology your organization’s IT department can best support. Faster storage will have positive implications on the performance of the system, with Alfresco recommending throughput at 200 MB/sec.

NAS allows us to mount the content store via NFS or CIFS on all Alfresco servers, and they are able to read/write the same file system at the same time. The only real requirement is that the OS on which Alfresco is installed supports NFS (which is any Linux box actually). NFS tends to be cheaper and easier, but is not the fastest option. It is typically sufficient, though.

SAN is typically faster and more reliable, but obviously more expensive and complex (dedicated hardware and configuration requirements). In order to read/write from all Alfresco servers from/to the SAN, special file system types are necessary. For Red Hat, we use GFS2, other Linux flavors use OCFS or many others.

You are maybe thinking what happen in case of having multiple Alfresco servers writing to the same LUN could result in corruption (especially in header files), so it sounds like NAS (NFS/CIFS) would take care of that issue, however, if using a SAN, the filesystem must be managed properly to allow for read/write from multiple servers. For the Alfresco stand point, you don’t have to take care of that in both SAN or NAS approaches because Alfresco manages the I/O such that no collisions or corruption occur.

Note: If using a SAN, ensure the file system is managed properly to allow for read/write from multiple servers.

I also wanted to share this presentation I did internally some time ago but I think it would be useful.

Alfresco Tuning Shortlist

During last few years, I have seen dozens of Alfresco installations in production without any kind of tuning. That makes me thing that 1) nobody cares about performance or 2) nobody cares  about documentation or 3) both of them!
I know people prefer to read a blog post instead the product official documentation. Since Alfresco have improved A LOT our official documentation and most of the information provided below can be found there, I want to point out some tips that EVERYONE has to take into account before going live with your Alfresco environment. Remember, it’s easy Tuning = Live, No Tuning = Dead.
Tuning the Alfresco side:
  • Increase number of concurrent connections to the DB in alfresco-global.properties

[bash]
# Number below has to be the maxThreads value + 75
db.pool.max=275
[/bash]

  • Increase number of threads that Tomcat will use in server.xml – section 8080, 8443 and 8009 in case you use AJP

[bash]
maxThreads=“200”
[/bash]

  • Adjust the amount of memory you want to assign to Alfresco in setenv.sh or ctl.sh (which is the default one):

[bash]
export CATALINA_OPTS=" -Xmx=16G -Xms=16G"
[/bash]

in JAVA_OPTS make sure you have the flag “-server” that gives 1/3 of memory for new objects, do not use “XX:NewSize=” unless you know what you are doing, Solr takes many new objects and it will need more than 1G in production.

[bash]
ooo.enabled=false
jodconverter.enabled=true
[/bash]

Tuning the Solr side:
In solrcore.properties for both workspace and archive Spaces Store

[bash]
alfresco.batch.count=2000
solr.filterCache.size=64
solr.filterCache.initialSize=64
solr.queryResultCache.size=1024
solr.queryResultCache.initialSize=1024
solr.documentCache.size=64
solr.documentCache.initialSize=64
solr.queryResultMaxDocsCached=2000
solr.authorityCache.size=64
solr.authorityCache.initialSize=64
solr.pathCache.size=64
solr.pathCache.initialSize=64
[/bash]

In solrconfig.xml for both workspace and archive Spaces Store 

[bash]
mergeFactor change it to 25
ramBufferSizeMB change it to 64
[/bash]

April/9/2015 Update! For Solr4 (Alfresco 5.x) add next options to its JVM startup options:

[bash]
-XX:+UseConcMarkSweepGC -XX:+UseParNewGC
[/bash]

Tuning the DB side:
Max allowed connections, adjust that value to the total amount of your Alfresco or Alfrescos plus 200, consider increase it in case you use that DB for other than only Alfresco.
  • For MySQL in my.cnf configuration file:

[bash]
innodb_buffer_pool_size = 4GB
max_connections=600
innodb_log_buffer_size=50331648
innodb_log_file_size=31457280
innodb_flush_neighbors=0
[/bash]

  • For Postgres in postgresql.conf configuration file

[bash]
max_connections = 600
[/bash]

Do maintenance on your DB often. Run ANALYZE or VACCUM (MySQL or Postgres), a DB also needs love!
Tuning the OS side:
I’m not very good on Windows so I will cover only a few tips for Linux:
  • Change limits in /etc/security/limits.conf to the user who is running your app server, for example “tomcat”:

[bash]
tomcat soft nofile 4096
tomcat hard nofile 65535
[/bash]

If you start Alfresco with a su -c option in /etc/init.d/, for Ubuntu you have to uncomment the pam_limits.so line here /etc/pam.d/su, if this is using login (by ssh) it is uncommented by default. For RedHat/Centos this line has to be uncommented here /etc/pam.d/system-auth.

  • Your storage throughput should be greater than 200 MB/sec and this can be checked by:

[bash]
# hdparm -t /dev/sda
/dev/sda:
Timing buffered disk reads: 390 MB in  3.00 seconds = 129.85 MB/sec
[/bash]

  • Allow more concurrent requests by editing /etc/sysctl.conf

[bash]
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535
net.ipv4.ip_local_port_range = 2048 64512
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 10
[/bash]

Run “sysctl -p” in order to reload changes.
  • A server full reboot is a good preventive measure before going live, it should start all needed services in case of contingency and we will find if we left something back on the configuration.
Remember, this is ONLY A SHORTLIST, you can do much more depending on your use case. Reading the documentation and taking our official training will be helpful and take advantege that we were polishing our training materials lately.

Alfresco security check list

Alfresco security check list is a list of elements to check before going live with an Alfresco installation in a production environment. This check list is part of the Alfresco Security Best Practices Guide, but I wanted to give it a post in case you missed (thinks that happen due to the 30+ pages of the guide).

[slideshare id=43292696&doc=alfrescosecuritybestpractices-checklistonly-150107130033-conversion-gate02&type=d]

 

Where and how to change any Alfresco related port

Due to a conversation in Twitter with @binduwavell I haven’t found a single place where to find how to change any port related to all services that Alfresco runs. So, I have decided to write a blog post about it with some notes from Rich McKnight.

Here you go a comprehensive list of all ports and where to change them:

Tomcat:

  • HTTP 8080: tomcat/conf/server.xml
  • HTTPS 8443: tomcat/conf/server.xml
  • Shutdown Port 8005:  tomcat/conf/server.xml
  • AJP 8009:  tomcat/conf/server.xml
  • JPDA 8000: catalina.sh

Alfresco:

Alfresco context inside Alfresco configuration: alfresco-global.properties

  • alfresco.port=8080

Share:
Share context inside Alfresco configuration: alfresco-global.properties

  • share.port=8080

If repository ports are changed you change Alfresco Share connection ports in web-extenxion/share-config-custom.xml

Alfresco SharePoint Protocol: alfresco-global.properties

  • vti.server.port=7070
  • vti.server.external.port=7070

OpenOffice – LibreOffice: alfresco-global.properties

  • ooo.port=8100

JodConverter: alfresco-global.properties

  • jodconverter.portNumbers=8100

FTP: alfresco-global.properties
Can be mapped to non-privileged ports, then use firewall rules to forward requests from the standard ports

  • ftp.port=21

CIFS – SMB shared drive: alfresco-global.properties
Can be mapped to non-privileged ports, then use firewall rules to forward requests from the standard ports

  • cifs.tcpipSMB.port=445
  • cifs.netBIOSSMB.sessionPort=139
  • cifs.netBIOSSMB.namePort=137
  • cifs.netBIOSSMB.datagramPort=138

IMAP: alfresco-global.properties
Can be mapped to non-privileged ports, then use firewall rules to forward requests from the standard ports

  • imap.server.port=143

Inbound Email (SMTP): alfresco-global.properties
Can be mapped to non-privileged ports, then use firewall rules to forward requests from the standard ports

  • email.server.port=25

NFS server: alfresco-global.properties
Mount/NFS server ports, 0 will allocate next available port

  • nfs.mountServerPort=0
  • nfs.nfsServerPort=2049

RPC registration port, 0 will allocate next available port
Some portmapper/rpcbind services require a privileged port to be used

  • nfs.rpcRegisterPort=0

To disable NFS and mount server registering with a portmapper set

  • nfs.portMapperPort to -1
  • nfs.portMapperPort=111

Cluster in 4.2 with Hazelcast: alfresco-global.properties

  • alfresco.hazelcast.port=5701

Cluster in 4.1 with JGroups: alfresco-global.properties

  • alfresco.tcp.start_port=7800

Solr:
From Solr to Alfresco workspace queries:  ./alf_data/solr/workspace-SpacesStore/conf/solrcore.properties

  • alfresco.port=8080
  • alfresco.port.ssl=8443

From Solr to Alfresco archive queries:  ./alf_data/solr/archive-SpacesStore/conf/solrcore.properties

  • alfresco.port=8080
  • alfresco.port.ssl=8443

From Alfresco to Solr queries:  alfresco-global.properties

  • solr.port=8080
  • solr.port.ssl=8443

RMI service, JMX ports: alfresco-global.properties

  • alfresco.rmi.services.port=50500
  • avm.rmi.service.port=0
  • avmsync.rmi.service.port=0
  • attribute.rmi.service.port=0
  • authentication.rmi.service.port=0
  • repo.rmi.service.port=0
  • action.rmi.service.port=0
  • deployment.rmi.service.port=0

Monitoring RMI:

  • monitor.rmi.service.port=50508

Revisión del libro “Icinga Network Monitoring” de Packt Publishing

Icinga Network Monitoring Book

La editorial Packt Publishing ha publicado recientemente un libro en el que he podido ayudar en su gestación; he colaborado como revisor técnico. Se trata del libro Icinga Network Monitoring. En este libro se puede encontrar todo lo necesario para aprender lo esencial de este software de monitorización que no sólo está en auge sino que es una realidad que anuncié aquí en 2009.

Este libro va al grano desde el primer capítulo, ejemplos útiles y descripciones que te permitirán aprender este potente sistema desde cero y con una base sólida. Además también te servirá para aprender a configurar Nagios.

Posiblemente lo más interesante del libro es forma de describir el core de la aplicación, como funcionan los diferentes tests y como entender y hacer plugins.

Finalmente, se añade un capitulo entero sobre las diversas interfaces gráficas (principalmente web) que dispone Icinga. Para muestra un botón:

Screen Shot 2014-01-27 at 10.06.13 PM

Puedes leer el capítulo 2 en este enlace. Enjoy monitoring!

My talk about “Alfresco Backup and Recovery Tool” in the Alfresco Summit

All recorded videos has been published recently in the Alfresco Summit portal and here you go my talk “Alfresco Backup and Recovery Tool: A Real World Backup Solution” I gave in both Boston and Barcelona. I was the first public presentation about Alfresco BART.

Thanks to all who attended this session and made it one of the most-well attended and highest-rated in both cities. I’m looking forward to keep talking covering security topics as usual (I already have some “hack-ideas”…).

If you only want to see the demo, it starts at minute 33:

The presentation is published in Slideshare as well:

Remember you can download here the White Paper I mention during the talk.

If you only want to see the practical demo (best resolution in the talk video above), you can enjoy it here:

Any questions and comments are always welcome!

Deploying an Alfresco cluster in Amazon AWS in just minutes

I have been playing with Amazon Web Services since few months ago. AWS is for a SysAdmin like Disneyland is for a 8 years old child, I enjoy so much doing this kind of stuff.
If you are not familiar with AWS products/services, let me describe with Amazon words and in my own words what are the most important services and concepts we have been using for deploying an Alfresco on-premise installation in AWS:

  • EC2: virtual servers in the cloud.
  • VPC: isolated cloud resources. Yes, a real isolated cloud architecture and resources.
  • S3: Scalable storage, like a CAS (Content Addressable Storage) for your local or cloud servers.
  • RDS: Managed Relational Database Service (MySQL, Oracle or MS SQL Server).
  • ELB: Elastic Load Balancer, as part of EC2 allows you to create load balancers easily.
  • CloudFormation: Templated AWS resource creation. *This is why I’m writing this article. A CloudFormation template is a json file which creates a wizard and options based in our needs.
  • AWS Region: a location with multiples AZ .
  • AZ: Availability Zone (data centers connected through low-latency links in the same region).

Once said so, my colleague Luis Sala has been working together with the Amazon AWS crew and they have made a CloudFormation template to deploy an Alfresco cluster in just minutes. This template is available here: https://github.com/AlfrescoLabs/alfresco-cloudformation.

This CloudFormation template will create a 2 nodes Alfresco cluster inside a virtual private cloud (VPC), a Load Balancer (ELB) with sticky sessions bases on the Tomcat JSESSIONID, a shared ContentStore based on S3, a shared MySQL DB based on a RDS instance. Each Alfresco node will be in a separate Availability Zone and finally the template includes auto-scaling roles for add extra Alfresco nodes when some thresholds are reached.

We will have something like the diagram below, I say “like this” because we will have only 2 Alfresco nodes in the cluster and the auto-scaling will add more nodes in case of thresholds are reached (clic to see it bigger).

aws-cf-alfresco

Finally in the video below you can see step by step a real CloudFormation deployment, I think the video screencast is self-explanatory, it does not have audio. As you can see, the video is 6 minutes length after cropping some dead times but it was around 15 minutes total.

I thought it is a very interesting approach about Alfresco clustering and it worth it to share with you all. Any question or feedback is welcome, even in spanish or english 😉

Playing with Duplicity backup and restore tool and Amazon S3

Duplicity is a python command line tool for encrypted bandwidth-efficient backup.

In their creator words: “Duplicity  incrementally  backs  up  files  and directory by encrypting tar-format volumes with GnuPG and uploading them to a remote (or local) file server.  Currently local, ftp, sftp/scp, rsync, WebDAV, WebDAVs, Google Docs, HSi and Amazon S3 backends  are  available.   Because  duplicity  uses librsync,  the  incremental  archives  are  space  efficient  and only record the parts of files that have changed since the last backup.  Currently duplicity supports deleted files, full Unix permissions, directories, symbolic links, fifos, etc., but not hard links.

My brief description: a free and open source tool for doing full and incremental backup and restore from linux to local or almost any remote target, compressed and encrypted. A charm for any sys admin.

In order to explain how Duplicity works for backup and restore. I’m going to show how to do a backup of a folder called “sample_data” to an Amazon S3 bucket called “alfresco-backup” and a folder called “test” inside my bucket (use your own bucket name) the bucket and folder has been created by me before running any command but could be created by duplicity first time we run the command. If you want to let Duplicity create your own Amazon S3 bucket and you are located in Europe, please read the Duplicity man page.

Note: please not get confused with my bucket name “alfresco-backup”, use your own bucket name. I will use this bucket name also in future articles 😉

How to install Duplicity in Ubuntu:

[bash]
# sudo apt-get install duplicity
[/bash]

Create a gpg key and remember the passphrase because will be required by Duplicity, defaults values works good. Your backup will be encrypted with the passphrase, all files created by command below will be on your Linux home/.gnupg but you won’t need that at all:

[bash]
# gpg –gen-key
[/bash]

Create required system variables (you can also use them with an script):

[bash]
# export PASSPHRASE=yoursupersecretpassphrase
# export AWS_ACCESS_KEY_ID=XXXXXXXXXXX
# export AWS_SECRET_ACCESS_KEY=XXXXXXXXXX
[/bash]

Backup:

To perform a backup with the Duplicity command (the easy and simple command):

[bash]
# duplicity sample-data/ s3+http://alfresco-backup/test
[/bash]

If you get errors, some dependencies for Python and S3 support are required, try installing librsync1 and next python libraries python-gobject-2, boto and dbus.

The command output should be something like this:

[bash]
Local and Remote metadata are synchronized, no sync needed.
Last full backup date: none
No signatures found, switching to full backup.
————–[ Backup Statistics ]————–
StartTime 1368207483.83 (Fri May 10 19:38:03 2013)
EndTime 1368207483.86 (Fri May 10 19:38:03 2013)
ElapsedTime 0.02 (0.02 seconds)
SourceFiles 5
SourceFileSize 1915485 (1.83 MB)
NewFiles 5
NewFileSize 1915485 (1.83 MB)
DeletedFiles 0
ChangedFiles 0
ChangedFileSize 0 (0 bytes)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 5
RawDeltaSize 1907293 (1.82 MB)
TotalDestinationSizeChange 5543 (5.41 KB)
Errors 0
————————————————-
[/bash]

This will create 3 files in your S3 bucket:

  • duplicity-full-signatures.20130510T160711Z.sigtar.gpg
  • duplicity-full.20130510T160711Z.manifest.gpg
  • duplicity-full.20130510T160711Z.vol1.difftar.gpg

All files are stored with the GNU tar format and encrypted, “duplicity-full” means that was first backup, in next backups you will see “duplicity-inc” in different volumes.

  • sigtar.gpg file contains files signatures then Duplicity will know what file has changed and do the incremental backup
  • manifest.gpg contains all files backed up and a SHA1 hash of each one
  • volume files (vol1 to volN depending of your backup size) will contains data files, a volume file use to be up to 25MB each one, this is for improve performance doing backup and restoration.

For more information about file format look at here: http://duplicity.nongnu.org/duplicity.1.html#sect19

[bash]
# duplicity –full-if-older-than 30D sample-data s3+http://alfresco-backup/test
[/bash]

Verify if there are changes between last backup and your local files:

[bash]
# duplicity verify s3+http://alfresco-backup/test sample-data
Local and Remote metadata are synchronized, no sync needed.
Last full backup date: Fri May 10 19:38:03 2013
Difference found: File . has mtime Fri May 10 19:39:05 2013, expected Fri May 10 19:34:53 2013
Difference found: File file1.txt has mtime Fri May 10 19:39:05 2013, expected Fri May 10 18:25:36 2013
Verify complete: 5 files compared, 2 differences found.
[/bash]

In last example we can see that a fine called file1.txt has changed and also the root directory “.” date,

List files backed up in S3:

[bash]
# duplicity list-current-files s3+http://alfresco-backup/test
Local and Remote metadata are synchronized, no sync needed.
Last full backup date: Fri May 10 18:32:59 2013
Fri May 10 19:34:53 2013 .
Fri May 10 18:25:36 2013 file1.txt
Fri May 10 18:54:31 2013 file2.txt
Fri May 10 19:35:03 2013 mydir
Fri May 10 19:35:03 2013 mydir/file3.txt
[/bash]

You can see 3 files and 2 directories, in the statistics report duplicity counts any directory as file.

Restore:

Duplicity can also manage the restore process but it will never override any existing file, the you can restore to a different location or remove your corrupted or old data if you want to restore in the original place. If duplicity successfully completes the restore it is not going to show any output.

How to restore last full backup:

[bash]
# duplicity s3+http://alfresco-backup/test restore-dir/
[/bash]

How to restore a single file:

[bash]
# duplicity –file-to-restore mydir/file3.txt s3+http://alfresco-backup/test restore-dir/file3.txt
[/bash]

How to restore entire backup in a given date:

[bash]
# duplicity -t 2D s3+http://alfresco-backup/test restore-dir/
[/bash]

this will restore full backup of  2 days ago (see -t options, seconds, minutes, hours, months, etc may be used)

How to restore a single file in a given date:

If you are looking for a file with a content but you don’t know what version of the file you have to recover, you can try restoring different file versions in the backup:

[bash]
# duplicity -t 2D –file-to-restore file1.txt s3+http://alfresco-backup/test file1.txt.2D
# duplicity -t 30D –file-to-restore file1.txt s3+http://alfresco-backup/test file1.txt.30D
[/bash]

Note, you have to specify a different file name for local restoration, remember that duplicity never overrides existing content.

Delete older backups:

[bash]
# duplicity remove-older-than 1Y s3+http://alfresco-backup/test –force
[/bash]

also you can use for example 6M (six months), 30D (30 days) or 60m (60 minutes).

To see more information when you are running a duplicity command can use the vervosity flag -v [1-9] but also can see all logs here /root/.cache/duplicity/[directory with unique ID]/duplicity-full.YYYMMDDT182930Z.manifest.part

When you are finished playing with Duplicity and Amazon S3 remember to clean your passphrase and Amazon keys from the variables:

[bash]
# unset PASSPHRASE
# unset AWS_ACCESS_KEY_ID
# unset AWS_SECRET_ACCESS_KEY
[/bash]

In next posts I will show  how to use Duplicity to have a perfect backup and restore policy of Alfresco.