smartrics

Tuesday 2 July 2019

On privacy, data ownership and self sovereign identity and the race to get there first.

People in the business know that companies 2.0 compete for our attention; a fierce competition with smaller and smaller margins (of attention). The end game is clearly advertisement. The pattern, simplistically, is:

user uses the service for free (or for very little money),
company stores the data and profiles the user, aggregates profiles, does fancy Machine Learning magic, etc
company uses the data and derived information for targeted advertisement or (shock!) sells it to third parties for other lucrative endeavours.

Simple and effective.

Finally, people is getting more and more aware of what's going on (don't you, sometimes weirdly, feel like you're followed or even been scammed?); law makers are catching up to get a fair share of the revenue and economists are adjusting their theories to explain recent blips in their Capitalism models.

A trend is emerging where anonymity and privacy are becoming more and more important for users; and companies do make an effort to protect such users (from themselves?).
To the extent that some of them have announced changes to their business models. Facebook is paving the way.

This is definitely a step in the right direction but it's clearly not enough. Data is still stored in these companies' data stores and, moreover, still - somewhat - traceable to a person.
Yet, we're seeing more and more efforts to frame the problem of data ownership in a lots of ways, some more creative than others.

Technology is also catching up (one of the many systems trying to tackle the issue is OpenPDS). And in the space of digital identity, self sovereign identity - users owning their digital identity much like they own their passport or identity card - is also being developed as a concept with associated technologies.

The problem, right now, is that companies haven't caught up as they haven't yet figured out how this switch will eventually affect their bottom line (read: make shareholders richer).
Implications, then, are that

tools and processes for the end users aren't yet developed for mass market adoption: think, for a moment, about how easy(!) is to use a crypto wallet.
incentives to move to this new model are virtually non existent so we can't really talk just yet of a new 3.0 model... maybe we can settle with a 2.5 though.

But - in a way - it's easy to see how this will eventually work. Companies will compete for that data: as if you were strewed with gold dust they'll knock at your (digital) door to come and swipe some of it. "Please, give us access to your browsing history for last month of May for the opportunity to win a trip to the Maldives!"; or: "Give us your Facebook timeline for last year to get a 10% grocery store discount for a whole month". And - make no mistake - we'll give it away (agent Smith echoes in my mind).

There will also be less mundane opportunities, obviously, which will generate new revenue streams ad new creative opportunities. A simple example comes to mind: "Hello A.C.M.E, here is access to my smart home data, please monitor it for intrusion detection and to activate my irrigation system when needed, whilst I am away. Oh, also please feed the dog, will you?!").

As long as this is done with fair, open and clear rules that guarantee the weakest users (like children and people with mental health issues, or the poorest) it's all welcome.

In fact, forward looking enterprises should be pivoting right now and contribute to pave this new way in their respective market segment. Clearly, it's challenging: not only they need to embrace the risk of adopting cutting edge technologies, but also create new processes and redefine their identity if necessary.

But the sooner they start the less the risk is to be caught by surprise. And maybe some of them will hit the jackpot and make it into the list.

Thursday 2 July 2015

Logging compendium

Here follow some thoughts on how best to do application logging; they are very Java-centric and primarily based on my experience and various source of information googled along the way.

Implementation

Use log4j2/logback/name-your-modern-logging-framework

As soon as you can, move to a newer logging framework. Modern API/frameworks have higher performance, have a better API and more useful features.
I have settled for log4j2 although I use it via slf4j - for historic reasons. But others may do well too (namely logback!).

Use logOnce technique

When a LOG statement is in a loop and (a) that loop is infinite or (b) has the potential to be running many times, there's a risk to spam the log messages with the same entry on and on.

This could make the file grow indefinitely making it hard to read/analyse; it may also roll over causing loss of important logged information.

The logOnce technique prevents these nasty side effect and it works as following:

define a special logger L that logs a log entry.
log a message within a loop using the logger L, remember the last logged message
next time you need to log on the same logger, the new message with the last logged message.
if matches skip the statement log; if not go back to step 2.

Variations include

match new log statements using regex to allow for differences in some dynamic data
log every N entry: even if the new line matches the last logged line, log every N occurrences
record some metrics - like number of occurrences logged/skipped etc

Examples where you want to use this pattern are:

scheduled threads like Heartbeats
Log statements occurring in a for/while loop or recursive functions. especially when logs are on high log levels like ERROR or WARN

Other implementation details

Logging statements must be side effect free. Besides appending the statement to the configured appender, a LOG statement MUSTN'T change the internal application state.
Make sure that a log actually happens: either log BEFORE the action Or try { action } finally { LOG after } to make sure the LOG after always happens.

In a distributed application, make sure that

all nodes share the same wall clock - use NTP to sync
all applications share the same configuration; or at least the same message pattern layout.

Log levels

Logs *are* a user interface. And, as such, they deserve the right attention to details... (Yeah, right!)...

At a basic level I personally group INFO and above as forming Administration/Support user interfaces to the system runtime logs.

ERROR (aliases severe/fatal): for messages that need to be looked at immediately because a transaction or part of the system is in an un-recoverable state (albeit temporary).
WARN : for messages that need to be looked at immediately because a transaction or part of the system is went wrong but managed to recover.
INFO: descriptive messages about a meaningful system transaction happened. Or for long transactions start/stop.

On the other end, below INFO is for Developers.

DEBUG (alias fine): follow system transactions' execution with the intention of reporting the runtime flow.
TRACE (alias finer/finest): uber-details - to report application state or transmitted data, etc.

In essence, a support person looking at the logs should be able to determine whether an application transaction has occurred and what the final execution state is just by having INFO on.

Developers should be able to diagnose went on/wrong by looking at the DEBUG trace. That's why it is important to be able to simply turn DEBUG on/off possibly selectively per Logger or group of loggers.

Traces should only be used during development and never in production.

Markers

Experiment with Markers to provide a better partition of logs depending on application specific contexts.

Design

For transactions spawning across several threads in the same process OR different processes in the same host OR different apps in different hosts it's important that

Each log entry, in each file reports some ID to trace back the transaction
For example an operation starting from a browser client and going via a web services layer, consuming various microservices and/or middleware components should ideally report a single transaction id in all systems.

Provide a diagnostic context where appropriate to propagate internal state of a thread/process across other systems.

Speeds up diagnostics because it won't require trawling multiple files, dispersed in various locations

Log a single transaction contextual message in a single LOG line

Separate concerns by delegation and composition.

If you're logging latency of code blocks or entry/exit of code blocks, don't clutter the block, rather create a wrapper and use the wrapper (or use AOP)

Log different concerns in different files. For example

traffic data should go on a file,
application logs on another,
performance and metrics on another.

Link log entries with a common transaction ID.

NEVER log the same log entry in two different parts of the code.

NEVER log an entry in multiple lines (caveat informational banners etc)

You should never log at INF/ERR/WRN inside utility function or classes: let the caller log within the context of the application.

Antipattens

Antipattern 1 - Have to log caller class and line number
If you find the need to log the caller's file name and line number, something's wrong with your log OR with your pattern layout. Logging caller file and line number is very inefficient.

Antipattern 2 - Need to write a custom log file parser
If you find the need to write a parser to parse the logs, somethings wrong with your logs or with the way you use logs. Possible causes include:

You're logging too much and - for a given transaction - you cant work out what's going on
Your logs are too complex to read (Log statement don't have a constant and predictable format or are split in multiple lines - which will make writing the parser hard anyway)
You are using log files where you should instead use a proper "inter process communication" mechanism, or a proper auditing mechanism, etc

You should always get away with grep/awk/sed/tail - which are common tools in every platform.

Antipattern 3 - DEBUG (or even TRACE) has to be turned ON in production.
Your logs suck. The system performances will be degraded and you'll spend your days staring at your screen trying to figure out what's going on... the classic needle in the haystack.

Monday 13 April 2015

Minecraft server on a Synology NAS

It had to happen - my son "requires" a Minecraft server! One of those things... I have installed mine on my Synology 415play following these instructions.
Disclaimer
These instructions worked for me, and may not work for you. They are "AS IS" and I accept NO RESPONSIBILITY WHATSOEVER for any loss or damage to your data or functionality. You have been warned.
Also: these are unsupported... if something doesn't work for you, you'll have to fix it yourself.

Prerequisite

Java

The "Java Manager" package must be installed. You can install it via the Package Manager, under "Utilities".

Once the package is installed, the JVM will be located at

/var/packages/JavaManager/target/Java/bin/java

Terminal access

Enable terminal access via "Control Panel", "Terminal & SNMP". Then check "Enable SSH service" on port 22. Now you may need a SSH client, like PuTTY and connect to your NAS.

Manual installation

Create user

Create a user to run minecraft server as. To do so, I found easier to edit /etc/passwd and append the following line

minecraft:x:25565:100:Minecraft Server:/volume1/minecraft:/sbin/nologin

Create the server directory and download the server

I decided to put the server directory in /volume1/minecraft

mkdir /volume1/minecraft

then

 cd /volume1/minecraft  
 wget https://s3.amazonaws.com/Minecraft.Download/versions/1.8.3/minecraft_server.1.8.3.jar

Note that the server version may change in the future. Check the latest version of the multiplayer server from Mojang's website

Create the start script

I am using a script to run the server. Use your favourite editor to create a file in /volume1/minecraft/run.sh with this content

 cd /volume1/minecraft
/var/packages/JavaManager/target/Java/bin/java -Xmx1024M -Xms1024M -jar /volume1/minecraft/minecraft_server.1.8.3.jar -o true -d64 nogui

then

chown minecraft:users run.sh
chmod 755 run.sh

Run the server

cd /volume1/minecraft
./run.sh

The first run will start the server and stop it straight away. In fact you need now to edit the eula.txt and "accept" as per instructions.
Re-run the script and the server should now start successfully. You can now try to connect your minecraft client.

Automatic start/stop

Create the file "/etc/init/minecraft.conf" with this content:

description "start minecraft server"
author "smartrics"
console log
start on syno.share.ready and syno.network.ready

stop on runlevel [06]

setuid minecraft

pre-start script
    echo "netatalk start" || true
    date || true
    if [ ! -x /volume1/minecraft/run.sh ]; then
      stop; exit 1
    fi
end script

script
    exec /volume1/minecraft/run.sh
end script

post-stop script
    echo "minecraft server stop" || true
    date || true
    rm -f /var/run/minecraft.pid
end script

Then the server can simply be started/stopped with

start minecraft
stop minecraft

and automatically be started at the next NAS reboot.

Check that service is running

If the server is running, this command

ps -w | grep minecraft

should show something similar to

 4735 minecraf  3768 S    /bin/sh /volume1/minecraft/run.sh
 4737 minecraf 1188m S    /var/packages/JavaManager/target/Java/bin/java -Xmx1024M -Xms1024M -jar /volume1/minecraft/minecraft_serv

Wednesday 5 June 2013

My Twitter feed as Wordle

@Eastmad wins as the chattiest one

Thursday 31 May 2012

Hanging cURL

I have just come across a nasty issue that eventually turned out to be caused by cURL, on Cygwin.

curl http://localhost:18182/....

hangs. Forever.
Turning verbosity on with --ascii-trace this is the output of the happy path:

* About to connect() to localhost port 18182 (#0)
*   Trying ::1...
* Timeout
*   Trying 127.0.0.1...
* connected
* Connected to localhost (127.0.0.1) port 18182 (#0)

When it hangs, it does it when Trying ::1...

With the little help from RawCap I noticed that the TCP stack never receives the ACK when in SYN SENT state (see here for a picture). And cURL sometimes isn't forcibly timing out.

By specifying --connect-timeout the problem is solved.

Sunday 7 August 2011

Grammichele - Piazza C. M. Carafa

I am re-organising the pictures in my hard drive and I have just came across a set of snaps I took 3y ago in Grammichele (the small town in Sicily where I grew up). After some stitching and cloning, this is the result

Piazza Carlo Maria Carafa, Grammichele (CT) Italy by Fabrizio Cannizzo is licensed under a Creative Commons Attribution 3.0 Unported License.

If you need/would like to use the large version of the file (over 200Mb for 14328x3091 pixels), by all means, get in contact.

UPDATE: Link to Google Maps

View Larger Map
UPDATE 2: this picture has been uploaded to wikipedia :)
UPDATE 3: A new photo of Piazza Carafa is my new FB cover

Wednesday 11 May 2011

Continuously integrate FitNesse and RestFixture

Update: As of Jan 2015 the RestFixture CI is hosted by Travis.

I recently released the RestFIxture v2.0 with some additions and improvements. An effort, having worked on new features and code refactoring, that took about two months.

One thing I have always missed during the past two years of development is the lack of continuous integration with FitNesse; being the RestFixture a "plug-in" to FitNesse. Not a big deal maybe, given the size of the project and the narrow point of integration with FitNesse; but, in the true spirit of Agile software development, having being mildly burned in the past by the integration issues I decided to set up a CI build to gain confidence that both the latest versions of the tools work correctly at any point in time.
The problem was clearly where and how.

CI in the "cloud" with FaZend
I searched for a "cloud" provider of Continuous Integration solutions, and I soon discovered FaZend.com, a company offering hosted SVN, Trac and Hudson for small and medium projects. The service - at the time of writing - is still beta and free of charge.

I like their product; I am a user of such systems, but I also have contributed to the implementation a similar SaaS solution, for BT, two years ago, in our internal cloud and for our developers community. So I fully appreciate the value of a point and click hosted solution to sort out this curcial piece of the development environment.

FaZend is therefore a good fit: it's ideal for a Java based stack, runs Hudson - not Jenkins - one of the best CI servers out there, and it doesn't cost a penny. Having said that, as both RestFixture and FitNesse are hosted on GitHub, I don't need Trac and SVN (well, it turns out that I do need SVN, as I will explain later)

Few comments to summarise my experience (so far) with the service.
Reading the FaZend website I discovered that on Sign up, the CI features of the service are disabled and that it's at discrection of the company to enable users.
So I dropped an email to the support email address to explain my plan and check if it was possible to have CI. Straight after I got the thumbs up response from Yegor (the CTO) and an invite to sign up. And so I did.

Yegor enabled Hudson on my account and also bothered to create an initial project skeleton and so I was ready to go.

The project configuration is accessible via the "CI settings" page and offers the possibilty to set the essential information for the build to start. Pretty easy if edited alongside the tutorial available in the website.

The CI server offers SVN, GIT and Mercurial integration. It turns out, though, that the Git plugin used by Hudson isn't working very well and code can't be cloned from GitHub. Because that Hudson is very much locked down and only accessible via the CI settings (even the workspace is forbidden), it's hard to see what's going on. So I left with it to FaZend to sort it out (albeit I have offered help, if needed).

From the FaZend portal you can also set access control lists to setup everybody participating to the development of the project being build. As I am currently the sole developer of the ResatFixture, I didn't bother to go beyhond the default setup (allow r/w to the account and forbid everything to everybody else).

And there you go, here it is the RestFixture Integration build with logs and metrics published for reference.

Technicalities
Update: The RestFixture-CI is now obsolete, now that FaZend hudson has a working Git/GitHub plugin.

At the time of writing the RestFixture-CI project - holding the build file and the necessary dependencies is available on GitHub, but also on the FaZend SVN, where the Hudson picks it from. It's not ideal, but given that the project isn't changing that often it's OK to double commit (to git and svn) every time.

The build strategy is very simple: a) clone the FitNesse master branch, b) clone the RestFixture master branch, c) build FitNesse, d) deploy the new fitnesse.jar in the RestFixture lib directory and then d) build and test the RestFixture. It's all in the RestFixture-CI build file in it's entirity.

And, the FaZend CI configuration that allows the build to run in Hudson, for everybody's benefit, is:

RestFixtureIntegr:
  people: *
  source:
    scm: svn
    url: svn://svn.fazend.com/RestFixtureIntegr/trunk
  builder:
    tool: ant
    target: test-integration
  publish:
    junit: build/restfixture-git-ro/build/reports/unit/html
    fitnesse: build/restfixture-git-ro/build/reports/fitnesse
    cpd: build/restfixture-git-ro/build/reports/metrics/cpd
    pmd: build/restfixture-git-ro/build/reports/metrics/pmd
    checkstyle: build/restfixture-git-ro/build/reports/metrics/checkstyle
    ckjm: build/restfixture-git-ro/build/reports/metrics/ckjm
    coverage: build/restfixture-git-ro/build/reports/metrics/coverage
    crap4j: build/restfixture-git-ro/build/reports/metrics/crap4j
    findbugs: build/restfixture-git-ro/build/reports/metrics/findbugs
    jdepend: build/restfixture-git-ro/build/reports/metrics/jdepend

More about the ant build

To implement the above strategy I had to - somehow - clone the two git master branches. Not being able to rely on accessing the underlying OS for the git command (albeit it turns out that it may be possible, but I haven't tried), I opted a full java solution.

I therefore implemented two simple Ant clone and pull tasks wrapping the jGit API. The project is called jgit-ant and it's available on GitHub. At the moment only cloning and pulling is implemented, which is just about what I need for this build to execute.

Having the 'clone' sorted, it was time to build.

FitNesse was a pain to build: the target "createUpdateList" simply doesn't work unless the build is started from the same directory as the build file. The (ugly) solution was to call the ant launcher via %lt;java>; importantly, it's necessary to set the attribute dir to the master branch directory for it to work.

After that an ant call to the RestFixture build completes the job.

The build overall takes less than 4 minutes, which is not bad. This, incidentally, suggests that the network connection between the hardware hosting Hudson and GitHub is pretty fast. (Running the build locally takes 30% more, having the clone task as the bottleneck).

Furthermore, after having properly crafted the build file to check for pull failures (that trigger a clone) and repository modifications (that trigger a build), now the build takes only 1 second if no mods/builds are executed, improving even further the feedback loop.

Conclusions
After releasing RestFixture v2.0 I wanted to add another safety net for spotting integration isssues with FitNesse, given that RestFixture and FitNesse development lifecycles are completely disjointed.
A Java based continuous integration hosted solution is what I was looking for. And FaZend ticks all the boxes (other than being the cheapest).
Very few issues along the way and in a total of half a day I set up the build to periodically integrate the two tools.
I am pretty happy with the result. I'll keep an eye on the solution to the Git Hudson plugin situation and possibly clean up the build process.

Monday 9 May 2011

RestFixture v2.0

I just completed the release of the RestFixture v2.0.
With spare time to dedicate to it (courtesy of National Railways who has increased the number of carriages in the train to London to 12, so I can sit!), I took the opportunity to finish off the unfinished business.

The new codebase includes few new features (plus some bug fixes and improvements to the fixture APIs). The RestFixture code is available on GitHub.

This is the list of the main additions. Further details are available in the generated documentation file.

SliM support
It's now possible to run the fixtures with the SLIM runner.
To do so, it's sufficient to specify in the root page of the suites to be run with SliM the following line:


!define TEST_SYSTEM {slim}

then, as each fixture is implemented as a TableTable, the first row of each fixture should be written as


| !-Table:smartrics.rest.fitnesse.fixture.RestFixture-! base_host_uri |

Setting custom content-type to adapter map
Up until version 1.1.1, the way of interpreting the expectations in each response body cell was dictated by the content type of the response.
Now, it's possible to override the default behaviour by specifying a config property that maps the content type of a response to a body handler. The name of the property is restfixture.content.handlers.map; and the available body handlers are

XML to interpret the body as an xml string and the expectations as a list of \n separated XPath expressions

JSON to interpret the body as a JSON string and the expectations as either a list of JavaScript lines to be evaluated in AND or as a block of JavaScript, if the first line of the cell is the string /* javascript */

TEXT to interpre the body as a string and expectations as a list of regular expressions to be matched against the body

The 'comment' command
A new command to inject comments in a fixture; useful to evaluate labels


|comment| the value of 'replaceme' is %replaceme% |

The LET handler
Let now can evaluate JavaScript strings and assign the result to labels. This allows greater expressiveness in extracting data from a REST response.

Other additions

labels containing a null value are rendered by default with the string 'null'. It's possible now to override this behaviour by defining the config 'restfixture.null.value.representation'

body cells with large content can be rendered in a folding tag. A button is provided to toggle the visibility of the content.

Monday 17 January 2011

RestFixture with namespaces support

I finally found some time to fix namespaces support on the RestFixture. Now it's possible to specify a namespace context where aliases are defined, and use such aliases in the XPaths that match the response body.

The namespace context is defined via the RestFixtureConfig fixture, defining the key restfixture.xml.namespace.context. The alias can then be used in any XPath subsequently.

The following two snippets are from the auto generated RestFixture documentation (section XPaths and namespaces). This shows the executed RestFixtureConfig with two aliases defined

whilst this shows the usage of one of the two aliases:

Note: the default namespace of a XML body must be explicitly defined as alias!

The latest release (v 1.1) of the Rest Fixture is downloadable from Google code here, and the source code is on GitHub here

Saturday 31 July 2010

My code has a new home

I have just finished to migrate all my pet projects to GitHub: http://github.com/smartrics. In there you'll also find some of the forks I have made to stuff that I have modified.

I find GitHub easier to play with (albeit git is horribly more complicated than svn) and I like some of its features.

The projects in Google Code are now here:

* Rest Fixture: http://github.com/smartrics/RestFixture
* Pic Builder: http://github.com/smartrics/PicBuilder
* Rest Client: http://github.com/smartrics/RestClient
* JMeter Rest Sampler: http://github.com/smartrics/JMeterRestSampler

So, bye bye Google Code and thanks for the hospitality!