Ask Your Question
1

Tracking durations of login/logout sessions

asked 2019-02-05 05:23:31 -0500

rleyba gravatar image

Hi Team,

Our VPN and firewall logs continuously stream login and logout events of users connecting via VPN. The events are very verbose and I can identify at least one field in every log line that I can use as unique identifier to track a particular user and also his/her login and logout time.

On a given day more than 100 users would be logging in and out continuously, each one with different durations and connect times.

I have a need to track down the "duration" of each user's session, for metrics and capacity planning purposes. What I am looking for is a way to temporarily store the state information of an individual's login session, so when my streamsets receives a logout event, it can do a lookup and search that user's log-in time and then do some time/date math to compute the duration.

I saw a video on youtube where the author uses redis to store and do in-memory lookups to query state info, but I don't know redis, although I am currently using apache kafka and mysql in my workflow.

Do you have any tutorials, videos, links on sample pipelines that resemble this requirement? I can imagine that a lot of Streamsets users work with raw time stamped log files and have somehow found a way to extract time-based durations in their data.

Any advice on which processors I should be using will be helpful.

Thanks very much in advance.

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
0

answered 2019-02-05 10:55:19 -0500

metadaddy gravatar image

You could do this by writing session details to MySQL. Here is a sample pipeline I created:

image description

I used a Dev Raw Data Source as the origin, configured with some sample JSON data:

{
  "type":"session_start",
  "user":12345,
  "session_id":23456,
  "timestamp":"2019-02-04 01:23:45.678"
}
{
  "type":"session_start",
  "user":34567,
  "session_id":45678,
  "timestamp":"2019-02-04 02:34:56.789"
}
{
  "type":"session_end",
  "user":12345,
  "session_id":23456,
  "timestamp":"2019-02-04 03:45:67.890"
}
{
  "type":"session_end",
  "user":34567,
  "session_id":45678,
  "timestamp":"2019-02-05 04:56:78.901"
}

The Field Type Converter converts the /timestamp string to a DATETIME:

image description

The Stream Selector routes records according to whether their /type is session_start:

image description

If '/type' is session start, then the record is written to a database via a JDBC Producer destination. I simply created a MySQL table to model the record structure:

CREATE TABLE session (
    session_id INT PRIMARY KEY,
    user INT,
    timestamp DATETIME
);

The default stream from the Stream Selector sends session_end records to a JDBC Lookup processor. A simple SQL query retrieves the relevant session start time, and I mapped /timestamp to '/session_start` so it wouldn't overwrite the session end time in the record:

image description

An Expression Evaluator calculates the session duration in milliseconds. You could use a Field Renamer to rename '/timestamp' to '/session_end' to make things more tidy, but I just left it as is:

image description

Finally, another JDBC Destination, with Default Operation set to DELETE, deletes the session from the database.

My sample pipeline uses a Local FS destination to write JSON output:

{"type":"session_end","user":12345,"session_id":23456,"timestamp":1549280767890,"session_start":1549272226000,"duration_milliseconds":8541890}
{"type":"session_end","user":34567,"session_id":45678,"timestamp":1549371438901,"session_start":1549276497000,"duration_milliseconds":94941901}

You would more likely write the session data to a Kafka topic or database table.

edit flag offensive delete link more

Comments

1

Hi Pat, Thanks very much for this. That was quite a creative way to use the streamselector and the lookup session combo to branch out and do the checking of the status. Very useful indeed.

rleyba gravatar imagerleyba ( 2019-02-05 22:55:12 -0500 )edit
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2019-02-05 05:23:31 -0500

Seen: 41 times

Last updated: Feb 05