My name
is
Jon Skeet

Parsing string timestamp with time zone in 3 digit format followed by 'Z'

In the Hadoop infrastructure (Java-based) I am getting timestamps as string values in this format:

2015-10-01T04:22:38:208Z
2015-10-01T04:23:35:471Z
2015-10-01T04:24:33:422Z

I tried different patters following examples for SimpleDateFormat Java class without any success.

Replaced 'T' with ' ' and 'Z' with '', then

"yyyy-MM-dd HH:mm:ss:ZZZ"
"yyyy-MM-dd HH:mm:ss:zzz"
"yyyy-MM-dd HH:mm:ss:Z"
"yyyy-MM-dd HH:mm:ss:z"

Without replacement,

"yyyy-MM-dd'T'HH:mm:ss:zzz'Z'"

In fact, this format is not listed among examples. What should I do with it? Maybe those 3 digits are milliseconds, and time is in UTC, like this: "yyyy-MM-dd'T'HH:mm:ss.SSSZ"? But it still should look like "2015-11-27T10:50:44.000-08:00" as standardized format ISO-8601.

Maybe, this format is not parsed correctly in the first place?

I use Ruby, Python, Pig, Hive to work with it (but not Java directly), so any example helps. Thanks!

I very strongly suspect the final three digits are nothing to do with time zones, but are instead milliseconds, and yes, the Z means UTC. It's a little odd that they're using : instead of . as the separator between seconds and milliseconds, but that can happen sometimes.

In that case you want

"yyyy-MM-dd'T'HH:mm:ss:SSSX"

... or use

"yyyy-MM-dd'T'HH:mm:ss:SSS'Z'"

and set your SimpleDateFormat's time zone to UTC explicitly.

See more on this question at Stackoverflow