Real Time Chat Application with Kotlin and Firebase
September 6th, 2017
Real Time Chat Application with Kotlin and Firebase
A growing practice across many organizations is to log as much information as is feasible, to allow for better debugging and auditing. Tools like Splunk and ELK may it even easier to index the logs, treating the them almost like databases. However, with PCI and HIPAA standards, those same organizations may want to mask much of the data to prevent unauthorized or unprotected access to sensitive data. In this blog post I’ll detail one potential approach to masking that data, so developers do not need to worry about filtering individual log statements.
You’re going to need to use Log4j 2 (potentially with SLF4J as well). A sample pom.xml for just these dependencies would include the lines:
<properties>
<log4j.version>2.7</log4j.version>
<slf4j.version>1.7.22</slf4j.version>
</properties>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
<version>${log4j.version}</version>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-slf4j-impl</artifactId>
<version>${log4j.version}</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>${slf4j.version}</version>
</dependency>
If you’re using a different logging framework, then I imagine this guide may not be very helpful.
I’m going to dive right in, as there are a few different files we need to create or modify to get log masking to work. The first file we will create is a pretty basic one. It’s going to hold all of our logging markers, so that we can tell Log4J to only run the masking on the log statements that need it. Masking our logs means we’re taking a performance hit, so we should not do it any more than we need to:
class LoggingMarkers {
static final Marker JSON = MarkerFactory.getMarker('JSON-MASK')
static final Marker XML = MarkerFactory.getMarker('XML-MASK')
}
I’ve got two basic Markers in that class, one for JSON, and one for XML. You can define as many as you need — for different content types, data types, etc. For this tutorial we’re only going to be using the JSON marker.
Let’s continue by extending the LogEventPatternConverter:
@Plugin(name = 'logmask', category = 'Converter')
@ConverterKeys(['cm'])
class LogMaskingConverter extends LogEventPatternConverter {
private static final String NAME = 'cm'
private static final String JSON_REPLACEMENT_REGEX = "\"\$1\": \"****\""
private static final String JSON_KEYS = ['ssn', 'private', 'creditCard'].join('|')
private static final Pattern JSON_PATTERN = Pattern.compile(/"(${JSON_KEYS})": "([^"]+)"/)
LogMaskingConverter(String[] options) {
super(NAME, NAME)
}
static LogMaskingConverter newInstance(final String[] options) {
return new LogMaskingConverter(options)
}
@Override
void format(LogEvent event, StringBuilder outputMessage) {
String message = event.message.formattedMessage
String maskedMessage = message
if (event.marker?.name == LoggingMarkers.JSON.name) {
try {
maskedMessage = mask(message)
} catch (Exception e) {
maskedMessage = message // Although if this fails, it may be better to not log the message
}
}
outputMessage.append(maskedMessage)
}
private String mask(String message) {
StringBuffer buffer = new StringBuffer()
Matcher matcher = JSON_PATTERN.matcher(message)
while (matcher.find()) {
matcher.appendReplacement(buffer, JSON_REPLACEMENT_REGEX)
}
matcher.appendTail(buffer)
return buffer.toString()
}
}
Ok, let’s stop and analyze the important bits. The ”ConverterKeys” value and the ”NAME” field we pass to the LogEventPatternConverter define the pattern that we will include in our “log4j2.xml” config. It’s what we need to include to ever see our masking at work. I believe you cannot override the default “%m”, so we are defining our own custom pattern “cm”. In fact, we will call “cm” INSTEAD of the default “m” in our configuration.
Next, the constructor and the ”newInstance()” methods are required for our converter to be properly invoked by Log4j. The ”format()” method holds the crux of our work. You can see that it takes the formatted message, and returns it if we do not have any Markers for the current logging statement. If we DO have markers (like for example our JSON one), then and only then will we attempt to mask the message.
I’ve implemented a simple JSON regex replacement for the mask method, but there are many different approaches you can take: you can hydrate the JSON and replace the values based on name/path, you can inspect an object to see if it’s annotated with a “DoNotMask” annotation, or you can even define simple regex values to replace (e.g. credit cards, SSNs). The implementation I provide is meant as a proof-of-concept example, and is not prod-ready. Also, if you DO decide to implement multiple strategies for different markers, it makes sense to move that logic into specific classes (I have included everything in one file for simplicity).
As a simple demonstration of this class, let’s also include the tests:
class LogMaskingConverterSpec extends Specification {
@Shared
LogMaskingConverter converter
void setup() {
converter = new LogMaskingConverter()
}
@Unroll
void 'format() should mask sensitive data'() {
setup:
SimpleMessage message = new SimpleMessage(input)
LogEvent logEvent = new Log4jLogEvent('LogMaskingConverterSpecLogger', new MarkerManager.Log4jMarker(LoggingMarkers.JSON.name), null, null, message, null)
StringBuilder builder = new StringBuilder()
when:
converter.format(logEvent, builder)
then:
assert builder.toString() == expectedOutput
where:
input | expectedOutput
'{"noMask": "foo"}' | '{"noMask": "foo"}'
'{"ssn": "1234567890", "id": "ABC-123", "private": "someKey"}' | '{"ssn": "****", "id": "ABC-123", "private": "****"}'
'invalidJson' | 'invalidJson'
}
}
At this point however, we are still not ready to use our class, as Log4j does not know to look for it. For this, we need to update the log4j2.xml file:
<Configuration packages='com.path.to.logging, com.your.other.packages'>
<Properties>
<Property name="maskingPattern">
%d, level=%p, %cm
</Property>
</Properties>
...
</Configuration>
The key parts here are to update “Configuration packages” attribute to include the package (or parent) of your LogEventPatternConverter, and to replace, or append “cm” rather than “m” in the pattern. If your logs should be filtering but are instead prefixed by a “c”, then Log4j has not picked up your converter, and you should make sure that the names are correct, and that the package is included in the “Configuration” node!
So hopefully now we have everything hooked up so that our log statements can be masked. In order to take advantage of our converter, we need to log our statements with the appropriate Marker:
log.info(LoggingMarkers.JSON, '{"ssn": "1234567890"}') // Will mask
log.info('{"ssn": "1234567890"}') // Will NOT mask
log.info(LoggingMarkers.XML, '{"ssn": "1234567890"}') // Will try to mask, but probably won't work for this message
If all went well, you should now see your sensitive data being replaced with your mask. As a final note, if you are using Spring Boot, by default Log4J is configured BEFORE Spring Boot components and @Value fields, so if you put your fields-to-mask into a properties file, it may take some extra configuration to make sure Log4J picks them up.
Igor Shults
Igor is a self-driven developer with a desire to apply and expand his current skill set. He has experience working with companies of all sizes on the whole application stack, from the database to the front-end. He enjoys working in collaborative environments involving discussion and planning around data modeling and product development.