How to secure a CQ5 instance

October 5th, 2011

tinkle tinkle

I am not a security expert. When I was a kid I enjoyed breaking things. When I was a teenage wannabe-programmer, I enjoyed breaking software (mostly games). But I never got a hang on the real hacking, finding security loopholes, stack overflows and such. So I joined Cognifide, learned how to develop in Communiqué, then CQ5 / ADEP (the new CQ5) and became a well-behaved programmer. Up until now.

I’ve done a few trainings recently — either as an internal Cognifide trainer, or as an Adobe consultant. As a part of such training, I normally demonstrate how to secure a CQ5 instance and what happens if you don’t do it properly. The sad thing is — for each training I could Google for Geometrixx to find some live CQ5 site so that I can demonstrate a few common mistakes. And each time I find a new one on the first page of the results. It makes my want to cry to see how one can spend a small fortune on a licence for a powerful platform, and then save on security. Especially that it’s no brain science, and Adobe documented it quite a while ago in the security checklist.

Below you will find a list of those common faults. From requests that shouldn’t make it to the publish instance, dangerous default configurations and, finally, denial of service holes.

home.json, home.xml

Number one and an absolute no-brainer for anyone who knows Sling.

How does it work?
Sling has a default handler for those requests — it will render content of a node. For .xml — whole page, for .json — just the requested node. Additionally, .json supports selectors that will allow infinite recursion, which may result in the instance rendering the entire content tree.

Why is it bad?

  • By default /.infinity.json renders recursively with a maximum depth of 200 nodes; that can overload a server, especially when looped (severity: critical).
  • /.infinity.json will also fill your cache quickly, especially when combined with cache walk-arounds (severity: critical).
  • /.infinity.json can dump the entire repository, including areas that may have some passwords saved — like replication agents (severity: critical).
  • Recursive calls may expose nodes that were not meant to be read (severity: major).
  • It’s exposing your implementation (severity: minor).

How to fix it?
Filter those requests on a dispatcher. If you require .json, make sure it’s available only where it’s necessary. If it’s you generating those files, add a custom selector to the request. This way, you will be able to separate your requests from the default handler and filter the latter on a dispatcher.

access to consoles

There are three consoles that come out of the box:

  • /admin — CQSE (servlet engine)
  • /system/console — Felix Web Console (OSGi)
  • /crx — CRX console

They are never needed for the end-user, thus they must not be available externally. Even if they are secured with a password.

Why is it bad?

  • It’s a quick way of doing a brute-force attack on your admin password. (severity: major)
  • If someone gets access to CQSE console, you risk your site being switched-off, or even removed; also, additional applications may be installed on the server, and that’s asking for trouble (severity: critical).
  • If someone gets access to Felix console you risk third party bundles being installed, or foundation bundles being replaced with modified, malicious version — an attack virtually impossible to detect (severity: critical).
  • Exposing CRX console, in default setup, gives everyone read-only access to whole content, repository configuration (your license key, available disk space, JVM version, etc) (severity: critical).

How to fix it?
Again — Dispatcher is your friend. Filter out those requests. And whenever you need those consoles use either direct port access via VPN, or a separate secure domain (e.g. with HTTP authentication, or hidden behind SSO).

wcm on publish

Did you know a publish instance has a fully functional authoring interface? All you need to get there is:

  • login at /libs/cq/core/content/login.html and then
  • hit /libs/cq/core/content/welcome.html

Also /etc/packages.html is available, and you can download them without logging in. Extremely handy if you have a user in the system, still dangerous even if you don’t. Severity: major.

How to fix it?
Dispatcher filters — deny requests to /libs.

anonymous user access

That’s the main cause, why you must filter the above requests. On publish instance, the default configuration gives anonymous user read only access to the whole repository. And that’s including:

  • /etc/packages where your installed packages are,
  • /etc/replication where one can find DESed transport user’s password (usually it’s for admin)
  • /apps where your application is exposed and further holes can be found.

(severity: major).

How to fix it?
Carefully trim ACL for anonymous user. Geometrixx home page renders properly with access to:

  • /content,
  • /etc/clientlibs,
  • /etc/designs,
  • /etc/segmentation.

Each application will require different set, so do it carefully not to strip something you actually need.

That’s about time you asked: I already set my filters — isn’t this step redundant? The answer is: yes, if your filters are perfect and there are no holes in your application, there should be no need to change anonymous user’s permissions. Tight filters are the primary way of securing an instance and ACL cannot replace it. Said that, safe ACL will supplement your filters and should be applied whenever possible. (severity: major).

default passwords

You would be surprised how many sites go with default passwords on publish instance. I don’t think I have to explain why it’s bad.To change passwords, follow the guide here. (severity: critical).

geometrixx on publish instance

Another common mistake is leaving the Geometrixx site on your publish instance. While this will not give anyone access to your system, this site can be indexed by search engine as you can see here and here. And this is bad because it exposes your site before people like me. (severity: major).

How to fix it?
Remove Geometrixx as documented on dev.day.com. You can also create an empty package with areas mentioned there in the filter — installing such package will wipe this content.

denial of service

In the last part I want to review possible options of a Denial of Service attack on a CQ5 instance. First attack can be done by just looping /.infinity.json with a get parameter (to skip a cache). We already secured this one, but it’s still worth mentioning as the weakest spot.

Secondly, for many sites it is possible to have 100% cache coverage. If that’s the case — perfect, once cache is filled, publish instances won’t get any requests and DoS is much harder to perform. If not — stress test the areas that the instance will render to find how much it can handle. You can then use Apache to limit number of requests that make it to the publish instance. It will result in a denial of service when the load exceeds the limit, but your publish will not die / corrupt in the result. And of course — optimise the code that gets most of the load.

Third type of attack will target instance by the use of randomly generated GET parameters, selectors, extensions and suffixes. By making each request unique, the attacker will bypass cache. Quick fix is to accept only the selectors and extensions you actually use, return 301 or 404 status for the rest. Can be applied via dispatcher or directly on the instance. You can also change GET parameters to suffixes, so that results are cached, but be very careful here. There is no way to prevent unique suffixes hitting publish instance, plus it’s an easy way to saturate your cache (dispatcher runs out of disk space). Obviously, if you don’t need GET parameters, or suffixes — filter them out before they make it to the dispatcher, or return 301 / 404 status codes for those requests to avoid caching.

summary

As said, it’s not rocket surgery to secure your instance. Once you set your filters right, you’re pretty safe. Securing for DoS may seem a bit more complicated, but in the end it can be a few foundation classes reused across other projects. So don’t push your luck too long — go forth and secure your instances!

vibrant ink textmate theme for eclipse

June 9th, 2010

The first time I’ve seen a MacBook was the first time I’ve seen the Textmate editor. On that day I fell in love with the Vibrant Ink colour theme it’s owner was using.

vibrant ink

It was aesthetic, it was colourful, it didn’t tire my eyes as much as my default black-on-white Eclipse editor. On that day I found an Eclipse version of the theme which, with some small modifications, I’ve been using since then.

The mentioned Eclipse version, while nice, had some problems. Colouring wasn’t consistent between editors, some of them were not configured and code was unreadable on a black background, also some colours were different than the original. It was slightly irritating, but not enough to do something about it. Until today.

I took the original theme, merged with what I got before and applied consistently across editors. What I got is much closer to the original: the only things I changed deliberately are background (greyish - to reduce contrast) and keyword colour (a bit less saturated). Also the 11px Monaco font is much better than the Eclipse’s default Courier New used in the original Eclipse version. I configured all the editors I had, i.e.: text, structured text, ant, java, javascript, php, css, html, dtd, xml, xsl. The configuration was tested on Eclipse 3.6 (Helios) RC 2, but should work well with both earlier and future releases.

To install the theme you need to install the font (if you don’t have it yet) and then import the eclipse preferences file (File -> Import -> Preferences). The file is stripped and contains only syntax colouring configuration, but anyway, I recommend making a backup of your preferences before applying the theme (File -> Export -> Preferences).

Download here and enjoy!

update (2010-07-18):
I added the support for CDT editors (C / C++ and Makefile)!

photo day 5.0 - teatr polski

October 24th, 2009

Today I decided to dust off my trusty camera and join a jolly company of photographers visiting Teatr Polski. The event was organised by mmpoznan.pl and we were allowed to visit the deepest recesses of the theatre. Which we did.

As usual I tried to do something out of ordinary rather than run around and take dozens of pictures. Yet another picture of the scene is boring, plus I don’t have either an adequate lens or a proper tripod to take one. I’ve focused on some details I’ve dug up and did one good close up by the entrance. The gallery is here.

communiqué as a windows service

March 22nd, 2009

When running CQ on Windows there is a common issue everyone runs into: how to install it as a Windows service? There are basically two ways: the built-in (automatic) and the manual.

built-in way

This one is available for CQ 4 only - one has to open Communiqué Manager and click File -> Register Service. The Service will appear in Window’s administrative tools -> services

register service

Ta-da-da, but… we’ve encountered multiple issues with this approach. The most serious is that all instances are starting up / stopping with the service - one can no choose to start/stop single instance. If you need this functionality: you have to do it manually.

installing service manually

First thing you need is a cqsvc.exe tool located in CQ 4 installation directory under opt\helpers\cqsvc.exe. To install a service, you execute it with the following arguments:

  • -i
  • -n <service name>
  • -e <full path to java.exe>
  • -w <workdir>
  • -a <arguments>
  • -o <outfile (log)>
  • -t <timeout>

example:
cqsvc -i -n “communique4401″
-e “C:\Program Files\Java\jdk1.5.0_16\bin\java.exe”
-w “D:\Day\CQ-4.2.1\server”
-a “-Xrs -Xms512m -Xmx768m -XX:PermSize=256m
-XX:MaxPermSize=256m -jar bin/bootstrap.jar”
-o “D:\Day\CQ-4.2.1\server\outfile.log”
-t 180

Ta-da-da - your service appeared under the name you selected. Now you can install instances as a separate services - each one with a different name. Note that this tool can be used for CQ 5 as well.

writing mvc framework in cq4 (part 3)

March 15th, 2009

In the first part of this little tutorial I’ve shown three major problems I encountered when building a CQ 4 framework. Two of them were solved in the previous post, leaving the persistence one for today. Once again, what it’s about: let’s assume you’ve got your shiny new project with Java controller and JSP view, the data you’re about to use to create your model are stored in pages as a set of atoms, containers and container lists. The simplest way of accessing them is by using e.g.:

Container pageContent = page.getContent();
ContainerList list = pageContent.getContainerList("myData");
Container content = list.getContainer("Single");
Atom atom = content.getAtom("myAtom");
String data = atom.getString();

Though it’s scary, it works just fine. Though it works, it’s too damn easy to hide an error here. For pages storing multiple containers with many atoms inside, managing such a mapping has proven to be really tricky.

We at Cognifide have been struggling with this one for quite a while. The credit for the solution I am about to present goes mainly to Albert Cenkier who invented an automated mapper from a CQ container to a Java object. In order to build it we did the following assumptions:

  • one CQ container is mapped to one java object (DTO),
  • an atom is mapped to the object field of the same name as atom’s label (case insensitive),
  • the above requires that all labels used are valid Java identifiers - we use camel case names.

The mechanism we came up with uses reflection mechanism to find all the properties of a DTO class. It then tries to fetch atoms with the same labels as the fields from the given container. This approach is called “standard over configuration” - instead of the excessive layer enabling the most flexible and elaborate mapping (that you’re not ever going to use anyway) it introduces a standard solution at the cost of a few restrictions (that you most often abide to already).

Time for more implementation details. Here are some example code listings - not a complete solution, but should be enough to get you started. First - get all setters of a given class:

private static Collection/* <Method> */getSetters(Class clazz) {
  Method[] methods = clazz.getMethods();
  Pattern pattern = Pattern.compile("(set)([A-Z_]\\w+)");
  Collection/* <Method> */setters = new ArrayList/* <Method> */();
 
  for (int i = 0; i < methods.length; i++) {
    String methodName = methods[i].getName();
    if (matcher.matches()) {
      setters.add(methods[i]);
    }
  }
}

Now, to get those from a container and execute the setter:

public static Object mapContainerToObject(Container content, Collection/* <Method> */setters)
    throws Exception {
  Pattern pattern = Pattern.compile("(get|set)([A-Z_]\\w+)");
  Object result = clazz.newInstance();
 
  for (int i = 0; i < setters.size(); i++) {
    Method setter = (Method) setters.get(i);
    Matcher matcher = pattern.matcher(method.getName());
    String propertyName = matcher.group(2);
    propertyName = propertyName.substring(0, 1).toLowerCase()
        + propertyName.substring(1);
 
    if (content.hasElement(propertyName)) {
      String value = content.getAtom(propertyName).getString();
      setter.invoke(result, new Object[] { value });
    }
  }
}

Note - the above doesn’t implement case insensitivity and assumes that every value is a string. The implementation of those complicates the code significantly and I wanted to show the basic idea here. From this point however, it shouldn’t be much of a challenge to implement the following:

  • mapping to any basic data type (string, integer, boolean, floating-point, date, etc.) based on Java field type and pre-defined parsing rules,
  • mapping container label to one specific field (essential for list’s parNum),
  • mapping a container list to a list of objects.

In fact, it is possible to implement a complete hibernate-like solution for Communiqué. Just remember - do not over-complicate. This piece of code lies at the very basis of all your projects. If it’s clean and simple, anything built on it will be so as well.

writing mvc framework in cq4 (part 2)

March 1st, 2009

Recently we discussed the Communiqué framework concept in general. Today let’s focus on those little details that will allow us to create such a framework.

First issue I encountered is: how to remove logic from the view? In the simplest cases one could use the <cfc:atom> tag to simply output a stored value. Once the value has to be transformed, or fetched from other page, it gets complicated, and more and more logic tends to be introduced in the view. What we need here is a way for Controller to output his data on the page.

Let’s assume we’ve got this controller already as a Java class that hides all the business logic and persistence from us. We could output data from it using <%= controller.getValue() %> but that still won’t deliver us from putting logic in there:
<%= controller.getValue().replaceAll("foo", "bar") %>
.

Fortunatelly, there is salvation - provided by Day in the JSTL integration package.  Using JSTL c:out tag one can output any value that was previously set as a request attribute.

example

Now, exactly what does it mean for us? Let’s assume we’ve got the simplest controller:


package pl.kuzniak.jan.myapp.controller;

import java.util.HashMap;
import java.util.Map;

import pl.kuzniak.jan.myapp.model.MyData;

import com.day.cq.delivery.DeliveryHttpServletRequest;

public class MyController {
public MyController(DeliveryHttpServletRequest cqRequest) {
protected final Map/* <String, Object> */model;
model = new HashMap/* <String, Object> */();
MyData data = new MyData();
data.setIndex("1");
model.put("myData", data);
cqRequest.setAttribute("model", model);
}
}

We can now use it in the JSP file just like that:


<%@page pageEncoding="UTF-8" contentType="text/html; charset=UTF-8"

%><%@taglib uri="/cfc" prefix="cfc"
%><%@taglib uri="http://java.sun.com/jstl/core" prefix="c"

%><%@page import="pl.kuzniak.jan.myapp.controller.MyController"

%><cfc:defineObjects/><%
new MyController(cqRequest);
%>

<p><c:out value="${model.myData.index}" /></p>

As you can see - we’ve got a read-only way of accessing values here. Except for the line with controller’s constructor, there is no business logic here. The only part of Java code is in fact a mapping between the view and the controller - something you could find as obfuscated XMLs in some popular Java frameworks.

conclusion

So, here we’ve got solution to the first problem:

  • use cqRequest.setAttribute to pass a value to the view
  • use JSTL to output the value in a read-only manner

Pros:

  • it’s easy this way - developers will like it and won’t be searching ways around,
  • it’s clean - keeps Java code away from the markup,
  • it’s standard - utilises well known JSTL and Expression Language,
  • it’s easy to extend - it’s easy to write custom tags that support EL and can e.g. externalise links or format values while outputting them.

Cons:

  • often requires extra effort to write a proper taglib instead of creating HTML markup from the controller itself.
  • reduced plain code readability (it’s slightly better with syntax highlighting, but still…); example:
    <body class="<c:out value="${model.body.htmlClass}"; />">

If something is unclear - please comment. Next time I will talk about data persistence in CQ 4.

writing mvc framework in cq4 (part 1)

February 22nd, 2009

In the next few articles I would like to give you some guidance in writing a MVC framework for Communiqué 4.2. Why 4.2 you ask - good question. With the version 5.1 finally out, I should probably be concentrating on it. The main reason is that clients still like CQ4, and most likely in 2009 there will still be a lot of new projects written on the old version of the platform, while we will be learning how to write in CQ5 and how to do it right.

Why would I want to write this framework? The default approach presented in examples provided with the default installation is lightweight, easy and quick - that’s true. In the long term, however, I’ve found it hard to scale and manage. It’s hard to force standards and QA Ecma scripts, plus, they tend to get complicated while developers hack they way through issues. Separation of concepts and putting them into layers simplifies architecture, and the cost of creating bloated codebase is compensated by the ease of supporting such a project. And the only way I know to help development team avoid messy projects is to create a framework.

Before I continue let me explain what I understand by framework. A framework is a solution to a high-level problem (here - creating web sites in CQ). This solution is provided incomplete but it can easily be extended and adapted to a given case (web site X in CQ). The spots that will be extended / overriden by the developer are clearly specified by the framework. Framework must be general enough to be reusable across the projects, while strictly enforcing solving similar problems in the same way, it must be easy to learn and hard to bypass.

Now, let’s have a look at CQ and see if we can define Model, View and Controller here.

  • Model - data displayed on the page. These data are probably persistent somewhere, have some business rules, are associated with default values  and will be displayed somewhere - but the model is unaware of all that. It just stores plain data - no logic, no formatting - it’s raw.
  • View - a JSP page generating HTML, CSS, JavaScript and so on. It is responsible for formatting data and displaying them correctly. View doesn’t care about where the data came from and has as little logic inside as possible.
  • Controller - A Java class responsible for calling the persistence layer to retrieve the model, applying business rules and default values to it and passing it to the view. It doesn’t care what the view is - it just creates the model and exposes it so that it is visible to the view.

The problems that lie ahead are:

  • How to combine controller and view? In Communiqué the page is associated with the template, and template defines one entry point - the JSP script (view). How to define a controller while keeping the logic away from the view?
  • How to pass the model to the view? How to expose Java objects to the JSP file so that it would be impossible (or at least hard) to modify them from the view?
  • How to solve persistence in an easy and effective way? Most of the data will be stored in the Content Bus. How to enforce a standard but flexible mechanism for mapping them to model objects? This operation seems easy but is very error-prone when done manually (especially when refactoring).

In the next article I will answer those questions and try to provide some examples for improved readability, so keep reading!

Cheers,
Jan

karpacz

February 1st, 2009

Some time ago I went mountain hiking with my friends. We left Friday afternoon and left on Sunday so it was short, but intense. Snow had just fallen making the nature pure and beautiful - heaven for a man with his camera. Once again - thanks mates for taking me away from my computer and running me ragged over the hills.

See what I’ve seen here.

hdr revisited

September 8th, 2008

It still takes me a while to get a good HDR picture. Not only because of a large space of parameters I have to search to make sure I produced the best result, but mostly because I cannot tell for sure if the picture I am taking at the moment will be a good source for the algorythm when I sit in front of my monitor. Maybe it will come with time…

Look at what I’ve done.

my first hdr

June 22nd, 2008

Ok, after some experiments and with my friend Albert I think I nailed the whole HDR thing down. I take under exposed (exposure: -2.0) picture, store it in RAW format and then create 9 exposures (from -2.0 to 2.0) out of it. Merging those into one is the hardest part because I have to play with rather nonintuitive parameters of the algorithm, but the result is worth it. It might not be the best HDR images I’ve seen, and they might look a little bit artificial, but still the result is better than I expected. Check it out!