July 13, 2014

Time and Date Confusion 2



Date conversion or interpretation functions

 Qlikview is able to automatically interpret dates that match te format set in your PC locale, or in the the Qlikview environment variables (usually automatically added to the start of the script in a new document)

Set TimeFormat='hh:mm:ss';
Set DateFormat='YYYY/MM/DD';
Set TimestampFormat='YYYY/MM/DD hh:mm:ss[.fff]';

However, the date and time values in your source systems are frequently in a different format. This is where the date interpretation functions come in:

  • Date#(stringvalue[, format])
  • Time#(stringvalue[, format])
  • Timestamp#(stringvalue[, format])
  • Interval#(stringvalue[, format]) 

The functions ensure that the date field is correctly interpreted. So lets assume that the field is in YYYYMMDD format (eg in a load statement):

    Date#(SourceField, ‘YYYYMMDD’) As DateField,

The interpretation functions only affect the numeric value of the dual. The above expression will return the correct numeric value, but the text format will still be YYYYMMDD format, as this what Qlikview inferred as the correct format from the first record. If you would like to corrrectly interpret the date and format it on the default format for you environment/locale, them combine the interpretation function with the relevamt formatting function, like this:

     Date(Date#(‘YYYYMMDD’)) As DateField,
     Date(Date#(‘YYYYMMDD’), ‘D MMM YYYY’) As DateField2,  

The first will produce a date formatted in the default date format and the second in D MMM YYYY format.
Hopefully this post and the previous one helps you to  understand how to use the date/time format and interpretation functions.

Time and Date Confusion 1

Part 1 Formatting Functions



A common cause of issues on the QlikCommunity forum is date fields and the use of the formatting and interpretation functions.

First of all, let me state that there is no specific date data type in Qlikview. There are only two data types in Qlikview – text (or character) and number (or numeric). All number values are actually ‘duals’ – a dual has a text representation and an underlying numeric value. A date is simply a dual value with formatted text representation and an underlying value described below.

The integer portion of a date is the number of days since midnight of December 31 1899. So 1 January 1900 is day 1. July 7 2014 is day number 41827. This is identical to Excel and SQL Server as well as many other products.

The fractional portion of a date/time value represents the proportion of 1 day. So 41827.0 is time 00:00 on 7 July. 41827.25 is 6:00am (0.25 * 24). A pure date has a fractional part of 0. A pure time has only a fractional part (0.000000 – 0.9999999). 

An interval has both parts where the integer part represents days and the fractional part represents portions of a day, So 0.75 is an interval of 18 hours (0.75 * 24) or 1 080 minutes (0.75 * 24 * 60) and 1.5 is an intrerval of 25 hours.

One important point is that the value of the date dual is the underlying numeric, regardless of the chosen text representation.

So lets move on to the formatting functions:

  • Date(numericvalue[, format])
  • Time(numericvalue[, format])
  • Timestamp(numericvalue[, format])
  • Interval(numericvalue[, format])

Where numeric value is the date value and format is the optional format. If the formatted is omitted it will format the value to the default for your environment/locale.

These functions format only the text representation of the date/time dual. The underlying numeric value is not affected. So the text output of the following two expressions are identical, but the values are not equal:

                Date(‘2014/07/10 09:00:00’, ‘YYYY/MM/DD’)  = 41830.375
                Date(‘2017/07/10’, ‘YYYY/MM/DD’) = 41830.000

I have also often seen constructions such as Max(Date(someField)) or Date(someField)+2. The Date() format function is not required in these expressions, and only serves to clutter the expression, and in extreme cases may also affect performance.    

The most important points to take away from this are:

  1. the value of the date is the numeric value of the dual and
  2. the formatting functions only affect the text representation of  the date/time

June 14, 2014

About Table Concatenation

Concatenation

 According to the Qlikview Reference Manual:
If two or more statements result in tables with identical field names, the tables are concatenated and treated as one logical table.
Let’s explore this statement. Consider the load script:

Table1:
LOAD * Inline
[
A, B, C
1, 2, 3
4, 5, 6 ];

Table2:
LOAD * Inline
[ A, B, C
6, 5, 4
4, 5, 6 ];

If you run this script, there will only be one table – Table1. This table contains all 4 lines loaded, including the line 4, 5, 6, which will be loaded twice. Click File | Table Viewer (or Ctrl-T) and check the table contents in Table1.

Now if we add the lines:

Table3:
LOAD * Resident Table1;

Table3 will not be created, as it contains the same fields as Table1 and will be concatenated. Now there will be 8 records in Table1. So if the tables have the same fields, they will autoconcatenate. The field order is unimportant, but remember that QV field names are case sensitive.


Avoiding Automatic Concatenation

If the table is created with different fields, then the second table will not be concatenated. But what if you need to reload the data from the resident table? Add the lines

Table4:
NoConcatenate
LOAD * Resident Table1;

Now Table4 is not concatenated to Table1. (If the scripts end at this point we will have a synthetic key, but that is the subject of another post).

Note that if the second table is created with different fields (a subset of the first table’s fields), then it will not be concatenated, even subsequent statements add the missing field(s) or drop the extra fields. Look at this example:

Table1:
LOAD * Inline
[ A, B, C
1, 2, 3
2, 5, 6
3, 6, 7 ];

Table2:
LOAD A, B
Resident Table1;

 Join(Table2)

LOAD A, (A+B) As C
Resident Table1;

Table2 will be created as a new table because it contains only a subset of Table1’s fields when it is created. This does not change when we add a column C to the table as the table already exists.

So the manual text should more accurately say:
If two or more statements create tables with identical field names, the tables are concatenated and treated as one logical table.

January 12, 2014

Understanding Aggr()

Aggr() is an important part of the toolkit that you need when building more complex expressions and calculated dimensions in Qlikview. This post attempts to demystify aggr() and show it can be used.
 

In a Qlikview chart, such as a straight table, the chart expressions are evaluated for all possible combinations of the dimension. An Aggr() function does the much same, so the expression:

Aggr(Sum(Sales), Region)

Produces a list of the summed values, one for each region. And this:

Aggr(Sum(Sales), Region, Period)

 
Produces an array of sales for each region/period combination. Three dimensions would construct a cube, and so on. 


You may use the aggr() expression anywhere that expects a list or array of values (technically an n-dimensional hypercube). This may include in a list box, a calculated dimension or in a nested aggregation. 

A nested aggregation is where one aggregation function (eg Sum, Min, Avg etc, not to be confused with the aggr() function) is contained within another. For example, lets say you would like to know the lowest possible summed value (eg Min(Sum(Sales))) for the sales outlets. Sum(Sales) will return a single value, but you need a list of Sum(Sales) by store. This is where aggr() comes in.  The expression becomes    

Min(Aggr(Sum(Sales), Store)) 


For another example, Lets say you want to know the lower 5% and upper 95% percentile of sales by Store:

Fractile(Aggr(Sum(Sales), Store), 0.05)
Fractile(Aggr(Sum(Sales), Store), 0.95)


If the data was arranged in months, and you wanted the expression should be calculated to each month (in a chart with month as a dimension):

Min(Aggr(Sum(Sales), Store, Month))
Fractile(Aggr(Sum(Sales), Store, Month), 0.05)
Fractile(Aggr(Sum(Sales), Store, Month), 0.95)

 
If you have more than one year, you would of course need to have a single year selected before using this chart/table, or you could do something like:

Min(Aggr(Sum({} Sales), Store, Month))
Fractile(Aggr(Sum({
} Sales), Store, Month), 0.05)
Fractile(Aggr(Sum({
} Sales), Store, Month), 0.95)

Aggr() expressions in a chart

 
When you use an aggr() in a chart, it is important to always include ALL the chart dimensions in the aggr() dimensions. This allows the expression to be calculated correctly in each cell in the chart or table.


This can present a problem if you are using calculated dimensions, as the dimensions in the aggr() statement have to be fields, not expressions. If the calculated dimension is simple and based on a single field, and each value corresponds to a single value of that field, then you may be able to use the field name in the aggr() expression.