Embedding a Qualtrics survey in another site

Qualtrics is used widely throughout academia to administer surveys and gather data. It is easy to set up surveys, which automatically look good and the data is stored reliably and is easily exportable.  And it turns out that you can embed Qualtrics surveys in your own web pages to gather data – so when I was faced with a task of recreating a 45 minute series of questionnaires for an online study, I decided to put this to the test.  

5 main benefits of using Qualtrics

  1. No mistakes in storage of data
  2. Can have non coders easily edit questions
  3. Advanced options such as conditional logic, and required questions are all taken care of
  4. No need to build separate infrastructure for accessing data
  5. Less effort needed to style questionnaire

How is it done?

It’s very easy, simply add and <iframe> element in the website with the url pointing to the survey’s anonymous link:

<iframe src="https://universityofsussex.eu.qualtrics.com/jfe/form/SV_xxxxxx" 
title="My questionnaire">
</iframe>

This works amazingly well however you will need to use CSS to adjust the size of the frame. Qualtrics will resize it’s elements but it’s best to give it as much space as possible.

How do I get rid of the ‘Powered by Qualtrics logo’?

This annoying little logo pops up, but you can use a bit of jQuery to remove it (jQuery is served in the Qualtrics iframe, so can be used). Note that qualtrics recommend not removing it, but it doesn’t seem to be against their T&Cs. Simply add following jQuery snippet to the header of your survey so it appears on every page. You can edit the header by going to the ‘Look and Feel’ section of the Qualtrics survey.

<script type="text/javascript">
Qualtrics.SurveyEngine.addOnReady(function() {
	 jQuery('#Plug').attr('style', 'display:none !important');
});
</script>

How do I link the survey data with the user on my app?

If you need to link the response to the survey with something on your main site, you can do it two ways:

  1. Send in an ID via the URL parameters and store it as a variable in qualtrics
  2. Send the Qualtrics response ID back to your app

I actually decided to do both, since it wasn’t apparent to me which would be best for later merging the datafiles.

Setting an ID on Qualtircs via URL parameters

This is as simple as adding ?my_id=xxxx to your URL in the iframe above. Then in Qualtrics, go to your survey, then ‘Survey Flow’ and add an ‘Embedded Data’ element and set the name as ‘my_id’ (or whatever you parameter you use. Do not set a value for this element and you will see it says ‘Value will be set from Panel or URL’. Move this new element to the beginning of the survey and it will cause the value of my_id to be stored as a separate variable with the response data.

Sending Qualtrics response ID back

There are probably a few ways to do this, but I just created a little end point on my application which takes an ID, and the Qualtrics response ID from the URL parameters (e.g. www.example.com/qualtrics_response_id.php?my_id=xxxxx&qualtrics_id=xxxxxx). The end point checks the id matches an existing id (the ids long random strings and so are hard to guess) and if so stores the Qualtrics id alongside. It’s then a case of getting Qualtrics to make a request. I just added some javascript to the first question and used Qualtric’s functionality to pipe in the response ID value:

Qualtrics.SurveyEngine.addOnReady(function()
{
	var myID ="${e://Field/my_id}";
	var responseID= "${e://Field/ResponseID}";
	$.ajax('http://www.example.com?my_id=' + myID + '&amp;qualtrics_id=' + responseID);

});

Allowing respondents to continue/retake survey after exiting

If you store a respondent’s Qualtrics response ID, you can now add it to the URL string with the parameter ?Q_R= to allow retaking of the survey. When they log back in, the existing questions they filled will be the same. Very handy for 45 minute long series of questionnaires!

Summary

Hopefully this is useful to someone out there, it would normally take me a very small amount of time to set up now, so if you need any help feel free to give me a shout!

Automatic longitudinal survey distribution with PHP, email and Qualtrics

With the outbreak of Covid-19, researchers at Sussex decided to move swiftly to measure the psychological impact of lockdown over the course of the pandemic. We had to move quickly using the tools we had: But we had a problem, the surveys were on Qualtrics and the package our University had bought didn’t have the tools to set up a longitudinal survey with 3 weekly surveys (and extra features like reminders etc) – that would enable easily linking the data for analysis later. Moreover the Qualtrics package didn’t include access to the API – which would have made the task a lot simpler!

In the end I found it was quite simple to make an automated solution that consisted of three parts:

  • Have Qualtrics survey send triggers to an email address (with IMAP and SMPT end points)
  • A php script to read the address and update participants’ progress through the longitudinal survey – the ‘reader’.
  • A php scripts to send any emails required to get people to do the next survey / remind them to complete – the ‘sender’.

We initially started the process manually running the scripts, verifying the behaviour, and then moved it to a CRON task, with a daily email update of new emails to be sent in the morning, with the emails themselves being sent later in the day (giving us time to inspect the email to check for any strange things going on). The process works as follows:

  1. A participant completes the initial baseline survey, and this triggers an email containing the date of completion and email address.
  2. The ‘reader’ script picks up the email, and adds the address to a table of users in the database, and assigns them a new id code. A second entry is made in a table of surveys in the database, storing the completion of this baseline survey with this user.
  3. Three weeks pass, and the on the daily run of the ‘sender’ script detects it is due to send a new email – it constructs a link containing URL parameters of the user’s id and the current iteration to the follow up survey. It also creates a new entry in the table of surveys (but marks it with a ‘waiting for completion’ status.
  4. When the participant completes the follow up survey, the URL parameters are stored with it (to allow data merging later) and a new email is triggered.
  5. The ‘reader’ script detects the follow up survey is complete and marks it as complete in the table of surveys – and then the process continues – unless –
  6. If the participant doesn’t complete a survey, the ‘sender’ script detects this from the table of surveys and sends reminders (after a few days).

The whole set up allows for a complete customisation of how to solicit follow up responses to the survey, and stores an explicit and clear representation of the state of each participant throughout the process. The database is specialised to deal only with the survey response state, and not any of the data itself, making it much easier to manage.

The main difficult of setting up was getting to grips with PHP’s imap functions such as imap_open() which wasn’t too easy, and required a fair amount of tinkering (see the comments on that page for very helpful advice). I used PHPMailer to send the emails, which was very easy to use. On the whole it was quite a satisfying small project to work on – there’s something nice and Rube Goldberg about the chain of events. I wondered for a while if it could scale into a usable piece of open source code – but decided that it was too niche to do so – especially considering the fact that the best solution would probably have been to purchase Qualtrics’s enhanced functionality – but if that’s not possible for you then consider this work around!

Modelling Personal Finance in Python

This is for people who want to understand how to make choices in personal finance and know how to program.  Normally, it’s easy enough to use rules of thumb, and to just follow advice / well trodden paths in finance – but if you’re inclined you can model things, and actually predict where you would be in a few years, dependent on choices! 

This post is not financial advice – do not make any real world decisions based on your models without consulting someone who knows what they’re doing and can check what you’ve done. I am not responsible for any decisions you make or any errors in this code.

Most of the time you can predict how a single investment or debt will behave using Excel, but if you want to see how various choices could combine, it quickly becomes difficult to control, and so I decided to make python classes that behave as accounts, mortgages etc so I could model the questions I had.

First I created an account class, that I can use to model savings and investments. It should be fairly self explanatory, it allows transferring between accounts, and has a few extra methods important in evaluating for any goal, the ROI (return on investment) and passive income. Most people’s goal is not simply to amass as much money as possible for no reason. Passive income is important, as the sum of passive income from all your assets is what you could potentially live off (although it’s more complex, see Safe Withdrawal Rates).

# Simple class to model an account with transactions and interest
class Account:

    def __init__(self, balance, apr, name):
        self.balance = balance
        self.apr = apr
        self.name = name

    def applyMonthlyInterest(self):
        self.balance = self.balance + self.passive_income()
        
    def deposit(self, amount):
        self.balance += amount
        
    def withdraw(self, amount):
        self.balance -= amount
        
    def transfer(self, amount, to):
        self.withdraw(amount)
        to.deposit(amount)
    
    def passive_income(self):
        return (self.balance * self.apr / 1200)
    
    def roi(self):
        return self.apr

This class is about as simple as you get for the basic purpose – I subclass it for various accounts which have withdrawal / purchase fees (e.g. funds) and would do also if my accounts had any other weird features (e.g. monthly charges etc). The idea is that with the class format you can lay out the rules for how the account should behave, and you can now run simulations, by iterating through months / years and checking balances.

Say that you want to decide whether to pay off a student loan as a priority or not, you can run a few simulations and edit the behaviour to see what the outcome would be.

# Simulation 1 - pay off student loan with 1000 from wage,
# switch to Investment ISA after done

n_years = 15
account1 = Account(-30000, 1.5, 'Student Loan')
account2 = Account(0, 5, 'Investment ISA')
accounts = [account1, account2]

for y in range(n_years):
        for m in range(12):
            
            save = 1000
            loan_left = 0 - account1.balance
            into_student_loan = save if save < loan_left else loan_left
            into_isa = save - into_student_loan
            account1.deposit(into_student_loan)
            account2.deposit(into_isa)
            
            for account in accounts:
                account.applyMonthlyInterest()

print('Simulation 1')               
for account in accounts:  
        print(account.name, account.value())              

print('Total:', sum(map(lambda x: x.value(), accounts)))



# Simulation 2 - pay of interest only on student loan,
# and put rest of monthly savings in ISA

account1 = Account(-30000, 1.5, 'Student Loan')
account2 = Account(0, 5, 'Investment ISA')
accounts = [account1, account2]

for y in range(n_years):
        for m in range(12):
            
            save = 1000
            interest_to_pay =  0 - (account1.balance + 30000)
            account1.deposit(interest_to_pay)
            account2.deposit(save - interest_to_pay)
            
            for account in accounts:
                account.applyMonthlyInterest()

print('Simulation 2')               
for account in accounts:  
        print(account.name, account.value())              

print('Total:',sum(map(lambda x: x.value(), accounts)))

After 15 years, simulation 1 gives you a final balance of 20,7623, whereas simulation 2 gives you a final balance of 22,8379. However this account doesn’t take into account the fact that I might not make 5% returns – I am not a financial advisor so I will not give you any advice there! There are other things to take into account, tax is a big one; I calculate my tax bill after carefully researching all the rules for allowances etc, I just add them to the simulation. Inflation is another one, I take it into account in my simulations – certain things inflate – costs, and the price of property – whereas balances do not – which is important when evaluating whether property is an investment. Obviously inflation rates are not guaranteed, so I run the simulation in best and worst case scenarios…

The other thing you can do asides from running a month by month strategy of putting cash here or there, is to have triggers at certain points – such as when you are saving for a house. Calculating everything by hand would be a nightmare. Here is a hypothetical situation, one is buying a house as soon as you have a 5% deposit, or waiting a certain amount of time to get a larger deposit and a better value mortgage. Here is a very simple class to deal with a house.

# Simple class to manage a house (rental or mortgage)
class House(Account):
    def __init__(self, balance, apr, name, monthlyExpenditure, borrowedAmount, investedAmount, yearsLeft):
        self.balance = balance
        self.apr = apr
        self.name = name
        self.monthlyExpenditure = monthlyExpenditure
        self.borrowedAmount = borrowedAmount
        self.investedAmount = investedAmount
        self.yearsLeft = yearsLeft
        
    def monthlyMortgageToPay(self):
        interest = self.borrowedAmount * (self.apr / 1200)
        if (self.yearsLeft > 0):
            repay = (self.borrowedAmount / self.yearsLeft) / 12
            return interest + repay
        else:
            return interest
                    
    def doMonthlyUpdate(self):
        self.balance += self.passive_income()
        
        if (self.yearsLeft > 0):
            repay = (self.borrowedAmount / self.yearsLeft) / 12
            self.borrowedAmount -= repay
            self.investedAmount += repay
            self.yearsLeft -= (1/12)
                    
    def value(self):
        return self.investedAmount + self.balance
   
    def passive_income(self):
        return 0 - self.monthlyExpenditure -self.monthlyMortgageToPay()
    
    def roi(self):
        return 1200 * self.passive_income() / self.investedAmount

In these models, we start off with a rental property (no loan but huge outgoing monthly expense) and wait for a trigger to purchase a property with a mortgage. I put the simulation in it’s own function so I can call it with different parameters.

# Run a simple simulation of saving, and once reaching a threshold, purchasing
# a house, to see how well I would do over 5 years

def purchase_simulation(deposit_percentage, mortgage_rate, mortgage_cost):
    
    # Starting situation with cash savings and rental outgoing.
    cashISA = Account(16000,1.5,'Cash ISA')
    house = House(0,0,'Rental flat',1250,0,0,0)

    accounts = [cashISA]
    
    # Calculate price of house transaction - rough UK estimate
    # for properties under 250000
    house_price = 200000
    stamp_duty = (0.02 * (house_price - 125000))
    house_purchase_cost = 4000 + stamp_duty + mortgage_cost
    deposit = house_price * deposit_percentage / 100
    saving_threshold = house_purchase_cost + deposit
    print('Saving threshold:',saving_threshold)
    
    
    bought_house = False
    n_years = 5
    
    for y in range(n_years):
        for m in range(12):

            # 1500 from wage, which is spent on living costs / savings.
            save = 1500
            cashISA.deposit(save)

            for account in accounts:
                account.applyMonthlyInterest()

            # Use the cash ISA to balance the house account
            house.doMonthlyUpdate()
            house.transfer(house.balance, cashISA)

            # Purchase the house immediately after saving enough
            if not bought_house and cashISA.balance >= saving_threshold:
                cashISA.withdraw(saving_threshold)
                house = House(0,mortgage_rate,'House',100,house_price - deposit, deposit, 25)
                bought_house = True
                print('Bought house year:', y + (m/12))
                
    print('Total assets:', sum(map(lambda x: x.balance, accounts)) + house.investedAmount)

# One simulation is based on saving a 5% deposit
print('Simulation - 5% deposit')    
purchase_simulation(5, 2.97, 1000)

# The other simulation is based on saving a 10% deposit, which gets you a better rate
print('Simulation - 10% deposit') 
purchase_simulation(10, 2.24, 1000)

I find that in the second simulation, not only does it take three more years to save, but that I am actually way worse off after 5 years:

Simulation - 5% deposit
Saving threshold: 16500.0
Bought house year: 0.08333333333333333
Total assets: 67365.21932369476
Simulation - 10% deposit
Saving threshold: 26500.0
Bought house year: 3.0833333333333335
Total assets: 43921.627199638795

I think I can say in this case that buying the house sooner is the better strategy.

My recommendation if you decide to do your own models: Use verbose code – you do not want to make any mistakes – spell everything out and double check everything!

Building a project for iOS or OSX with Armadillo in Xcode

Armadillo is a superb linear algebra library for C++, that makes doing things which are easy in MATLAB, almost as easy in C++ . I recently had to port a fairly complex curve fitting algorithm from MATLAB to an Xcode library, to use in an iPad Unity app. Setting the environment up requires a little care. Armadillo is from the C++ world, and it’s not as simple as copying a .framework. There are two different ways of getting Armadillo to work:

  1. Use armadillo simply as a template library, and link directly to LAPACK and BLAS (sounds complicated but isn’t)
  2. Build the runtime library as suggested in README.md

Armadillo mainly consists of headers which ‘wrap’ the more complicated and verbose commands of whichever library that is actually used for doing matrix calculations. For both iOS and OSX, this is Accelerate.framework, which contains optimised versions of LAPACK and BLAS, the components for doing the linear algebra fast.

Armadillo: To build or not to build

You do not have to build the Armadillo run-time library (e.g. using the installation instructions included), but you can if you wish. This includes which ever libraries the installation routine finds into a package, and conveniently for your purposes copies the headers to a sensible location. Alternatively you can just run Armadillo as a pure template library. This way, you include the headers, but don’t need to include the library. This is probably the easiest way and just requires a simple preprocessor directive before you include armadillo. I repeat: you do not have to build the Armadillo run-time library to get it to work on iOS.

If you try to include the run-time library you made on OSX in an iOS project it will not work! Really the only reason you would want to build this library is if you are not using Accelerate.framework for some reason.

Wait what?

Dylib files are dynamic libraries, which you can essential think of as precompiled code that is loaded at runtime, as opposed to code you compile yourself. In order to write code with the .dylib, you need header files, which tell you (and your compiler) what names to call the functions in the library. Libraries that you use in C++ generally come in this form, with both the library, and the header files. Frameworks are in fact just a neat package of dynamic libraries and headers. Armadillo is special – it’s not really a library per se, but is really just some very clever headers that sit on top of other dynamic libraries, which do the work. Since iOS and OSX come with the Accelerate framework as standard, you can just use Armadillo with that – and not bother creating the run-time library, which essentially is just a repackaging of existing libraries.

Problem with BLAS symbols on iOS store

For iOS, one problem that arises when you submit the app store is you will get a rejection unless you tell the armadillo library to NOT use BLAS. The first time I submitted an app I got the following rejection note:

ITMS-90338: Non-public API usage – The app references non-public symbols…_ddot_, _dgemm_, _dgemv_, _dsyrk_.

You might also get the same for any of these symbols:

_sasum_ _dasum_ _snrm2_ _dnrm2_ _sdot_ _ddot_ _sgemv_
 _dgemv_ _cgemv_ _zgemv_ _sgemm_ _dgemm_ _cgemm_ _zgemm_
 _ssyrk_ _dsyrk_ _cherk_ _zherk_

These are all used by BLAS for doing *faster* matrix calculation, but are in fact not necessary for basic functionality. To turn off BLAS go to the armadillo_bits/config.hpp wherever you are linking to armadillo and comment out line 26:

#if !defined(ARMA_USE_BLAS)
//#define ARMA_USE_BLAS
//// Comment out the above line if you don't have BLAS or a high-speed replacement for BLAS,
//// such as OpenBLAS, GotoBLAS, Intel MKL, AMD ACML, or the Accelerate framework.
//// BLAS is used for matrix multiplication.
//// Without BLAS, matrix multiplication will still work, but might be slower.
#endif

This will now mean you can use armadillo without BLAS and the app store will not reject the binary. If you WANT to use BLAS, you could have a look in wrapper_blas.h and def_blas.h and do some replacement with the provided public API (eg. cblas_dgemm) – but this was not necessary for my project and I didn’t have time to check if it would all work (and I have to admit I have more interesting free time projects…)

Recipe for iOS and OSX without run-time library:

  • Follow the installation instructions for Armadillo in the README.md file found in the downloaded archive. This involves building Armadillo with CMake, which will create a .dylib file and headers, and installing them to /usr/local/lib. OR Copy the headers to your project.
  • Add the header’s path (e.g. your project path or usr/local/include, if you installed Armadillo) to your ‘Header Search Paths’ entry in the target’s build settings.
  • Add Accelerate.framework (included with Xcode) to your linked frameworks and libraries.
  • Comment out the #define ARMA_USE_BLAS line in config.hpp if you are submitting on the app store
  • Add #define ARMA_DONT_USE_WRAPPER to the top of your code.Add #include <armadillo> afterwards – Thats it!
  • If you get an ‘armadillo’ file not found error (highlighting the #include <armadillo> this is because the preprocessor cannot find the headers, so you need to check your ‘Header Search Paths’

Recipe for OSX with run-time library:

  • Follow the steps above.
  • Add the dylib file in usr/local/lib to your Linked Frameworks and Libraries. If you add the alias rather than the library itself, when you update armadillo, you won’t have to update the projects that use it. Note that /usr is hidden and in this current version of Xcode, you have to press Command + Shift + . to show hidden files in the finder window.
  • Add the library’s path (e.g. usr/local/lib) to your ‘Library Search Paths’ entry in the target’s build settings. Also, add the headers path to ‘Header Search Paths’.
  • If you get a linker error such as ‘ld: library not found for larmadillo.9.20.7’ – it is because the library cannot be found by the linker. This occurs if you didn’t update your ‘Library Search Paths’ – or you didn’t copy the library to your projects directory, if that was the route you were taking.
  • Happy linear algebra fun in Xcode!

Reading CVPixelBuffer in Objective-C

I wanted to inspect individual pixels in a CVPixelBuffer, but I couldn’t find a recipe online, so after figuring it out, here it is. You can adjust this code to iterate through the CVPixelBuffer too if that’s what you need!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
    CVPixelBufferRef pixelBuffer = _lastDepthData.depthDataMap;
   
    CVPixelBufferLockBaseAddress(pixelBuffer, 0);
   
    size_t cols = CVPixelBufferGetWidth(pixelBuffer);
    size_t rows = CVPixelBufferGetHeight(pixelBuffer);
    size_t bytesPerRow = CVPixelBufferGetBytesPerRow( pixelBuffer );
    uint8_t *baseAddress = CVPixelBufferGetBaseAddress( pixelBuffer );
   
    // This next step is not necessary, but I include it here for illustration,
    // you can get the type of pixel format, and it is associated with a kCVPixelFormatType
    // this can tell you what type of data it is e.g. in this case Float32

    OSType type = CVPixelBufferGetPixelFormatType( pixelBuffer);
   
    if (type != kCVPixelFormatType_DepthFloat32) {
        NSLog(@"Wrong type");
    }
   
    // Arbitrary values of x and y to sample
    int x = 20; // must be lower that cols
    int y = 30; // must be lower than rows
   
    // Get the pixel.  You could iterate here of course to get multiple pixels!
    int baseAddressIndex = y  * bytesPerRow + x  * sizeof(Float32);
    const Float32 pixel = *(Float32*)(&baseAddress[baseAddressIndex]);

    CVPixelBufferUnlockBaseAddress( pixelBuffer, 0 );

Note that the first thing you need to determine is what type of data is in the CVPixelBuffer – if you don’t know this then you can use CVPixelBufferGetPixelFormatType() to find out. In this case I am getting depth data at Float32, if you were using another type e.g. Float16, then you would need to replace all occurrences of Float32 with that type.

Note that it’s important to lock and unlock the base address using CVPixelBufferLockBaseAddress and CVPixelBufferUnlockBaseAddress.

Python in SPSS

Using Python in SPSS is great if you want to do any complex calculations, without having to leave the SPSS environment. Python is much more flexible than SPSS syntax, and it’s actually very easy to use. It is especially useful when you are collaborating with people who are not willing to do all their analysis in python (e.g. with Spyder), yet require complex data processing steps in their analysis – for example converting between colour spaces. The documentation online is actually pretty good, but I thought I’d post a very simple use case, converting a colour from sRGB to LAB colour space. Doing this in SPSS syntax would be very tiring, but in python I can just cut and paste code into a loop and it’s done. Here is the program:

1
2
DELETE VARIABLES LAB_L LAB_A LAB_B.
OUTPUT CLOSE *.

I start off with some syntax that deletes any variables with the same names as those I am about to create. This is useful especially when developing as you might run the script many times and don’t want to have to delete the variables manually each time.

1
2
3
4
5
6
7
8
9
10
11
BEGIN PROGRAM Python.
import spss


spss.StartDataStep()
datasetObj = spss.Dataset()

# Manipulation of variables goes here!

spss.EndDataStep()
END PROGRAM.

This is the standard boilerplate code needed in most Python SPSS scripts. BEGIN PROGRAM and END PROGRAM determine the area in which you are writing python. You then ‘import spss’ and start a data step. Finally you get a dataset object which allows you to iterate through rows and perform manipulations

1
2
3
4
# Create the new variables
datasetObj.varlist.append('LAB_L',0)
datasetObj.varlist.append('LAB_A',0)
datasetObj.varlist.append('LAB_B',0)

Here I create the new variables, initialised with 0. LAB is a colour space with three dimensions, L, A and B.

1
2
3
4
5
6
7
# Get the variable names
rIndex = datasetObj.varlist['r'].index
gIndex = datasetObj.varlist['g'].index
bIndex = datasetObj.varlist['b'].index
LIndex = datasetObj.varlist['LAB_L'].index
AIndex = datasetObj.varlist['LAB_A'].index
BIndex = datasetObj.varlist['LAB_B'].index

In order to perform data manipulations you need the index of the variable in the dataset. I get all these indexes at the start and store them in their own variables.

1
2
3
4
5
6
7
8
9
10
11
12
for idx, row in enumerate(datasetObj.cases):
    r = row[rIndex]
    g = row[gIndex]
    b = row[bIndex]
    if r >= 0 and g >= 0 and b >=0:
        LAB = calc_LAB(r,g,b)
    else:
        LAB = [None, None, None]
       
    datasetObj.cases[idx, LIndex] = LAB[0]
    datasetObj.cases[idx, AIndex] = LAB[1]
    datasetObj.cases[idx, BIndex] = LAB[2]

Here I iterate each row (or case), pulling the R, G and B values into python variables, using the variable indexes (i.e. rIndex, gIndex and bIndex). Then, if each is above 0, I send them to a function calc_LAB (which I will define later). Finally I take the LAB values and put them into the correct place in the dataset.

You can see how powerful this can be, calc_LAB is actually a lengthy function that would be a real chore to program in syntax. Here is the full program with the function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
* Encoding: UTF-8.
*Importing all of the data.
DELETE VARIABLES LAB_L LAB_A LAB_B.
OUTPUT CLOSE *.
BEGIN PROGRAM Python.
import spss


spss.StartDataStep()
datasetObj = spss.Dataset()

# Create the new variables
datasetObj.varlist.append('LAB_L',0)
datasetObj.varlist.append('LAB_A',0)
datasetObj.varlist.append('LAB_B',0)

# Get the variable names
rIndex = datasetObj.varlist['r'].index
gIndex = datasetObj.varlist['g'].index
bIndex = datasetObj.varlist['b'].index
LIndex = datasetObj.varlist['LAB_L'].index
AIndex = datasetObj.varlist['LAB_A'].index
BIndex = datasetObj.varlist['LAB_B'].index


def calc_LAB(R,G,B):
    var_R = float(R) / 255.0
    var_G = float(G) / 255.0
    var_B = float(B) / 255.0


    if ( var_R > 0.04045 ):
        var_R = pow(( ( var_R + 0.055 ) / 1.055 ), 2.4)
    else:                  
        var_R = var_R / 12.92;
    if ( var_G > 0.04045 ):
        var_G = pow(( ( var_G + 0.055 ) / 1.055 ), 2.4)
    else:
        var_G = var_G / 12.92;
    if ( var_B > 0.04045 ):
        var_B = pow(( ( var_B + 0.055 ) / 1.055 ), 2.4)
    else:
        var_B = var_B / 12.92

    var_R = var_R * 100
    var_G = var_G * 100
    var_B = var_B * 100


    X = var_R * 0.4124 + var_G * 0.3576 + var_B * 0.1805
    Y = var_R * 0.2126 + var_G * 0.7152 + var_B * 0.0722
    Z = var_R * 0.0193 + var_G * 0.1192 + var_B * 0.9505

    var_X = X / 95.047
    var_Y = Y / 100.000
    var_Z = Z / 108.883

   
    third = 1.0/3.0
   
    if ( var_X > 0.008856 ):
        var_X = pow(var_X,third)
    else:
        var_X = ( 7.787 * var_X ) + ( 16.0 / 116.0 )
    if ( var_Y > 0.008856 ):
        var_Y = pow(var_Y,third)
    else:
        var_Y = ( 7.787 * var_Y ) + ( 16.0 / 116.0 )
    if ( var_Z > 0.008856 ):
        var_Z = pow(var_Z,third)
    else:
        var_Z = ( 7.787 * var_Z ) + ( 16.0 / 116.0 )


    L = ( 116 * var_Y ) - 16
    A = 500 * ( var_X - var_Y )
    B = 200 * ( var_Y - var_Z )


    return [L, A, B]


for idx, row in enumerate(datasetObj.cases):
    r = row[rIndex]
    g = row[gIndex]
    b = row[bIndex]
    if r >= 0 and g >= 0 and b >=0:
        LAB = calc_LAB(r,g,b)
    else:
        LAB = [None, None, None]
       
    datasetObj.cases[idx, LIndex] = LAB[0]
    datasetObj.cases[idx, AIndex] = LAB[1]
    datasetObj.cases[idx, BIndex] = LAB[2]

spss.EndDataStep()
END PROGRAM.

Draggable Piechart JS class

For a recent project I made a draggable piechart. I thought this might be useful, so I have made it open source for others to use: https://github.com/jamesalvarez/draggable-piechart

Here is the default example:

Your browser is too old!

The default behaviour gives each slice a minimum width. You can change colours and set it to allow collapsing pies:

Your browser is too old!

However you need to provide a way to uncollapse the segments, as below. You can also override the drawing routines, providing your own format arguments for each segment, and create a UI for it to interact with:

Your browser is too old!

Random non-overlapping circles in MATLAB

For a recent project I had to generate a stimulus set consisting of random non-overlapping circles in MATLAB.  My solution was to keep a list of the outside of a gradually increasing outer ring of circles, and add new ones, expanding the ring.  This requirement is similar to circle-packing, except I needed to have gaps between the circles.

If you use especially large gaps, or the ratio between your maximum and minimum circle sizes is great, I detected the occasional overlap.  But for the parameters I needed, I tested over 10000 tests of 400 circles, with no overlaps.

Here is how it works:

1
2
3
4
5
6
% Parameters for the circles
setup.min_circle_radius = 15;
setup.max_circle_radius = 20;
setup.min_gap = 10;
setup.max_gap = 20;
setup.n_circles = 400;

First I set up some parameters – minimum and maximum circle radiuses and gaps, plus the number of circles I want to generate.

1
2
3
4
5
6
7
8
9
10
11
12
function [ circles ] = CirclePacking( setup )
    % Start with a circle in the middle
    first_circle = generate_random_radius_circle( setup );
    first_circle.position = [ 0, 0 ];
 
    % Add second circle at random angle
    second_circle = generate_random_radius_circle( setup );
 
    gap = randi([setup.min_gap, setup.max_gap]);
    total_distance = first_circle.radius + gap + second_circle.radius;
    [x, y] = pol2cart(2 * pi * rand, total_distance);
    second_circle.position = [x, y];

Next I begin with two circles, one in the centre, and another at a random angle from the first.

1
2
3
function [circle] = generate_random_radius_circle( setup )
    circle.radius = randi([setup.min_circle_radius, setup.max_circle_radius]);
end

I represent circles as structs, first defined solely by radius, but then by position. All circles are first generated using this function which makes a random radius.

Back to the main function:

1
2
3
4
5
6
7
8
9
first_circle.before = 2;
first_circle.after = 2;
second_circle.before = 1;
second_circle.after = 1;

% Setup doubly linked list
first_circle.id = 1;
circles(1) = first_circle; % have to set first circle manually
circles = save_circle(second_circle, circles);

I set up a doubly linked list, that is, each circle keeps track of the circle before and after it. With just two, before and after are the same.

1
2
3
4
5
6
function [circles, circle] = save_circle(circle, circles)
    if (~isfield(circle, 'id'))
        circle.id = size(circles,2) + 1;
    end
    circles(circle.id) = circle;
end

‘save_circle’ adds or saves a circle to the main circles array. I couldn’t work out after a google search how to initialise an empty array of structs, so I didn’t waste time doing so.

Back to main function:

1
2
3
4
current_position = 1;
for i = 1:setup.n_circles
    [circles, current_position] = add_new_circle( setup, circles, current_position );
end

To finish the main function, I just call ‘add_new_circle’, maintaining a variable ‘current_position’ which points to a particular circle in the circles array.

This is the end of the main function, most of what happens is going on inside ‘add_new_circle’, but before I go into it, I will show a few more important helper functions:

1
2
3
4
5
6
7
8
9
10
function circles_overlap = circles_overlap(circle1, circle2, min_gap)
    dist = edist(circle1.position, circle2.position);

    circles_overlap = dist &lt; circle1.radius + circle2.radius + min_gap;
end

function edist = edist(a, b)
    % euclidean distance
    edist = sqrt((a(1)-b(1))^2+(a(2)-b(2))^2);
end

The above functions are self explanatory, just checking whether circles overlap, and a function for euclidean distance.

1
2
3
4
5
6
7
function next_circle = next_circle(circle, circles)
    next_circle = circles(circle.after);
end

function previous_circle = previous_circle(circle, circles)
    previous_circle = circles(circle.before);
end

The above functions allow me to get the next and previous circles – this is not necessary really, but I find it’s easier on the eyes using these functions.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
function new_circle = poisiton_new_circle(new_circle, circle1, circle2, gap1, gap2)

    distance1 = new_circle.radius + gap1 + circle1.radius;
    distance2 = new_circle.radius + gap2 + circle2.radius;
    distance_between = edist(circle1.position, circle2.position);

    % Calculate angle to new circle
    angle = get_law_cosines_angle_a(distance1, distance2, distance_between);

    % Calculate angle of first two circles
    vector_first_to_second = circle2.position - circle1.position;
    angle_first_to_second = atan2(vector_first_to_second(2), vector_first_to_second(1));

    angle_of_new_from_first = mod(angle_first_to_second + angle, 2 * pi);
    [x, y] = pol2cart(angle_of_new_from_first, distance1);
    new_circle.position = [x, y] + circle1.position;
end

function angle = get_law_cosines_angle_a(ab,bc,ca)
    num = (ab * ab) + (ca * ca) - (bc * bc);
    den = 2 * ab * ca;

    angle = acos(num / den);
end

The ‘position_new_circle’ function puts a circle next to circle1 and circle2, taking into account the gaps passed in too. It always results in the anticlockwise order circle1, new_circle, circle2. It is using the law of cosines to do this – I won’t explain the maths here.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
function [circles, first_circle_index] = add_new_circle( setup, circles, first_circle_index )
    % circles must contain 2 or more circles

    gap_with_first = randi([setup.min_gap, setup.max_gap]);
    gap_with_second = randi([setup.min_gap, setup.max_gap]);

    first_circle = circles(first_circle_index);
    second_circle = next_circle(first_circle, circles);
    new_circle = generate_random_radius_circle(setup);

    % Check backwards for overlaps
    for i = 1:8;

        new_circle = poisiton_new_circle(new_circle, first_circle, second_circle, gap_with_first, gap_with_second);
        before_circle = previous_circle(first_circle, circles);
        number_to_check = min(size(circles,2) ,8);

        overlaps = false;

        for j=1:number_to_check
            % Check if circle overlaps
            if (before_circle.id == second_circle.id)
                break;
            end

            if (circles_overlap(new_circle, before_circle, setup.min_gap))
                first_circle = before_circle;
                first_circle_index = first_circle.id;
                overlaps = true;
                break;
            end

            before_circle = previous_circle(before_circle, circles);
        end

        if (~overlaps)
            break;
        end
    end

    % Update linked list
    circles = update_list(circles, first_circle, second_circle, new_circle);
end

This is the main part of the algorithm that works out where to put a new circle. It starts by setting the gaps, and taking the key circle, and the circle after that. ‘first_circle_index’ indexes the key circles, all circles are generated around this circle, until overlaps. Often, a new circle will overlap one of the previous circles in the chain. When this occurs I make the overlapping circle the new key circle, and generate circles around that.

I found that checking 8 circles back was more than enough (on the idea that you can fit roughly 6 circles around another circle – if they are similar size) – but this can be modified to check all circles, at a performance hit.

So after positioning a circle between the key circle and the next one, I check circles before the key (making sure I don’t check the same circle). If the circles overlap, I set the key circle to the one that overlaps.

If there are no overlaps, then I update the linked list with that circle.

1
2
3
4
5
6
7
8
9
10
11
function circles = update_list(circles, circle1, circle2, new_circle)
    new_circle.before = circle1.id;
    new_circle.after = circle2.id;
    [circles, new_circle] = save_circle(new_circle, circles);

    circle1.after = new_circle.id;
    circles = save_circle(circle1, circles);

    circle2.before = new_circle.id;
    circles = save_circle(circle2, circles);
end

Note this can very obviously be massively sped up – but remembering that ‘premature optimization is the root of all evil’ – I have not yet needed to so.

A simple introduction to Core Audio

I had to learn Core Audio for a project recently, which despite being notoriously difficult, has been great fun. Starting out, I would have killed for a basic example audio player that didn’t have any unnecessary bells and whistles to just get the basics.  I ended up creating this project, which specifically does two things:
 
Loads an audio file entirely into memory: 
The audio files for my project were very short, but many could be playing at the same time, which means that loading them from disk could be a bottleneck.
Plays the loaded data with the Audio Unit API: 
This is the most low-level way to play audio, thus offers the most control and lowest latency.
 
I have created a project which does this here: 
https://github.com/jamesalvarez/iosCoreAudioPlayer

The code is very straightforward once you become familiar with the API, but in this post I’ll go over the above tasks, with some extra notes which could be useful.  You need to know basic audio processing terms like samples, channels, frames etc, as well as the C language.

 

AudioStreamBasicDescription

This struct contains information that defines a specific format for an audio data stream.  It can get very complicated, but thankfully you can just define the one you want and load the data and play it back using this format.  I chose to use interleaved floats, which means the data comes in a single buffer of floats, with the left and right channels alternating – thus the two samples per stereo frame are always next to each other.  Non-interleaved means you get separate data buffers for the left and right channels.

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#define CAP_SAMPLE_RATE 44100
#define CAP_CHANNELS 2
#define CAP_SAMPLE_SIZE sizeof(Float32)

AudioStreamBasicDescription const CAPAudioDescription = {
    .mSampleRate = CAP_SAMPLE_RATE,
    .mFormatID = kAudioFormatLinearPCM,
    .mFormatFlags = kAudioFormatFlagIsFloat,
    .mBytesPerPacket = CAP_SAMPLE_SIZE * CAP_CHANNELS,
    .mFramesPerPacket = 1,
    .mBytesPerFrame = CAP_CHANNELS * CAP_SAMPLE_SIZE,
    .mChannelsPerFrame = CAP_CHANNELS,
    .mBitsPerChannel = 8 * CAP_SAMPLE_SIZE, //8 bits per byte
    .mReserved = 0
};

When using other data formats you can sometimes have more than one frame per packet, but here this is not the case so the values are straightforward to fill out, using the size of Float32.

 

ExtAudioFile

It takes quite a few lines to load a file with ExtAudioFile, but the result is that you get your data in whatever format you like.  In the Github example, I add error checking but here for more clarity I will just call the functions without checking that they were successful. When dealing with large audio files, it is better to use a ring buffer, where you load more data from the file into memory as it is played rather than loading it all at once. For short files it’s ok to load them completely as I do here.

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
ExtAudioFileRef audioFile;

// Open file
ExtAudioFileOpenURL(url, &audioFile);

// Get files information
AudioStreamBasicDescription fileAudioDescription;
UInt32 size = sizeof(fileAudioDescription);
ExtAudioFileGetProperty(audioFile,
    kExtAudioFileProperty_FileDataFormat,
    &size,
    &fileAudioDescription);

// Apply audio format
ExtAudioFileSetProperty(audioFile,
    kExtAudioFileProperty_ClientDataFormat,
    sizeof(CAPAudioDescription),
    &CAPAudioDescription);

The first command is to load the audio file with a URL from a CFURLRef, which bridges directly from a NSURL*.  Next we get the the AudioStreamBasicDescription of the file.  We don’t use this for any other purpose than to calculate the length of the file in frames when allocating buffers to load the file into.  Next we set our predefined AudioStreamBasicDescription on the file, so now when we request data, it will come in this format.

 

1
2
3
4
5
6
7
8
9
10
// Determine length in frames (in original file's sample rate)
UInt64 fileLengthInFrames;
size = sizeof(fileLengthInFrames);
ExtAudioFileGetProperty(audioFile,
    kExtAudioFileProperty_FileLengthFrames,
    &size,
    &fileLengthInFrames);

// Calculate the true length in frames, given the original and target sample rates
fileLengthInFrames = ceil(fileLengthInFrames * (CAPAudioDescription.mSampleRate / fileAudioDescription.mSampleRate));

Here, we get the number of frames of the file (in the original sample rate), and calculate the number of frames for the file using our desired sample rate and the original sample rate.

 

1
2
3
4
5
6
7
8
9
10
11
// Prepare AudioBufferList: Interleaved data uses just one buffer, non-interleaved has two
int numberOfBuffers = 1;
int channelsPerBuffer = CAPAudioDescription.mChannelsPerFrame;
int bytesPerBuffer = CAPAudioDescription.mBytesPerFrame * (int)fileLengthInFrames;

AudioBufferList *bufferList = malloc(sizeof(AudioBufferList) + (numberOfBuffers-1)*sizeof(AudioBuffer));

bufferList->mNumberBuffers = numberOfBuffers;
bufferList->mBuffers[0].mData = calloc(bytesPerBuffer, 1);
bufferList->mBuffers[0].mDataByteSize = bytesPerBuffer;
bufferList->mBuffers[0].mNumberChannels = channelsPerBuffer;

Here we create an AudioBufferList, which is used to store the loaded audio. Before doing so we need to know the number of buffers, the number of channels per buffer and the number of bytes per buffer. Since we are using interleaved data, we only need one buffer, which contains two channels. The number of bytes is simply the number of bytes per frame times the file length in frames. The Github example contains more complex code which can handle non-interleaved data – also in this snippet I have excluded checking to see if the malloc and calloc are successful – just for clarity.

 

1
2
3
4
5
// Create a stack copy of the given audio buffer list and offset mData pointers, with offset in bytes
char scratchBufferList_bytes[sizeof(AudioBufferList)];
memcpy(scratchBufferList_bytes, bufferList, sizeof(scratchBufferList_bytes));
AudioBufferList * scratchBufferList = (AudioBufferList*)scratchBufferList_bytes;
scratchBufferList->mBuffers[0].mData = (char*)scratchBufferList->mBuffers[0].mData;

Next we create a second AudioBufferList which is a copy of the first. This is used to load the data in piece by piece. After loading in a chunk of data, the pointers on the scratchBufferList are changed to point to the next section of data on the heap, ready to load the next chunk of data. I copied this technique from the excellent TAAE library.

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
// Perform read in multiple small chunks (otherwise ExtAudioFileRead crashes when performing sample rate conversion)
UInt32 readFrames = 0;
while (readFrames < fileLengthInFrames) {
    UInt32 framesLeftToRead = (UInt32)fileLengthInFrames - readFrames;
    UInt32 framesToRead = (16384 < framesLeftToRead) ? framesLeftToRead : 16384; // Set the scratch buffer to point to the correct position on the real buffer
    scratchBufferList->mNumberBuffers = bufferList->mNumberBuffers;
    scratchBufferList->mBuffers[0].mNumberChannels = bufferList->mBuffers[0].mNumberChannels;
    scratchBufferList->mBuffers[0].mData = bufferList->mBuffers[0].mData + (readFrames * CAPAudioDescription.mBytesPerFrame);
    scratchBufferList->mBuffers[0].mDataByteSize = framesToRead * CAPAudioDescription.mBytesPerFrame;

    // Perform read
    ExtAudioFileRead(audioFile, &framesToRead, scratchBufferList);

    // Break if no frames were read
    if ( framesToRead == 0 ) break;

    readFrames += framesToRead;
}

Now we loop whilst reading in frames to the scratch buffer list, which is updated each time to point to the next section of data. There is a maximum number of frames, this seems to be the done thing, I’m not 100% sure why.

 

1
2
3
4
5
6
// Clean up
ExtAudioFileDispose(audioFile);

// BufferList and readFrames are the audio we loaded
audioPlayer->bufferList = bufferList;
audioPlayer->frames = readFrames;

The last thing is to clean up, which just consists of calling ExtAudioFileDispose. I save bufferList and readFrames, to a custom struct which will be used later during the render callback when playing audio.

 

Audio Unit Output

It takes slightly fewer lines to set up the most basic stream for output using Audio Units. Since we only have one Audio Unit we don’t need to use a graph or anything like that. We simply create an output component, set the stream to the correct format, set the render callback and switch it on.

 

1
2
3
4
5
6
7
8
9
10
11
// Description for the output AudioComponent
AudioComponentDescription outputcd = {
    .componentType = kAudioUnitType_Output,
    .componentSubType = kAudioUnitSubType_RemoteIO,
    .componentManufacturer = kAudioUnitManufacturer_Apple,
    .componentFlags = 0,
    .componentFlagsMask = 0
};

// Get the output AudioComponent
AudioComponent comp = AudioComponentFindNext (NULL, &outputcd);

In this first step we create an AudioComponentDescription, which is a struct that describes a particular Audio Unit. In this case, we choose type: kAudioUnitType_Output and sub type: kAudioUnitSubType_RemoteIO to get the AudioUnit which deals in outputting audio to the device. AudioComponentFindNext finds this AudioComponent, so we can begin to use it.

 

1
2
3
4
5
6
7
8
9
10
11
// Create a new instance of the AudioComponent = the AudioUnit
// outputUnit is type AudioUnit
AudioComponentInstanceNew(comp, &outputUnit);

// Set the stream format
AudioUnitSetProperty(outputUnit,
    kAudioUnitProperty_StreamFormat,
    kAudioUnitScope_Input,
    0,
    &CAPAudioDescription,
    sizeof(CAPAudioDescription));

In this step, we create a new instance of the AudioComponent, which gives us the AudioUnit itself, and then we set the stream format using the same stream as we set the file we loaded. This makes it easy when it comes to filling the buffers as no conversion is needed.

 

1
2
3
4
5
6
7
8
9
10
11
12
// Set the render callback
AURenderCallbackStruct callBackStruct = {
    .inputProc = CAPRenderProc,
    .inputProcRefCon = player
};

AudioUnitSetProperty(outputUnit,
    kAudioUnitProperty_SetRenderCallback,
    kAudioUnitScope_Global,
    0,
    &callBackStruct,
    sizeof(callBackStruct));

Here we create a struct that contains the name of our render callback ‘CAPRenderProc’ and void* pointer to anything we like, that will be passed in each time the callback is called. I created a struct which amongst other things points to the buffer of data that was loaded earlier.

 

1
2
3
4
5
// Initialize the Audio Unit
AudioUnitInitialize(outputUnit);

// Start the Audio Unit (sound begins)
AudioOutputUnitStart(outputUnit);

Finally we initialize the audio unit and start it. This will begin calling the render callback for new audio data.

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
static OSStatus CAPRenderProc(void *inRefCon,
    AudioUnitRenderActionFlags *ioActionFlags,
    const AudioTimeStamp *inTimeStamp,
    UInt32 inBusNumber,
    UInt32 inNumberFrames,
    AudioBufferList * ioData) {

    CAPAudioOutput *audioOutput = (CAPAudioOutput*)inRefCon;
    CAPAudioPlayer *audioPlayer = &audioOutput->player;

    UInt32 currentFrame = audioPlayer->currentFrame;
    UInt32 maxFrames = audioPlayer->frames;

    Float32 *outputData = (Float32*)ioData->mBuffers[0].mData;
    Float32 *inputData = (Float32*)audioPlayer->bufferList->mBuffers[0].mData;

    for (UInt32 frame = 0; frame < inNumberFrames; ++frame) {
        UInt32 outSample = frame * 2;
        UInt32 inSample = currentFrame * 2;

        (outputData)[outSample] = (inputData)[inSample];
        (outputData)[outSample+1] = (inputData)[inSample + 1];

        currentFrame++;
        currentFrame = currentFrame % maxFrames; // loop
    }

    audioPlayer->currentFrame = currentFrame;

    return noErr;
}

This is the render callback (designed for clarity as performance isn’t really going to be an issue yet!). Here I am just copying interleaved samples from the previously loaded buffer to the output buffer. I have a struct called CAPAudioPlayer which contains the AudioBufferList, the number of frames, and the current frame, which I first extract. Then I set pointers to the two data buffers. Next I loop over the number of frames the callback has requested, copying from the stored AudioBuffer to the output buffer. Finally I updated the currentFrame, so that audio picks up from the correct spot next time the callback occurs.

The project contains more code, especially how to detect errors, dispose of things, and also how to deal with non-interleaved data. This is about the most basic start to using Core Audio, I hope it was useful!