As we previously discussed that we need descriptors of each image of the collection in order to cluster. So, first we start with generating descriptor of each image and then we save them into a single array in order to cluster. The input will be image collection of the pre-cluster phase. Technically, our focus is on the highly dissimilar representative images. For that we used local features of images. The local approach represents each image by a set of local featured descriptors computed at some interesting points inside the image.We used SIFT algorithm for finding and computing descriptors of each images.

Now, we apply K-means algorithm on an array of descriptors of images. In statistics and data mining, k-means clustering is a method of cluster analysis which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean. Since the dataset is a large set (minimum 1000 images and maximum 5000 for N=5 windows) and our goal is to generate precise, minimal redundant and diverse informative overview of the image collection, we decided to apply k- means twice. So, first we apply k-means with the k value and generate a small subset of image set then we further apply K-means in order to get reduced set of that small set, which is more diverse .When we apply k- means, in the result, we get cluster results, centroids, sum and distances.

Now we fetch centroid image of all clusters which are the representative image of each clusters. The concept is; to find for each cluster the least distance image to the centroid. We get the distances from k-means output and after calculating the distances, we sort the nearest image of the cluster centroid for each cluster. That will be the representative set of the image collection.

After having 1st k-means subset, we again apply k-means second time on representative set which becomes precise and small representative set of the large image data set. So, from this phase we generate representative set and it is also useful for the next phase namely ranking mechanism. One can see the procedure for generating representative set in algorithm 3.

Algorithm 3: Clustering and generating representative set

1: input: result image set of pre-cluster
2: output: the representative set
3: for each images img do get descriptor or key points by calling sift function [image, descriptor] =sift (img) save each image descriptor in an array descriptor_images[img] = descriptor
4: end for
5: set number of clusters k and apply k-means on the descriptor_images array [Id, C, D] = kmeans(descriptor_images, k) where Id is image identification number, C is the assigned cluster number and the D is distance from the assigned cluster and other clusters as well.
6: find centroid image of each clusters: for each images i and j of cluster C if distance_image_i < distance_image_j //store the least distance image centroid = image_i end if save centroid at the output directory of the representative set
7: end for

Representative APR 391%. Average APR for this type of loans is 391%. Let's say you want to borrow $100 for two week. Lender can charge you $15 for borrowing $100 for two weeks. You will need to return $115 to the lender at the end of 2 weeks. The cost of the $100 loan is a $15 finance charge and an annual percentage rate of 391 percent. If you decide to roll over the loan for another two weeks, lender can charge you another $15. If you roll-over the loan three times, the finance charge would climb to $60 to borrow the $100.

Implications of Non-payment: Some lenders in our network may automatically roll over your existing loan for another two weeks if you don't pay back the loan on time. Fees for renewing the loan range from lender to lender. Most of the time these fees equal the fees you paid to get the initial payday loan. We ask lenders in our network to follow legal and ethical collection practices set by industry associations and government agencies. Non-payment of a payday loan might negatively effect your credit history.

Calculate APR