Home » Posts tagged 'Analytics'
Tag Archives: Analytics
I have been writing freelance for kdnuggets.com
Its a great learning for me to be a better writer especially for analytics and programming
These are a list of articles -interviews are in bold and I will keep updating this list when there are new additions
- Guide to Data Science Cheat Sheets 2014/05/12
- Book Review: Data Just Right 2014/04/03
- Exclusive Interview: Richard Socher, founder of etcML, Easy Text Classification Startup 2014/03/31
- Trifacta – Tackling Data Wrangling with Automation and Machine Learning 2014/03/17
- Paxata automates Data Preparation for Big Data Analytics 2014/03/07
- etcML Promises to Make Text Classification Easy 2014/03/05
- Wolfram Breakthrough Knowledge-based Programming Language – what it means for Data Science? 2014/03/02
Suppose – let us just suppose- you want to create random numbers that are reproducible , and derived from time stamps
Here is the code in R
Note- you can create a custom function ( I used the log) for generating random numbers of the system time too. This creates a random numbered list of pseudo random numbers (since nothing machine driven is purely random in the strict philosophy of the word)
 39621645 99451316 109889294 110275233 278994547 6554596 38654159 68748122 8920823 13293010
 57664241 24533980 174529340 105304151 168006526 39173857 12810354 145341412 241341095 86568818
Possible applications- things that need both random numbers (like encryption keys) and time stamps (like events , web or industrial logs or as pseudo random pass codes in Google 2 factor authentication )
Note I used the rnorm function but you could possibly draw the functions also as a random input (rnorm or rcauchy)
Again I would trust my own random ness than one generated by an arm of US Govt (see http://www.nist.gov/itl/csd/ct/nist_beacon.cfm )
Update- Random numbers in R
The currently available RNG kinds are given below.
kind is partially matched to this list. The default is
- The seed,
.Random.seed[-1] == r[1:3]is an integer vector of length 3, where each
1:(p[i] - 1), where
pis the length 3 vector of primes,
p = (30269, 30307, 30323). The Wichmann–Hill generator has a cycle length of 6.9536e12 (=
prod(p-1)/4, see Applied Statistics (1984) 33, 123 which corrects the original article).
- A multiply-with-carry RNG is used, as recommended by George Marsaglia in his post to the mailing list ‘sci.stat.math’. It has a period of more than 2^60 and has passed all tests (according to Marsaglia). The seed is two integers (all values allowed).
- Marsaglia’s famous Super-Duper from the 70′s. This is the original version which does not pass the MTUPLE test of the Diehard battery. It has a period of about 4.6*10^18 for most initial seeds. The seed is two integers (all values allowed for the first seed: the second must be odd).
We use the implementation by Reeds et al. (1982–84).
The two seeds are the Tausworthe and congruence long integers, respectively. A one-to-one mapping to S’s
.Random.seed[1:12]is possible but we will not publish one, not least as this generator is not exactly the same as that in recent versions of S-PLUS.
- From Matsumoto and Nishimura (1998). A twisted GFSR with period 2^19937 – 1 and equidistribution in 623 consecutive dimensions (over the whole period). The ‘seed’ is a 624-dimensional set of 32-bit integers plus a current position in that set.
- A 32-bit integer GFSR using lagged Fibonacci sequences with subtraction. That is, the recurrence used is
X[j] = (X[j-100] – X[j-37]) mod 2^30
and the ‘seed’ is the set of the 100 last numbers (actually recorded as 101 numbers, the last being a cyclic shift of the buffer). The period is around 2^129.
- An earlier version from Knuth (1997).
The 2002 version was not backwards compatible with the earlier version: the initialization of the GFSR from the seed was altered. R did not allow you to choose consecutive seeds, the reported ‘weakness’, and already scrambled the seeds.
Initialization of this generator is done in interpreted R code and so takes a short but noticeable time.
- A ‘combined multiple-recursive generator’ from L’Ecuyer (1999), each element of which is a feedback multiplicative generator with three integer elements: thus the seed is a (signed) integer vector of length 6. The period is around 2^191.
The 6 elements of the seed are internally regarded as 32-bit unsigned integers. Neither the first three nor the last three should be all zero, and they are limited to less than
This is not particularly interesting of itself, but provides the basis for the multiple streams used in package parallel.
- Use a user-supplied generator.
RNGkindallows user-coded uniform and normal random number generators to be supplied.
Hosting a 6 weekend live online certification course on Business Analytics with R starting June 1 at Edureka.Check www.edureka.in/r-for-analytics for more details. Course has been decided to ensure more open data science than current expensive offerings that are tech rather than business oriented but more support and customization than a MOOC This is because many business customers don’t care if it is lapply or ddapply, or command line or GUI, as long as they get good ROI on time and money spent in shifting to R from other analytics software.
Message from our Sponsors and my favorite Analytics conference ( only if I could attend a cool analytics conference nearby in Asia (singapore/turkey?) -sighs) Even useR wont come to Asia ever?-
This is the number 1 conference for analytics in the world and it is next month in Chicago, USA? So you think you have the best analytics software or product or service. Here is where you can find it out!
|It’s time to amp-up your analytics strategy. It’s time to beef up your analytics strategy by attending Predictive Analytics World Chicago, June 10-13, 2013. With over 30 case studies from leading organizations across a spectrum of industries, this is the must-attend event for anyone serious about their analytics strategy.
Here’s what your peers had to say about their experience at PAW:
Who’s attending PAW Chicago 2013?
Here are just a few of the many companies attending:
And many more!
Registration options for all budgets.
PAW Chicago has a variety of conference pass options available to meet budgets of all sizes.
UPDATED- Here are three great examples of a visualization making a process easy to understand. Please click on the images to read them clearly.
1) It visualizes CRISP-DM and is made by Nicole Leaper (http://exde.wordpress.com/2009/03/13/a-visual-guide-to-crisp-dm-methodology/)
2) KDD -Knowledge Discovery in Databases -visualization by Fayyad whom I have interviewed here at http://www.decisionstats.com/interview-dr-usama-fayyad-founder-open-insights-llc/
and work By Gregory Piatetsky Shapiro interviewed by this website here
3) I am also attaching a visual representation of SEMMA from http://www.dataprix.net/en/blogs/respinosamilla/theory-data-mining
Here is an interview with Pranay Agrawal, Executive Vice President- Global Client Development, Fractal Analytics – one of India’s leading analytics services providers and one of the pioneers in analytics services delivery.
Ajay- Describe Fractal Analytics’ journey as a startup to a pioneer in the Predictive Analytics Services industry. What were some of the key turning points in the field of analytics that you have noticed during these times?
Pranay- In 2000, Fractal Analytics started as a pure-play analytics services company in India with a focus on financial services. Five years later, we spread our operation to the United States and opened new verticals. Today, we have the widest global footprint among analytics providers and have experience handling data and deep understanding of consumer behavior in over 150 counties. We have matured from an analytics service organization to a productized analytics services firm, specializing in consumer goods, retail, financial services, insurance and technology verticals.
We are on the fore-front of a massive inflection point with Big Data Analytics at the center. We have witnessed the transformation of analytics within our clients from a cost center to the most critical division that drives competitive advantage. Advances are quickly converging in computer science, artificial intelligence, machine learning and game theory, changing the way how analytics is consumed by B2B and B2C companies. Companies that use analytics well are poised to excel in innovation, customer engagement and business performance.
Ajay- What are analytical tools that you use at Fractal Analytics? Are there any trends in analytical software usage that you have observed?
Pranay- We are tools agnostic to serve our clients using whatever platforms they need to ensure they can quickly and effectively operationalize the results we deliver. We use R, SAS, SPSS, SpotFire, Tableau, Xcelsius, Webfocus, Microstrategy and Qlikview. We are seeing an increase in adoption of open source platform such as R, and specialize tools for dashboard like Tableau/Qlikview, plus an entire spectrum of emerging tools to process manage and extract information from Big Data that support Hadoop and NoSQL data structures
Ajay- What are Fractal Analytics plans for Big Data Analytics?
Pranay- We see our clients being overwhelmed by the increasing complexity of the data. While they are all excited by the possibilities of Big Data, on-the-ground struggle continues to realize its full potential. The analytics paradigm is changing in the context of Big Data. Our solutions focus on how to make it super-simple for our clients combined with analytics sophistication possible with Big Data.
Let’s take our Customer Genomics solution for retailers as an example. Retailers are collecting information about Shopper behaviors through every transaction. Retailers want to transform their business to make it more customer-centric but do not know how to go about it. Our Customer Genomics solution uses advanced machine learning algorithm to label every shopper across more than 80 different dimensions. Retailers use these to identify which products it should deep-discount depending on what price-sensitive shoppers buy. They are transforming the way they plan their assortment, planogram and targeted promotions armed with this intelligence.
We are also building harmonization engines using Concordia to enable real-time update of Customer Genomics based on every direct, social, or shopping transaction. This will further bridge the gap between marketing actions and consumer behavior to drive loyalty, market share and profitability.
Ajay- What are some of the key things that differentiate Fractal Analytics from the rest of the industry? How are you different?
Pranay- We are one of the pioneer pure-play analytics firm with over a decade of experience consulting with Fortune 500 companies. What clients most appreciate about working with us includes:
- Experience managing structured and unstructured Big Data (volume, variety) with a deep understanding of consumer behavior in more than 150 counties
- Advanced analytics leveraging supervised machine-learning platforms
- Proprietary products for example: Concordia for data harmonization, Customer Genomics for consumer insights and personalized marketing, Pincer for pricing optimization, Eavesdrop for social media listening, Medley for assortment optimization in retail industry and Known Value Item for retail stores
- Deep industry expertise enables us to leverage cross-industry knowledge to solve a wide range of marketing problems
- Lowest attrition rates in the industry and very selective hiring process makes us a great place to work
Ajay- What are some of the initiatives that you have taken to ensure employee satisfaction and happiness?
Pranay- We believe happy employees create happy customers. We are building a great place to work by taking a personal interest in grooming people. Our people are highly engaged as evidenced by 33% new hire referrals and the highest Glassdoor ratings in our industry.
We recognize the accomplishments and contributions made through many programs such as:
- FractElite – where peers nominate and defend the best of us
- Recognition board – where anyone can write a visible thank you
- Value cards – where anyone can acknowledge great role model behavior in one or more values
- Townhall – a quarterly all hands where we announce anniversaries and FractElite awards, with an open forum to ask questions
- Employee engagement surveys – to measure and report out on satisfaction programs
- Open access to managers and leadership team – to ensure we understand and appreciate each person’s unique goals and ambitions, coach for high performance, and laud their success
Ajay- How happy are Fractal Analytics customers quantitatively? What is your retention rate- and what plans do you have for 2013?
Pranay- As consultants, delivering value with great service is critical to our growth, which has nearly doubled in the last year. Most of our clients have been with us for over five years and we are typically considered a strategic partner.
We conduct client satisfaction surveys during and after each project to measure our performance and identify opportunities to serve our clients better. In 2013, we will continue partnering with our clients to define additional process improvements from applying best practice in engagement management to building more advanced analytics and automated services to put high-impact decisions into our clients’ hands faster.
Pranay Agrawal -Pranay co-founded Fractal Analytics in 2000 and heads client engagement worldwide. He has a MBA from India Institute of Management (IIM) Ahmedabad, Bachelors in Accounting from Bangalore University, and Certified Financial Risk Manager from GARP. He is is also available online on http://www.linkedin.com/in/pranayfractal
Fractal Analytics is a provider of predictive analytics and decision sciences to financial services, insurance, consumer goods, retail, technology, pharma and telecommunication industries. Fractal Analytics helps companies compete on analytics and in understanding, predicting and influencing consumer behavior. Over 20 fortune 500 financial services, consumer packaged goods, retail and insurance companies partner with Fractal to make better data driven decisions and institutionalize analytics inside their organizations.
Fractal sets up analytical centers of excellence for its clients to tackle tough big data challenges, improve decision management, help understand, predict & influence consumer behavior, increase marketing effectiveness, reduce risk and optimize business results.