Holiday Fun: Analyzing Facebook Privacy for Ads

So you got a Facebook ID and ticked it in a hurry AND added in your work info. Bad Choice. Even small advertisers like me ( with 225 fans for Decisionstats) can see aggregate numbers of work info BEFORE even advertising.
This can lead to hilarious results-

See Screenshots below- AND note the numbers

1) 400 US females > age 18 work at IBM, SAP, Oracle or Microsoft AND are interested in Women

2) 2940 US females or males > age 18 work at IBM, SAP, Oracle or Microsoft AND are interested in Women

3) 480 US females > age 18 work at IBM, SAP, Oracle or Microsoft AND are interested in Men AND are married

4) 440 US males > age 18 work at IBM, SAP, Oracle or Microsoft AND are interested in Men

5) 40 US males > age 18 work at IBM, SAP, Oracle or Microsoft AND are interested in Men AND are married

[tweetmeme=”decisionstats”]

Interested in males/females while giving out your work info AND your marital status. I hope these are ahem False Positives but seriously do you think these are violations of privacy or not.

Ps- i decided not to advertise after seeing the err statistics.
pps- This is meant to showcase lax ad related privacy for professionals rather than any individual preference or judgment.

PAWS goes to SF

Conference :Message on Linkedin groupof Decisionstats

 

[tweetmeme source=”decisionstats”]

Predictive Analytics World, Feb 16-17 in San Francisco

The agenda for Predictive Analytics World – Feb. 16-17 2010 in San Francisco – has been posted: http://www.pawcon.com/sanfrancisco/2010/agenda_overview.php

February’s PAW covers hot topics and advanced methods such as social data, uplift modeling (net lift), text mining, massively parallel analytics, in-cloud deployment, and innovative applications that benefit organizations in new and creative ways.

Be sure to register by December 18 for the Super Early Bird to save $400 off the Regular Price:
http://www.predictiveanalyticsworld.com/register.php

And take an additional $50 off the Super Early Bird with discount code: LIN150

Below is some more info – let me know if you have any questions.

-Eric Siegel, Conference Chair

———–

PAW-2010 includes 25 sessions across two tracks, so you can witness how predictive analytics is applied at 1-800-FLOWERS, Amazon.com, AT&T, BBC, Canadian Automobile Association, Charles Schwab, Continental Airlines, Deutsche Postbank, Google, Group RCI, IBM, PASSUR Aerospace, PayPal (eBay), Sun Microsystems, U.S. Army, Visa, Walmart Financial Services, and Younoodle, plus special examples from the U.S. government agencies CBP, NCMI, NGIC, NSA, and SSA.

Keynote speakers include Kim Larsen, Director Advanced Analytics at Charles Schwab, Andreas S. Weigend, Ph.D., Former Chief Scientist at Amazon.com, and Program Chair Eric Siegel, Ph.D., President of Prediction Impact and former Columbia University professor.

Predictive Analytics World is the business-focused event for predictive analytics professionals, managers and commercial practitioners, covering today’s commercial deployment of predictive analytics, across industries and across software vendors.

For more information, including three pre- and post-event workshops:
http://www.predictiveanalyticsworld.com

How to be a BAD blogger?

Here are some tips to being a BAD blogger. This assumes that –

[tweetmeme source=”decisionstats”]

  • you are intelligent enough to know what you speak ( NO- STUPID CLAUSE),
  • are otherwise an interesting person in your offline life,
  • have a good story to tell about yourself, your product or your company ( NO BORING CLAUSES),
  • can spell-check (mostly) (NOT LAZY CLAUSE),
  • can create a free account on wordpress.com or have access to a website where you can post material (NOT LAZY AND STUPID CLAUSES)
  • AND otherwise have a desire to try and be a good blogger.
BAD

Step 1

Credibility

On the Internet everyone is an experienced expert in something.

Ways to wreck credibility-

  1. Offer ads from Adsense before your blog traffic crosses 100 average a day and maximum 200 visitors a day( not views).
  2. Take offers like free travel, books, software from people, products and companies- dont disclose that- and pump them up by flattering reviews.
  3. Scratch the back of a fellow blog monkey- Also known as you praise me in my blog- I will praise you in mine and we think we fooled everyone that we are just networking.
  4. Use shock words and images to differentiate.
  5. Offer ads from Non Adsense advertisers before your traffic crosses 500 average a day and maximum 1000 visitors a day( not views)
  6. Have only ONE advertiser and offer PRIME placement to news of it AND IGNORE corporate rivals completely.
  7. Claim to know people intimately whom you only know via Facebook Mafia Wars.
  8. Offer stuff to guest blogger and forget to follow up on the promise.
  9. Spam people on email and tell them how you are spamming them to HELP them with NEW stuff.
  10. Take money from sponsors, and free content from people. Call it aggregation and community. Pocket all the money
  11. Accept advertising from pornography. Claim you did not know what it was.
  12. Give tips on hacking websites. What goes around will never come around, right?

That should wreck your credibility completely. To build up your credibility ,  do the reverse of the above.

Hard Work

Hard work never killed anyone, but try to blog on boring stuff. Or on politics ,guns, gays and religion (preferably at the same time)

  1. Post a stupid  picture of yourself in the about page  and tell yourself people don’t care on photos anyway.
  2. Touch up your photo image by ADOBE Photoshop or Post an image 10 years younger (or 10 pounds thinner).
  3. Choose a bad theme. Like Violet background and yellow font.
  4. Post images of your kids or your vacation in a professional blog OR /AND post images of your computer or conferences in a personal blog.
  5. DO NOT SPELL CHECK.
  6. Use HTM4.0 . Pretend that CSS is a hit TV show.
  7. Pretend SEO , Tags and Categories is for others. DO NOT make it easy to search your blog.

WRITING

Coleridge was a drug addict. Poe was an alcoholic. Marlowe was killed by a man whom he was treacherously trying to stab. Pope took money to keep a woman’s name out of a satire then wrote a piece so that she could still be recognized anyhow. Chatterton killed himself. Byron was accused of incest. Do you still want to a writer – and if so, why?

Bennett Cerf ( from http://koti.mbnet.fi/pasenka/quotes/q-writ.htm#Writing%20is%20hell

  1. Write on politics and guns on a tech blog, or technology on a politics blog.
  2. Write dis jointed sentences in a hurry and claim it’s okay people wont notice anyways.
  3. Write only in text without ANY Images.
  4. Write 5 posts a day. or Write once in 5 weeks.
  5. Never explore VIDEO or AUDIO in your blog. Podcasts are for frozen peas.
  6. Have an ego bigger than your talent. Write about it.
  7. Be an expert in social media without crossing 1.5 years of blogging, or 25000 unique visitors. or 100,000 views on Internet. Twitter followers and Linkedin connections doesn’t count. Facebook  Fans don’ count either.
  8. Generally make an ass of yourself by not editing or not proof reading your posts.

This should generally make sure that you become a BAD blogger, your blog traffic never crosses into two digits a day and you get back to work on your day job which you are probably good at.

If you do that, tell everyone blogs don’t matter in the 2010’s just as websites never mattered in the 1990’s, or Novels in the 1980’s, or TV in the 1950’s or Talking Pictures in the 1930’s.

Yup.

Born in the USA?

Here is some econometric search-ing I did

Using Google Public Data-and Wolfram Alpha and The Bureau of Labour Statistics

United States

United States – Monthly Data
Data Series Back
Data
May
2009
June
2009
July
2009
Aug
2009
Sept
2009
Oct
2009
Unemployment Rate (1)
Jump to page with historical data
9.4 9.5 9.4 9.7 9.8 10.2
Change in Payroll Employment (2)
Jump to page with historical data
-303 -463 -304 -154 (P) -219 (P) -190
Average Hourly Earnings (3)
Jump to page with historical data
18.53 18.54 18.59 18.66 (P) 18.67 (P) 18.72
Consumer Price Index (4)
Jump to page with historical data
0.1 0.7 0.0 0.4 0.2 0.3
Producer Price Index (5)
Jump to page with historical data
0.2 1.7 (P) -1.0 (P) 1.7 (P) -0.6 (P) 0.3
U.S. Import Price Index (6)
Jump to page with historical data
1.7 2.7 (R) -0.6 (R) 1.5 (R) 0.2 (R) 0.7
Footnotes
(1) In percent, seasonally adjusted. Annual averages are available for Not Seasonally Adjusted data.
(2) Number of jobs, in thousands, seasonally adjusted.
(3) For production and nonsupervisory workers on private nonfarm payrolls, seasonally adjusted.
(4) All items, U.S. city average, all urban consumers, 1982-84=100, 1-month percent change, seasonally adjusted.
(5) Finished goods, 1982=100, 1-month percent change, seasonally adjusted.
(6) All imports, 1-month percent change, not seasonally adjusted.
(R) Revised
(P) Preliminary
United States – Quarterly Data
Data Series Back
Data
3rd Qtr
2008
4th Qtr
2008
1st Qtr
2009
2nd Qtr
2009
3rd Qtr
2009
Employment Cost Index (1)
Jump to page with historical data
0.6 0.6 0.3 0.4 0.4
Productivity (2)
Jump to page with historical data
-0.1 0.8 0.3 6.9 9.5
Footnotes
(1) Compensation, all civilian workers, quarterly data, 3-month percent change, seasonally adjusted.
(2) Output per hour, nonfarm business, quarterly data, percent change from previous quarter at annual rate, seasonally adjusted.

And also included are the average wages for salary of teachers and average salary per hour of some offshore  prone industries

http://www.bls.gov/oes/2008/may/oes_nat.htm#b25-0000

http://www.bls.gov/oes/2008/may/oes_nat.htm#b11-0000

and

http://www.google.com/publicdata?ds=usunemployment&met=unemployment_rate&idim=state:ST370000:ST540000:ST510000&tdim=true

WHAT THEY PAY TEACHERS (MAY 2008)

Education, Training, and Library Occupations top
Wage Estimates
Occupation Code Occupation Title (click on the occupation title to view an occupational profile) Employment (1) Median Hourly Mean Hourly Mean Annual (2) Mean RSE (3)
25-0000 Education, Training, and Library Occupations 8,451,250 $21.26 $23.30 $48,460 0.5 %
25-1011 Business Teachers, Postsecondary 69,690 (4) (4) $77,340 1.0 %
25-1021 Computer Science Teachers, Postsecondary 32,520 (4) (4) $74,050 1.0 %
25-1022 Mathematical Science Teachers, Postsecondary 45,710 (4) (4) $68,130 0.9 %
25-1031 Architecture Teachers, Postsecondary 6,430 (4) (4) $75,450 1.9 %
25-1032 Engineering Teachers, Postsecondary 32,070 (4) (4) $90,070 1.1 %
25-1041 Agricultural Sciences Teachers, Postsecondary 10,000 (4) (4) $77,770 1.6 %
25-1042 Biological Science Teachers, Postsecondary 51,930 (4) (4) $83,270 2.7 %

WHAT THEY PAY THEMSELVES

Management Occupations top
Wage Estimates
Occupation Code Occupation Title (click on the occupation title to view an occupational profile) Employment (1) Median Hourly Mean Hourly Mean Annual (2) Mean RSE (3)
11-0000 Management Occupations 6,152,650 $42.15 $48.23 $100,310 0.2 %
11-1011 Chief Executives 301,930 $76.23 $77.13 $160,440 0.5 %
11-1021 General and Operations Managers 1,697,690 $44.02 $51.91 $107,970 0.2 %
11-1031 Legislators 64,650 (4) (4) $37,980 1.1 %

and JOBS PRONE TO SHORTAGE /OFFSHORING

Computer and Mathematical Science Occupations top
Wage Estimates
Occupation Code Occupation Title (click on the occupation title to view an occupational profile) Employment (1) Median Hourly Mean Hourly Mean Annual (2) Mean RSE (3)
15-0000 Computer and Mathematical Science Occupations 3,308,260 $34.26 $35.82 $74,500 0.3 %
15-1011 Computer and Information Scientists, Research 26,610 $47.10 $48.51 $100,900 1.1 %
15-1021 Computer Programmers 394,230 $33.47 $35.32 $73,470 0.6 %
15-1031 Computer Software Engineers, Applications 494,160 $41.07 $42.26 $87,900 0.4 %
15-1032 Computer Software Engineers, Systems Software 381,830 $44.44 $45.44 $94,520 0.5 %
15-1041 Computer Support Specialists 545,520 $20.89 $22.29 $46,370 0.3 %
15-1051 Computer Systems Analysts 489,890 $36.30 $37.90 $78,830 0.4 %
15-1061 Database Administrators 115,770 $33.53 $35.05 $72,900 0.8 %
15-1071 Network and Computer Systems Administrators 327,850 $31.88 $33.45 $69,570 0.3 %
15-1081 Network Systems and Data Communications Analysts 230,410 $34.18 $35.50 $73,830 0.4 %
15-1099 Computer Specialists, All Other 191,780 $36.13 $36.54 $76,000 0.5 %
15-2011 Actuaries 18,220 $40.77 $46.14 $95,980 1.4 %
15-2021 Mathematicians 2,770 $45.75 $45.65 $94,960 1.7 %
15-2031 Operations Research Analysts 60,860 $33.17 $35.68 $74,220 0.8 %
15-2041 Statisticians 20,680 $34.91 $35.96 $74,790 1.5 %
15-2091 Mathematical Technicians 1,100 $18.46 $20.24 $42,100 2.7 %
15-2099 Mathematical Science Occupations, All Other 6,600 $26.44 $31.55 $65,630 4.3 %

 

UNEMPLOYED IN THE USA (above)

BY STATE (below)

16 million people out of work. Give or take a million.

How can America pay 5.6 million people UNEMPLOYMENT BENEFITS

Keep another 10 million unemployed,

another 10 million only partially employed.

[tweetmeme source=”decisionstats”]

and still claim aggregate cost savings from offshoring jobs.

M2009 Interview Peter Pawlowski AsterData

Here is an interview with Peter Pawlowski, who is the MTS for Data Mining at Aster Data. I ran into Peter at his booth at AsterData during M2009, and followed up with an email interview. Also included is a presentation by him of which he was a co-author.

[tweetmeme source=”decisionstats”]

Ajay- Describe your career in Science leading up till today.

Peter- Went to Stanford, where I got a BS & MS in Computer Science. I did some work on automated bug-finding tools while at Stanford.
( Note- that sums up the career of almost 60 % of CS scientists)

Ajay- How is life working at Aster Data- what are the challenges and the great stuff

Peter- Working at Aster is great fun, due to the sheer breadth and variety of the technical challenges. We have problems to solve in the optimization, languages, networking, databases, operating systems, etc. It’s been great to think about problems end-to-end & consider the impact of a change on all aspects of the system. I worked on SQL/MR in particular, which had lots of interesting challenges: how do you define the API? how do you integrate with SQL? how do you make it run fast? how do you make it scale?

Ajay- Do you think Universities offer adequate preparation for in demand skills like Mapreduce, Hadoop and Business Intelligence

Peter-   Probably not BI–I learned everything I know about BI while at Aster. In terms of M/R, it’d be useful to have more hands-on experience with distributed system which at school. We read the MapReduce paper but didn’t get a chance to actually play with M/R. I think that sort of exposure would be useful. We recently made our software available to some students taking a data mining class at Stanford, and they came up with some fascinating use cases for our system, esp. around the Netflix challenge dataset.

Ajay- Describe some of the recent engineering products that you have worked with at Aster

Peter-  SQL/MR is the main aspects of nCluster that i’ve worked with–interesting challenged described in #2.

Ajay- All BI companies claim to crunch data the fastest at the lowest price at highest quality as per their marketing brochure- How would you validate your product’s performance scientifically and transparently.

Peter- I’ve found that the hardest part of judging performance is to come up with a realistic workload. There are public benchmarks out there, but they may or may not reflect the kinds of workloads that our customers want to run. Our goal is to make our customers’ experience as good as possible, so we focus on speeding up the sorts of workloads they ask about.
And here is a presentation at Slideshare.net on more of what Peter works on.

SAS and JMP : Visual Data Discovery

While R packagers have a lot to be proud of in the graphics packages of R, the truth of the matter is that the lack of GUI even for Graphical Analysis hinders the ease of usage in adopting R’s powerful graphics for statistical analysis. As a contrast , SAS and JMP have been combined together in the SAS Visual Data Discovery Environment

[tweetmeme source=”decisionstats”]

I really liked the GUI of JMP ( which is very rich in stats testing) and with the powerful data handling capabilities on the desktop of SAS, this is clearly an outstanding effort to create terrific graphics ( see below)

Note the combination of the two- Great Graphics WITH a GUI. in R the GUI that comes closest to matching JMP is R Commander, but it’s graphical capabilities are kept basic as it is not meant for replacement of the beloved Kommand prompt

( maybe an expanded plugin for graphics or hexabin would help)

It would be interesting to see an on demand  Ec2 cloud hosted version of visual data discovery by SAS (with JMP as the front end) even for a limited pilot of six months and targeted at the SMB segment. Or a Salesforce.com application that integrates Salesforce.com data with the tests and standard procedures in SAS and JMP.

Note of Discontent- The JMP Website is terrible. It has a different font from the SAS Website ( they could atleast use the same CSS ) and overall is the worst part of the otherwise excellently elegant JMP. Hope they upgrade their website soon ( they havent done it this year atleast).

Scrennshot Citation-

http://www.sas.com/technologies/analytics/statistics/datadiscovery/index.html