This program computes the Pearson correlation coefficient (R) for a user-supplied dataset of up to 100 (X, Y) pairs and then renders a scatter plot using a 17×25 character array. It calculates means (MX, MY), standard deviations (SX, SY), and the cross-product mean to derive R, guarding against division by zero when either standard deviation is zero. The scatter plot maps data points into a 17×24 grid stored in the string array A$, using a sequence of characters (“*”, “2”–”9″) to represent overplotted points at the same cell. The Y-axis is drawn with block-graphic vertical bar characters, and the X-axis labels are printed using TAB positioning with calculated interval values.
Program Analysis
Program Structure
The program is divided into four logical phases:
- Initialisation (lines 10–90): Prints a title in inverse video, dimensions the string array
A$(17,25)and numeric arraysX(100),Y(100), and zeroes the accumulators. - Data entry (lines 100–180): Prompts the user for a sample size
Nand then collects each X/Y pair in a loop, clearing the screen between entries. - Statistics computation (lines 190–340): Accumulates sums to compute means
MX,MY, standard deviationsSX,SY, and the Pearson correlation coefficientR. - Scatter plot (lines 350–720): Finds the data range, normalises coordinates into the 17×24 grid, populates
A$with overplot symbols, and prints the chart with axis labels.
Statistical Computation
The program uses the computational form of variance and covariance. The mean-of-squares minus the square-of-means formula is used for standard deviations:
SX = SQR((MSQX/N) - MX^2)SY = SQR((MSQY/N) - MY^2)R = ((CROSS/N) - (MX*MY)) / (SX*SY)
A guard at line 300 skips the R calculation if either standard deviation is zero (constant data), and line 330 prints an appropriate message. Note that ABS is applied before squaring at lines 230–240, which is mathematically redundant since squaring always yields a non-negative result, but causes no harm.
Overplot Symbol Encoding
Rather than using a simple presence/absence marker, the program encodes point density in the character stored at each grid cell. The cascade of IF tests at lines 560–640 must be evaluated top-to-bottom (highest to lowest) to correctly increment the count without skipping values:
| Cell contents | Meaning |
|---|---|
" " | No data point |
"*" | 1 point |
"2"–"9" | 2–9 overlapping points |
The sequence saturates at 9; any additional points beyond 9 at the same cell are silently ignored because no IF branch matches "9" to increment it further.
Coordinate Normalisation
Data values are scaled to fit the grid using the formulae at lines 470–480:
X(I) = INT(((X(I)/XRANGE)*16) + 1.5)— maps to rows 1–17Y(I) = INT(((Y(I)/YRANGE)*24) + 1.5)— maps to columns 1–24
The +1.5 offset effectively rounds to the nearest integer while shifting the origin from 0 to 1. This assumes all data values are non-negative; negative X or Y values would produce out-of-bounds indices and likely cause an error.
Bugs and Anomalies
- Loop variable mismatch (lines 660–700): The outer loop uses
Las the control variable (FOR L=17 TO 1 STEP -1), but the loop body referencesIin theIFconditions andPRINTstatements at lines 670–680. SinceIwas last set toNat the end of the normalisation loop, the axis labels and row selection are essentially broken. The variable should beLthroughout. - Only positive data supported: The range-finding loop (lines 350–400) only tracks the maximum values (
XH,YH), and the normalisation divides by these maxima, so datasets containing zero or negative values will produce incorrect or erroneous results. - A$ column dimension:
A$is dimensioned as(17,25)but only columns 1–24 are used; column 25 is never written, leaving one unused column per row. - Unused variables:
XRANGEandYRANGEare assigned the same values asXHandYHrespectively (lines 410–420) andPis computed at line 440 for axis labelling, butZis initialised toXH(line 450) and decremented byIX— the X-axis labels are thus printed in descending order, which is unusual for a scatter plot. - Lines 950–970: A
CLEAR/SAVE/RUNsequence follows theSTOPat line 900 and is unreachable during normal execution; it serves as a manual save utility if entered directly.
Notable BASIC Idioms
- The title at line 10 uses inverse-video character escapes (
%K%W%I%K%P%L%O%T) to display “KWIKPLOT” in reverse highlight. - The horizontal axis border at line 710 is constructed from ZX81 block-graphic characters to draw a continuous double-width line across the chart.
- The
TABkeyword is used at line 720 to position four numeric axis labels at evenly-spaced column positions in a singlePRINTstatement.
Content
Source Code
10 PRINT "%K%W%I%K%P%L%O%T"
20 DIM A$(17,25)
30 LET MX=0
40 LET MY=0
50 LET CROSS=0
60 LET MSQX=0
70 LET MSQY=0
80 DIM X(100)
90 DIM Y(100)
100 PRINT AT 10,0;"INPUT YOUR SAMPLE SIZE."
110 INPUT N
120 CLS
130 PRINT AT 10,0;"INPUT YOUR DATA WITH EACH X(VRT)FOLLOWED BY ITS Y(HRZ)."
140 FOR I=1 TO N
150 INPUT X(I)
160 INPUT Y(I)
170 CLS
180 NEXT I
190 FOR I=1 TO N
200 LET MX=MX+X(I)
210 LET MY=MY+Y(I)
220 LET CROSS=CROSS+X(I)*Y(I)
230 LET MSQX=MSQX+(ABS X(I))**2
240 LET MSQY=MSQY+(ABS Y(I))**2
250 NEXT I
260 LET MX=MX/N
270 LET MY=MY/N
280 LET SX=SQR ((MSQX/N)-(MX)**2)
290 LET SY=SQR ((MSQY/N)-(MY)**2)
300 IF SX=0 OR SY=0 THEN GOTO 330
310 LET R=((CROSS/N)-(MX*MY))/(SX*SY)
320 PRINT "MX=";MX;" MY=";MY;" R=";R
330 IF SX=0 OR SY=0 THEN PRINT "R IS NOT COMPUTABLE"
340 PRINT
350 LET XH=X(1)
360 LET YH=Y(1)
370 FOR I=1 TO N
380 IF X(I)>XH THEN LET XH=X(I)
390 IF Y(I)>YH THEN LET YH=Y(I)
400 NEXT I
410 LET XRANGE=XH
420 LET YRANGE=YH
430 LET IX=XRANGE/16
440 LET P=YRANGE/4
450 LET Z=XH
460 FOR I=1 TO N
470 LET X(I)=INT (((X(I)/XRANGE)*16)+1.5)
480 LET Y(I)=INT (((Y(I)/YRANGE)*24)+1.5)
490 NEXT I
500 FOR I=1 TO 17
510 FOR J=1 TO 24
520 LET A$(I,J)=" "
530 NEXT J
540 NEXT I
550 FOR I=1 TO N
560 IF A$(X(I),Y(I))="8" THEN LET A$(X(I),Y(I))="9"
570 IF A$(X(I),Y(I))="7" THEN LET A$(X(I),Y(I))="8"
580 IF A$(X(I),Y(I))="6" THEN LET A$(X(I),Y(I))="7"
590 IF A$(X(I),Y(I))="5" THEN LET A$(X(I),Y(I))="6"
600 IF A$(X(I),Y(I))="4" THEN LET A$(X(I),Y(I))="5"
610 IF A$(X(I),Y(I))="3" THEN LET A$(X(I),Y(I))="4"
620 IF A$(X(I),Y(I))="2" THEN LET A$(X(I),Y(I))="3"
630 IF A$(X(I),Y(I))="*" THEN LET A$(X(I),Y(I))="2"
640 IF A$(X(I),Y(I))=" " THEN LET A$(X(I),Y(I))="*"
650 NEXT I
660 FOR L=17 TO 1 STEP -1
670 IF I=1 THEN PRINT "0";TAB 4;" \: ";A$(I)
680 IF I>1 THEN PRINT INT (Z+.5);TAB 4;" \: ";A$(I)
690 LET Z=Z-IX
700 NEXT I
710 PRINT TAB 4;"\''\':\''\''\''\''\':\''\''\''\''\''\''\':\''\''\''\''\''\':\''\''\''\''\''\':"
720 PRINT TAB 5,0;TAB 10,INT ((YRANGE-3*P)+.5);TAB 16;INT ((YRANGE-2*P)+.5);TAB 22;INT ((YRANGE-P)+.5);TAB 28;INT (YRANGE)
900 STOP
950 CLEAR
960 SAVE "1006%9"
970 RUN
Note: Type-in program listings on this website use ZMAKEBAS notation for graphics characters.
