{ "cells": [ { "cell_type": "markdown", "id": "e9d72aa5", "metadata": {}, "source": [ "# Get data statistics\n", "\n", "This notebook shows how to use the ``get_stats`` method to get data from the AsyncAPI.\n", "It allows to get statistics about the data without loading it, which can be useful to get an overview of the data and to decide which data to load.\n", "\n", "API-24SEA endpoint: [https://api.24sea.eu/routes/v1/datasignals/stats](https://api.24sea.eu/docs/v1/#/operations/datasignals_metrics_stats)\n" ] }, { "cell_type": "code", "execution_count": 5, "id": "cc5263c5", "metadata": { "lines_to_next_cell": 2 }, "outputs": [], "source": [ "# **Package Imports**\n", "# - From the Python Standard Library\n", "import logging\n", "import os\n", "import sys\n", "\n", "# - API-24SEA\n", "from api_24sea.version import __version__, parse_version\n", "from api_24sea.datasignals.core import API\n" ] }, { "cell_type": "code", "execution_count": 6, "id": "e7b5b41d", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Package Version(major=2, minor=1, patch=6, release=None, num=None)\n" ] } ], "source": [ "# **Package Version**\n", "print(f\"Package {parse_version(__version__)}\")\n", "\n", "# **Notebook Configuration**\n", "logger = logging.getLogger()\n", "logger.setLevel(logging.WARNING)\n" ] }, { "cell_type": "markdown", "id": "adebe789", "metadata": {}, "source": [ "
\n", "

Login Credentials

\n", " \n", " \n", " \n", " \n", "

Do not store API credentials in plain text in your notebook. Rather use the python-dotenv package to load environment variables from a .env file.

\n", "
" ] }, { "cell_type": "code", "execution_count": 7, "id": "8b6454c1", "metadata": {}, "outputs": [], "source": [ "# **Set Sample API Credentials**\n", "os.environ[\"API_24SEA_USERNAME\"] = \"Sample.User\"\n", "os.environ[\"API_24SEA_PASSWORD\"] = \"CheckOutSomeData!\"\n" ] }, { "cell_type": "markdown", "id": "37dbfb82", "metadata": {}, "source": [ "## Statistics Example\n", "\n", "Get statistical parameters for the selected metrics and time range for the specified project and one or more locations.\n", "\n", "The statistics calculated for every metric are the following:\n", "\n", "* ``max_``: maximum value\n", "* ``min_``: minimum value\n", "* ``avg_``: average value\n", "* ``stddev_``: standard deviation\n", "* ``variance_``: variance\n", "* ``count_``: count\n", "* ``countnotnull_``: count of non-null values\n", "* ``median_``: median value\n", "* ``q1_``: first quartile\n", "* ``q3_``: third quartile\n", "* ``mode_``: mode\n", "* ``mintimestamp_``: timestamp of the minimum value\n", "* ``maxtimestamp_``: timestamp of the maximum value\n" ] }, { "cell_type": "code", "execution_count": null, "id": "75cc3229", "metadata": { "lines_to_next_cell": 2 }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
metricavgcountcountnotnullmaxmaxtimestampmedianminmintimestampmodeq1q3stddevvariancesitelocationdata_groupstatisticshort_handprint_str
0mean_WF_A01_pitch53.3054421288712887181.0928612020-04-30T08:50:00+00:0045.153488-86.1317712020-03-28T18:00:00+00:0066.39878844.01348854.99663819.628382385.273376WindFarmWFA01scadameanpitchPitch
1mean_WF_A01_power887.96248312887128873765.5705782020-03-11T16:00:00+00:00403.762354-57.1433722020-05-19T15:50:00+00:000.0-3.3928811275.3000511163.0537621352694.052959WindFarmWFA01scadameanpowerPower
2mean_WF_A01_rpm6.743165128871288711.2444272020-03-12T12:20:00+00:007.0251760.02020-03-01T00:20:00+00:000.04.9491349.4881063.51019412.321463WindFarmWFA01scadameanrpmRotor speed
3mean_WF_A01_winddirection148.8261131288012880340.5100792020-03-27T12:00:00+00:00183.7580077.6210532020-05-14T12:20:00+00:0021.57709560.963421211.97361481.5047436643.023207WindFarmWFA01scadameanwinddirectionWind direction
4mean_WF_A01_windspeed11.567049128751287531.3924112020-03-11T19:10:00+00:0010.3405541.2181542020-03-01T01:00:00+00:008.188158.24516313.9682494.73820322.450571WindFarmWFA01scadameanwindspeedWind speed
5mean_WF_A01_yaw200.0281231287512875359.9158612020-03-21T06:40:00+00:00177.033961-43.1583742020-03-28T20:50:00+00:00315.542861107.300361326.850061116.13779613487.987697WindFarmWFA01scadameanyawYaw
6mean_WF_A02_pitch55.2770811278912789181.0428612020-03-17T14:50:00+00:0045.343388-89.212032020-04-22T14:50:00+00:0066.39878844.02018856.67828823.057883531.665981WindFarmWFA02scadameanpitchPitch
7mean_WF_A02_power838.75736112789127893765.9941092020-03-21T01:00:00+00:00356.91991-59.0176162020-05-19T19:00:00+00:000.0-7.3389051177.5079951131.9431981281295.404286WindFarmWFA02scadameanpowerPower
8mean_WF_A02_rpm6.500222127891278911.6999092020-03-08T05:40:00+00:006.9996710.02020-03-01T00:10:00+00:000.04.359859.2312973.68088113.548884WindFarmWFA02scadameanrpmRotor speed
9mean_WF_A02_winddirection149.8205221277812778358.5191122020-04-09T00:10:00+00:00184.185625.3666212020-03-09T13:50:00+00:0018.6303360.98125213.17363781.9141046709.920367WindFarmWFA02scadameanwinddirectionWind direction
10mean_WF_A02_windspeed11.240079127711277130.8952052020-03-11T19:10:00+00:0010.1062040.8159152020-03-09T12:20:00+00:007.958117.93565213.8182594.68703421.968291WindFarmWFA02scadameanwindspeedWind speed
11mean_WF_A02_yaw196.9596721277112771359.9963612020-03-21T02:20:00+00:00174.742861-70.0568422020-03-28T08:10:00+00:0014.642861102.685811330.054061118.93677614145.9567WindFarmWFA02scadameanyawYaw
\n", "
" ], "text/plain": [ " metric avg count countnotnull max \\\n", "0 mean_WF_A01_pitch 53.305442 12887 12887 181.092861 \n", "1 mean_WF_A01_power 887.962483 12887 12887 3765.570578 \n", "2 mean_WF_A01_rpm 6.743165 12887 12887 11.244427 \n", "3 mean_WF_A01_winddirection 148.826113 12880 12880 340.510079 \n", "4 mean_WF_A01_windspeed 11.567049 12875 12875 31.392411 \n", "5 mean_WF_A01_yaw 200.028123 12875 12875 359.915861 \n", "6 mean_WF_A02_pitch 55.277081 12789 12789 181.042861 \n", "7 mean_WF_A02_power 838.757361 12789 12789 3765.994109 \n", "8 mean_WF_A02_rpm 6.500222 12789 12789 11.699909 \n", "9 mean_WF_A02_winddirection 149.820522 12778 12778 358.519112 \n", "10 mean_WF_A02_windspeed 11.240079 12771 12771 30.895205 \n", "11 mean_WF_A02_yaw 196.959672 12771 12771 359.996361 \n", "\n", " maxtimestamp median min \\\n", "0 2020-04-30T08:50:00+00:00 45.153488 -86.131771 \n", "1 2020-03-11T16:00:00+00:00 403.762354 -57.143372 \n", "2 2020-03-12T12:20:00+00:00 7.025176 0.0 \n", "3 2020-03-27T12:00:00+00:00 183.758007 7.621053 \n", "4 2020-03-11T19:10:00+00:00 10.340554 1.218154 \n", "5 2020-03-21T06:40:00+00:00 177.033961 -43.158374 \n", "6 2020-03-17T14:50:00+00:00 45.343388 -89.21203 \n", "7 2020-03-21T01:00:00+00:00 356.91991 -59.017616 \n", "8 2020-03-08T05:40:00+00:00 6.999671 0.0 \n", "9 2020-04-09T00:10:00+00:00 184.18562 5.366621 \n", "10 2020-03-11T19:10:00+00:00 10.106204 0.815915 \n", "11 2020-03-21T02:20:00+00:00 174.742861 -70.056842 \n", "\n", " mintimestamp mode q1 q3 \\\n", "0 2020-03-28T18:00:00+00:00 66.398788 44.013488 54.996638 \n", "1 2020-05-19T15:50:00+00:00 0.0 -3.392881 1275.300051 \n", "2 2020-03-01T00:20:00+00:00 0.0 4.949134 9.488106 \n", "3 2020-05-14T12:20:00+00:00 21.577095 60.963421 211.973614 \n", "4 2020-03-01T01:00:00+00:00 8.18815 8.245163 13.968249 \n", "5 2020-03-28T20:50:00+00:00 315.542861 107.300361 326.850061 \n", "6 2020-04-22T14:50:00+00:00 66.398788 44.020188 56.678288 \n", "7 2020-05-19T19:00:00+00:00 0.0 -7.338905 1177.507995 \n", "8 2020-03-01T00:10:00+00:00 0.0 4.35985 9.231297 \n", "9 2020-03-09T13:50:00+00:00 18.63033 60.98125 213.173637 \n", "10 2020-03-09T12:20:00+00:00 7.95811 7.935652 13.818259 \n", "11 2020-03-28T08:10:00+00:00 14.642861 102.685811 330.054061 \n", "\n", " stddev variance site location data_group statistic \\\n", "0 19.628382 385.273376 WindFarm WFA01 scada mean \n", "1 1163.053762 1352694.052959 WindFarm WFA01 scada mean \n", "2 3.510194 12.321463 WindFarm WFA01 scada mean \n", "3 81.504743 6643.023207 WindFarm WFA01 scada mean \n", "4 4.738203 22.450571 WindFarm WFA01 scada mean \n", "5 116.137796 13487.987697 WindFarm WFA01 scada mean \n", "6 23.057883 531.665981 WindFarm WFA02 scada mean \n", "7 1131.943198 1281295.404286 WindFarm WFA02 scada mean \n", "8 3.680881 13.548884 WindFarm WFA02 scada mean \n", "9 81.914104 6709.920367 WindFarm WFA02 scada mean \n", "10 4.687034 21.968291 WindFarm WFA02 scada mean \n", "11 118.936776 14145.9567 WindFarm WFA02 scada mean \n", "\n", " short_hand print_str \n", "0 pitch Pitch \n", "1 power Power \n", "2 rpm Rotor speed \n", "3 winddirection Wind direction \n", "4 windspeed Wind speed \n", "5 yaw Yaw \n", "6 pitch Pitch \n", "7 power Power \n", "8 rpm Rotor speed \n", "9 winddirection Wind direction \n", "10 windspeed Wind speed \n", "11 yaw Yaw " ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# **API Initialization**\n", "api = API()\n", "# The Metrics Overview is a table containing all the locations and metrics\n", "# available in the API, together with their metadata. It is useful to query it\n", "# before getting data, to check which locations and metrics are available for\n", "# the site you want to query.\n", "m_o = api.metrics_overview\n", "\n", "# site is the wind farm name you want to query. The parameter can also be\n", "# a list of site names, e.g. [\"windfarm\", \"windfarm2\"]. Matching is\n", "# case-insensitive and passing the \"site-id\" is also accepted, e.g. \"windfarm\"\n", "# will match \"WindFarm\", but also \"WF\", \"wf\", etc.\n", "site = \"windfarm\"\n", "# Matching locations from Metrics Overview for the specified site\n", "# Also partial names of locations are accepted, e.g. \"a01\" will match\n", "# all locations containing \"a01\" in their name, such as \"A01\", \"a01\". Matching\n", "# is case-insensitive.\n", "locations = m_o[m_o[\"site\"].str.lower() == site][\"location\"].unique().tolist()\n", "# Metrics: partial matches and regexes are accepted.\n", "# Spaces are interpreted as .* in the name.\n", "# If you want to query all metrics, pass \"all\" or [\"all\"].\n", "metrics = [\"mean windspeed\", \"mean power\", \"mean yaw\", \"mean rpm\",\n", " \"mean pitch\", \"mean winddirection\"]\n", "# Start and end timestamp in ISO 8601 format. The API accepts also other\n", "# formats, such as the one provided by\n", "# https://www.elastic.co/docs/reference/elasticsearch/rest-apis/common-options#date-math\n", "start_timestamp = \"2020-03-01T00:00:00Z\"\n", "end_timestamp = \"2020-06-01T00:00:00Z\"\n", "\n", "stats_df = api.get_stats(site, locations, metrics, start_timestamp, end_timestamp)\n", "stats_df\n" ] } ], "metadata": { "jupytext": { "custom_cell_magics": "kql", "encoding": "# -*- coding: utf-8 -*-" }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.1" } }, "nbformat": 4, "nbformat_minor": 5 }