Entrenamiento de un modelo de ML

Unavailable or self-documenting column names are marked with an «NA».

MachineIdentifier: Individual machine ID
ProductName: Defender state information e.g. win8defender
EngineVersion: Defender state information e.g. 1.1.12603.0
AppVersion: Defender state information e.g. 4.9.10586.0
AvSigVersion: Defender state information e.g. 1.217.1014.0
IsBeta: Defender state information e.g. false
RtpStateBitfield: NA
IsSxsPassiveMode: NA
DefaultBrowsersIdentifier: ID for the machine’s default browser
AVProductStatesIdentifier: ID for the specific configuration of a user’s antivirus software
AVProductsInstalled: NA
AVProductsEnabled: NA
HasTpm: True if machine has TPM
CountryIdentifier: ID for the country the machine is located in
CityIdentifier: ID for the city the machine is located in
OrganizationIdentifier: ID for the organization the machine belongs in, organization ID is mapped to both specific companies and broad industries
GeoNameIdentifier: ID for the geographic region a machine is located in
LocaleEnglishNameIdentifier: English name of Locale ID of the current user
Platform: Calculates platform name (of OS related properties and processor property)
Processor: This is the process architecture of the installed operating system
OsVer: Version of the current operating system
OsBuild: Build of the current operating system
OsSuite: Product suite mask for the current operating system.
OsPlatformSubRelease: Returns the OS Platform sub-release (Windows Vista, Windows 7, Windows 8, TH1, TH2)
OsBuildLab: Build lab that generated the current OS. Example: 9600.17630.amd64fre.winblue_r7.150109-2022
SkuEdition: The goal of this feature is to use the Product Type defined in the MSDN to map to a ‘SKU-Edition’ name that is useful in population reporting. The valid Product Type are defined in %sdxroot%\data\windowseditions.xml. This API has been used since Vista and Server 2008, so there are many Product Types that do not apply to Windows 10. The ‘SKU-Edition’ is a string value that is in one of three classes of results. The design must hand each class.
IsProtected: This is a calculated field derived from the Spynet Report’s AV Products field. Returns: a. TRUE if there is at least one active and up-to-date antivirus product running on this machine. b. FALSE if there is no active AV product on this machine, or if the AV is active, but is not receiving the latest updates. c. null if there are no Anti Virus Products in the report. Returns: Whether a machine is protected.
AutoSampleOptIn: This is the SubmitSamplesConsent value passed in from the service, available on CAMP 9+
PuaMode: Pua Enabled mode from the service
SMode: This field is set to true when the device is known to be in ‘S Mode’, as in, Windows 10 S mode, where only Microsoft Store apps can be installed
IeVerIdentifier: NA
SmartScreen: This is the SmartScreen enabled string value from the registry. This is obtained by checking in order, HKLM\SOFTWARE\Policies\Microsoft\Windows\System\SmartScreenEnabled and HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Explorer\SmartScreenEnabled. If the value exists but is blank, the value «ExistsNotSet» is sent in telemetry.
Firewall: This attribute is true (1) for Windows 8.1 and above if the Windows firewall is enabled, as reported by the service.
UacLuaenable: This attribute reports whether or not the «administrator in Admin Approval Mode» user type is disabled or enabled in UAC. The value reported is obtained by reading the regkey HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System\EnableLUA.
Census_MDC2FormFactor: A grouping based on a combination of Device Census level hardware characteristics. The logic used to define Form Factor is rooted in business and industry standards and aligns with how people think about their devices. (Examples: Smartphone, Small Tablet, All in One, Convertible…)
Census_DeviceFamily: AKA DeviceClass. Indicates the type of device that an edition of the OS is intended for. Example values: Windows.Desktop, Windows.Mobile, and iOS.Phone
Census_OEMNameIdentifier: NA
Census_OEMModelIdentifier: NA
Census_ProcessorCoreCount: Number of logical cores in the processor
Census_ProcessorManufacturerIdentifier: NA
Census_ProcessorModelIdentifier: NA
Census_ProcessorClass: A classification of processors into high/medium/low. Initially used for Pricing Level SKU. No longer maintained and updated
Census_PrimaryDiskTotalCapacity: Amount of disk space on the primary disk of the machine in MB
Census_PrimaryDiskTypeName: Friendly name of Primary Disk Type – HDD or SSD
Census_SystemVolumeTotalCapacity: The size of the partition that the System volume is installed on in MB
Census_HasOpticalDiskDrive: True indicates that the machine has an optical disk drive (CD/DVD)
Census_TotalPhysicalRAM: Retrieves the physical RAM in MB
Census_ChassisTypeName: Retrieves a numeric representation of what type of chassis the machine has. A value of 0 means xx
Census_InternalPrimaryDiagonalDisplaySizeInInches: Retrieves the physical diagonal length in inches of the primary display
Census_InternalPrimaryDisplayResolutionHorizontal: Retrieves the number of pixels in the horizontal direction of the internal display.
Census_InternalPrimaryDisplayResolutionVertical: Retrieves the number of pixels in the vertical direction of the internal display
Census_PowerPlatformRoleName: Indicates the OEM preferred power management profile. This value helps identify the basic form factor of the device
Census_InternalBatteryType: NA
Census_InternalBatteryNumberOfCharges: NA -Census_OSVersion – Numeric OS version Example- 10.0.10130.0
Census_OSArchitecture: Architecture on which the OS is based. Derived from OSVersionFull. Example – amd64
Census_OSBranch: Branch of the OS extracted from the OsVersionFull. Example- OsBranch = fbl_partner_eeap where OsVersion = 6.4.9813.0.amd64fre.fbl_partner_eeap.140810-0005
Census_OSBuildNumber: OS Build number extracted from the OsVersionFull. Example – OsBuildNumber = 10512 or 10240
Census_OSBuildRevision: OS Build revision extracted from the OsVersionFull. Example- OsBuildRevision = 1000 or 16458
Census_OSEdition: Edition of the current OS. Sourced from HKLM\Software\Microsoft\Windows NT\CurrentVersion@EditionID in the registry. Example: Enterprise
Census_OSSkuName: OS edition friendly name (currently Windows only)
Census_OSInstallTypeName: Friendly description of what install was used on the machine i.e. clean
Census_OSInstallLanguageIdentifier: NA
Census_OSUILocaleIdentifier: NA
Census_OSWUAutoUpdateOptionsName: Friendly name of the WindowsUpdate auto-update settings on the machine.
Census_IsPortableOperatingSystem: Indicates whether the OS is booted up and running via Windows-To-Go on a USB stick.
Census_GenuineStateName: Friendly name of OSGenuineStateID. 0 = Genuine
Census_ActivationChannel: Retail license key or Volume license key for a machine.
Census_IsFlightingInternal: NA
Census_IsFlightsDisabled: Indicates if the machine is participating in flighting.
Census_FlightRing: The ring that the device user would like to receive flights for. This might be different from the ring of the OS, which is currently installed if the user changes the ring after getting a flight from a different ring.
Census_ThresholdOptIn: NA
Census_FirmwareManufacturerIdentifier: NA
Census_FirmwareVersionIdentifier: NA
Census_IsSecureBootEnabled: Indicates if Secure Boot mode is enabled.
Census_IsWIMBootEnabled: NA
Census_IsVirtualDevice: Identifies a Virtual Machine (machine learning model)
Census_IsTouchEnabled: Is this a touch device?
Census_IsPenCapable: Is the device capable of pen input?
Census_IsAlwaysOnAlwaysConnectedCapable: Retrieves information about whether the battery enables the device to be AlwaysOnAlwaysConnected.
Wdft_IsGamer: Indicates whether the device is a gamer device or not based on its hardware combination.
Wdft_RegionIdentifier: NA

Entrenamiento de un modelo de ML

Planteamiento del problema

Descripción del dataset de Kaggle

Descripción de las columnas

Organización del código

Notebook de pruebas (Desarrollo)

Importar librerías

Carga de datos

EDA

RFE

Selección del modelo

Notebook de entrenamiento

Importar librerías

Carga de datos

Feature engineering

Definición y entrenamiento del modelo

Notebook de inferencia

Importar librerías

Carga de datos

Feature engineering

Predicción de los resultados

Resultados y conclusión

Seguir viendo proyectos de...

Ciencia de datos

Ingenieria de datos

Análisis de datos